does SE clustering actually work? #16

thorek1 · 2021-08-26T10:04:50Z

Hi,

I was wondering whether standard error clustering actually works. In the code example you can see that clustering does not change the standard errors whereas I would expect them to move and fixest estimates them to change:

n <- 1000

set.seed(1)
dat <- data.frame(y = runif(n) > .5,
                  x1 = rnorm(n),
                  x2 = rnorm(n) + 2,
                  x2 = (rnorm(n) + 2) / 2,
                  x3 = sample(LETTERS,n, T),
                  x4 = sample(letters,n, T))


library(alpaca)

alpaca::feglm(y ~ x1 + x2 | x3, dat) |> summary()
#binomial - logit link
#
#y ~ x1 + x2 | x3
#
#Estimates:
#   Estimate Std. error z value Pr(> |z|)
#x1 -0.01404    0.06397  -0.219     0.826
#x2 -0.01322    0.06193  -0.213     0.831
#
#residual deviance= 1341.74,
#null deviance= 1384.69,
#n= 1000, l= [26]
#
#Number of Fisher Scoring Iterations: 4 
alpaca::feglm(y ~ x1 + x2 | x3 | x4, dat) |> summary()
#binomial - logit link
#
#y ~ x1 + x2 | x3 | x4
#
#Estimates:
#   Estimate Std. error z value Pr(> |z|)
#x1 -0.01404    0.06397  -0.219     0.826
#x2 -0.01322    0.06193  -0.213     0.831
#
#residual deviance= 1341.74,
#null deviance= 1384.69,
#n= 1000, l= [26]
#
#Number of Fisher Scoring Iterations: 4 

library(fixest)

fixest::feglm(y ~ x1 + x2 | x3, dat, family = 'binomial')
#GLM estimation, family = binomial, Dep. Var.: y
#Observations: 1,000 
#Fixed-effects: x3: 26
#Standard-errors: Clustered (x3) 
#    Estimate Std. Error   t value Pr(>|t|)) 
#x1 -0.014040   0.054451 -0.257846  0.796526 
#x2 -0.013218   0.055863 -0.236617  0.812954 
#---
#Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Log-Likelihood:  -670.9   Adj. Pseudo R2: -0.007979
#           BIC: 1,535.2     Squared Cor.:  0.041794
fixest::feglm(y ~ x1 + x2 | x3, dat, family = 'binomial', cluster = 'x4')
#GLM estimation, family = binomial, Dep. Var.: y
#Observations: 1,000 
#Fixed-effects: x3: 26
#Standard-errors: Clustered (x4) 
#    Estimate Std. Error   t value Pr(>|t|)) 
#x1 -0.014040   0.058249 -0.241034  0.809529 
#x2 -0.013218   0.058738 -0.225035  0.821952 
#---
#Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Log-Likelihood:  -670.9   Adj. Pseudo R2: -0.007979
#           BIC: 1,535.2     Squared Cor.:  0.041794

amrei-stammann · 2021-08-26T17:25:12Z

Hello,

summary() returns per default standard errors based on the inverse of the negative Hessian (not clustered).

If you want to compute clustered standard errors, you need to specify further arguments, see example below:

# clustered by x3
summary(feglm(y ~ x1 + x2 | x3, dat) , type = "clustered", cluster = ~ x3)

# clustered by x4
summary(feglm(y ~ x1 + x2 | x3 | x4, dat) , type = "clustered", cluster = ~ x4)

A further example is available in the vignette:
https://cran.r-project.org/web/packages/alpaca/vignettes/trade.html

Best wishes,
Amrei

thorek1 · 2021-09-11T16:36:48Z

Thanks for your answer. I should have read the documentation more thoroughly.

In the second example you give, why would you need to specify the x4 in the formula? It seems redundant and does not change the functionality (not needed in your first example). I would have expected, from reading the help page of feglm that specifying it there would be enough to have clustered standard errors printed.

Using summary to print the clustered standard errors unfortunately makes a typical workflow of mine impossible:

results <- list()
alpaca::feglm(y ~ x1 + x2 | x3, dat) |> summary() -> results[[1]]
alpaca::feglm(y ~ x1 + x2 | x3 | x4, dat) |> summary() -> results[[2]]

library(texreg)
results |> texreg()
#Error in extract(l[[i]], ...) : 
#  Neither texreg nor broom supports models of class summary.feglm.


alpaca::feglm(y ~ x1 + x2 | x3, dat) -> results[[1]]
alpaca::feglm(y ~ x1 + x2 | x3 | x4, dat) -> results[[2]]

library(texreg)
results |> texreg()
# no clustered standard errors

Integration with other tools (such as texreg, stargazer,...) makes life a lot easer as is the case for fixest.

pachadotdev · 2024-07-23T01:47:58Z

here is a a modification that I made to simplify the clustering #17 (comment)

jay-sf mentioned this issue Oct 21, 2022

Cluster specification in feglm formula has no effect? #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

does SE clustering actually work? #16

does SE clustering actually work? #16

thorek1 commented Aug 26, 2021

amrei-stammann commented Aug 26, 2021

thorek1 commented Sep 11, 2021

pachadotdev commented Jul 23, 2024

does SE clustering actually work? #16

does SE clustering actually work? #16

Comments

thorek1 commented Aug 26, 2021

amrei-stammann commented Aug 26, 2021

thorek1 commented Sep 11, 2021

pachadotdev commented Jul 23, 2024