We discuss when and how to deal with possibly clustered errors in linear regression models. Specifically, we discuss situations in which a regression model may plausibly be treated as having error terms that are arbitrarily correlated within known clusters but uncorrelated across them. The methods we discuss include various covariance matrix estimators, possibly combined with various methods of obtaining critical values, several bootstrap procedures, and randomization inference. Special attention is given to models with few treated clusters and clusters that vary in size, where inference may be problematic. Two empirical examples and a simulation experiment illustrate the methods we discuss and the concerns we raise.
QED Working Paper Number
clustered data, cluster-robust variance estimator, CRVE, wild cluster bootstrap, robust inference