Inference With Large Clustered Datasets | Queen's Economics Department

QED Working Paper Number

1365

Inference using large datasets is not nearly as straightforward as conventional econometric theory suggests when the disturbances are clustered, even with very small intra-cluster correlations. The information contained in such a dataset grows much more slowly with the sample size than it would if the observations were independent. Moreover, inferences become increasingly unreliable as the dataset gets larger. These assertions are based on an extensive series of estimations undertaken using a large dataset taken from the U.S. Current Population Survey.

Author(s)

JEL Codes

Keywords

placebo laws

cluster-robust inference

earnings equation

wild cluster bootstrap

CPS data

sample size

Working Paper

Download [PDF]