QED Working Paper Number
              1355
          Inference using difference-in-differences with clustered data requires care. Previous research has shown that, when there are few treated clusters, t-tests based on cluster-robust variance estimators (CRVEs) severely overreject, and different variants of the wild cluster bootstrap can either overreject or underreject dramatically. We study two randomization inference (RI) procedures. A procedure based on estimated coefficients may be unreliable when clusters are heterogeneous. A procedure based on t-statistics typically performs better (although by no means perfectly) under the null, but at the cost of some power loss. An empirical example demonstrates that alternative procedures can yield dramatically different inferences.
Keywords
          CRVE
          grouped data
          clustered data
          panel data
          randomization inference
          difference-in-differences
          wild cluster bootstrap
          DiD
              Working Paper