Abstract:
The paper proposes a causal supervised machine learning algorithm to uncover treatment
effect heterogeneity in classical regression discontinuity (RD) designs. Extending
Athey and Imbens (2016), I develop a criterion for building an honest “regression discontinuity
tree”, where each leaf of the tree contains the RD estimate of a treatment
(assigned by a common cutoff rule) conditional on the values of some pre-treatment
covariates. It is a priori unknown which covariates are relevant for capturing treatment
effect heterogeneity, and it is the task of the algorithm to discover them, without
invalidating inference. I study the performance of the method through Monte Carlo
simulations, and apply it to the data set compiled by Pop-Eleches and Urquiola (2013)
to uncover various sources of heterogeneity in the impact of attending a better secondary
school in Romania.