Date of Graduation
Doctor of Philosophy in Educational Statistics and Research Methods (PhD)
Rehabilitation, Human Resources and Communication Disorders
Ronna C. Turner
Second Committee Member
George S. Denny
Third Committee Member
Gary W. Ritter
Education, Casual design, Educational assessment, Educational statistics, Quasi-experimental methods, Synthetic control
Synthetic control methods are an innovative matching technique first introduced within the economics and political science literature that have begun to find application in educational research as well. Synthetic controls create an aggregate-level, time-series comparison for a single treated unit of interest for causal inference with observational data. However, the strict statistical assumptions associated with matching methods for causal inference raise concerns for unobserved bias related to some data models and availability. The small but increasing set of existing synthetic controls studies with student achievement measures as the outcome of interest suggest that research is warranted into the effectiveness of this methodology in creating unbiased comparisons with necessary sensitivity to detect treatment effects typical of educational interventions. In this study I examined these concerns at an empirical level by analyzing patterns of minimum necessary effects for statistical significance across multiple data models, contrasting covariate specifications, and pools of available comparison units. Data included five years of public elementary school math and reading scores from Measures of Academic Progress (MAP) exams for approximately 35,000 unique students. Using placebo tests for statistical inference as recommended in the literature, I calculated the standardized differences necessary for both a cross-sectional and a cohort model of student progress. Results showed that the addition of demographic covariates provided no additional predictive power over matching on prior MAP achievement alone. Further, near-perfect matches across the pretreatment period were found often enough that a treated unit could not reasonably reach posttreatment effect sizes necessary for detection without also achieving a near-perfect synthetic control match. The placebo tests were sensitive to the increased difficulty of finding close matches when additional pretreatment time points were included, but overall the magnitudes of necessary effects decreased as a result. Average z-score differences across four pools of comparison units ranged from 0.13 to 0.45 for statistical significance at 5% and from 0.10 to 0.35 for 10% significance. I offer recommendations for using synthetic controls in evaluating educational interventions with student achievement outcomes and for further research into the effectiveness of these methods in reaching conclusions based on unbiased comparisons.
Johnson, Clay Stephen, "Compared to What? The Effectiveness of Synthetic Control Methods for Causal Inference in Educational Assessment" (2013). Theses and Dissertations. 946.