Continuing to Learn from School Choice in Washington D.C.
“There is no conclusive evidence that the [DC Opportunity Scholarship Program (OSP)] affected student achievement.” This one statement, taken from a 214-page government report for which one of us (Wolf) was lead author, is often treated as the first, last and only word on the effectiveness of our nation’s only federally-sponsored private school choice program.
While the statement is technically correct, it is by no means complete.
Our study of the OSP pilot program from 2004 to 2010 provided conclusive evidence that the OSP increased high school graduation rates. In addition, positive impacts of the program on student reading scores were clear for certain subgroups in years two, three and four, and there was a positive effect on overall student reading scores in year three that reached the 99 percent level of statistical confidence. Unfortunately, these bright spots are often ignored when our study is discussed.
Our final year estimate of the overall positive effect on reading scores only reached the 94 percent level of statistical confidence, but the U.S. Department of Education insists on 95 percent confidence, so our overall effects had to be characterized as not “conclusive.” When we published our findings in the highly-ranked peer-reviewed Journal of Policy Analysis and Management (JPAM), our scientific peers agreed that 90 percent statistical confidence was sufficient, so our published article reports positive effects on student reading scores in years two, three and four. Reporting 90 percent confidence levels as marginally significant is relatively standard in social science, even though the U.S. Department of Education sets a higher bar.
The initial OSP evaluation was also limited to comparing the results for students who won a scholarship lottery with similar students who applied for a scholarship but lost the lottery. This pure experimental analysis was certainly unbiased but only yielded the average effect of being offered a scholarship, not the effect of actually using it.
Some lottery winners never used their scholarships, but outcomes for these “treatment decliners” were factored into the treatment group average. Some lottery losers still gained access to private schooling, but outcomes for these “control crossovers” were factored into the control group average. If there is an effect of private schooling at all, these two groups that didn’t comply with their lottery assignment would push the estimate of that effect towards zero. Surely a program has a larger average effect just on the people who actually use it.
Instrumental variables (IV) analysis is a powerful tool to recover unbiased estimates of the effect of using a program even when some participants don’t comply with the lottery. Using the original lottery results as an instrumental variable cleans out the effects of non-compliers. At the time of our original OSP evaluation, however, IV analysis was considered too novel to use. It is now quite common, and one of us (Wolf) used this method in a follow-up analysis focused on the impact of the OSP on parent satisfaction.
Beyond IV analysis there are other, less experimental, methods to estimate the effect of using a school voucher or scholarship. We were interested in testing how these methods stack up against experimental methods such as IV analysis, so we conducted a “Within Study Comparison” (WSC) to compare the effects of the OSP on student reading and math outcomes using a variety of statistical methods. We used the IV estimates of the effect of using an Opportunity Scholarship to attend a private school of choice as our “benchmark” estimate of the true effect.
We found a positive effect on student reading scores that was statistically significant beyond the 90 percent confidence level in years two, three, and four, just as the JPAM article reported. Using the highly reliable IV approach, however, the reading gains of the OSP participants actually increased each year, ending with a gain of 24 percent of a standard deviation, which is equivalent to about an extra 5.5 months of learning over the four-year period.
Importantly, in our WSC, many non-experimental methods produced biased results.
School choice researchers, including us, sometimes use student matching or control variables when lotteries are not available. We always are careful to match on or control for student test scores at baseline before the program started. Good thing, too, because our WSC shows that methods that do not control for baseline student ability produce estimates that are alarmingly biased, and not in predictable ways. The results of our WSC validate the approach of researchers such as Greg Forster here and us here and here, to only include experimental studies in summary reviews of the test score effects on students participating in school choice programs.
For most of our WSC, we use the same sample of students in the original evaluation. For our final analysis, however, we include all public school students in Washington, D.C., including those who never applied to the OSP.
Interestingly, we find that non-experimental methods are all biased against the D.C. school voucher program. OSP participants were disadvantaged relative to their non-applicant peers in subtle ways that were accounted for with IV analysis but NOT by non-experimental methods.
In other words, the private school choice programs do not always attract the “cream of the crop,” as is often assumed. Students seek private school choice when they are struggling for reasons that may not be captured by non-experimental analytic methods. The policy question, of course, is whether or not they should be given that choice.