Excerpted from the SEP by the Communities in Schools program, a subgrantee of the Edna McConnell Clark Foundation.
The introductory paragraph describes how each design will address specific research questions and aspects of the program (i.e., different levels of analysis).
This section provides detail on our proposed evaluation design for estimating the impact of CIS. The CIS model, as described earlier, operates on two levels: Level 2 intensive services targeted for individual high-need students (e.g., who typically have poor academic achievement, poor attendance, or other signs of risk for school failure or dropping out), and Level 1 preventive services for the whole school. Given the multilevel nature of the intervention strategies, our proposed evaluation design uses two complementary studies to address the effects of CIS at each of its levels. The first study is an individual-level randomized control trial (RCT) in middle schools and high schools, designed to investigate the impact of Level 2 services for individual students, and the second uses a school-level comparative interrupted time-series (CITS) approach to investigate impacts of the full model for the whole school. The RCT design seeks to evaluate the student-level impact of the most intensive CIS service provision for the students with greatest need for support, while the CITS design will explore the broader impact of CIS services experienced across a whole school at the elementary, middle and high school levels. Details of each of these approaches are described throughout this plan.
A rationale for using a combined approach that identifies how that approach addresses threats to internal and external validity is discussed.
In an effort to provide robust estimates of program impact, the evaluation seeks to measure outcomes both within and across schools, pooling results across schools to improve the power of the research to detect impacts by increasing the sample sizes for statistical analyses. When possible, analyses will also pool data across grade levels within each school, also improving the power of our analyses by increasing sample sizes. The inclusion of multiple schools and grades also improves the generalizability of the overall impact findings. This approach of pooling together findings across schools is not uncommon, and has been used, for example, when estimating the impact of the Reading First program.1
Student-Level Impact Evaluation of CIS Level 2 Services:
The Random Assignment Study (Level 2 RCT)
The randomized control trial is considered the “gold standard” for rigor in evaluating program effectiveness. The key distinguishing feature of this design is that students, after being assessed for eligibility and recruited for participation in the program, but before the intervention to be studied begins, are randomly assigned to receive the services or “treatment” being investigated. In this case, a subset of students in each grade level at each CIS school will be selected randomly from a larger pool of eligible students to receive the Level 2 CIS intervention services. The other eligible students, those not selected, are assigned to a “control” or “business as usual” condition within the school and do not receive the Level 2 services. The random assignment process creates two comparable groups of students. Thus, it can be inferred that any differences in post-random assignment outcomes between students within grades and/or within schools are attributable to the impact of the CIS Level 2 services.2
School-Level Impact Evaluation: Comparative Interrupted Time-Series Design (CITS Study)
The impact of CIS on a student and/or school outcome equals the difference between what the outcome was after CIS was under way and what it would have been without CIS. One can estimate this difference by comparing the change in outcomes over time for schools that adopted CIS with the corresponding change for similar comparison schools that did not adopt it (the “counterfactual”). Thus, the impact estimate represents the observed improvement of the CIS program schools relative to the observed improvement of their comparison schools3. Ideally the time-series design used to produce impact estimates should have data on consistently measured outcomes for multiple pre-intervention baseline years, multiple post-intervention follow-up years, multiple program schools, and multiple comparison schools4. Exhibit 4.a is a representation of this CITS design that uses five observations (O) prior to an intervention and three after it has been implemented.
The key to this design is to know when the intervention occurred. The logic is straightforward: if the intervention has had an impact, the causal hypothesis is that the observations after the intervention will have a different slope or level from those before the intervention (i.e., the series should reflect an “interruption” in the prior pattern or trend at the time the intervention was delivered or implemented). The comparison group allows us to address the threat to our ability to assert a causal relationship between CIS and changes in outcomes after its implementation that some other concurrent event may have caused changes in the patterns of observed outcomes. If some other concurrent event (e.g., within a district or state) is a plausible cause of the outcomes, then we should also see similar changes in the pattern of outcomes for the comparison group schools. If not, then CIS is more likely to have been the cause. Thus, the CITS is a powerful quasi-experimental design alternative when randomization is not feasible since it combines the traditional interrupted time series analysis and a comparative schools analysis, each building on strengths of the other.5