cover image: Measuring Intervention Effectiveness: The Benefits of an Item Response Theory Approach

Measuring Intervention Effectiveness: The Benefits of an Item Response Theory Approach

Assessing the effectiveness of educational interventions relies on quantifying differences between interventions groups over time in a between-within design. Binary outcome variables (e.g., correct responses versus incorrect responses) are often assessed. Widespread approaches use percent correct on assessments, and repeated measures analysis of variance (ANOVA) methods to detect differences between groups. However, this approach is not ideal, as in fact several assumptions are often violated when using this method that can result in less informative and at times biased and spurious findings (Dixon, 2008; Embretson, 1994). An alternative approach is to utilize item response models to detect differences between intervention groups over time. The benefits of item response methodology for intervention research are contrasted with repeated measures ANOVA approaches, using a longitudinal intervention dataset having a between-within design from elementary students learning about mathematical equivalence. The dependent measures of percent correct in repeated measures ANOVA approaches and item responses in item response models are contrasted, as well as the methods for quantifying differences between groups using repeated measures ANOVA approaches and item response models. Second and third grade students who scored below 75% correct on the pretest participated in a 20-minute one-on-one tutoring intervention that focused on mathematical equivalence problems. In conclusion, item response models offer many methodological advantages in the quantification of individual learning and group change over time compared to repeated measure ANOVA approaches based on percent correct outcomes. In particular, the generalized explanatory longitudinal item response model for multidimensional tests (Cho et al., in press) quantifies and tests for differences between intervention conditions, while utilizing the more informative and less problematic metrics of student performance. In addition to being methodologically more sound, these analyses can be performed using the open-source and free program R. Details of the model, as well as information how to run these analyses can be found in Cho et al. (in press). One drawback to preforming IRT analyses is that they do require more technical proficiency on the part of the data analyst than ANOVA approaches. Nevertheless, researchers should strive to adapt this more informative and less biased metric in the evaluation of intervention effectiveness. Generalized Explanatory IRT Model R Code and Select Results are appended. (Contains 3 figures and 1 footnote.

Authors

McEldoon, Katherine, Cho, Sun-Joo, Rittle-Johnson, Bethany

Authorizing Institution
Society for Research on Educational Effectiveness (SREE)
Education Level
['Elementary Education', 'Grade 2', 'Grade 3']
Peer Reviewed
F
Publication Type
Reports - Research
Published in
United States of America

Table of Contents