Focusing on achievement gaps is the fatal flaw of DE ESEA Flexibility waiver….. #ShankerBlog
February 29, 2012 Leave a comment
Shanker Blog » Interpreting Achievement Gaps In New Jersey And Beyond
But the big issues arise when achievement gaps are viewed over time. This is the primary focus of the NJDOE statement (as well as NJ’s overall 2011 education reform agenda). The state presents a bunch of graphs illustrating the trend in gaps between 2005 and 2011. The accompanying text asserts that race- and income-based gaps as measured by the state’s tests have been persistent since 2005 (NAEP results support this characterization – for instance, the difference between students eligible and not eligible for free/reduced-price lunch in 2011 is not statistically different from that in 2005 in any of the four main assessments).**
Differences in performance between student subgroups are important, but the “narrowing of the achievement gap,” as a policy goal, also entails serious, well-known measurement problems, which can lead to misinterpretations. For example, as most people realize, the gap between two groups can narrow even if both decline in performance, so long as the higher-performing group decreases more rapidly. Similarly, an achievement gap can remain constant – and suggest policy failure – if the two groups attain strong, but similar, rates of improvement.
Put differently, trends in achievement gaps, by themselves, frequently hide as much as they reveal as far as the performance of the student subgroups they are comparing. These issues can only be addressed by , at the very least, decomposing the gaps and looking at each group separately. And that’s precisely the case in NJ.
The simple table below compares the change (between 2005 and 2011) in average NAEP scale scores for NJ students who are eligible for free/reduced-price lunch (lower-income) versus those who are not eligible (higher-income). I want to quickly note that these data are cross-sectional, and might therefore conceal differences in the cohorts of students taking the test, even when broken down by subgroups.***
This table shows that, in three out of four NAEP tests, both low- and higher-income cohorts’ scores have increased substantially, at roughly similar rates. In fourth grade math, students eligible for free/reduced-price lunch scored six points higher in 2011 compared with 2005, the equivalent of roughly half a “year of learning,” compared with a similar, statistically discernible five point increase among non-eligible students. The results for eighth grade math and fourth grade reading are more noteworthy – on both tests, eligible students in NJ scored 12 points higher in 2011 than in 2005, while the 2011 cohorts of non-eligible students were higher by roughly similar margins.In other words, achievement gaps in NJ didn’t narrow during these years because both the eligible and non-eligible cohorts scored higher in 2011 versus 2005. Viewed in isolation, the persistence of the resulting gaps might seem like a policy failure. But, while nobody can be satisfied with these differences and addressing them must be a focus going forward, the stability of the gaps actually masks notable success among both groups of students (at least to the degree that these changes reflect “real” progress rather than compositional changes).
Only in eighth grade reading was there a discrepancy between subgroups –the score for the 2011 cohort of FRLP-eligible students is statistically indistinguishable from that of the 2005 cohort. This means that, by social science conventions, we cannot dismiss the possibility that the former change was really just random noise. So, to the degree that there was a widening of the NJ achievement gap in eighth grade reading between 2005 and 2011 (and, as stated above, it also wasn’t large enough to be statistically significant), it’s because there was a discernible change for one subgroup but not the other. This exception is something worth looking into, and it is only revealed when both groups are viewed in terms of their absolute, not relative scores.
Similarly, if one looks exclusively at achievement gap trends in a simplistic manner, the substantial increases between cohorts in seven out of eight subgroup/exam combinations would be ignored, as would the fact (not shown in the table) that both eligible and non-eligible students score significantly higher than their counterparts nationally on all four assessments.
Roughly identical results are obtained for the subgroup changes if the achievement gap is defined in terms of race – there were equally large increases among both white and African-American cohorts between 2005 and 2011 in all four tests except eighth grade reading, where the change between African-American student cohorts was positive but not statistically significant.
(One important note about the interpretation of these data: The cohort changes among seven of the eight groups shown in the table [or the gaps in any given year], assuming that some of it is “real progress,” should not necessarily be entirely chalked up to the success of NJ schools per se. This represents the rather common error of conflating student and school performance. That is, assuming that students’ testing results [and changes therein] are entirely due to schools’ performance – though it’s well-established empirically that this is not the case. Some of the change is school-related, while some of it is a function of non-school factors [and/or sampling variation]. Without multivariate analysis using longitudinal data, it’s very difficult to tease out the proportion attributable to instructional quality.)
Nevertheless, the simple data above do suggest that an overintepretation of the achievement gap as an educational measure, without, at the very least, attention to the performance of constituent subgroups, can be problematic. Yes, in any given year, the differences between groups can serve as a useful gauge of inequality in outcomes, and, without question, we should endeavor to narrow these gaps going forward, while hopefully also boosting the achievement of all groups.
But it’s important to remember that the gaps by themselves, especially viewed over time, often mask as much important information as they reveal about the performance of each group, within and between states and districts, as well as the ways in which the actual quality of schools interact with them. Their significance can only be judged in context. States and districts must interpret gaps in a nuanced, multidimensional manner, lest they risk making policy decisions that could actually impede progress among the very students they most wish to support.
- Matt Di Carlo
*****
* It’s also worth noting that the achievement gap as defined above – the difference in scores between students eligible and not eligible for free/reduced-price lunch – is not statistically different from the U.S. public school student average in three out of four NAEP tests (with the exception being eighth grade reading, where the NJ gap is moderately larger).
** Most of the data presented in the NJDOE statement are achievement gaps on the state’s tests, as defined in terms of proficiency rates – that is, the difference in the overall proficiency rate between subgroups, such as students who are and are not eligible for free/reduced-price lunch. Using proficiency rates in serious policy analysis is almost always poor practice – they only tell you how many students are above or below a particular (and sometimes arbitrary) level of testing performance. However, measuring achievement gaps using these rates, especially over time, is almost certain to be misleading– an odd decision that one would not expect of a large state education agency. In this post, I use actual scale scores.
*** Since most achievement gaps compare two groups of students, they often mask huge underlying variation. For example, the comparison of students who are eligible versus those not eligible for free- and reduced-price lunches completely ignores the fact that this poverty measure only looks at students below a certain threshold, thus concealing the fact that some students below that line are much more impoverished than others, making comparisons between states and districts (and over time) extremely difficult (making things worse, income is a very limited measure of student background). On a related note, one infrequently-used but potentially informative conceptualization of the achievement gap is the difference between and high- and low-performing students (e.g., comparisons of scores between percentiles).





Recent Comments