Today’s release of the 2015 National Assessment of Educational Progress (NAEP) means that pundits, politicians, and, yes, even some researchers, will soon begin the biennial exercise of making unwarranted inferences from the NAEP results. We draw on a new Urban Institute report to show, first, how to make more responsible comparisons across states and, second, that the declines in NAEP scores from 2013 to 2015 are unlikely to be explained by shifts in student demographics.
NAEP, often called the “nation’s report card,” is the only standardized test regularly administered to a nationally representative sample of U.S. students. Unfortunately, “misNAEPery” has become common practice, with education stakeholders touting high-scoring states that have adopted their preferred policies, or low-scoring states that have done the opposite. The fundamental problem is that there’s no widely accepted way to factor student demographics into state NAEP scores. Urban’s new report, Breaking the Curve: Promises and Pitfalls of Using NAEP Data to Assess the State Role in Student Achievement, proposes better ways to compare NAEP scores across states and over time.
Breaking the Curve calculates adjusted NAEP scores, based on the 2013 results, that account for differences in student demographics among the states. Using a rich set of control variables, the report generates a ranking that shows which states are “breaking the curve” – producing stronger academic outcomes for their students compared to demographically similar students across the US.
Breaking the Curve demonstrates that just reporting raw NAEP scores obscures a deeper narrative: significant variation among state achievement results even after demographic factors are taken into account. It's no surprise that Massachusetts has better outcomes than Mississippi, but it may be a surprise that Texas performed significantly better than California. The demographic adjustment substantially weakens the relationship between NAEP scores and state poverty rates. But even after this adjustment, the difference between the highest- and lowest-ranked states is equivalent to nearly 18 months of learning.
Adjusting the 2015 NAEP scores
But with the 2015 scores, aren’t the 2013 scores old news? The student-level dataset needed to conduct a 2015 “Breaking the Curve” analysis probably won’t be released by the National Center for Education Statistics until 2017, given that the 2013 data were not available until May 2015. We can, however, provide an approximation of the 2015 demographically-adjusted scores using the 2013 adjustments from our recent report.
We do this by applying the 2013 adjustment—the difference between unadjusted and adjusted scores in 2013—to the 2015 state average scores released today. The assumption at work here is that the underlying demographics of a given state likely haven’t changed substantially in the two years since the last test was given. This strategy isn’t perfect, but a validation analysis using the 2011 and 2013 data showed that it produces estimated adjusted scores that are very highly correlated (r=0.97) with the actual adjusted scores.
The figure below shows our adjusted NAEP ranking for states based on the 2015 data. Each state’s score (averaged across the tests in math and reading in the 4th and 8th grades) is reported in months of learning, compared to an overall average adjusted score of zero. The 2015 ranking of adjusted scores is fairly similar to the 2013 list, with the most notable exception being Maryland, which fell from 6th in 2013 to 37th in 2015.
Do demographic shifts explain the recent decline in NAEP scores?
The decline in this year’s national NAEP scores on three of the four tests is likely to receive the most attention, especially after a long period of gradual increases in 4th and 8th grade reading and math scores. How much of this decline can be attributed to changing student demographics? A full analysis will have to await the eventual release of student-level scores, but the results in Breaking the Curve strongly suggest that demographics are unlikely to explain away the 2015 drop in scores, especially in 8th grade.
Based on the relationship between demographics and scores in 2003, Breaking the Curve generates a prediction for nationwide scores in 2013. NAEP test takers in 2013 were predicted to score an average of 3.5 months of learning lower than their 2003 predecessors. Instead of a decrease, however, the 2013 NAEP test produced a demographically-adjusted increase of 5.4 months of learning over the 2003 data. In fact, every state “broke the curve” over the period from 2003 to 2013, posting gains of 2 to 16 months of learning over what was predicted based on demographic changes.
But nationwide scores fell by 1.5 months of learning between 2013 and 2015. This is more than twice the average biennial predicted fall in scores based on the 2003-2013 data. The table below shows that demographic shifts are particularly unlikely to explain the drops in 8th-grade scores, which fell by about three months of learning over the last two years, compared to an average demographic-predicted score decrease of about one month of learning every two years. It seems highly unlikely that demographics shifted three times as quickly over the past two years as compared to earlier years.
Raw NAEP scores are unhelpful at best and misleading at worst. Demographic adjustments are never perfect, but they allow for much more meaningful comparisons across states and over time. With a few exceptions, the adjusted 2015 ranking of states is not very different from 2013. But the decreases in 8th-grade math and reading scores are too substantial to be blamed on changes in the characteristics of students taking the tests.