Study: Error rates high when student test scores used to evaluate teachers
Valeria Strauss – Washington Post
I don’t actually understand all of a new statistical study about error rates when “value-added” student test scores are used to evaluate teachers, but I do get this: The rates high enough to give even supporters of such measures some pause about using them for high-stakes decisions.
There’s a more in-depth analysis of the report, which was undertaken for the Education Department’s Institute of Education Sciences, on Bruce D. Baker’s School Finance 101 blog.
But here’s my takeaway from the report, entitled “Error Rates in Measuring Teacher and School Performance Based on Student Test Score Gains:”
Value-added measures have become all the rage in evaluating teachers. What does that mean? As explained in a guest blog this year by by FairTest’s Lisa Guisbond, these measures use student standardized test scores to track the growth of individual students as they progress through the grades and see how much “value” a teacher has added.
An emerging body of research has found that these value-added estimates based on a few years of data can be imprecise. How imprecise?
According to thenew report by Mathematica Policy Research:
If three years of data is used there is about a 25 percent change that a teacher who is “average” would be identified as significantly worse than average, and, under new evaluation systems, perhaps fired.
*If one year of data is used, there is a 35 percent chance of the same misidentification.
Considering that teachers are now being fired based partly on test scores — D.C. Schools Chancellor Michelle Rhee just let go dozens of teachers based on such evaluations — this error rate matters in a big way.
The report, written by Mathematica’s Peter Z. Schochet and Hanley S. Chiang, goes on to say that value-added estimates “in a given year are still fairly strong predictors of subsequent-year academic outcomes in the teachers’ classes.”
I wonder if the authors would like their evaluations to be based on such “fairly strong” criteria.
By the way, Baker’s analysis of the report mentions other major issues that he says undermine “the usefulness of value-added assessment for teacher evaluation and dismissal (on the assumption that majority weight is placed on value-added assessment).” According to Baker, they include:
*That students are not randomly assigned across teachers and that this non-random assignment may severely bias estimates of teacher quality.
*That only a fraction of teachers can even be evaluated this way, generally less than 20%.