Everyone who has any connection with education — teacher, student, parent, administrator — needs to read Todd Farley’s Making the Grades: My Misadventures in the Standardized Testing Industry. Yes, the book is a bit repetitive, and of course it reflects only one person’s views, and it doesn’t match my colleagues’ experiences scoring AP exams…but you still should read it. Not a statistical study, it is an easy-to-read narrative of Todd Farley’s work with Pearson (“the world’s leading education company”) and ETS (which “conducts assessment and policy research and develops assessments and related services to advance quality and equity in learning worldwide”). Farley started out scoring open-response questions, progressed to being a table leader and a trainer, and eventually wrote rubrics and test questions, all over a period of 15 years. Note, please, that none of this has anything to do with multiple-choice tests, which may have their own issues but at least are scored objectively and consistently.
Quoting from the book will be more effective than merely commenting on it. We’ll start with a portion of a rubric describing how to score an eighth-grade descriptive writing task on a statewide test (not MCAS, but similar), quoted verbatim including layout, punctuation, and capitalization:
A good response (3) includes
Good organization, including appropriate use of the five-paragraph format.
Good focus and development.
Good style and sentence fluency.
Good grammar, usage, and mechanics.
The excellent (4), inconsistent (2), and poor (1) portions of the rubric are identical to this one, as long as you do a find-and-replace accordingly. Now you have to understand that the typical abysmally paid and undereducated scorer somehow has to decide whether an essay is “good” in all four categories, with no more guidance than “good” = “good.” Of course the scorers and table leaders have to be trained, which means they have to pass a qualifying test. Here’s how the trainer, Maria, ensured that all the table leaders (Caitlin, Ricky, Harlan, etc.) would pass the qualifying test, on which they had to score seven out of ten essays “correctly”:
Maria held up her hand to tell Caitlin not to move. After Maria checked the scores, she handed the score sheet back to Caitlin, whispered something to her, and sent her back to her desk, where Caitlin started to rescore the ten essays. Then Maria whispered something to Ricky at her side, a something Ricky turned and whispered into the ear of the table leader closest to him. The whispering continued through the room. Harlan, on my left, passed on to me the useful nugget: “The same score is never given to successive essays. Pass it on.”
Then Maria passed to Ricky who passed to the table leaders information that essay 2 “was absolutely considered appropriate five-paragraph format,” meaning it would earn at least a 3.
And so forth.
Other examples of cheating pervade the first half of the book. For example:
I rarely, of course, actually looked at the essays in question, because I simply didn’t have the time. If I was looking at the score sheets of two scorers I didn’t trust (Louise and Harry, for example), eventually I compromised and erased bubbles from the score sheets of each, changing the scores until their agreement went from an unacceptable 50 or 60 percent up to an acceptable 70 percent.
The horrifying thing about examples like this one is that statistics are determining data, not the other way around. These are real students’ lives that they’re playing with. Even if, say, 70% of the scores are in some sense “correct,” that is of no comfort to you if you are in the remaining 30%.
Not that you would ever find out.
With all the pressures from politicians and the public these days, so-called “standardized tests” are going to become more and more prevalent. But the scorers won’t become better educated or better paid. And the more pressure there is for favorable statistics, the more likely it is that adults will cheat, as we’ve seen recently in Washington, New York, and Springfield. Replacing open-response questions with multiple-choice questions will solve several of these problems (though it would introduce other problems, of course).
We might get a rebuttal from one of my colleagues. But in the meantime…read this book!