Monday, March 19, 2012

(Mis)Evaluating Student Writing and Teachers

Uptown  (M.A. Reilly, January 2012)
I. Misreading a Student's Work

Today an email arrived from a former student my husband taught English to 8 years ago.  Now she is finishing her last semester abroad in Paris and will graduate from Columbia University in May. We have kept in contact with her and her family during these years and it has been a privilege to watch her grow up.

In 8th grade she was fine writer and artist--years more mature than her age.  It was a surprise then at the end of the school year when New Jersey Sate Test scores were returned and this same student received a *failing* writing test score.  The student's score suggested that she had limited control of writing. I remember at the time that my husband was more than surprised, as was the student. He assumed that there had to have been an error made.  In New Jersey, students' written state tests are returned in the fall to each school district. Rob told the student that he would take a look at her work and ask it to be re-scored.

I recall the writing, 8 years later as it was outstanding work.  Unlike the typical student writing that begins with the inverted question as thesis statement, followed by three paragraphs of support and a clincher ending--this student's composition was an essay with an implied thesis. This is a student who had read essays by Annie Dillard, Barry Lopez, and Loren Eiseley and had studied their styles as part of her class work. She, like her peers,  did not participate in test prepping. As such, it is not surprising then that the essay she composed required work to read.  It required evaluators who would spend more than the 90-seconds to 2-minutes allotted. After reading her returned essay, I knew that the issue wasn't the writing, but rather the two evaluators who failed to comprehend the work. Careful reading was required.  I wondered how many other students' test-writing received low scores because the writing was complex, elegant, and difficult to read quickly. The State insisted that the writing had been scored correctly. I thought then, as I do now that an error had been made.

Today the email she sent was announcement and an invitation.  She learned that a piece she has authored has been selected for inclusion in Columbia's literary magazine and that she will do a public reading of her work later in the spring.  She invited Rob to attend the reading, wanting her former teacher and mentor-- and now friend to be in attendance. An honor for sure.

II. Teacher Evaluation

A lot has been written about the challenges with using single measures to publicly evaluate a teacher. Specifically, discussions about the metric used to create a 'value-added' score have been widely published. One only has to look at New York City's teacher evaluation process to realize that serious flaws are present in any evaluation system when the margin of error is 53% (for ELA teachers).  What I haven't seen discussed though is the question about the initial measure.  What if the high stakes test and/or the evaluation of a student's work that form the basis of the teacher evaluation are seriously flawed?  At that time when this student's work was evaluated as being partially proficient, the question of whether the teacher had taught the student well was not in question.  Fortunately he had taught her other siblings and his relationship with her family then, like today, was quite strong.  No one doubted the importance of the learning that had taken place during the school year.

8 years later and the world is a bit changed. In today's data-driven world,  oddly the student, her parents, and her teacher would not be in the position to offer insights.  Their understandings would not be collected, would not be valued.  Their knowledge would not be factored.  Rather, a score would be generated based on a metric that assumed the assessment was accurate.  

There's an irony here that I am sure you grasp.  Perhaps the finest measure of a teacher's influence is not the single state test score, but rather the measure that comes with the passing of time. What finer measure is there then a student, who many years after the end of the official teaching invites her former teacher to be her guest at a reading where she is being celebrated as an author?

Such understandings cannot and do not fit on a rubric.


  1. I have read with error how some state essay tests are "graded." The evaluators need not have a college degree as long as they are quickly trained how to use a rubric. The graders are crammed into rooms, and as long as their scores are similar, the essay receives the agreed-upon grade. Thoughtful feedback on writing takes time, and assigning a numerical value should take even longer, a luxury testing companies choose not to afford.

    1. No wonder mistakes occur. I recall reading a book about the score testing industry a few years ago.Stories about scoring while drunk, hiring blind people to score and more. Truly unsettling.

  2. All horror and commiseration aside, is there a collection vehicle (your blog is a start) where experiences, such as yours described above, can be compiled and brought as a critical mass to the powers that be? While at first a collection like this would surely be ignored and discounted, over time the only hope we have is an overwhelming number of exceptions to the RULE OF TESTING.

    Just a thought... OCCUPY Standardized Testing!

    1. I don't know of one. If you find one please let me know and I will contribute. Thanks a good question and action.