Between the By-Road and the Main Road: teacher evaluation

Uptown (M.A. Reilly, January 2012)

I. Misreading a Student's Work

Today an email arrived from a former student my husband taught English to 8 years ago. Now she is finishing her last semester abroad in Paris and will graduate from Columbia University in May. We have kept in contact with her and her family during these years and it has been a privilege to watch her grow up.

In 8th grade she was fine writer and artist--years more mature than her age. It was a surprise then at the end of the school year when New Jersey Sate Test scores were returned and this same student received a *failing* writing test score. The student's score suggested that she had limited control of writing. I remember at the time that my husband was more than surprised, as was the student. He assumed that there had to have been an error made. In New Jersey, students' written state tests are returned in the fall to each school district. Rob told the student that he would take a look at her work and ask it to be re-scored.

I recall the writing, 8 years later as it was outstanding work. Unlike the typical student writing that begins with the inverted question as thesis statement, followed by three paragraphs of support and a clincher ending--this student's composition was an essay with an implied thesis. This is a student who had read essays by Annie Dillard, Barry Lopez, and Loren Eiseley and had studied their styles as part of her class work. She, like her peers, did not participate in test prepping. As such, it is not surprising then that the essay she composed required work to read. It required evaluators who would spend more than the 90-seconds to 2-minutes allotted. After reading her returned essay, I knew that the issue wasn't the writing, but rather the two evaluators who failed to comprehend the work. Careful reading was required. I wondered how many other students' test-writing received low scores because the writing was complex, elegant, and difficult to read quickly. The State insisted that the writing had been scored correctly. I thought then, as I do now that an error had been made.

Today the email she sent was announcement and an invitation. She learned that a piece she has authored has been selected for inclusion in Columbia's literary magazine and that she will do a public reading of her work later in the spring. She invited Rob to attend the reading, wanting her former teacher and mentor-- and now friend to be in attendance. An honor for sure.

II. Teacher Evaluation

A lot has been written about the challenges with using single measures to publicly evaluate a teacher. Specifically, discussions about the metric used to create a 'value-added' score have been widely published. One only has to look at New York City's teacher evaluation process to realize that serious flaws are present in any evaluation system when the margin of error is 53% (for ELA teachers). What I haven't seen discussed though is the question about the initial measure. What if the high stakes test and/or the evaluation of a student's work that form the basis of the teacher evaluation are seriously flawed? At that time when this student's work was evaluated as being partially proficient, the question of whether the teacher had taught the student well was not in question. Fortunately he had taught her other siblings and his relationship with her family then, like today, was quite strong. No one doubted the importance of the learning that had taken place during the school year.

8 years later and the world is a bit changed. In today's data-driven world, oddly the student, her parents, and her teacher would not be in the position to offer insights. Their understandings would not be collected, would not be valued. Their knowledge would not be factored. Rather, a score would be generated based on a metric that assumed the assessment was accurate.

There's an irony here that I am sure you grasp. Perhaps the finest measure of a teacher's influence is not the single state test score, but rather the measure that comes with the passing of time. What finer measure is there then a student, who many years after the end of the official teaching invites her former teacher to be her guest at a reading where she is being celebrated as an author?

Such understandings cannot and do not fit on a rubric.

Graphic from this Site

I. The State Test

In 2010, 86,000 11th graders in New Jersey wrote responses to a "persuasive" prompt as part of the high stakes state assessment. Student who do not pass the assessment cannot graduate. This was the prompt:

Writing Situation
Several teenagers in the neighborhood are suing a local fast food restaurant, claiming that their poor health resulted from consuming the restaurant's food. The lawsuit has forced the owners to close the restaurant. This has caused a controversy in the community.

You decide to write an article for your school newspaper expressing your opinion on the teens' lawsuit against the restaurant.

Directions for Writing
Write an article either supporting or opposing the teens' lawsuit. Use reasons, facts, examples, and other evidence to support your position.

These 86,000 students were not actually allowed to conduct any research or gather facts, examples or other evidence to support their fictitious positions, as no resources nor external research may be used, nor did their response actually have to ascribe to the genre specified (newspaper article). Rather students had to invent reasons, facts, examples, and other evidence that supported being for or against the teens' lawsuit and then put these "ideas" into paragraphs (hopefully 5) and do so in 60 minutes.

As I read the prompt and thought of so many students composing responses, I wondered what students had learned about writing and persuasion?

Did they learn that making up reasons, facts, examples, and other evidence is acceptable practice when writing to persuade?
Did they learn that the five-paragraph response penned in an hour is the same thing as composing a newspaper article?
Did they learn that all persuasive positions result in either being for or against something?
Did they learn that one takes a definitive position after being presented with only a situation, sans details or other texts?
Did they learn that research does not require any searching?
Did they learn that persuasion does not require actual facts, truthful statistics, reasoned examples?
Did they learn that writing is a task you do without careful and honest thought?
Did they learn that audience doesn't actually matter to a writer?

Although I don't write a lot of persuasive texts, I cannot recall an instance when I resorted to making up facts, reasons, examples, or other types of evidence to prove being for or against something. Anywhere I have worked, such an approach would be classified as academic dishonesty, not a strategy for writing. Is this an apt measure that would allow you to feel confident that your child had developed the requisite skills, dispositions, and habits of mind to be able to and want to continue learning after high school? I would be disappointed if any of my child's teachers taught my son to write based on this limited sense of composing.

Now to be sure, this one task is not the whole of the two-day assessment. Students also read two different texts, answer multiple choice and open ended response questions, and write another response, this time in 30 minutes to different prompt. None of the work students do during this test would one characterize as examples of authentic work, the type one would expect do outside of a testing situation. For example, when was the last time you tried to identify the "best" central meaning of a text you were reading? Or spent time identifying a specific literary term with a sentence from the text you were reading? Is being able to select the statement from a field of four that includes personification really an important indicator of "language arts literacy" prowess?

II. Inside English Teachers' Classes
A few months after students took the HSPA, I was visiting a high school English teacher's class. This is the narrative I recorded after visiting:

Students in Jillian Honoria's English class are performing monologues based on their reading of Fyodor Dostoevsky’s Crime and Punishment on the day I visit. Once seated I quickly noticed the life-size doll propped on a rolling desk chair. Dressed in a black suit, white shirt, and scruffy boots, this class-made version of Raskolnikov, the protagonist from Crime and Punishment, sported wild black hair topped with a hat. During the class period, the students would address him as they performed monologues written from the perspective of other characters from the text. Dr. Honoria began class by having students compose a 10-minute journal entry.Within seconds, the room was silent, save the noise of everyone in the class writing. After some discussion about the texts students had written, the student performances began. The students’ performances demonstrated their significant understanding of the text. Dr. Honoria directed students to choose a character other than the protagonist, compose a written monologue that would extend the narrative in some manner, and then perform the monologue in class. One student stepped into the character of Sonya, Raskolnikov’s love, and set the scene for her monologue in Siberia, years after the novel’s close. Seated on the tile floor of the classroom at Raskolnikov’s feet, with a basket of knitting, wearing fingerless gloves, she delivered a powerful six-minute monologue that highlighted her remembered relationship with Raskolnikov, her faith in God, and how each informed her understanding of love. It is this newly made knowledge she conveyed that I found so compelling. Her performance was preceded by a young man who assumed the character of Alyona, the pawnborker who visits Raskolnikov from her place in hell, who is more insulted by his disrespect of the money he stole, than being murdered by him...

Or considered this synopsis of class work:

In another humanities' course team taught by Ms. Carmen and Ms. Joseph, I watch as eleventh grade students present ten-minute presentations intended to persuade their peers to vote for their global citizenship project. The students had determined five potential global projects and students working in teams created film-based multimedia projects. Students researched global problems that they believed they could adequately spend the year helping to solve, determined a project, created a film that would not only explain the project, but also potentially persuade an audience to adopt their project, and then embedded that film into a ten-minute presentation and presented in front of their peers, teachers, principal, assistant principals, and myself (Director of Curriculum).

III. Teacher Evaluation

In New Jersey, Governor Christie and Acting Department of Education Commissioner Chris Cerf want to base at least 50% of teacher evaluation with how well students do on tests, such as the HSPA. NJDOE reported:

Governor Christie today proposed and sent to the legislature a package of bills that gets at the root of the problems in New Jersey’s public education system by reforming the tenure system to demand results for New Jersey’s children in the classroom and reward the best and brightest teachers

The Governor believes that basing at least 50% of a teacher's evaluation on his/her students' standardized test performance, such as the HSPA is an apt indicator of successful teaching. Keep in mind that currently NJ does not have any state-issued assessments that can be used to evaluate any teachers who teach K-2, teachers who do not teach mathematics or language arts in grades 3 - 8, or science in grades 4 and 8. At the high school level the state only has assessments that could be used to evaluate biology, algebra, and 11th grade mathematics and English teachers.

No one else.

Already there are gross examples of abusive and frankly idiotic tests being used to measure teacher effectiveness. One example comes from Colorado where 6-year-olds were subjected to an art test.

On exam day in Sabina Trombetta's Colorado Springs first-grade art class, the 6-year-olds were shown a slide of Picasso's "Weeping Woman," a 1937 cubist portrait of the artist's lover, Dora Maar, with tears streaming down her face. It is painted in vibrant -- almost neon -- greens, bluish purples, and yellows. Explaining the painting, Picasso once said, "Women are suffering machines."
The test asked the first-graders to look at "Weeping Woman" and "write three colors Picasso used to show feeling or emotion." (Acceptable answers: blue, green, purple, and yellow.) Another question asked, "In each box below, draw three different shapes that Picasso used to show feeling or emotion." (Acceptable drawings: triangles, ovals, and rectangles.) A separate section of the exam asked students to write a full paragraph about a Matisse painting.

In a 29-year-career as an educator and as a working artist, I cannot think of anyone I have met who would subject children to such a test as a means to measure students' understanding of a state's fine arts standards.

IV. Effectiveness

To be sure, understanding teacher and administrator effectiveness is complex and needs to be determined locally. It is a mistake to think student performance on a single measure is a reliable indicator of a teacher's capacity to teach. Further, we have additional concerns when the test content is so disconnected from the century we live in. Using the results from that same out of date state test to measure teachers like Dr. Honoria, Ms. Carmen and Ms. Joseph who are engaging students in aesthetic and complex learning is foolish at best.

Between the By-Road and the Main Road

Pages

Monday, March 19, 2012

(Mis)Evaluating Student Writing and Teachers

Wednesday, April 27, 2011

Student Performance on NJ State Tests: A Poor Measure of Teacher Effectiveness