data-driven decisionmaking

Why we won't publish individual teachers' value-added scores

Tomorrow’s planned release of 12,000 New York City teacher ratings raises questions for the courts, parents, principals, bureaucrats, teachers — and one other party: news organizations. The journalists who requested the release of the data in the first place now must decide what to do with it all.

At GothamSchools, we joined other reporters in requesting to see the Teacher Data Reports back in 2010. But you will not see the database here, tomorrow or ever, as long as it is attached to individual teachers’ names.

The fact is that we feel a strong responsibility to report on the quality of the work the 80,000 New York City public school teachers do every day. This is a core part of our job and our mission.

But before we publish any piece of information, we always have to ask a question. Does the information we have do a fair job of describing the subject we want to write about? If it doesn’t, is there any additional information — context, anecdotes, quantitative data — that we can provide to paint a fuller picture?

In the case of the Teacher Data Reports, “value-added” assessments of teachers’ effectiveness that were produced in 2009 and 2010 for reading and math teachers in grades 3 to 8, the answer to both those questions was no.

We determined that the data were flawed, that the public might easily be misled by the ratings, and that no amount of context could justify attaching teachers’ names to the statistics. When the city released the reports, we decided, we would write about them, and maybe even release Excel files with names wiped out. But we would not enable our readers to generate lists of the city’s “best” and “worst” teachers or to search for individual teachers at all.

It’s true that the ratings the city is releasing might turn out to be powerful measures of a teacher’s success at helping students learn. The problem lies in that word: might.

Value-added measures do, by many readings, appear to do the job that no measure of a teacher’s quality has done before: They estimate the amount of learning by students for which a teacher, and no one else, is responsible, and they do this with impressive reliability. That is, a teacher judged to be more effective one year by value-added is likely to continue to be judged effective the next year, and the year after that.

But this is not true for every teacher — hardly. Many teachers will be mislabeled; no one disputes this. Value-added scores may be more reliable than existing alternatives, but they are still far from perfectly reliable. It’s completely possible, for instance, that a teacher judged as less effective one year will be judged as very effective the next, and vice versa.

As we reported two years ago, when the NYU economist Sean Corcoran looked at New York City’s value-added data, he found that 31 percent of English teachers who ranked in the bottom quintile of teachers in 2007 had jumped to one of the top two quintile by 2008. About 23 percent of math teachers made the same jump.

The fluctuation is acknowledged by even the strongest supporters of using value-added measures to evaluate teachers. One of the creators of the city’s original value-added model, the Columbia economist Jonah Rockoff, compares value-added scores to baseball players’ batting averages. One of his reasons: In each case, the year-to-year fluctuations of an individual’s score are about the same.

“If someone hit, you know, .280 last year, that doesn’t guarantee they’re going to hit .280 next year,” Rockoff said today. “However, if you hit .210 last year and I hit .300, there’s a very high likelhood I’m going to hit more than you next year, too. Whereas if you hit .280 and I hit .278, we’re basically the same.”

Another challenge is that many researchers still aren’t convinced that value-added scores are measuring the right sort of teacher impact. The challenge lies in the flaws of the measures on which value-added scores depend — standardized state test scores.

Tests are supposed to measure what a student has learned about a subject, but they can also reflect other things, like how well her teacher prepared her for the test, or how well she mastered the narrow band of the subject the test assessed.

The test-prep concern is magnified by findings that a single teacher can generate two different value-added scores if evaluators use two different student tests to determine them. The Gates Foundation’s Measures of Effective Teaching study calculated value-added scores for teachers based on both state tests and more conceptual tests. They found substantial differences between the two, according to an analysis by the economist Jesse Rothstein of the University of California at Berkeley.

“If it’s right that some teachers are good at raising the state test scores and other teachers are good at raising other test scores, then we have to decide which tests we care about,” Rothstein said today. “If we’re not sure that this is the test that captures what good teaching is, then we might be getting our estimates of teaching quality very wrong.”

Flags about exactly what high value-added ratings reward are also raised by studies that ask how the ratings match up with measures of what teachers actually say and do in the classroom. Heather Hill,  professor at Harvard’s Graduate School of Education, rated math teachers’ teaching quality based on an observation rubric called the Mathematical Quality of Instruction, which looks at factors like whether the teacher made mathematical errors and the quality of her explanations. Then Hill compared the math teaching rating to value-added measures.

Two individual cases stood out: One teacher had made a slew of math errors in her teaching, and the other had failed to connect a class activity to math concepts. But teachers’ value-added scores put them at the top of their cohort.

There is some reason to think that value-added measures reflect more than test prep. Rockoff points out that while different tests can produce different value-added scores for the same teacher, the two measures are still correlated. Using different tests, he said, is akin to looking at slugging percentage rather than batting average. “I’m sure those two things are positively correlated, but probably not one for one,” he said.

More persuasively, a recent study by Rockoff and two other colleagues concluded that value-added measures can actually predict long-term life success outcomes, including higher cumulative lifelong income, reduced chance of teen pregnancy, and living in a high-quality neighborhood as an adult. The study examined an anonymous very large urban school district that bears several similarities to New York City.

That study targeted another concern about value-added measures: that teachers score consistently well year after year not because of something they are doing, but because they consistently teach students with certain advantages.

Rothstein has used value-added models to conclude that fifth-grade teachers have strong effects on their students’ performances in third-grade — something they could not possibly influence, unless value-added scores reflect not just teachers’ influence but also advantages brought by students.

Rockoff and his colleagues evaluated the possibility by testing a question. If high-value added teachers do well because they get the “better” students of those in their grade, then their students’ high test score growth would be linked with mediocre performance in other classrooms. That would mean that, when researchers looked at growth for the entire grade, the “better” students’ growth would be canceled out by their less lucky peers. But the scores were not canceled out, suggesting that effective teachers did more than just have unusually good students.

None of this means that we won’t write about what the data dump includes or that we might not publish an adapted database that strips out information linking the city’s data to individual teachers. With more than 90 columns in the Excel sheet the city has developed — and more than 17,000 rows, representing the number of reports issued over their two-year lifespan — the release might well enable us to examine the city’s value-added experiment in new ways.

Value-added measures certainly aren’t going away. City officials only stopped producing Teacher Data Reports because they knew the State Education Department is preparing its own. The measures, which are expected to come out in 2013, will make up 25% of the evaluation for teachers of math and English in tested grades.

To and Through

Newark’s post-grad paradox: More students are entering college, but few earn degrees

PHOTO: Patrick Wall/Chalkbeat
Newark Mayor Ras Baraka wants 25 percent of residents to have college degrees by 2025, up from 19 percent today.

When it comes to college, Newark faces a good news-bad news paradox.

More students than ever are graduating high school and enrolling in college, according to a new report. Yet fewer than one in four Newark students earns a college degree within six years of graduating high school — leaving many with limited job prospects in a city where an estimated one-third of jobs require a four-year college degree.

Now, city officials are promising to build on the report. They want to ramp up the rigor of high-school classes and create more early-college programs to increase the odds of students entering college and leaving with a degree.  

“How do we teach our children to perform — to graduate?” Mayor Ras Baraka asked at a press conference Wednesday to mark the official release of the report of Newark students’ college outcomes. “We got them in the door,” he said of students who attend college. “Now how do we make them stay?”

The city’s plans, to which Superintendent Roger León is lending his support, reflect a growing recognition that simply getting students into college is not sufficient — and can even backfire if they drop out before graduation, leaving them with college debt but no degree.

Until recently, the charge given to high schools in Newark and across the country was to foster “college-going cultures.” And these efforts showed promising results: On average, 51 percent of Newark Public School students who graduated high school between 2011 and 2016 immediately enrolled in college, up from 39 percent who did so between 2004 and 2010, according to the report by the Newark City of Learning Collaborative, or NCLC, and Rutgers University-Newark’s School of Public Affairs and Administration.

But entering college didn’t guarantee its completion. Of those students who started college straight after high school, only 39 percent earned a degree within six years, the report found.

As a result, educators and policymakers have begun to think harder about how to help students “to and through” college — to ensure they actually earn degrees. Toward that end, Baraka and the NCLC — which includes roughly 40 colleges, schools, nonprofits, and corporations — has set a goal of 25 percent of Newark residents earning college degrees or comparable credentials by 2025.

Today, just 19 percent of Newark adults have associate degrees or higher — compared to 45 percent of adults across New Jersey and 40 percent nationally.

Superintendent León, who began overseeing the city’s schools on July 1, said his main strategy for supporting these efforts will be to expose students to challenging work early on.

“If we don’t do something dramatically in classrooms to improve instruction and make it rigorous,” León said after Wednesday’s event, then students are “getting into college but they’re not completing it.”

Source: “Post-Secondary Outcomes of Newark High School Graduates (2011-2016)” report. Note: The four-year rate is an average of the classes of 2011 to 2013. The six-year rate is from the class of 2011. Graphic: Sam Park/Chalkbeat

For starters, León said he wants high schools to offer more college-level classes. In the 2016-17 school year, just 21 percent of Newark students were enrolled in one or more Advanced Placement or International Baccalaureate classes — compared to 42 percent of students statewide.

He also vowed to raise the quality of instruction in the district’s traditional high schools. Only 14 percent of their graduates earn college degrees within six years, compared to 42 percent of graduates from the city’s selective magnet schools, the report found.

To do that, León said he will create specialized academies within the traditional schools modeled on the magnets, which have specialized themes such as science, technology, or the arts. The academies, which will partner with colleges, will most likely feature admissions criteria similar to those of magnet schools, which select students based on their academic and attendance records, León added.

And, for the first time, all ninth-grade students this academic year will take the Preliminary SAT, or PSAT, León said Wednesday. An additional 1,100 eighth-graders who passed at least one of their seventh-grade PARCC exams will also take the PSAT when it’s administered on Oct. 10.

Since 2016, the district has provided the PSAT to all 10th and 11th-grade students. But León said that giving the test to younger students will focus their attention on college and help identity those who are ready for advanced classes. The PSAT is designed to help students prepare for the SAT, which is used in college admissions, and to qualify for National Merit Scholarships.

The district, which was under state control for 22 years until February, is getting some assistance in its effort to improve students’ college outcomes.

For instance, KIPP, the national charter-school network with eight schools in Newark, is sharing its strategies for helping students choose the right college with guidance counselors at three district high schools.

And the higher-education institutions in the Newark City of Learning Collaborative, including Essex County College and Rutgers University-Newark, plan to create more “dual-enrollment” programs that allow high-school students to earn college credits, said NCLC Executive Director Reginald Lewis.

“We’re all going to do a better job,” Lewis said, “of making sure that once Newark residents get in our doors, that we help them persist.”

Time crunch

In victory for teachers union, Newark superintendent scraps longer hours for low-performing schools

PHOTO: Patrick Wall/Chalkbeat
Superintendent Roger León at Hawkins Street School, one of the schools that will lose its extended hours.

Newark’s new superintendent is eliminating a program that extended the hours of struggling schools, which the teachers union has long attacked as ineffective and unfair to educators.

Teachers at roughly 30 schools will no longer receive $3,000 annual stipends for the extra hours, a provision written into the current teachers contract, which extends to 2019. Instead, all 64 district schools will get extra funding for before and after-school programs, Superintendent Roger León said in an email to employees on Tuesday.

The changes will go into effect Monday, Sept. 10, resulting in new hours for the affected schools just days after the new school year began. The district is still working to adjust pickup times for students who are bused to school, according to León’s email. A few of the schools will phase out their extended hours later in the year, the email said.

“We will not continue to do the same things as before and be surprised when the results do not change,” León wrote, adding that cutting the extra hours would save the district $5 million.

In an interview with Chalkbeat Thursday, León said the move is intended to create more uniformity among schools and the services they provide. Now, all schools will get additional money to pay for programs outside of the regular school day, which schools can tailor to their individual needs, though students who are struggling academically will continue to receive “intensive” support, he said.

“Ultimately, the idea would be by October having completely different after-school and before-school programming that meets the needs of each respective school,” León said.

The extended time was first included in the teachers contract in 2012 as part of a larger improvement plan for the targeted schools, which was developed by Cami Anderson, Newark’s former state-appointed superintendent. The plan also designated some low-performing schools as “renew” schools, where teachers had to reapply for their positions and work longer hours.

Anderson also closed some schools and gave principals new hiring authority. Both actions left dozens of tenured teachers without positions, so Anderson created a fund to pay those teachers to perform support duties in schools. In 2014, that fund for “employees without placement” cost the district $35 million out of its nearly $1 billion budget, though by last year the fund had shrunk to $8 million for about 100 unassigned teachers, according to officials.

León said in Tuesday’s email that he was also eliminating the fund, which he said would save the district another $6 million. The teachers union president said he believed all the unassigned teachers now have placements, but the district did not respond to a request to confirm that.

León is also removing the “renew” and “turnaround” labels from low-performing schools, citing their “progress and student achievement,” according to the email.

“I applaud everyone’s efforts at renew or turnaround schools and acknowledge what has been accomplished,” he wrote.

Now that León has abolished his predecessors’ school-improvement program, he will be expected to create his own. Many schools remain mired in poor performance, even as the district overall has made strides in recent years.

When the teachers union agreed to the extended hours in its 2012 contract with the district, it was hailed nationally as a major breakthrough in efforts to revamp troubled schools. But even as the union agreed last year to keep the provision in its current contract, union officials have assailed the turnaround effort as a failure.

NTU President John Abeigon told Chalkbeat on Thursday that the program had been a “scam” and “nothing more than extended childcare.” He added that the stipend teachers received amounted to about $7 per hour for the extra time they worked.

In 2016, a district-commissioned survey of 787 teachers at schools with extended hours found that two-thirds of teachers at schools where the extra time was spent on student instruction said the time was valuable. But in a survey the union conducted in April, the 278 teachers who responded gave the extended hours low ratings for effectiveness in boosting student achievement.

Some teachers in the union survey praised the longer hours, saying their schools used them effectively to lengthen class periods, run after-school clubs, or allow teachers to plan lessons or review student data. But others said the extra time was squandered, leaving staff and students exhausted with little evidence of improved student outcomes to show for it. (Students’ pass rates on state tests stayed flat or declined at most “renew” schools in the first years of the program.)

The union also has complained that many teachers felt compelled to work the extra hours because those who refused to could be transferred to different schools. Under the terms of the original extended-day agreement, teachers were required to work an extra hour per day and attend trainings during the summer and some weekends.

In León’s email to employees, he said every extended-day school had set different work requirements and “none are consistent with the original design.” The longer days may also be contributing to high teacher turnover in those schools, he wrote, adding that principals of schools with regular hours told him they did not want to extend their hours.

Abeigon, the union president, applauded León’s decision to scrap the extra work hours.

“He came to the conclusion that we expected any true educator to reach: that the program was not working and was never going to work,” he said.

León said Thursday that he is now working on a new turnaround program. Once it’s ready, he promised to share the details with affected families before publicly announcing which schools are part of it — an effort to avoid the student protests that erupted when Anderson identified her “turnaround” schools.

He also said he was still considering whether he would ever close schools that fail to improve or to reverse their declining enrollments. Anderson’s decision to shutter nearly a dozen long-struggling schools continues to fuel resentment among her critics even years later.

“I think the whole idea of how much time does a school get to correct itself is a very important one and I’m going to need to be really reflective on it,” León said. “I’ve seen what closing schools does with people who do not feel that they were aware of it or a part of fixing it.”