data-driven decisionmaking

Why we won't publish individual teachers' value-added scores

Tomorrow’s planned release of 12,000 New York City teacher ratings raises questions for the courts, parents, principals, bureaucrats, teachers — and one other party: news organizations. The journalists who requested the release of the data in the first place now must decide what to do with it all.

At GothamSchools, we joined other reporters in requesting to see the Teacher Data Reports back in 2010. But you will not see the database here, tomorrow or ever, as long as it is attached to individual teachers’ names.

The fact is that we feel a strong responsibility to report on the quality of the work the 80,000 New York City public school teachers do every day. This is a core part of our job and our mission.

But before we publish any piece of information, we always have to ask a question. Does the information we have do a fair job of describing the subject we want to write about? If it doesn’t, is there any additional information — context, anecdotes, quantitative data — that we can provide to paint a fuller picture?

In the case of the Teacher Data Reports, “value-added” assessments of teachers’ effectiveness that were produced in 2009 and 2010 for reading and math teachers in grades 3 to 8, the answer to both those questions was no.

We determined that the data were flawed, that the public might easily be misled by the ratings, and that no amount of context could justify attaching teachers’ names to the statistics. When the city released the reports, we decided, we would write about them, and maybe even release Excel files with names wiped out. But we would not enable our readers to generate lists of the city’s “best” and “worst” teachers or to search for individual teachers at all.

It’s true that the ratings the city is releasing might turn out to be powerful measures of a teacher’s success at helping students learn. The problem lies in that word: might.

Value-added measures do, by many readings, appear to do the job that no measure of a teacher’s quality has done before: They estimate the amount of learning by students for which a teacher, and no one else, is responsible, and they do this with impressive reliability. That is, a teacher judged to be more effective one year by value-added is likely to continue to be judged effective the next year, and the year after that.

But this is not true for every teacher — hardly. Many teachers will be mislabeled; no one disputes this. Value-added scores may be more reliable than existing alternatives, but they are still far from perfectly reliable. It’s completely possible, for instance, that a teacher judged as less effective one year will be judged as very effective the next, and vice versa.

As we reported two years ago, when the NYU economist Sean Corcoran looked at New York City’s value-added data, he found that 31 percent of English teachers who ranked in the bottom quintile of teachers in 2007 had jumped to one of the top two quintile by 2008. About 23 percent of math teachers made the same jump.

The fluctuation is acknowledged by even the strongest supporters of using value-added measures to evaluate teachers. One of the creators of the city’s original value-added model, the Columbia economist Jonah Rockoff, compares value-added scores to baseball players’ batting averages. One of his reasons: In each case, the year-to-year fluctuations of an individual’s score are about the same.

“If someone hit, you know, .280 last year, that doesn’t guarantee they’re going to hit .280 next year,” Rockoff said today. “However, if you hit .210 last year and I hit .300, there’s a very high likelhood I’m going to hit more than you next year, too. Whereas if you hit .280 and I hit .278, we’re basically the same.”

Another challenge is that many researchers still aren’t convinced that value-added scores are measuring the right sort of teacher impact. The challenge lies in the flaws of the measures on which value-added scores depend — standardized state test scores.

Tests are supposed to measure what a student has learned about a subject, but they can also reflect other things, like how well her teacher prepared her for the test, or how well she mastered the narrow band of the subject the test assessed.

The test-prep concern is magnified by findings that a single teacher can generate two different value-added scores if evaluators use two different student tests to determine them. The Gates Foundation’s Measures of Effective Teaching study calculated value-added scores for teachers based on both state tests and more conceptual tests. They found substantial differences between the two, according to an analysis by the economist Jesse Rothstein of the University of California at Berkeley.

“If it’s right that some teachers are good at raising the state test scores and other teachers are good at raising other test scores, then we have to decide which tests we care about,” Rothstein said today. “If we’re not sure that this is the test that captures what good teaching is, then we might be getting our estimates of teaching quality very wrong.”

Flags about exactly what high value-added ratings reward are also raised by studies that ask how the ratings match up with measures of what teachers actually say and do in the classroom. Heather Hill,  professor at Harvard’s Graduate School of Education, rated math teachers’ teaching quality based on an observation rubric called the Mathematical Quality of Instruction, which looks at factors like whether the teacher made mathematical errors and the quality of her explanations. Then Hill compared the math teaching rating to value-added measures.

Two individual cases stood out: One teacher had made a slew of math errors in her teaching, and the other had failed to connect a class activity to math concepts. But teachers’ value-added scores put them at the top of their cohort.

There is some reason to think that value-added measures reflect more than test prep. Rockoff points out that while different tests can produce different value-added scores for the same teacher, the two measures are still correlated. Using different tests, he said, is akin to looking at slugging percentage rather than batting average. “I’m sure those two things are positively correlated, but probably not one for one,” he said.

More persuasively, a recent study by Rockoff and two other colleagues concluded that value-added measures can actually predict long-term life success outcomes, including higher cumulative lifelong income, reduced chance of teen pregnancy, and living in a high-quality neighborhood as an adult. The study examined an anonymous very large urban school district that bears several similarities to New York City.

That study targeted another concern about value-added measures: that teachers score consistently well year after year not because of something they are doing, but because they consistently teach students with certain advantages.

Rothstein has used value-added models to conclude that fifth-grade teachers have strong effects on their students’ performances in third-grade — something they could not possibly influence, unless value-added scores reflect not just teachers’ influence but also advantages brought by students.

Rockoff and his colleagues evaluated the possibility by testing a question. If high-value added teachers do well because they get the “better” students of those in their grade, then their students’ high test score growth would be linked with mediocre performance in other classrooms. That would mean that, when researchers looked at growth for the entire grade, the “better” students’ growth would be canceled out by their less lucky peers. But the scores were not canceled out, suggesting that effective teachers did more than just have unusually good students.

None of this means that we won’t write about what the data dump includes or that we might not publish an adapted database that strips out information linking the city’s data to individual teachers. With more than 90 columns in the Excel sheet the city has developed — and more than 17,000 rows, representing the number of reports issued over their two-year lifespan — the release might well enable us to examine the city’s value-added experiment in new ways.

Value-added measures certainly aren’t going away. City officials only stopped producing Teacher Data Reports because they knew the State Education Department is preparing its own. The measures, which are expected to come out in 2013, will make up 25% of the evaluation for teachers of math and English in tested grades.

Betsy DeVos

To promote virtual schools, Betsy DeVos cites a graduate who’s far from the norm

U.S. Education Secretary Betsy DeVos spoke to the National Alliance for Public Charter Schools in June.

If Betsy Devos is paying any attention to unfolding critiques of virtual charter schools, she didn’t let it show last week when she spoke to free-market policy advocates in Bellevue, Washington.

Just days after Politico published a scathing story about virtual charters’ track record in Pennsylvania, DeVos, the U.S. education secretary, was touting their successes at the Washington Policy Center’s annual dinner.

DeVos’s speech was largely identical in its main points to one she gave at Harvard University last month. But she customized the stories of students who struggled in traditional schools with local examples, and in doing so provided an especially clear example of why she believes in virtual schools.

From the speech:

I also think of Sandeep Thomas. Sandeep grew up impoverished in Bangalore, India and experienced terrible trauma in his youth. He was adopted by a loving couple from New Jersey, but continued to suffer from the unspeakable horrors he witnessed in his early years. He was not able to focus in school, and it took him hours to complete even the simplest assignment.

This changed when his family moved to Washington, where Sandeep was able to enroll in a virtual public school. This option gave him the flexibility to learn in the quiet of his own home and pursue his learning at a pace that was right for him. He ended up graduating high school with a 3.7 GPA, along with having earned well over a year of college credit. Today, he’s working in finance and he is a vocal advocate for expanding options that allow students like him a chance to succeed.

But Thomas — who spoke at a conference of a group DeVos used to chair, Advocates for Children, in 2013 as part of ongoing work lobbying for virtual charters — is hardly representative of online school students.

In Pennsylvania, Politico reported last week, 30,000 students are enrolled in virtual charters with an average 48 percent graduation rate. In Indiana, an online charter school that had gotten a stunning six straight F grades from the state — one of just three schools in that positionis closing. And an Education Week investigation into Colorado’s largest virtual charter school found that not even a quarter of the 4,000 students even log on to do work every day.

The fact that in many states with online charters, large numbers of often needy students have enrolled without advancing has not held DeVos back from supporting the model. (A 2015 study found that students who enrolled in virtual charters in Michigan, Illinois, and Wisconsin did just as well as similar students who stayed in brick-and-mortar schools.) In fact, she appeared to ignore their track records during the confirmation process in January, citing graduation rates provided by a leading charter operator that were far higher — nearly 40 points in one case — than the rates recorded by the schools’ states.

She has long backed the schools, and her former organization has close ties to major virtual school operators, including K12, the one that generated the inflated graduation numbers. In her first week as education secretary, DeVos said, “I expect there will be more virtual schools.”

Correction: An earlier version of this article misstated the location of the dinner.

expansion plans

Here are the next districts where New York City will start offering preschool for 3-year-olds

PHOTO: Christina Veiga
Schools Chancellor Carmen Fariña, left, and Mayor Bill de Blasio, center, visited a "Mommy and Me" class in District 27 in Queens, where the city is set to expand 3-K For All.

New York City officials on Tuesday announced which school districts are next in line for free pre-K for 3-year-olds, identifying East Harlem and the eastern neighborhoods of Queens for expansion of the program.

Building on its popular universal pre-K program for 4-year-olds, the city this year began serving even younger students with “3-K For All” in two high-needs school districts. Mayor Bill de Blasio has said he wants to make 3-K available to every family who wants it by 2021.

“Our education system all over the country had it backwards for too long,” de Blasio said at a press conference. “We are recognizing we have to reach kids younger and more deeply if we’re going to be able to give them the foundation they need.”

But making preschool available to all of the city’s 3-year-olds will require an infusion of $700 million from the state or federal governments. In the meantime, de Blasio said the city can afford to expand to eight districts, at a cost of $180 million of city money a year.

Funding isn’t the only obstacle the city faces to make 3-K available universally. De Blasio warned that finding the room for an estimated 60,000 students will be a challenge. Space constraints were a major factor in picking the next districts for expansion, he said.

“I have to tell you, this will take a lot of work,” he said, calling it “even harder” than the breakneck rollout of pre-K for all 4-year-olds. “We’re building something brand new.”

De Blasio, a Democrat who is running for re-election in November, has made expansion of early childhood education a cornerstone of his administration. The city kicked off its efforts this September in District 7 in the South Bronx, and District 23 in Brownsville, Brooklyn. More than 2,000 families applied for those seats, and 84 percent of those living in the pilot districts got an offer for enrollment, according to city figures.

According to the timeline released Thursday, the rollout will continue next school year in District 4 in Manhattan, which includes East Harlem; and District 27 in Queens, which includes Broad Channel, Howard Beach, Ozone Park and Rockaways.

By the 2019 – 2020 school year, the city plans to launch 3-K in the Bronx’s District 9, which includes the Grand Concourse, Highbridge and Morrisania neighborhoods; and District 31, which spans all of Staten Island.

The 2020 – 2021 school year would see the addition of District 19 in Brooklyn, which includes East New York; and District 29 in Queens, which includes Cambria Heights, Hollis, Laurelton, Queens Village, Springfield Gardens and St. Albans.

With all those districts up and running, the city expects to serve 15,000 students.

Admission to the city’s pre-K programs is determined by lottery. Families don’t have to live in the district where 3-K is being offered to apply for a seat, though preference will be given to students who do. With every expansion, the city expects it will take two years for each district to have enough seats for every district family who wants one.