checking your work (corrected)

New York City looks for a way out of its "bubble tests" problem

UFT President Michael Mulgrew testifies at a state senate hearing in New York City. At right, Senator John Flanagan, chair of the education committee, listens.
UFT President Michael Mulgrew testifies at a state senate hearing in New York City. At right, Senator John Flanagan, chair of the education committee, listens.

The city wants to get rid of unpopular “bubble sheet” tests that some of its youngest students are required to take this year, a top Department of Education official said on Tuesday.

“There are better ways to do assessments of early childhood and I think that we can find a better way to do it,” Chief Academic Officer Shael Polakow-Suransky told lawmakers in testimony at state Senate hearing. The hearing was planned by Senator John Flanagan in large part as an opportunity for people to air their frustration with the state’s new standards and the tests associated with them.

The math tests in question, called Discovery Education Assessment, are being given to small portion of students in kindergarten through second grades as part of their teachers’ evaluations, a portion of which must measure student learning over the course of a school year. Discovery’s tests include elements, like No. 2 pencils and standardized bubble answers, that teachers and experts have panned as developmentally inappropriate.

Polakow-Suransky echoed that criticism on Tuesday and vowed to offer an alternative student learning measure soon to take effect for this school year.

It represents a somewhat sudden reversal for the city, which bought the Discovery tests from a vendor in August for this school year after declining to use its own elementary math assessments, an option that Commissioner John King preferred when he crafted DOE’s new teacher evaluation rules. Polakow-Suransky’s comments come as push back against testing policies from parents and teachers have escalated statewide in recent weeks, prompting the State Education Department to make a series of its own changes to curtail the role of testing requirements.

It also fits into a pattern of conciliatory statements from Polakow-Suransky lately, all coming at a time when the direction of the Department of Education is likely to change dramatically in two months. Bill de Blasio, the overwhelming favorite to win next week’s mayoral election, has said he wants to reverse many of the city and state’s testing policies.

United Federation of Teachers President Michael Mulgrew, who testified before Polakow-Suransky, urged a more drastic action than the one offered by Polakow-Suransky.

“I think New York State should ban all standardized tests for pre-k through second grade,” Mulgrew said.

Told of Polakow-Suransky’s comments after the hearing, Mulgrew said he was pleased to hear of the possible changes, but added, “he’s the one who implemented them in the first place.”

Polakow-Suransky noted that only a relatively small number of students in early elementary grades were required to take the bubble tests. Teachers in more than 800 of the city’s elementary school teachers will be evaluated based on test scores earned by older students on existing state exams.

But older students don’t exist in 36 “early education” schools in New York City, which serve kindergarten through second grades only. Those schools, Polakow-Suransky said, were sacrificed “in order to protect the rest of the elementary schools” from a stipulation in King’s evaluation plan earlier this year.

Polakow-Suransky said the city hoped to make the DOE’s newly-developed math assessments available as an option to all elementary schools. But he pulled back on those plans over the summer after seeing King’s plan, which mandated that schools use performance assessments if the city made them available at all.

“It created a situation where we had, at the K-2 level, to make a choice between essentially putting out a test that would be mandated for every elementary school in the city, or not putting anything out at all,” Polakow-Suransky said.

Instead, Discovery’s bubble tests were picked out of more than a dozen state-approved vendors.

Commissioner John King would need to approve any alternative measure, though a spokesman suggested that it won’t be a hard plan to move forward.

“Commissioner King and Chancellor [Merryl] Tisch, are both opposed to using bubble tests for kindergarten to second grades,” said the spokesman, Dennis Tompkins. “If the city wants to move away from bubble tests, then we’d more than welcome that move.”

Tompkins noted that one option that the city could consider for the 36 early education schools is to evaluate teachers based on third grade test scores of former students after they’ve moved onto a different school.

Polakow-Suransky said the department was considering multiple ideas before sending in a formal proposal, including one in which teachers would look at student work samples over time to gauge growth.

Updated to clarify the Department of Education’s reasons for choosing to buy the Discovery assessments.

research report

Three years in, some signs of (slight) academic growth at struggling ‘Renewal’ schools

PHOTO: Patrick Wall
Mayor Bill de Blasio at Brooklyn Generation School — part of the Renewal program

When Mayor Bill de Blasio launched an aggressive and expensive campaign to turn around the city’s lowest performing schools, he made a big promise: Schools would see “fast and intense” improvements within three years.

Almost exactly three years later, and after flooding 78 schools with more than $386 million in new social services and academic support, there are signs that the Renewal program has generated gains in student learning. The evidence is based on two newly updated analyses of test score data — one from Marcus Winters, a fellow at the conservative-learning Manhattan Institute, and the other from Aaron Pallas, a professor at Teachers College.

But the researchers caution that those improvements are modest — when they exist at all — and don’t yet match the mayor’s lofty promises.

The results may have implications far beyond New York City, as a national and political test case of whether injecting struggling schools with resources is more effective than closing them.

The two researchers previously reviewed the first two years of test score data in elementary and middle schools in the Renewal program: Winters found a positive effect on test scores, while Pallas generally found little to no effect.

Now, as the program reaches its third birthday, the pair of researchers have updated their findings with new test score data from last school year, and largely reaffirmed their earlier conclusions.

“We’re not seeing large increases” in student achievement, Pallas said. “And the reality is it’s hard to get large increases in struggling schools.”

Some advocates have argued that it is too early to expect big shifts in test scores, and that infusing schools with extra social services like mental health counseling and vision screenings are valuable in themselves. But de Blasio’s promise of quick academic turnaround has invited questions about Renewal’s effectiveness and whether resources can be more effective in improving low-performing schools than shuttering them.

To assess the program’s academic effect, Pallas compared changes in Renewal school test scores to other schools that had similar test results and student demographics when the program started, but did not receive extra support.

The biggest gains Pallas found were concentrated at the elementary level.

Over the past three school years, 20 elementary schools in the Renewal program have made larger gains on average in math and reading than 23 similar schools that didn’t get extra resources. The proportion of elementary school students considered proficient in reading at Renewal schools increased from 7 percent in 2014 to 18 percent last year — an 11-point jump. Meanwhile, the comparison schools also saw gains, but only by seven percentage points, giving Renewal schools a four percentage point advantage.

At the middle school level, the results are less encouraging. The 45 Renewal middle schools did not collectively outperform a group of 50 similar schools outside the program in reading or math.

In math, for instance, Renewal school students improved from 5 percent proficient to 7 percent. However, the comparison schools outside the program improved by roughly the same margin — increasing proficiency from 6 to 9 percent (and still far below city average). In reading, Renewal middle schools showed slightly less growth than the comparison group.

City officials have argued that Pallas’ findings are misleading partly because Renewal schools and the comparison schools are not actually comparable. Renewal schools, they say, were designated based on a range of factors like school climate or teacher effectiveness, not just student demographics and test scores.

“The schools included in the study are neither similar nor comparable in quality and a comparison of the two dissimilar groups is unreliable at best,” Michael Aciman, an education department spokesman, said in a statement. Aciman added that Renewal schools have made larger gains in reading and math than similar schools across the state, and have made progress in reducing chronic absenteeism and improving instruction.

Pallas notes that there are some limitations to his approach, and acknowledges that he could not account for some differences between the two groups, such as the quality of a school’s principal. He also does not use student-level data, for instance, which would allow a more fine-grained analysis of whether the Renewal program is boosting student achievement. But Pallas, and other researchers who have previously reviewed his data, have said his model is rigorous.

The Manhattan Institute’s Winters found more positive trends than Pallas, consistent with his earlier findings. Using an approach that evaluates whether Renewal schools are outperforming historical trends compared with schools outside the program, Winters found that the Renewal program appeared to have a statistically significant effect on both reading and math scores — roughly equivalent to the difference in student achievement between charter schools and traditional district schools in New York City.

Asked about how to interpret the fact that his results tended to be more positive, Winters said either interpretation is plausible.

“It’s hard to tell which of these is exactly right,” he said. But “neither of us are finding results that are consistent with what we would expect if the program is having a large positive effect.”

explainer

Five things to know about the latest brouhaha over Tennessee’s TNReady test

PHOTO: Laura Faith Kebede

Last week’s revelation that nearly 10,000 Tennessee high school tests were scored incorrectly has unleashed a new round of criticism of the standardized test known as TNReady.

Testing company Questar says it muffed some tests this spring after failing to update its scanning software. A year earlier, a series of mistakes got its predecessor, Measurement Inc., fired when Tennessee had to cancel most of TNReady in its first year after a failed transition to online testing.

While the two companies’ glitches are hardly comparable in scope, Questar’s flub has uncorked a tempest of frustration and anger over the standardized assessment and how it’s used to hold teachers accountable.

Here are five things to know about the latest TNReady flap:

1. A relatively small number of students, teachers, and schools are affected.

State officials report that the scoring problem was traced to only high school tests, not for its grade-schoolers. Of the 600,000 high school end-of-course tests, about 9,400 were scored incorrectly. Most of the fixes were so small that fewer than 1,700 tests — or less than one-tenth of 1 percent — saw any change in their overall performance level. A state spokeswoman says the corrected scores have been shared with the 33 impacted districts.

2. But the TNReady brand has taken another huge hit.

Tennessee has sought to rebuild public trust in TNReady under Questar and celebrated a relatively uneventful testing season last spring. But the parade of problems that surfaced during TNReady’s rollout, combined with this year’s drops in student performance under the new test, have made subsequent bumps feel more like sinkholes to educators who already are frustrated with the state’s emphasis on testing. Questar’s scanning problems were also tied to delays in delivering preliminary scores to school systems this spring — another bump that exasperated educators and parents at the end of the school year and led many districts to exclude the data from student report cards.

3. State lawmakers will revisit TNReady — and soon.

House Speaker Beth Harwell asked Monday for a hearing into the latest testing problems, and discussion could happen as early as next week when a legislative study committee is scheduled to meet in Nashville. Meanwhile, one Republican gubernatorial candidate says the state should eliminate student growth scores from teacher evaluations, and a teachers union in Memphis called on Tennessee to invalidate this year’s TNReady results.

4. Still, those talks are unlikely to derail TNReady.

Tennessee is heavily invested in its new assessment as part of its five-year strategic plan for raising student achievement. Changing course now would be a surprise. Last school year was the first time that all students in grades 3-11 took TNReady, a standardized test aligned to the Common Core standards, even though those expectations for what students should learn in math and English language arts have been in Tennessee classrooms since 2012. State officials view TNReady results as key to helping Tennessee reach its goal of ranking in the top half of states on the Nation’s Report Card by 2019.

5. Tennessee isn’t alone in traveling a bumpy testing road.

Questar was criticized this summer for its design of two tests in Missouri. Meanwhile, testing giant Pearson has logged errors and missteps in New York, Virginia, and Mississippi. And in Tennessee and Ohio this spring, the ACT testing company administered the wrong college entrance exam to almost 3,000 juniors from 31 schools. Officials with the Tennessee Department of Education emphasized this week that they expect 100 percent accuracy on scoring TNReady. “We hold our vendor and ourselves to the highest standard of delivery because that is what students, teachers, and families in Tennessee deserve,” said spokeswoman Sara Gast.