What's wrong with this headline?
In poll reporting, the margin of error is like the late comic Rodney Dangerfield: It don't get no respect.
On Thursday, Oct. 21, this headline topped the front page of the San Jose Mercury News: "Bush ahead in 7 swing states." Reporting on a set of statewide polls, the Knight Ridder story asserts, "Bush leads in seven states that he carried closely in 2000."
But in the next two paragraphs, the writer contradicts himself: The 1-percentage-point margin between Bush and Kerry in Ohio is "a statistically insignificant difference that means the race is effectively a dead heat."
One paragraph later the reporter also describes the races in Florida and New Hampshire as "tossups" because the poll results place the candidates only 3 percentage points apart, and the poll has a margin of error of plus or minus 4 percentage points.
The poll results are straightforward, even if the writing reverses course. One cannot tell who leads in these three states. The headline and top of the story are misleading because they ignore the margin of error.
The Contra Costa Times, another Knight Ridder newspaper, also carried the poll article, which carried the byline of one of the chain's principal political writers, Steven Thomma.
The problem showed up again the following day in a story by the same author in both Bay Area papers. This time Mr. Kerry was said to have "trailed" Mr. Bush in three "blue" swing states. But a sentence later the claim was contradicted: "All were within the poll's margin of error and remained toss-ups."
If a race is a "toss-up" and the difference in the survey sample is "statistically insignificant," then it's counterfactual to say one party is leading. One can say candidate A is leading in our survey of 650 people. But to say the candidate is leading in the population of several million likely voters is to give the survey more precision than a probability sample can deliver.
It may help to review how probability surveys work.
Probability sampling seems almost magical. You talk to just 650 residents of a state like Ohio with 7.8 million registered voters. If everything goes right, there's a 95% chance the true proportion of the millions planning to vote for a candidate will fall within 4 percentage points on either side of the proportion in your small sample who prefer that office-seeker.
So there's a plus-or-minus-4-percentage-point sampling error around the estimate of each candidate's popularity. In Ohio, the sample showed Mr. Bush commanded the allegiance of 46% of likely voters. So he could be the favorite of as many as 50% of population sampled, or as few as 42%. With a sample estimate of 45%, Mr. Kerry could be the choice of 49% or as few as 41%.
The overlap is so great the poll can't tell you who's ahead.
Of the seven states sampled in the Mercury News article, in only one -- Nevada -- does the margin found in the poll exceed the margins of error around each candidate. The poll showed Bush with a 10-percentage-point lead in the Silver State.
The sample mean, Bush's 46% and Kerry's 45%, is the best guess of the true proportion for each candidate in Ohio. But it's only an estimate. There's a declining, but real, chance the true proportion lies toward one or the other end of each candidates' error margin.
Rather than ignoring margins of sampling error, news media should consider them conservative estimates of inaccuracy.
Almost all modern polls are prone to a host of errors. The logic of probability polling depends on every member of the population having an equal chance of being included in the poll. That's almost never the case.
Some people screen out calls from pollsters, or aren't home when they call, or don't speak the poll-takers' language, or aren't included in the list of numbers because they own only a cell phone, or wish to keep their vote a secret. To the extent that those excluded differ from those who answer, the poll loses accuracy.
The wording of polls, the definition of "likely voters," even the order of questions can also introduce bias.
During this campaign season we have seen many polls asking the same question -- for example, "who won the debate?" -- of the same population at the same time, but coming up with answers far outside the margins of sampling error.
Grade the News conducted an analysis of the accuracy of Bay Area and California polls in 2000. We compared the actual difference between the winning and losing candidate or ballot measure with the predictions of the margin of victory in the poll taken closest to Election Day. We found fewer than half the poll predictions fell within their stated margins of error multiplied by two. The analysis was designed by Warren Mitofsky, one of the nation’s most respected and veteran pollsters.
Given these problems, reporters ought not place too much faith in any single poll. Averaging is a good idea. Paying attention to trends is another. In the Knight Ridder story, the reporter notes that earlier polls showed Bush with a likely lead. Kerry's rising numbers should have raised a further warning flag about the president's "lead.".
Mercury News editor Rich Ramirez has acknowledged the error and promised a correction. We have alerted Chris Lopez, managing editor of the Times and will post any reply he may provide.
Story: Washington Post pollster discusses poll accuracy
The poll story that we ran on Thursday, Oct. 21 reported
1. That Knight Ridder conducted a poll that showed someone was ahead.
2. That the differences between the two candidates in most of the states polled was too small to be statistically significant, because it was within the margin of error.
These are both accurate statements and NOT contradictory. The poll said the candidate was ahead; we said the difference between the candidate and his opponent was too small to be of great significance. I believe readers are smart enough to absorb this information accurately.
In an election this close, most polls in the contested states are going to be within the margin of error. Yet, for all their flaws -- and there are many -- polls do give some indication of what may be happening. They are not predictive. They don't say what's going to happen on election day.
They reflect only what a group of people told a pollster at the moment he or she called. Is that perfect? Heck no, far from it. But it does provide a clue to the thoughts of what clearly is a fluid electorate.
I believe these nuances were captured in the story. What is more difficult for us is a headline that does not do as well capturing nuance, though the deck head -- "But leads not solid in several battlegrounds he won in 2000 election" -- certainly helps.
I'm sure that after this election, there will be stories written that "the polls were wrong." Again. And perhaps some of them were. Others may turn out to be wrong even if they were right when they were taken. They're polls -- not results. Clearly, I have more confidence than you that people are smart enough to understand that distinction.