Trust but verify when it comes to polling data
October 15, 2008
Last Wednesday, the afternoon after the second presidential debate, www.realclearpolitics.com, a political Web site that, among other things, provides an average of all of the recent polling data, showed Democrat Barack Obama with a 5.1-percent lead over Republican John McCain in the presidential race.
However, when looking at the 10 polls averaged, taken at various times from Oct. 1 through 7, things got a bit confusing. While Obama led in each, his lead varied from one percentage point in the Reuters/CSpan/Zogby three-day tracking poll taken Oct. 5 through 7 to 11 percentage points in the Gallup tracking point taken during the exact same three days.
A quick look at recent polling data for the Indiana governor's race, as well as that for the District Nine U.S. Congressional seat, show similar discrepancies. Incumbent Republican Gov. Mitch Daniels either leads Democrat challenger Jill Long Thompson by 16 points or four points, depending on which polling firm you trust. The District Nine race is just as confusing, as one poll has incumbent Democrat Baron Hill up by 15, while another has him up by just three.
So, the central questions are: No. 1) Which polls are correct, and No. 2) Why do they vary so much?
The answer to the first question won't be known until Election Day, but the section question may be a bit easier to understand.
Polls are much more scientific than they appear on the surface; it's not just a matter of calling 800 people and asking them, "If you had to vote for president and vice president today, would you cast your vote for the Democratic ticket of Barack Obama and Joseph Biden or the Republican ticket of John McCain and Sarah Palin?"
To be accurate, a poll first must include a large enough sample — some say at least 600 respondents. The reason for this is each poll contains a margin of error, and the higher the sample, the lower the margin of error. A poll with 2,000 respondents may have a margin of error +/-2 points, whereas a poll of 600 respondents may have a margin of error of +/-4 points. What this means, in the case of the latter, is if a poll shows a candidate up 48 to 44 percent, based on the margin of error, the actual margin could be as much as 12 points or as little as zero.
Second, pay attention to who was polled. That's the key to a good poll.
First, take note of whether a poll is of registered voters or likely voters. While Indiana Secretary of State Todd Rokita expects the highest turnout of registered voters since the 1992 presidential election, he anticipates that number to only be 65 percent. That means that 35 percent of registered Hoosier voters won't cast a ballot next month, and, therefore, what they tell a pollster is irrelevant. Likely voters, on the other hand, are just that — those who are expected to go to the polls on Election Day.
In addition, consider the breakdown of the polled sample. Pollsters will weigh the sample to reflect what they believe will be the voter makeup on Election Day. To do this, they look at voter identification and the makeup of recent elections, including gender, race and income levels. If they get this correct, there's a good chance they get their poll correct; if they don't weigh the sample accordingly, their poll will probably be incorrect.
In 1996 pollster John Zogby's last poll before the presidential election showed that President Bill Clinton would win re-election with 49 percent of the vote, compared to 41 percent for Republican Bob Dole and 8 percent for Ross Perot of the Reform Party, while other polls suggested Clinton would win by a higher percentage. Zogby's poll was dead-on with the actual results.
However, while Zogby International's last poll prior to the 2004 presidential election showed Republican incumbent George W. Bush with a one-point lead over Democrat John Kerry (Bush won by 2.7 percent), Zogby predicted Kerry would win 311 electoral votes, giving him the election.
But some of the most glaring polling miscues happened in this year's Democratic primary, most notably in New Hampshire and later in California, where several polls showed Obama with leads, some sizable, only to lose to Hillary Clinton by almost 3 and 10 points, respectively.
However, it's not the first time polling has been incorrect. Perhaps the most notable example was Gallup in 1948, when the organization's polling consistently showed Republican Thomas Dewey ahead of incumbent Democrat Harry Truman by five to 15 points. The lead was so consistent that Gallup stopped polling three weeks before the election.
Unfortunately, this prompted several print media outlets to proclaim victory for Dewey, although Truman won by 5.5 percent, with the most famous possibly being the Chicago Daily Tribune's front-page headline: "Dewey Defeats Truman."
But, to Gallup's credit, it called the 1936 re-election bid of President Franklin D. Roosevelt over Republican Alf Landon correctly, when the leading poll at the time, in the Literary Digest magazine, which had called the five previous elections correctly, said Roosevelt would lose to by a 56 to 44 margin.
The error on the part of the magazine, which had mailed 10 million questionnaires to readers and potential readers with two million returned, was that it didn't account for more Republicans subscribing to the magazine than Democrats. It, therefore, oversampled Republicans.
The magazine, hurt by not just being wrong but by being so wrong (Roosevelt beat Landon 61 to 37 percent), went out of print a few months later.
Polling seems to be a science based in art, and, therefore, any one poll may or may not be accurate. However, if several are looked at as a whole, they can point to a trend.
To look at the most recent polls, visit www.realclearpolitics.com or www.pollingreport.com.