Based on data conveniently aggregated for me by resident poll addict Adam C, I would now like to try to make sense of the recent national polling of the Presidential election, excluding tracking polls which are more useful for measuring movement than status.
The polls included in my survey are as follows:
NOTA: None of the Above. LV: Likely Voters. RV: Registered Voters.PollDateMcCainObamaNOTASampleZogby8/4424117LVAP-Ipsos8/4424810RVCNN7/2944515RVGallup7/2749456LVPew7/27424711RV
My conclusion: It’s a close one, with Obama barely ahead. Read on for the why and how.
When reading polling, the eye usually jumps straight to the difference between the two candidates. This is natural, but when it comes to analyzing this series of polls, I think it’s the wrong approach. These polls don’t measure the gap in support, if any, between Senators McCain and Obama. They measure the levels of support for each candidate.
So what are the levels of support? Ideally, to answer that question, we’d like a bunch of polls taken all at the same time, with the same methodology. With that, we could answer that question with a high degree of confidence. However we only have five polls, all taken on different days, with different targeted samples of voters. That skews things: The differences between polls of Registered Voters and polls of Likely voters can be large relative to the gap between the candidates. We go to war with the data we have, though, so let us see what comes out.
There are two ways we can try to pick at this data. One is to try to decide which separate poll is most accurate, the other is to try to aggregate the poll data somehow and take a guess. I’ll try a combination of both. The polls show huge gaps in the number of voters who favor neither Obama nor McCain. Given that the None of the Above vote was 1% in 2004 and 4% in 2000, I believe it safe to assume that it will not top 5% in 2008.
So the gap between 5% and the NOTA rate in any given poll, according to our assumptions, must represent undecided voters. What should that number be in an accurate poll? According to CNN’s 2000 exit polling, 11% of voters decided their votes in the three days before the election. In the 2004 exit polling, 9% chose on the final day or the last three days.
It is true that the exit polls have figures for people who decided a week before or a month before, but those people are thinking about the election in advance. They’re likely to show up in these polls as already having formed opinions, though those opinions may change before election day. I don’t expect people thinking about the election a month ahead not to favor a candidate right now.
So looking at these polls, it appears that CNN and Gallup are getting undecided figures all wrong. As we toss those out, notice that those are the two with the largest leads for each candidate (Gallup with McCain +4, CNN with Obama +7).
Looking at what’s left, we have three polls that show John McCain with 42% support. As for Obama, one shows him with 41%, one with 48%, and one with 47%. So if these polls mean anything, here is the state of the national opinion of voters, in my estimate: McCain 42, Obama 45, NOTA 13. Obama has a three point lead, with 8-12% yet to choose between the two.
That seems awfully close for a “Democratic Year” when in other elections, Democrats have held much larger leads that Republicans successfully erased. It’s no surprise then, that the polling map this year looks so much like the 2000-2004 maps. This one should be close to the wire.