# On Polling Models, Skewed & Unskewed

There’s a very large gulf between my conclusion, explained on Friday, that Obama is toast on Election Day and confident projections like Nate Silver’s poll-reading model still giving the president (at last check) a 77.4% chance of victory. Let me explain why, and what that says about the difference between my approach and Nate’s.

The Limits of Mathematical Models

“A page of history is worth a volume of logic”
- Oliver Wendell Holmes

Mathematical models are all the rage these days, but you need to start with the most basic of facts: a model is only as good as the underlying data, and that data comes in two varieties: (1) actual raw data about the current and recent past, and (2) historical evidence from which the future is projected from the raw data, on the assumption that the future will behave like the past. Consider the models under closest scrutiny right now: weather models such as hurricane models. These are the best kind of model, in the sense that the raw data is derived from intensive real-time observation and the historical data is derived from a huge number of observations and thus not dependent on a tiny and potentially unrepresentative sample.

Yet, as you watch any storm develop, you see its projected path change, sometimes dramatically. Why? Because the models are highly sensitive to changes in raw data, and because storms are dynamic systems: their path follows a certain logic, but does not track a wholly predictable trajectory. The constant adjustments made to weather models ought to give us a little more humility in dealing with models that suffer from greater flaws in raw data observations, smaller sample sizes in their bases of historical data, or that purport to explain even more complex or dynamic systems – models like climate modeling, financial market forecasts, economic and budgetary forecasting, or the behavior of voters. Yet somehow, liberals in particular seem so enamored of such models that they decry any skepticism of their projections as a “War on Objectivity,” in the words of Paul Krugman. Conservatives get labeled “climate deniers” or “poll deniers” (by the likes of Tom Jensen of PPP, Markos Moulitsas, Jonathan Chait and the American Prospect) or, in the case of disagreeing with budgetary forecasts that aren’t really even forecasts, “liars.” But if history teaches us anything, it’s that the more abuse that’s directed towards skeptics, the greater the need for someone to play Socrates.

Consider an argument Michael Lewis makes in his book The Big Short: nearly everybody involved in the mortgage-backed securities market (buy-side, sell-side, ratings agencies, regulators) bought into mathematical models valuing MBS as low-risk based on models whose historical data didn’t go back far enough to capture a collapse in housing prices. And it was precisely such a collapse that destroyed all the assumptions on which the models rested. But the people who saw the collapse coming weren’t people who built better models; they were people who questioned the assumptions in the existing models and figured out how dependent they were on those unquestioned assumptions. Something similar is what I believe is going on today with poll averages and the polling models on which they are based. The 2008 electorate that put Barack Obama in the White House is the 2005 housing market, the Dow 36,000 of politics. And any model that directly or indirectly assumes its continuation in 2012 is – no matter how diligently applied – combining bad raw data with a flawed reading of the historical evidence.

Different sets of polls are, more or less, describing two alternate universes in terms of what the 2012 electorate will look like, one strongly favorable to Obama, one essentially decisive in favor of Romney. The pro-Obama view requires a number of things to happen that are effectively unprecedented in electoral history, but Nate Silver argues that we should trust them because state poll averages have a better track record in other elections than national polls. The pro-Romney view, by contrast, simply assumes that things have gone wrong in a number of the polls’ samples that have gone wrong before. Sean Trende argues that the national pollsters currently in the field are more reliable, and that this (rather than the history of state and national pollsters in the abstract) should be significant:

Among national pollsters, you have a battle-tested group with a long track record performing national polls. Of the 14 pollsters producing national surveys in October, all but three were doing the same in 2004 (although AP used Ipsos as its pollster that year rather than GfK, and I believe a few others may have changed their data-collection companies). Of the 14 pollsters surveying Ohio in October, only four did so in 2004 (five if you count CNN/USAToday/Gallup and CNN/Opinion Research as the same poll).

Pollsters such as ABC/Washington Post, Gallup, Pew, Battleground, and NBC/WSJ are well-funded, well-staffed organizations. It’s not immediately obvious why the Gravises, Purple Strategies and Marists of the world should be trusted as much as them, let alone more. And since virtually none of the present state pollsters were around in 1996 or 2000 (except Rasmussen Reports, which had a terrible year in 2000 and has since overhauled its methodology), it’s even less clear why we should now defer to state poll performance based upon those years.

In my opinion, which view is correct is not one that can be resolved by mathematical models, but rather by an examination of the competing assumptions underlying the two sets of polls and an assessment of their reasonableness in light of history and current political reality.

Where Polls Come From

Polling is “scientific,” in the sense that it attempts to follow well-established mathematical concepts of random sampling, but political polls remain as much art as science, and each polling cycle presents different challenges to pollsters’ ability to accurately capture public sentiment. Quick summary: dating back roughly to George Gallup’s introduction of modern political polling in the 1936 election, a pollster seeks to extrapolate the voting behavior of many millions of people (130 million people voted in the 2008 presidential election) from a poll of several hundred or a few thousand people. In a poll that seeks only the opinion of the public at large, the pollster will seek to use a variety of sampling techniques to ensure that the population called actually matches the population as a whole in terms of age, gender, race, geography and other demographic factors. In some cases, where the raw data doesn’t provide a random sample, the pollster may re-weight the sample to reflect a fair cross-section.

Political polling is a somewhat different animal, however: not all adults are registered voters, and not all registered voters show up to vote every time there’s an election. So, a pollster has to use a variety of different methods – in particular, a “likely voter” screen designed to tease out the poll respondent’s likelihood of voting – to try to figure out whether the pollster’s results have sampled a group of people who correspond to the actual electorate for a given election. This is complicated by the fact that voter turnout isn’t uniform: in some years and some states Republican enthusiasm is higher than others, in some Democratic enthusiasm is higher than others. You can conduct the best poll in the world in terms of accurately ascertaining the views of a population that mirrors your sample – but if your sample doesn’t mirror that season’s electorate, your poll will mislead its readers in the same way that the Literary Digest’s unscientific poll did in 1936, or the RCP averages in the Senate elections in Colorado and Nevada in 2010, or the polls that failed to capture the GOP surge in 2002.

Technology, economics and other factors affect polling. The rise of caller ID in particular has dramatically reduced response rates – that is, pollsters have to call 8 or 9 people for every one who will answer their poll. That raises the level of difficulty in ensuring that the people who actually do answer the questions are a representative sample. Liberals argue that pollsters undersample people who have only cell phones (a disproportionately younger and/or poorer group) and non-English speakers; conservatives counter that Tea Partiers may be less likely to talk to pollsters and that polls in some cases can suffer a “Shy Tory Factor” where voters are less likely to admit to voting Republican. Partisans dispute the relative merits of in-person versus automated polling and the structure of polls that ask a lot of leading questions before asking for voter preferences. And the economics of the polling business itself is under stress, as news organizations have less money to spend on polls and pollsters do public political polling for a variety of business reasons, only some of which have anything to do with a desire to be accurate – some pollsters like PPP make most of their money off serving partisan clients, news organizations do it to drive news, universities do it for name recognition.

2012, even moreso than past elections, is apt to produce another round of reflection and recrimination on all of these issues, as a great many of the individual polls we have seen so far have been largely or wholly irreconcilable, especially in terms of their view of the partisan makeup of the 2012 electorate. If you assume that (1) the various players in national and state polling have essentially random tendencies towards inaccuracy in modeling the electorate in all conceivable environments and (2) each state’s poll average includes a large enough sample of different polls by different pollsters to bear out this assumption – in that case, state polling averages and the models that rest on them should be good predictors of turnout, as they have been in most (but not all) past elections. But when you consider that 2008 was a very unusual environment and that every turnout indicator we have other than the state poll averages is pointing to a different electorate, these become far more questionable assumptions.

Toplines and Internals

Nate Silver’s much-celebrated model is, like other poll averages, based simply on analyzing the toplines of public polls. This, more than any other factor, is where he and I part company.

If you read only the toplines of polls – the single number that says something like “Romney 48, Obama 47″ – you would get the impression from a great many polls that this is a very tight race nationally, in which Obama has a steady lead in key swing states. In an ordinary year, the toplines of the polls eventually converge around the final result – but this year, there seems to be some stubborn splits among the poll toplines that reflect the pollsters’ struggles to come to agreement on who is going to vote.

Poll toplines are simply the sum of their internals: that is, different subgroups within the sample. The one poll-watchers track most closely is the partisan breakdowns: how each candidate is doing with Republican voters, Democratic voters and independent voters, two of whom (the Rs & Ds) have relatively predictable voting patterns. Bridging the gap from those internals to the topline is the percentage of each group included in the poll, which of course derives from the likely-voter modeling and other sampling issues described above. And therein lies the controversy.

My thesis, and that of a good many conservative skeptics of the 538 model, is that these internals are telling an entirely different story than some of the toplines: that Obama is getting clobbered with independent voters, traditionally the largest variable in any election and especially in a presidential election, where both sides will usually have sophisticated, well-funded turnout operations in the field. He’s on track to lose independents by double digits nationally, and the last three candidates to do that were Dukakis, Mondale and Carter in 1980. And he’s not balancing that with any particular crossover advantage (i.e., drawing more crossover Republican voters than Romney is drawing crossover Democratic voters). Similar trends are apparent throughout the state-by-state polls, not in every single poll but in enough of them to show a clear trend all over the battleground states.

If you averaged Obama’s standing in all the internals, you’d capture a profile of a candidate that looks an awful lot like a whole lot of people who have gone down to defeat in the past, and nearly nobody who has won. Under such circumstances, Obama can only win if the electorate features a historically decisive turnout advantage for Democrats – an advantage that none of the historically predictive turnout metrics are seeing, with the sole exception of the poll samples used by some (but not all) pollsters. Thus, Obama’s position in the toplines depends entirely on whether those pollsters are correctly sampling the partisan turnout.

That’s where the importance of knowing and understanding electoral history comes in. Because if your model is relying entirely on toplines that don’t make any sense when you look at the internals with a knowledge of the past history of what winning campaigns look like, you need to start playing Socrates.

Moneyball and PECOTA’s World

Let me use an analogy from baseball statistics, which I think is appropriate here because it’s where both I and Nate Silver first learned to read statistics critically and first got an audience on the internet: in terms of their predictive power, poll toplines are like pitcher win-loss records or batter RBI.

At a very general level, the job of a baseball batter is to make runs score, and the job of a baseball pitcher is to win games, so traditionally people looked at W-L records and RBI as evidence of who was good at their jobs. And it’s true that any group of pitchers with really good W-L record will, on average, be better than a group with bad ones; any group of batters with a lot of RBI will, on average, be better than a group with very few RBI. If you built a model around those numbers, you’d be right more often than you’d be wrong.

But wins and RBI are not skills; they are the byproducts of other skills (striking people out, hitting home runs, etc.) combined with opportunities: you can’t drive in runners who aren’t on base, and you can’t win games if your team doesn’t score runs. If you build your team around acquiring guys who get a lot of RBI and wins, you may end up making an awful lot of mistakes. Similarly, you can’t win the votes of people who don’t come to the polls.

Baseball analysis has come a long way in recent decades, because baseball is a closed system: nearly everything is recorded and quantified, so statistical analysis is less likely to founder on hidden, uncounted variables. Yet, even highly sophisticated baseball models can still make mistakes if they rest on mistaken assumptions. Baseball Prospectus.com’s PECOTA player projection system – designed by Nate Silver and his colleagues at BP – is one of the best state-of-the-art systems in the business. But one of PECOTA’s more recent, well-known failures presents an object lesson. In 2009, PECOTA projected rookie Orioles catcher Matt Wieters to hit .311/.395/.546 (batting/on base percentage/slugging). As regular consumers of PECOTA know, this is just a probabilistic projection of his most likely performance, and the actual projection provided a range of possible outcomes. But the projection clearly was wrong, and not just unsuccessful. While Wieters has developed into a good player, nothing in his major league performance since has justfied such optimism: Wieters hit .288/.340/.412 as a rookie, and .260/.328/.421 over his first four major league seasons. What went wrong? Wieters had batted .355/.454/.600 between AA and A ball in 2008, and systems like PECOTA are supposed to adjust those numbers downward for the difference in the level of competition between A ball, AA ball and the major leagues. But as Colin Wyers noted at the time, the problem was that the context adjustments used by PECOTA that season used an unusually generous translation, assuming that the two leagues Wieters had played in – the Eastern League and the Carolina League – were much more competitive in 2008 than they had been in previous years. By getting the baseline of the 2008 environment Wieters played in wrong, PECOTA got the projection wrong, a projection that was out of step with what other models were much more realistically projecting at the time. The sophistication of the PECOTA system was no match for two bad inputs in the historical data.

My point is not to beat up on PECOTA, which as I said is a fantastic system and much better than anything I could design. Let’s consider for a further example one of PECOTA’s most notable successes, one where I questioned Nate Silver at the time and was wrong; I think it also illustrates the differing approaches at work here. In 2008, PECOTA projected the Tampa Bay Rays to win 88-89 games, a projection that Nate Silver touted in a widely-read Sports Illustrated article. It was a daring projection, seeing as the Rays had lost 95 or more games three years running and never won more than 70 games in franchise history. As Silver wrote, “[i]t’s in the field…that the Rays will make their biggest gains…the Rays’ defense projects to be 10 runs above average this year, an 82-run improvement.” I wrote at the time: “this is nuts. Last season, Tampa allowed 944 runs (5.83 per game), the highest in the majors by a margin of more than 50 runs. This season, BP is projecting them to allow 713 runs (4.40 per game), the lowest in the AL, third-lowest in the majors…and a 32% reduction from last season…it’s an incredibly ambitious goal.”

PECOTA was right, and if anything was too conservative. The Rays won 97 games and went to the World Series, without any improvement by their offense, almost entirely on the strength of an improved defense. I later calculated that their one-year defensive improvement was the largest since 1878. Looking at history and common sense, I was right that PECOTA was projecting an event nearly unprecedented in the history of the game, and I would raise the same objection again. But the model was right in seeing it coming.

If you looked closely, you could see why: the frontiers of statistical analysis had shifted. Michael Lewis’ book Moneyball, following the 2002 Oakland A’s, captured the era when statistical analysts stressed hitting and de-emphasized fielding on the theory that it was easier to use sophisticated metrics to find better hitters, but harder to quantify the benefits of defense. By 2008, the metrics were creating more opportunities to study defense, and – as captured in Jonah Keri’s book The Extra 2% (about the building of that Rays team) – the Rays took advantage.

But for the Rays, the 2008 environment was not so easily repeated in subsequent years. While still a successful club with a solid defense in a pitcher’s park (and still far better defensively than in 2007) they have led the league in “Defensive Efficiency Rating” only once in the past four years. It’s what Bill James called the Law of Competitive Balance: unsuccessful teams adapt more quickly to imitate the successes of the successful teams, bringing both sides closer to parity. Trende, in his book The Lost Majority, applies the same essential lesson to political coalitions. Assuming that the 2008 turnout models, which depended heavily on unusually low Republican turnout, still apply to Obama’s current campaign ignores the extent to which multiple factors favor a balance swinging back to the Republicans. And the polls that make up the averages – averages upon which Nate Silver’s model rests – are doing just that. Nate’s model might well work in an election where the relationship between the internals and the toplines was unchanged from 2008. But because that assumption is an unreasonable one, yet almost by definition not subject to question in his model, the model is delivering a conclusion at odds with current, observable political reality.

Painted Into A Corner

Poll analysis by campaign professionals often involves a large dollop of conscious partisan hackery: spinning the polls to suggest a result the campaigns know is not realistic, in the hopes of avoiding the bottom-drops-out loss of voter confidence that sets in when a campaign is visibly doomed. For the record, unlike some of my conservative colleagues, I don’t think Nate is a conscious partisan hack. I have a lot of respect for his intelligence and his thoroughness as a baseball analyst and we have mutual friends in the world of baseball analysis, and I think he undoubtedly recognizes that it will not be good for his credibility to be committed to the last ditch to defending Obama as a prohibitive favorite in an election he ends up losing. (It’s true that the 538 model is just probabilities, but as Prof. Jacobson notes, Nate won his reputation as an electoral forecaster with similar probabilistic projections in 2008; if you project a guy to have a 77% chance to win an election he loses, that will inevitably cause people to put less faith in your odds-laying later on).

I do, however, think that – for whatever reasons – Nate has painted himself into a corner from which there is no easy escape. If I’m right about the electorate and the polls are right about the internals, Romney wins – and if Romney wins, the 538 model will require some serious rethinking. There’s a bunch of reasons why he finds himself in this position. One is that his model has been oversold: he made his poll-reading reputation based on a single election cycle, in which he had access to non-public polls to check his work. Nate is, in fact, not the first poll-reader to get 49 states right: RedState’s own Gerry Daly did the same thing in 2004, missing only Wisconsin (which Bush lost by half a point) in his Election Day forecast, and Gerry did this through careful common-sense reading of the state-by-state polls checked against the national polls, not through a model that purported to do his thinking for him. (As it happened, the RCP averages at the end of the cycle did the same thing, as they did in 2008.) I’m inclined to listen to guys like Gerry who have been doing this for years and have not only recounted the numbers from past elections but lived through the reading of polls while they were happening. In 2010, the 538 model fared well – but no better than the poll averages at RCP. And that was only after Nate was much slower to pick up on the coming GOP wave than Scott Rasmussen, who called it a lot earlier in the cycle.

There are a raft of methodological quibbles with the 538 model (some larger than others), many of which reek of confirmation bias (ie, the tendency to question bad news more closely than good). For example, while Nate’s commentaries have included lengthy broadsides against Rasmussen and Gallup, his model tends to give a lot of weight to partisan pollster PPP. Ted Frank noted one example that perfectly captures the value of knowing your history; the 538 model’s assumptions about how late-deciding undecided voters will break are tilted towards Obama by including the 2000 election, when Gore did far better on Election Day than the late-October polls suggested. But Gore wasn’t an incumbent, and there was a major event (the Bush DUI story) that had a major impact on turnout and undecided voters. If you make different assumptions based on a different reading of history, you get different conclusions. The spirit of open scientific inquiry should welcome this kind of scrutiny, even in the heat of election season.

None of this is a reason to conclude that the 538 concept is broken beyond repair. If you regard poll analysis as something like an objective calling, you can learn from your failures as well as your successes. If Obama wins, my own assumptions (and indeed, nearly everything we know about winning campaigns) will have to be re-examined. If Romney wins, the model of simply aggregating the topline state-by-state poll averages will have to be sent back to the drawing board. But there will be no hiding, in that case, from the fact of its failure.

Unskewed Polls

One of the more widely-discussed efforts to fix the problem of topline poll data varying by turnout models is Dean Chambers’ UnskewedPolls.com, which takes the internals of each poll and re-weights them for a more Romney-friendly turnout model. In concept, what Chambers is doing is on the right track, because it lets us separate how much of the poll toplines is due to the sentiments of different groups and how much is due to assumptions about turnout. But his execution is a methodological hash.

I haven’t pulled apart all the pieces of Chambers’ model, but my main objection to UnskewedPolls is that it re-weights the electorate twice:

The QStarNews poll works with the premise that the partisan makeup of the electorate 34.8 percent Republicans, 35.2 percent Democrats and 30.0 percent independent voters. Additionally, our model is based on the electorate including approximately 41.0 percent conservatives, 20.0 percent moderates and 39.0 percent liberals.

Republicans are 89 percent conservative, 9 percent moderate and 2 percent liberal. Among Democrats, 3 percent are conservative, 23 percent are moderate and 74 percent are liberal. Independents include 33 percent conservatives, 49 percent moderates and 18 percent liberals.

Our polls are doubly-weighted, to doubly insure the results are most accurate and not skewed, by both party identification and self-identified ideology. For instance, no matter how many Republicans answer our survey, they are weighted at 34.8 percent. If conservatives are over-represented among Republicans in the raw sample, they are still weighted at 89 percent of Republicans regardless.

The problem with this method is that neither the raw data (the current polls) nor a lot of the historical data (past years’ exit polls) has crosstabs showing how the votes of each partisan group break out by ideology. That is, for example, we have nearly no separate polling (certainly none on the polls Chambers is “unskewing”) showing how Romney is polling among independents who self-identify as moderates, or how Obama is polling among Democrats who self-identify as conservatives. That’s aside from the question of whether ideological self-ID is nearly as predictive a variable as party ID. Re-weighting the samples twice by these two separate variables, without access to those crosstabs, means you don’t really have any idea whether you are just adding a mutiplier that double-counts your adjustments to the turnout model. It’s more alchemy than science.

Conclusion

We can’t know until Election Day who is right. I stand by my view that Obama is losing independent voters decisively, because the national and state polls both support that thesis. I stand by my view that Republican turnout will be up significantly from recent-historic lows in 2008 in the key swing states (Ohio, Wisconsin, Colorado) and nationally, because the post-2008 elections, the party registration data, the early-voting and absentee-ballot numbers, and the Rasmussen and Gallup national party-ID surveys (both of which have solid track records) all point to this conclusion. I stand by my view that no countervailing evidence outside of poll samples shows a similar surge above 2008 levels in Democratic voter turnout, as would be needed to offset Romney’s advantage with independents and increased GOP voter turnout. And I stand by the view that a mechanical reading of polling averages is an inadequate basis to project an event unprecedented in American history: the re-election of a sitting president without a clear-cut victory in the national popular vote.

Perhaps, despite the paucity of evidence to the contrary, these assumptions are wrong. But if they are correct, no mathematical model can provide a convincing explanation of how Obama is going to win re-election. He remains toast.

• kentucky

The Ohio top lines are certainly off. If you take any of the Ohio polls and adjust early voter portion down to the number who had actually voted early at the time of the poll, and adjust up the portion yet to vote, Romney is at least tied and quite often leading.

People who insist that the state polls can’t all be wrong should consider that nearly all of them are verifiably off by 10% in their estimation of the number of voters who have already cast their ballot. They would rather believe that their mathematical estimation of how many beans are in a jar, rather than an objective count of the beans.

• http://www.bohnetlaw.com rightappeal

Here’s what is bugging me about the polling: if Romney is up big with independents and partisan split is within a few points, shouldn’t Romney by up by at least the 4-5% Gallup has been showing? If so, why is everyone else underestimating his support? Shouldn’t there be some outliers showing a 7-8 point lead nationally, with at least modest leads in Ohio and other midwestern swing states?

Maybe I’m just paranoid, but I remember expecting Bush landslides in 2000 and 2004.

• APA Guy

Romney IS up 4-5 points…or more…as Gallup is showing. These nonsense pollsters are showing the race as basically dead-even because they are taking their RV sample and narrowing down what they BELIEVE is a LV model that assumes a Dem turnout of +7-12 as opposed to Republicans. In other words, they are saying the 2008 model will repeat (and in some cases, be augmented) in 2012. We know this to be absolutely nonsensical based on Republican enthusiasm in the 2010 and 2011 elections…enthusiasm that has only gotten stronger.

• http://impudent.edublogs.org/ kyle8

I have not trusted any polls in many years and guess what? I am never disappointed.

• The_Rebel

Here comes the October surprise. Drudge has top line report saying “sex scandal to hit campaign…developing”. He doesn’t say which campaign, but I think we know which one it will be.

• congressworksforus

Dan, what a wonderful piece of objective analysis, unlike Silver’s which ramble on and on trying to cover all of his bases while providing “poll porn” to his loyal followers.

I have one question for you though on whether you think we could see a massive landslide on Nov 6th.

Consider that from a turnout perspective (less Dem enthusiasm, more Rep/Ind enthusiasm) adjusting the State polls to accomodate this generally shows a comfortable Romney victory. Does the addition of a far superior Republican ground game (which I have been witness to, and yes, it is impressive!) than ever seen before and as tested and proved during the Walker recall election, increase the chances of say, an even larger victory on Nov 6th?

• Kyle-MI

Do they unskew the exit polls according to the actual votes? I have never heard of anything like that, but I suspect they do a little adjustment. Exit polls are not going to perfectly reflect the actual vote totals, so it would make sense to do some adjustments. I am just curious to how much has been done in the past. How close are the raw exit polls to the actual vote? Would this comparison provide some clue as to how well the polls have done?

Do they just do exit polls in person or are there any telephone exit polls? I have always assumed that exit polls were in person near the polling sites. I think a telephone exit poll would be less accurate, but it is more comparable to the regular polling that has been going on.

It would be interesting to see how many people claim to have voted compared to actual turnout and how that breaks down by party.

• timmcg

Wow. This is an AWESOME post.

As you say, we won’t know who is right until Tuesday, but I love the logic and reasoning you’ve done. You’ve addressed what’s been on my mind the last few weeks.

Thank you. Thank you. Thank you!

• wrenhal

There’s just no way that white voter turnout is going to be as low as 75% or even lower.

• wrenhal

Can’t wait for the details….

• deltawing

Don’t forget the job numbers on Friday. Most economists expect the unemployment rate to go up, even with massive holiday hiring.

• wrenhal

Sorry, but tl;dr I just don’t read these long poll posts.
I believe they are trying to skew the polls so that they can cry voter fraud on the Republicans when Romney wins by about 5% or more.

• APA Guy

Oh look…two trolls down-rated me. How can I possibly sleep tonight?

• http://www.baseballcrank.com Dan McLaughlin

In my most wildly optimistic moments, I see something like Bush-Dukakis in 88, but that’s a reach. I think Romney’s Electoral College ceiling is in the 330s or so. More likely, he levels out around 295.

• http://www.baseballcrank.com Dan McLaughlin

Exits usually, but not always, match up with actual votes. Josh Jordan noted, of some importance, that the 2008 Ohio exits were off and you have to adjust them to fit the actual outcome.

• dreb93

I just wanted to say that this is the single finest piece of analysis I have read this election season. Extraordinarily well done.

• DerKrieger

I realize that 2010 wasn’t a presidenta election year but wouldn’t the turnout that year be more reflective of this year than any other year?
I don’t think Republican ethusiasm has abated at all nor has Democrat enthusiasm risen.
There is no reason for Democrats to come out en masse for Obama. Dem voters will have to be dragged to the polls to play prevent defense. No voter us as enthusiastic playing defense as they are offense.
I see 2010 happening g again.

• barleycorn

Thanks a heap for dragging Matt Wieters into this. I once wasted a high pick on him in a keeper league based on the almighty PECOTA.

There is one aspect of polling that is extremely critical and that is the likely voter screen. I don’t recall the writer but in just the past couple days it was pointed out that in the Washington Post poll they showed something like 74% of adults and 85% of registered voters making it through their “likely voter” screen. That’s obviously absurd unless this election is massively different than all other presidential elections over the past 100 years or so.

• larbost

All I can say, is if you really believe Romney is going to win then go to Intrade.com and bet on Romney. He is getting 2 to 1 odds against. A \$100 bet can earn you \$300. Put your money where your mouth is.

• Bill S

• runner12

Excellent, excellent explanations for all of this wacky polling and the differences in methodologies. Thank you for being both thorough and using language that non-statisticians can understand clearly. It was informative and I learned a lot.

The only thing I disagree with you on is Nate Silver, respectfully of course . I do think he is a partisan hack and would be perfectly willing to risk his reputation because he knows the elitist Left will defend him no matter what. He wants to continue to be invited to all of those cocktail parties, after all. I do appreciate you pointing out that his claim-to-fame was based off of one election and quite frankly not one that was difficult to predict.

• larbost

Dan are you going to try and make a killing on Intrade or is that beneath you?

• larbost

• larbost

• RealQuiet

One of the best comprehensive writings I have seen this election and quite possibly ever on a very complex science. Very well done.

• MoeLane

It wasn’t any cleverer the fourth time you wrote it, spambot. Shoo.

So, trolling. Noted.

Extremely well done, Dan.

• Kyle-MI

Moe, can you take care of “Mr. Science” above, too? The links are absolutely vile.

• MoeLane

Thank you for establishing that people who rely on Nate Silver to keep from panicking about this election find posts like these disquieting.

• tlhoward

I feel that Ohio voters are so fed up with the ad bombardment that they are the most unpredictable.

• stillmisstpaw

Speaking of polls, there’s one thing I think we should all steel ourselves for. Repeat after me, “The exit polls will look awful. The exit polls will look awful.” We also need to be prepared for the fact that for any given size of lead, we’ll see the networks MUCH quicker to call states for Obama than for Romney.

None of that will keep anyone on this site from going to the polls in the late afternoon if that was the time he/she had planned to vote. But, with marginal voters it could make a difference. (Think Central Time Zone voters in Florida who didn’t vote in 2000 because the state had been called for Gore — AFTER the Republicans had pointed out to the networks, days in advance, that the polls would still be open in the Panhandle after they had closed elsewhere in the state.)

Spread the word — don’t allow yourself to be discouraged by ridiculous polls (like National Journal’s “5-point lead for Obama nationally” poll that just came out), exit polls (which CONSISTENTLY look terrible), and early calls for Obama vs. very slow calls for Romney.

• agooglyminotaur

Can’t the mods show this man to the door for spamming the page?

• agooglyminotaur

Subtle.

• crusty

I saw that Rasmussen interview as well and I to was put off by it. But I seek comfort is the realization that from other interviews Rasmussen has always been cautious never one to make bold statements. He is right about losing Virginia it could be a fatal error.

• azaeroprof

Excellent article, Dan! As an amateur sabremetrician and engineer who loves math, I love this kind of talk. I don’t know if you are aware of this site or not, but I’ve become a fan of the analysis at Washington Dispatch. They take an average of the internals of a series of polls (both in partyID and in preference by party), then apply several different turnout models to this average set of data. Their bottom line is based on a partyID breakdown that is an average of the ’04/’06/’08/’10 exit polling, but they give the results for several other options so you can compare.

• independentandproud

everyone mentions nate silver at 538, but there is also prof. wang @ the princeton election consortium predicting 303 electoral college votes for obama, with a 93% chance of winning reelection.
for those of you who don’t know prof. wang, he came within one vote of correctly predicting the EC vote in 2008.
but like dan says in the article, we won’t know until election day who is right. i propose that whoever is wrong, should be banned for life from writing another article.

• crusty

Speaking of polls did anyone see today’s FOX poll where they report 46-46. Nowhere do they report the partisan break down. In the text the numbers they cite fail to lead me to conclude it is a tie. How can they claim Independents give the edge to Romney by seven percentage points (46-39 percent). Then say Among the subgroup of most interested voters, those who are “extremely”
interested in the election, Romney leads Obama by 53-42 percent. They go further by claiming Voters who say the economy will be most important in their decision back
Romney by 50-43 percent, and fiscal-issue voters back him by a similar
50-41 percent.

Nowhere is there any positive numbers for Obama. What am I missing here?

• MoeLane

You’re not a member of this community, independentandproud, so you don’t get a vote. Don’t ever presume to speak for us again.

• independentandproud

mr. lane, i would never presume to speak for you. the “we” i use is for all americans, be they red or blue.

• MoeLane

You know, at some point I’m sure that somebody rushing over here to defend Silver will actually elevate his game up beyond Appeal To Authority and/or Shut Up, He Explained.

• MoeLane

Don’t insult my intelligence, either. Do you understand? Indicate your understanding in your next comment here, and the clock is ticking. You aren’t particularly worth staying up for.

• cbartlett

Speaking of evangelical vote: my husband was channel surfing last weekend and landed on some random show on BET that was documenting black southern preachers’ views on this election. Apparently many of them are having a hard time recommending a candidate to their congregations this time. They do not like the pro-choice and pro-gay stance of Obama and also think he is pushing blacks into poverty – BUT they are also very concerned about endorsing “a Morman” because they consider it to be a cult. There were some implications that this group of voters might be staying home. That could potentially be a large number of usual conservative voters not voting at all. Just hadn’t heard this particular issue discussed before.

• independentandproud

Mr. Lane, have I offended you in some way? my post mentions only prof. wang and finding out whose prediction will be accurate come election day.

• MoeLane

Yeah, people coming on our site to try to get one of our writers blacklisted always offends me. Fortunately, I have methods of dealing with that, as you are discovering now. Bye!

…And let that be a lesson for the rest of you. When I (or another site moderator) tell you to do something, DO it. That’s why we have the ‘mod’ icon; to give fair warning.

• dpmaine

This is a really thoughtful piece, thanks for writing it. I have always held more or less the same opinion of Mr. Silver as you posit, which he is basically in over his head, but not going out of his way to cook the books, so to speak.

Partisan split is the real deciding factor today in poll accuracy; I think one aspect that has often left me confused is how to suss out when pollsters assign someone a party-affliation other than what they claim – namely, there are a lot more claimed “independents” than actual “swing voters”. There independents who always vote democrat, and there are the same who always vote republican.

So in that regard, party identification is not all that useful, because when you press people and try to assign them to a party based on prior voting habits, there is a lot less uncertainty. (A voter who has always voted GOP on the Pres. election is a Republican, whether or not they self identify that way).

Dan, or anyone, is there any source documenting the weighting of democrats, republicans, independents, along with leaners, down to the question level? I

• uneasyrider

I really like Silver’s analysis, but, to borrow from baseball, this election could be his #6org moment. The #6org meme was popular a few years ago when one of my favorite (and one of the best) baseball analysts predicted that his home team, the Seattle Mariners, would be the 6th best team in the league. The Mariners ended up being terrible that year. Cameron’s arguement for this rating was very well reasoned and he defended his prediction confidently. However, Cameron’s initial assumptions were flawed. He incorrectly weighed both the reliability and repeatability of defensive metrics and by failing to account for all essential variables, such as roster depth or the lack thereof. Perhaps he was blinded a bit by homerism, as well. I guess what I’m trying to get to is that Silver’s underlying assumptions may well prove to be true. He can soundly argue all of his beliefs, but certainly the objections raised here are valid. Silver does, by necessity, make a number of assumptions in his models and, while these are in good faith, their veracity can be argued. They are based on historical trends (in the accuracy of polling data) and historical trends can be bucked and as Dan argues here, there is underlying evidence that they very well may be in this election and this could be Silver’s #6org moment.

• gscandlen

I am skeptical of all polls. Look at the exit polls in Wisconsin (and Kerry/Bush in 2004, for that matter). These polls didn’t have the problem of estimating who will vote and who will not. Everyone they talked to was an actual voter. But they still were way off. Either they mis-weighted the precincts or people lied to them. Either way, it is an essentially human endeavor, and all the computer models in the world can’t change that.

• congressworksforus

Not necessarily; there’s somewhat of an assumption that someone who agrees to even take the poll is more likely to be a voter in the first place.

The real joke was the one a few weeks back (Pew?) that had 97% of the respondents as likely voters because “they told us they would vote” (or some nonsense to that effect…).

• congressworksforus

Bill… not so sure it was imbecilic. Either these pollsters are incompetent, or they are crooked…

• congressworksforus

Thanks. I saw a piece last night (forget where) that said Obama was on track to lose Independents by double digits. The last three to lose by that margin were Dukakis, Mondale and Carter. (Even McCain didn’t lose that badly…)

• tngal

Next time I buy lotto tickets, I’ll get my numbers from you Dan. After reading your in-depth explanation, I just don’t trust that Silver guy to help fund my retirement.

• Pingback: Holli Taberski

• Pingback: Ina Grinkley

• Pingback: electronic cigarette

• milehighcon

I wonder if some of the independents are former Republicans? I personally know of many conservatives that self-identify as Independent now instead of Republican. They are Conservative first and vote libertarian or constitution or whatever in the primary, but they will vote for Romney.

If we assume all independents are swing-voter moderates, then I think we are making some of the mistakes that we’re attempting to correct. I personally think this will be a close election, but I trust Gallup to have the right numbers. It all comes down to turnout I think!

Right now Gallup says its 48% Obama, 47% Romney. It’s too close to call so make sure you vote!

Edit: I was accidentally looking at “registered voters.” Likely voters have Romney at 51% to Obama’s 46%, but it’s still close enough that every vote will matter!

• eauc

For a technical comparison of these
two sites, http://electionanalytics.cs.illinois.edu and 538.com, visit http://punkrockor.wordpress.com/2012/10/01/forecasting-the-presidential-election-using-regression-simulation-or-dynamic-programming/

• mkr76

I can’t bear the thought of having to look at
Barry Hussain Obama’s smug face for another four years, however after spending
several years as a gaming analyst in Vegas, the current Intrade spread has me
very concerned. Their call at 66% this
late in the game would appear to be insurmountable. Barring some last minute surprise I just don’t see how Romney
pulls this off. I trust markets more
than pollsters. I pray that I am wrong

• streiff

nice troll.

• Pingback: Ryan Morring

• Pingback: Welcome to My Blog

• Pingback: Automotive Forum

• Pingback: noclegi zakopane

• Pingback: Proactol customer reviews

• Pingback: Hubert Roethel

• Pingback: Egal Gabbay Los Angeles

• Pingback: how to be a good parent

• Pingback: click website

• Pingback: Male Edge scam

• Pingback: กล้องวงจรปิด

• Pingback: check here

• Pingback: Egal Gabbay Los Angeles, CA

• Pingback: คอนโดเอแบค

• Pingback: musica cristiana 7

• Pingback: Benedict Mangrich

• Pingback: Branden Snaer

• Pingback: BBCAT

• Pingback: Joie Czaplinski

• Pingback: descargar musica gratis jazz

• Pingback: click reference

• Pingback: ps3

• Pingback: Koi

• Pingback: Whitney Pasquarella

• Pingback: florida foreclosure lawyer

• Pingback: kozmetika pre muzov

• Pingback: payday loans online

• Pingback: distil vs distill

• Pingback: diet universe

• Pingback: Clair Dundlow

• Pingback: Aquaristik Bücher

• Pingback: Fusevision.com.sg

• Pingback: Lida Kennan

• Pingback: a Yahoo

• Pingback: avi2dvd

• Pingback: recaning service dallas

• Pingback: serwis randkowy

• Pingback: buy inexpensive wow gold

• Pingback: Lynell Renneker

• Pingback: Mary Touhy

• Pingback: umow sie na randke

• Pingback: Aquariumfilter

• Pingback: florida no fault auto insurance

• Pingback: znajdz milosc

• Pingback: winter garden remodel your home

• Pingback: strona z randkami

• Pingback: Weldon Novel

• Pingback: Colette Nii

• Pingback: najlepsza randka

• Pingback: randka

• Pingback: Power Point

• Pingback: Mariah Danielovich

• Pingback: san diego biofeedback therapist

• Pingback: Explore Talent Kristina Thalen

• Pingback: engagement rings custom made

• Pingback: Live In Nanny

• Pingback: Doug Nuss

• Pingback: Kaitlin Caspi

• Pingback: super randka

• Pingback: strona z randkami

• Pingback: portal randkowy za darmo

• Pingback: umow sie na randke

• Pingback: serwis randkowy

• Pingback: xenon

• Pingback: rzeszow randki

• Pingback: SEO Services Australia

• Pingback: Mining Technology

• Pingback: Randy Avery

• Pingback: Yuk Mceneny

• Pingback: chemical free dry cleaners

• Pingback: Emory Stasny

• Pingback: Troy Nesti

• Pingback: Terrell Schlick

• Pingback: ipl Laser Hair Removal Pagewood

• Pingback: by using Bing

• Pingback: 2013 honda civic si redesign

• Pingback: Lacy

• Pingback: Carmen Cianfrini

• Pingback: Susann Panepinto

• Pingback: Firekevinmorris

• Pingback: Fake Rank

• Pingback: ITSM Training

• Pingback: Tafsir Comparison

• Pingback: Home Designs

• Pingback: Assembly Rooms

• Pingback: Jack Taylor

• Pingback: small investment opportunities

• Pingback: Enedina Richiusa

• Pingback: Minerva Stagnitto

• Pingback: Darwin Glatt

• Pingback: Dorian Landerman

• Pingback: alfileres de novia

• Pingback: Jude Cheuvront

• Pingback: Dinah Yedinak

• Pingback: Diseño web barato

• Pingback: Dolly Mizwicki

• Pingback: Annett Blowe

• Pingback: Genevieve

• Pingback: Autos Blog

• Pingback: Frederick Pelletier

• Pingback: Mack Paugh

• Pingback: Susan Saumier

• Pingback: Alcohol Detox at Home

• Pingback: Isidra Khare

• Pingback: John Markin

• Pingback: John Markin

• Pingback: Joey Blumer

• Pingback: Mason Oberhausen

• Pingback: name server check

• Pingback: Very yrev test

• Pingback: Bryce Paz

• Pingback: texas driver education

• Pingback: Shaniqua Steves

• Pingback: Nick Berczel

• Pingback: Teichfolie

• Pingback: alcoholic beverages

• Pingback: Erin Mcneff

• Pingback: Eugenia Ellworths

• Pingback: Your #1 urban lifestyle destination

• Pingback: www.eztraffic.org

• Pingback: umow sie na randke