There seems to be a fair bit of bashing of Pollsters of late. This is not surprising given that Brexit was expected to vote remain, and Clinton was expected to win the US presidential vote.
The naive criticism is that the Pollsters got it wrong. I’m going to call this naive as in the case Brexit the polls were saying the vote was going to be close, not that ‘remain’ was a shoe-in. As late as June some polls were saying that leave would win. The Economist had the vote tied. If some polls are saying that remain will win, and some that leave will win, and everything is in the margin of error, then there’s no sure-bet. It’s easy to overlook, that the margin of error also needs to be reported. A poll that says option A is preferred by 48% and option B by 47%, will predict option A. Suppose the margin of error is 2%. We should have a lot less confidence of that outcome than polls that report say, A is preferred by 60% and B by 38%.
In short, we are very good at reporting poll predictions, but we don’t emphasise the uncertainty attached to these. There are things that do increase that uncertainty. If polls are volatile and show wide swings in support, if there is a lot of disagreement between polls, if there are a lot of undecideds, these are all signals that we need to be more cautious at interpreting polls.
Even in the case of Clinton versus Trump, the polls got a fundamental point right. Clinton was more popular. She got more of the popular vote. The margin we’re seeing at the moment is around 1.2%. That would normally be large enough to win an election, but for the quirks of the US electoral system. Each state gets a fixed number of votes in the electoral college. The electoral college, elects the president.
Each stat’s votes aren’t calculated as a fixed-ratio based on population. So population dense states like California or New York, get relatively fewer votes than say, Wyoming. A Democrat voter is not equal everywhere, and neither is a Republican. This is why pollsters like 538 were more bullish about Trump’s chances (sitting at roughly 25% chance in last month of election). Clinton was much weaker in the electoral college where her popularity in say, California had less weight.
Many of the tipping point states (Florida, Pennsylvania, North Carolina) were also predicted to vote for Clinton, but with margins were very thin. Within margins of error. Given the uncertainty of this election (high percentage of undecideds, volatile polls) we should have seen some of these flipping towards, and against Clinton. They weren’t a sure bet. In the end if we look at a state like Pennsylvanian (with 20 College votes), nearly 6m people voted. Trump’s margin was a mere 68,000 ahead. The margins here are small. That’s mostly what the polls were saying.
Nonetheless, there was a big polling failure at the state-level. Instead of seeing the votes flipping almost randomly, the traffic was largely one way- it was biased toward Trump. And this was especially the case in the battleground states of Florida, Pennsylvania, Michigan etc. Some of this may be down to the polling methods. US polling companies like to sample a group that is representative of the general, voting public. With more registered Democrat voters than Republican, polls try to capture more Democrats in their samples. This introduces a potential bias if fewer Democrats decide to vote in the election. The representative sample, is no longer representative. A lower participation rate means we should see a shift in actual voting to Republican candidates. In other words, the polling companies didn’t screen their samples well enough to get a representative sample of actual voters.
This may carry forward, perhaps in the rust-belt states that flipped to Trump, to voters without previous voting experience. By trying to get voters who have already voted in previous elections to poll, pollsters omit people who participate infrequently. This is a hidden group of voters, of potentially significant size. With roughly only half the eligible US population voting, there’s probably enough out there to swing a few percentage points if they participate. Normally these non-voters may not make a difference, but with two polarizing candidates, Trump may have tapped into these. Pollsters didn’t identify them, and that’s another significant polling failure.
Another potential problem is the polls may influence voting behaviour. Some of this has been reported anecdotally in the UK following Brexit. Voters opted to vote for leave because they expected remain to win, then experienced remorse. If Clinton’s reported poll-margins before the election prompted more Democrats to stay home, or to vote for 3rd party candidates as a protest, then crucial margins would be pulled back.
So basically, the US polling companies failed to screen for confounding factors at the state level. Which given the US electoral system, isn’t an easy task. In the more general sense, they did predict correctly Clinton would end up with more of the popular vote. Not that that matters as much for determining who the president will be. And media needs to perhaps emphasise the uncertainty of polling data better, much more so than reporting the margin-of-error as some kind of footnote.