Forecasting elections is becoming more sophisticated, thanks in part to methods developed by Erin Hartman (BS ’07).
The pollsters are back.
With the 2016 presidential race fully underway, candidates are crisscrossing the country to shake hands with the common folk, eat pie, and take their case to the American public. Just where they stump, however, is increasingly determined by armies of statisticians, continually checking the pulse of the electorate. And this time, many of them are building upon methods introduced by Erin Hartman (BS ’07) back in 2012.
During the last presidential election cycle, Hartman served as an analyst on Obama’s campaign, where she was tasked with studying data from its massive polling operation, which was conducting more than 30,000 telephone interviews per week. Hartman’s team needed to quickly sift that data to predict which voters would turn out, how they would vote, and who was capable of being persuaded. Their analysis was then used to deploy armies of volunteers.
Hartman soon realized that the models could be more accurate. “I was completely new to the process. So I didn’t have any preconceptions about how things should be done,” she said. Hartman credits her broad-based analytical training at Caltech, which set her apart from other operatives with more traditional political campaign experience, for allowing her to spot an opportunity.
Hartman started with the observation that randomly conducted polls end up being not that random, after all.
“Usually pollsters will call 1,000 participants within a region as a sample. But it turns out not everyone likes to talk about politics when you call them at dinner,” Hartman said. Some demographics, particularly younger generations, are disaffected, discouraged, or not interested—and they just hang up the phone. “In so doing, these groups systematically remove themselves from the polling.”
And the trend is getting worse. Whereas 50 years ago, nearly half of those called might participate, today it’s down to just 5 percent. “So the question is, who are these 5 percent? Are they truly representative of the region’s voters?” Hartman asked. “It turns out they’re not.”
“Usually pollsters will call 1,000 participants within a region as a sample. But it turns out not everyone likes to talk about politics when you call them at dinner.”
Imagine a fictional city in which half of the active voting population is over the age of 50, while those under 25 make up 10 percent. When it comes time to poll, however, the numbers shift: The Baby Boomers jump up to 75 percent of responses and Millennials fall to just 2 percent.
How do you predict the intentions of someone who doesn’t participate in a poll? After all, just because someone hangs up the phone on a pollster doesn’t mean he or she won’t actually vote.
While attempting to build models for Obama’s team, Hartman realized that if she knew the population’s actual voting habits, she could rebalance the poll. As it turned out, by 2012 new and publicly available voting data offered a wealth of just such information. Hartman was able to synthesize detailed demographics and voting histories and apply them to her models. In the case of our fictional city, Hartman could now reweight the 2-percent response of the younger group to form 10 percent, reflecting the true voter turnout. The result: a more accurate prediction.
How much more accurate? According to Andrew Claster, then the deputy chief analytics officer for the Obama campaign and now a consultant, it was a game changer. “At one point, we saw the Romney campaign dramatically increase their investment in Michigan and Minnesota. Clearly, they thought the states were in play. We wondered ‘What are they seeing?’” he recalled. “But our polling showed a comfortable lead, so we didn’t feel the need to counter.”
Obama won both states by more than 8 percentage points. Claster added, wryly, “Obviously, their models were flawed.”
Following the 2012 election, Hartman and several partners formed a consulting group, BlueLabs, that has since advised a number of Democratic campaigns, including New Jersey Senator Corey Booker and Virginia governor Terry McAuliffe.
Now this year, the presidential campaigns have taken note. According to Matthew Holleque, who cofounded BlueLabs with Hartman, many of the current candidates employ operations with similarly sophisticated practices. “Polling is really about efficiency,” Holleque said. “Erin took a complex mathematical problem and found a solution that makes campaigns much more efficient.”
Claster put it more bluntly, “I believe Erin’s work resulted in the most significant improvement in public-opinion survey methodology in more than 30 years.” Within campaign circles, Hartman has received numerous accolades, including being named this year to the Influencers 50 list by Campaigns & Elections.
But while the 2016 campaigns may benefit from Hartman’s methods, it will be without Hartman, who has returned to academia. Currently finishing post-doctorate work at Princeton, she will join UCLA as an associate professor in statistics and political science next year. She wants to go beyond observing and predicting how voters behave, she says. Now she wants to know why.
“Erin was an outstanding student who perhaps fell into social science,” said Michael Alvarez, professor of political science at Caltech. “She accomplished important work in the political sphere and shows the same rigor and inventiveness as a social scientist.”
Asked to predict the 2016 election, Hartman just smiled, “I look forward to finding out along with everyone else.”