Pollster.com

Brian Schaffner

 

Polling Registered vs. Likely Voters: 2004

As Pollster.com readers have no doubt noticed, there has been much discussion in the posts and the comments here about the merits of polling registered voters (RV) versus likely voters (LV). Mark and Charles have been debating this point in their most recent exchanges about whether it is better to include LV or RV results in the Pollster.com poll averages. Charles's last post on this topic raised the following questions:

"There is a valid empirical question still open. Do LV samples more accurately predict election outcomes than do RV samples?"

Ideally, I'd have time to go back over 30 or more years of polling to weigh in on this question. Instead, I thought I'd go back to 2004 and get a sense of how well RV versus LV samples predicted the final outcome. To do this, I used the results from the final national surveys conducted by eight major survey organizations. For each of these eight polls (nearly all of which were conducted during the last three days of October), I tracked down the Bush margin among both RVs and among LVs. The figure below demonstrates the difference in the Bush margin for the LV subset relative to the RV sample from the same survey.

reg_likely_2004_1.PNG

For most polls, LV screens increased Bush's margin, including three surveys (Gallup, Pew, and Newsweek) where Bush did 4 points better among LVs than he did among RVs. But using a LV screen did not always help Bush. In three polls, (CBS/New York Times, Los Angeles Times, and Fox News) his margin remained the same and in the Time poll (which was conducted about a week earlier than the other surveys) Bush actually did 2% worse among LVs.

Of course, this doesn't really tell us which method was more accurate in predicting the general election outcome, just which candidate benefited more from the LV screens. To answer which was more accurate, we can plot each poll's Bush margin among both RVs and LVs to see which came closest to the 2.4% margin that Bush won in the popular vote. This information is presented in the figure below, which includes a dot for each survey along with red lines indicating the actual Bush margin.

reg_likely_2004_2.PNG

Presumably, the best place to be in this plot is where the red lines meet. That would mean that both your RV and LV margins came closest to predicting the eventual outcomes. But, if you are going to be closer to one line over the other, you'd rather be close to the vertical line than the horizontal line. This means that the polling organization's LV screen helped them improve their final prediction over just looking at RVs. If the opposite is true (an organization is closer to the horizontal line than they are to the vertical line), their LV screen actually reduced their predictive accuracy.

The CBS/New York Times poll predicted a 3 point Bush margin for both its RV and LV samples, meaning it was just 6/10ths of a point off regardless of whether they employed their LV screen. Four organizations (Pew, Gallup, and ABC/Washington Post, and Time) increased the accuracy of their predictions by employing the LV screens, coming closer to the vertical line than they do to the horizontal line. Gallup's LV screen appeared to be most successful, since it brought them closest to the actual result (predicting a 2 point victory for Bush despite the fact that their RV sample showed a 2 point advantage for Kerry).

On average, the RV samples for these eight polls predicted a .875 Bush advantage while the LV samples predicted a 2.25 advantage for Bush, remarkably close to the actual result. Of course, this is just one election, but it does appear as though likely voters did a better job of predicting the result in 2004 than registered voters. On the other hand, this analysis reinforces some other concerns about LV screens, the most important of which is the fact that some LV screens created as much as a 4 point difference in an organization's predictions while in three cases LV screens produced no difference at all. It is also important to note that these are LV screens employed at the end of a campaign, not in the middle of the summer, when it is presumably more difficult to distinguish LVs. Ultimately, the debate over LV screens is an important one and the 2008 campaign may very well provide the biggest challenge yet to pollsters trying to model likely voters.

By Brian Schaffner on August 13, 2008 4:42 PM | | Comments (12) | TrackBacks (0)

"Statistical Dead Heat?" Depends on Which Statistics You Use

"We have a race that by every measure of every poll is a statistical dead heat. McCain's not supposed to be in this thing, and Obama's supposed to be blowing everybody away and it just isn't happening, at least to this point."

Lou Dobbs (July 17th, Lou Dobbs Tonight)

If you have paid any attention to the news in the past month, you have had a hard time avoiding some journalist or pundit noting that the presidential race is currently a "statistical dead heat" or "essentially tied." The news media, of course, love to cover the horserace aspects of the campaign, particularly in a way that emphasizes how close the election is. But when you step back and gain a little perspective on the big picture, you realize that this race isn't quite the dead heat that it is made to be.

The news media are often a bit myopic in their view of the contest, extrapolating too much from the most recent poll (or even the most recent "surprising" poll). Last week, Fox News released a national survey that showed Obama holding a 41-40% lead, well within the margin of error for the survey. Commentators were quick to emphasize this result and note that the candidates were essentially running neck-and-neck or that the race may even be tied. No doubt there will be a lot of commotion over the latest Gallup/USA Today survey showing McCain ahead (though also within the margin of error) among likely voters. Nevertheless, we gain much better perspective on the state of the race when we look at all available data.

Alan Abramowitz notes that Obama has consistently led in national polls over the past two months. In fact, according to national poll results listed on Pollster.com, Obama had been tied or ahead in 50 consecutive national polls through Sunday. Sure, many polls may show Obama holding a lead within the statistical margin of error, but if Obama and McCain were actually tied, we'd expect as many polls showing McCain ahead as show Obama ahead. Based on some basic calculations, the probability that 50 consecutive national surveys would show Obama tied or ahead if the candidates were actually tied is .0000000000000009. In short, this race is not a "statistical tie," despite what a few scattered surveys (drawing disproportionate attention from the pundits) indicate.

By Brian Schaffner on July 29, 2008 2:45 PM | | Comments (7) | TrackBacks (0)

 
Featured on Pollster

Chart Links

President

Senate

Governor

U.S. House


Pollsters

Poll Blogs/Sites

Academic

Survey Orgs