gelman and king - why are american presidential election campaign polls so variables when votes are...

Upload: bill-johnson

Post on 03-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    1/43

    B.J . Pol.S. 2 3 , 4 0 9 4 5 1Printed in Great Britain

    Copyright O 99 3 Cambridge University Press

    Why Are American Presidential Election CampaignPolls So Variable W hen Votes Are So Predictable?A N D R E W G E L M A N A N D G A R Y K I N G *As most political scientists know, the outcome of the American presidential election can bepredicted within a few percentage points (in the popular vote), based on information availablemo nths before the election. Thu s, the general camp aign fo r president seems irrelevant t o theoutcome (except in very close elections), despite all the media coverage of campaign strategy.However, it is also well known that the pre-election opinion polls can vary wildly over thecampaign, an d this variation is generally attrib uted t o events in the camp aign. How can cam-paign events affect people's opinions on whom they plan to vote for, and yet not affect theoutcome of the election? For that matter, why do voters consistently increase their rupportfor a candid ate during his nominating c onve ntion , even th ough the conventions are almostentirely p redictable events whose effects can be ration ally forecast?In this explorato ry s tud y, we consid er several intuitively appealing , but ultimately wrong.resolutions to this puzzle and discuss our cu rrent understanding of what causes opinion pollsto fluctuate while reaching a predictable o utcom e. Ou r evidence is based o n graphical presen-tation and analysis of over 67,000 individual-level responses from forty-nine commercial pollsduring the 1988 campaign a nd m any o ther aggregate poll results from the 1952-92campaigns.

    We show that responses to pollsters during the campaign are not generally informed oreven, in a sense we describe, 'rational'. In contrast, voters decide, based on their enlightenedpreferences, as formed by the information they have learned during the campaign, as wel!as basic political cues such as ideology and party identification, which candidate to supporteventually. We cannot prove this conclusion, but we do show that it is consistent with theaggreg ate forecasts and individual-level opinion poll responses. Based on the enlightened prefer-ences hypothesis, we conclude that the news media have an im portan t effect on the outcomeof presidential elections - not th rough misleading advertisements. sound bites, or spin do ctors,but rather by conveying candidates' positions on im portan t issues.

    Something is amiss in the scholarly study of American presidential elections.F o r som e time now , political scientists have forecast the ou tcome of presidentialelections accurately using only information available before the start of thegeneral election campaign. However, the numerous 'trial heat' public opinionsurveys (polls about whether likely voters plan to cast their ballots for theDemocratic or Republican candidate for president) conducted during thecampaign vary enormously in support for the Democratic and Republican* Gelman, Department of Statistics, University of California, Berkeley; King. Department of

    Gov ernm ent, Harv ard University. We th ank Eric Oliver and Maggie Trevor for research assistance,and Larry Bartels, Neal Beck, To m Belin. M o Fiorin a, Jo hn K essel, Mik L aver, Eileen McD onau gh,Phil Paolin o, Keith Poole, Dou g Price, Phillip Price. Sid Verba a nd D . Stephen Voss for helpfulcomments, and the National Science Foundation for a research grant. All graphs were madeusing the S system. This is a revised version of a paper which received the Pi Sigma Alpha awardfor the best paper at the annual meeting of the Midwest Political Science Association, Chicago.1992.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    2/43

    candidates. At one point during the 1988 general election campaign, surveyrespondents favoured Dukakis over Bush by 17 percentage points, and yetany reason able app lication of the political science literature would hav e mad eGeo rge Bush almost certain to win the Novemb er election.In addition to being interesting in its own right, the puzzle stated in thetitle of this article is impo rtant for three related reasons. First, given ou r pro-fession's heavy reliance on public opinion surveys for studying presidentialelections and numerous other phenomena, the puzzle represents a large voidin ou r substantive understand ing a nd possibly also a very serious methodologi-cal problem for much existing research outside this area. The existence of thepuzzle means th at we can no t rely o n answ ers to a t least some survey questions.What political science obviously needs is a very clear broader theory of thesurvey response, so th at we can decide which qu estions conta in directly usefulinformation. Although there has been much interesting work on the subject,we certainly have no fully satisfactory theory yet. ' This is not a problemwe solve in this article, but any resolution of the more general problem mustalso account for o ur puzzle.A second reason for studying this subject is its potential contribution towhat political philosophers have called 'the epistemological problem of inter-ests: how we can know what they are. ' ' Dahl defines 'interests' by appealingto the concept of enlightened understanding: 'A person's interest or good iswhatever that person would choose with fullest attainable understanding ofthe experience resulting from that choice and its most relevant alternatives. 'He and others have asked, 'What processes o r institutions can best be countedon to protect these interests?' We have no final answer to this question, butthe issues we address an d evidence we provide may help t o focus the questionmo re precisely.Finally, the puzzle has a practical conse quence, since mainstream journalistsrespond to it largely by ignoring the lessons of political science and insteadinterpreting each short-term change in the public opinion polls as a seriouschange in the likely fo rtun es of the cand idate s. This focus is in pa rt responsiblefor the relatively issue-free, or 'horse race', aspect of presidential campaignmedia coverage, which a t its most extreme finds journalists interpreting therace by deconstructing the claims of competing 'spin doctors'. Conversely,

    ' Christopher Achen, 'Mass Political Attitudes and the Survey Response', American PoliticalScience Review, 69 (1975), 1218-23; Stanley Feld ma n, 'W hat D o Survey Questions Really Measure?'Political Methodologist, 4 (1991), 8-12; T . Piazza, Paul Sn iderma n an d Phillip Tetlock , 'Analysisof the Dynamics of Political Reasoning: A General-Purpose Computer Assisted Methodology',Political A nalysis, 1 (1989), 99-120.

    ' Rober t Dahl , Democracy and Its Critics (New H aven, C on n.: Yale University Press, 1989),p. 181.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    3/43

    W hy Are Presidental Election Polls So Variable? 41 1som e political obs erve rs, notin g the success of forecas ters in predicting electionsmonths ahead of time, hold that the general election campaign has no effecton the outcom e of the presiden tial election. Neither of these extreme positionsfully captures the truth; at the end of this article, we return to a discussionof the roie of the media in election camp aigns .As far as we know, the arguments and evidence in this paper apply onlyto the general election campaign for the A merican President (see Section 2.2).Sorting ou t where it applies, an d why, is an im portant topic for future research.In Section 1, we review the evidence regarding political scientists' forecastsand the variability of poll results. Underlying our ability to forecast is theprofession's distinctive model of voter decision making. Section 2 discussesthis mod el, as well as the alternative m odel implicitly followed by most acc oun tsof the election in the news media. We work our way through several plausible,but flawed, explanations for this puzzle in Section 3. We are far from a finalanswer to ou r puzzle, but we d o have one tentative explanation, which is consis-tent with all our existing evidence. We outline this hypothesis in Section 4and present the evidence for it in Section 5. We conclude in Section 6 witha discussion of the imp lication s for the role of the me dia in presidential electioncampaigns.Our intended contribution in this article is to raise the question in our titleand provide evidence sufficient to dismiss many apparently reasonable and'obvious' hypotheses (including our own prior beliefs). Because of the largelyexploratory nature of relevant existing theories, we make extensive use of gra-phical techniques. This enables us to evaluate a series of specific hypotheseswhile still no t obscuring features of the d ata tha t might suggest novel appro ach esor new hypotheses.

    1. F O R E C A S T I N G E V I D E N C E A N D D A T A S U M M A R I E SRosenstone's forecasting model is one of the most developed and successfulof the recent contributions to the literature, and it is the empirical results ofthis model on which we focus. ' His model is based on measurable economicand political variables that were discovered and analysed by numerousresearchers over many decades, and not on trial heat polls. Even if one wereto disagree with the particulars of Rosenstone's model, it would be hard to

    Steven J. Rosenstone, Forecasting Presidential Elections (New H aven, Co nn .: Yale UniversityPress, 1983).

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    4/43

    412 G E L M A N A N D K I N Gdeny tha t past p residential elections have been forecast fairly accurately usingthese method^.^

    1.1. Political Science Forecasts up to 1988Ro sens tone summarizes his considerable success a t forecasting presidential elec-tions th rou gh 1980. Per hap s even stro nge r evidence is tha t his model ha s con-tinued to forecast very well in the two elections since the publication of hisbook, as recounted by ~ o s e n st o n e . ' n both 1984 and 1988, Rosenstone's fore-casts fell within 1 per cent of the nationwide p opu lar vote a nd predicted onlya few states incorrectly, a n excellent performance, co nsidering tha t the forecastswere made mo nths before the election. Table 1 summarizes the performanceof Rosenston e's mod el, along with o ur fo recasts for 1992 (see below), by com-paring forecasts made at the start of the general election campaign with thosefrom the natio nal polls, media prog noses a nd judgem ents by political strategiststaken at the same time.Ro sen ston e also presents wh at he calls 'perfect information forecasts', basedon information theoretically, but not actually, available before the election,such as late changes in real disposable income . (This would be actually availableif the gove rnment released this inform ation earlier.) These perfect info rma tionforecasts are generally significant improvements. They are obviously of lessuse for actual forecasting, but they confirm the most important general pointfro m o ur perspective: the outcom es of recent elections can be predicted withina few percentage points in the popu lar vote, based o n events tha t have occurredbefore Lab or D ay (the first Mond ay in September).Other forecasting models, also based on economic and political variablesmeasured before the start of the campaign, have performed well, and often

    Michael S. Lewis-Beck, 'Election Foreca sts in 1984: H ow A ccurate W ere They?' P S , 18 (1985),53-62, and Michael S . Lewis-Beck and Tom W. Rice, Forecasting Elections (Washington , DC:Congressional Quarterly Press, 1992) review many other statistical forecasting models. Allan J .Lichtman and Ken DeCell, The Thirteen Keys to the Presidency (Lanham, NY: Madison Books,1990) and Robert Forsythe, Forrest Nelson, George Neumann and Jack Wright, 'The Iowa Presi-dential Stock Market: A Field Experiment' , Research in Experimental Economics, 4 (1991), 1113,presen t som e non-statistical app roa che s to forecasting presidential elections. Social scientists havebeen explaining and forecasting Individual votes and aggregate election outcomes almost sincethe start of the discipline. The first quantitative article published in a political science journal(abou t political science) was on voting behaviour (William Ogbu rn and Inez Goltr a, 'H ow W omenVote: A Stud y of an Election in Portlan d, Oregon', Political Science Quarterly, 34 (191 9), 413-33),and voting, particularly in presidential elections, has almost always remained a lively area ofresearch.

    ' Steven J. Rosenstone, 'Predicting Elections' (Ann Arb or: U niversity of Michigan, unpublishedman uscript, 1990). In Forecasting Presidential Elections, p. 122, Rosenstone also reports sendingletters on 14 Octo ber 1980 to twenty sch olars with his forecasts of the Novem ber 1980 election.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    5/43

    W hy Ar e Presidental Election Polls S o Variable? 413T A B L E 1 Presidential Election Forecasting ErrorsForecas ts Errors1984National Popular VoteRose nston e 0.3 percentNa tio nal polls (avera ge miss) 5.3 percent

    National Electoral Vot eRose nston e 48 electoral votesMedia prognoses (average miss) 129 elector al vote sPolitical strateg ists (average miss) 115 electoral votes1988National Popular VoteRosenstoneNati onal polls (average miss)

    National Electoral Vo teRosenstoneMedia prognoses (average miss)

    0.2 percent2.8 percent82 electoral votes13 1 electoral votes

    1992National Popular VoteGelm an and King 0.3 percentNati onal polls, early September (average miss) 2.8 percentNati onal polls, mid-October (average miss) 5.4 percentNational Electoral VoteGelman and King 5.6 electoral v otesState polls, Septem ber 59 electoral votes

    Nore: All popular vote forecasts are expressed in terms of the Democratic candidate's share ofthe two-party vote. The 1984 forecasts were made in mid-July; the 1988 forecasts were madein early Septembe r; the 1992 forecasts were perform ed in early-O ctober, bu t only used in formationavailable in early September. When the media declared states as 'toss-ups', the electoral voteswere divided evenly between the two m ajor parties an d states were cou nted a s half a miss.

    Sour cefor 1984 and 1988forecasts: Rosenstone. 'Predicting Elections'. Tables 1 and 2.

    better, in recent ye ar s6 By con trast, public opinion polls a t this time gaverelatively useless forecasts of the election outcome. The predictions of mediaexperts and political strategists were not muc h better.?

    See, for example, Ian Budge and Dennis Farlie, Voting and Parry Comperition (New York:Wiley, 1977); Edward R . Tufte, Political Conrrol of rh e Eco nom y (Princeton, N J : Princeton Univer-sity Press, 1978); Ray C. Fair, 'The Effect of Economic Events on Votes for President', Reviewof Economics and Sratistics, 60 (1978), 159-73: and 'The Effect of Economic Events on Votesfor President: 1980 Upd ate' . Review ofEconomics and Srarisrics, 64 (1982), 322-5; an d 'Th e Effectof Economic Events on Votes for President: 1984 Upd ate' . Polirical Behavior, 10 (1988), 168-79;James E . Campbell, 'Forecasting the Presidential Vote in the States' , American Journal of PoliricalScience, 36 (1992). 38 64 07 ; Lewis-Beck and R ice, Forecasring E lecrions.' See Lewis-Beck and Rice, Forecasring Elecrions, chap. 1 .

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    6/43

    1.2. u r Forecast or 1992In upd ating ou r paper to include the 1992 election an d poll results, we wantedonce again to compare Rosenstone's forecasts to those of the pundits andpollsters. Unfortunately, as the November election ap proache d, we could n ottrack do wn any official Rosensto ne forecasts, so we decided t o m ake ou r own.Our purpose was not to perform the most accurate forecasts or optimally toselect variables for prediction, but rathe r to c omb ine the elements of existingforecasting methods in the political science literature an d accurately to assessthe uncertainty in ou r forecast. We briefly outline o ur m ethodology here.'Campbell's forecast. We started with what we viewed as the best currently-available forecasting mod el, tha t of C ampb ell,9 which predicts the D emocraticshare of the two-party vote for president in each state. Campbell fits a linearregression of the statewide vote proportions in the eleven elections since 1948- 531 observations in all - on a set of nationwide, statewide and regionalpredictor variables. (The District of Columbia is ignored in the model, sinceit has reliably voted Democratic in every election.) The nationwide variables- which are constant in each election year - are the Democratic candidate'sshare of the trial heat polls two months before the election, incumbency (0,1, o r - 1, depending on the party), and the change in Gross National Produ ct(GNP) in the preceding year (counted positively or negatively, depending onwhether the Democrats or the Republicans are the incumbent party). Thestatewide variables are the state's vote in the last two presidential elections(relative to the nationw ide vo te in each case), a presidential a nd vice-presidentialhom e-state adv anta ge (0, 1, or - ), the change in the state's econo mic growthin the past year (counted positively or negatively dep ending o n the incumbe ntparty), the partisanship of the state (measured by the proportion of Demo cratsin the state legislature) and the state's ideology (as measured by the averageof its congressional representatives' ADA-ACA interest-group rating scoresin 1988). The regional variables - meant to capture various regional effects,mostly from past elections - are dummy variables for the South in electionsin which o ne of the cand idates was a So utherner, for the S outh in 1964, forthe deep South in 1964, for New England in 1964, the West in 1976, andfor the North Central region in 1980. Except for the Southern effect (whichcou nted for Clin ton) , the regional variables ha d n o effect in the 1992 elections;their only role was to remove ano malies in past elections and th us allow mo reaccurate estimation of the systematic effects. Because of the data structure,the division in to nation al, sta te an d regional variables is mo re tha n a conveni-ence. With 531 observations, a large number of state variables can reasonably

    Detai ls appear in Andrew Gelman and Gary King, 'Forecast ing the 1992 US PresidentialElection', manuscript, in progress.Campb ell, 'Forecasting the Presidential Vote in the States' .

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    7/43

    W hy Are Presidental Election Polls S o Variable? 41 5be fitted to the election data set. National variables, however, must be morerestricted, since they are essentially being fitted to only eleven da ta ~ o i n t s . ' '

    After estimating the regression coefficients, Campbell predicts the state-by-state results for 1992 based o n the natio nal a nd state-by-state explanatory vari-ables for that year, which could be obtained by early September. (Earlier,Campbell had made rough predictions based on preliminary estimates of theGNP change.) Each state was counted in the Democratic or R epublican columndepending on whether i ts forecast Democratic vote proportion was greateror less than 0.5. In addition, the nationwide popular vote was estimated bymultiplying each state's forecast vote proportions by an estimate of turnout.We were easily able to replicate Cam pbe ll's exact numerical results.F o r the purposes of forecasting the 1992 election - a task we undertookin early October 1992, but only using information available in early Septem ber- we altered Cam pbell's m ode l in three ways.Choosing explanatory variables. One problem with Campbell 's forecastingmodel is that it is based on a single regression specification that has beenchosen because of its close fit to previous electoral data. As is well knownin econom etrics an d statistics, a prediction me thod optimized in this way willoften pick up the idiosy ncratic, rathe r th an systematic and persistent, featuresof these data and will therefore forecast poorly. For election forecasting, thismeans that (1) Campbell 's standard errors are probably too low, and (2) i tmay be possible to gen erate better forecasts by choosing a fit by more substantivecriteria.R ath er th an just selecting the one regression model tha t best fitted past d at a,we considered all models in which the chosen su bset of explanatory variableswere plausible from a substantive standpoint and had low residual variancewhen fitted to the state election results from 1948 to 1988. Even together,these criteria ar e no t sufficient to na rrow the search to a single set of explanatoryvariables. Indee d, several subsets of the available variables met these criteria,including Campb ell's, and we considered them together to represent the uncer-tainty in our forecasts due to the choice of predictor variables. These gave

    'O Th e 1992 presidential election campaign drew a n unusually large number of political scientiststo mak e forecasts. Th e quality of these forecasts were quite uneven, as was their success. Modelswhich ignored features of voter decision making that the political science literature has demon-strated to be important - especially candidate ideology and presidential approval - seemed tod o especially poorly . (F or summ aries, see Natha niel Beck, 'Forecasting the 1992 Presidential Elec-tion: The Message is in the Conference Interval ' , Public Perspective, 3, N o. 6 (1992), 32-3; PoliticalMethodologist, April 1993; Jay P. Greene, 'Forewarned Before Forecast: Presidential ElectionForecasting M odels an d the 1992 Election', PS, 26 (1993), 17-21.) It is easy to be too hard onall the forecasters of 1992, however, since this was a year without precedent: no president sinceTrum an in 1948 has ever run for re-election with such low public approva l. Fortu nately , extremeobservations such as occurred in 1992 should help substantially in making future forecasts. Ofcourse, one should be especially wary of forecasting 'models' that are not precise enough to bereplicable. F or example, one co-autho red method was applied by each co-au thor in different tele-vision interviews: according to one, the method picked Clinton as the likely winner; accordingto the o ther, it picked Bush.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    8/43

    varying forecasts of Clinton's votes, from about 50 per cent to 56 per cent.(Campbell 's choice happened to favour Bush more than most of these). Thestan dard deviation of the estimates across models was abo ut 1.5 per cent, whichwe con sidere d to be the level of 'specification un certain ty' igno red by Ca mp bell's(or an y other reasonable) single l inear model used to forecast . Fo r the p urposeof ou r estimation, we add ed the squ are of 1.5 per cent to the estimated predictivevariance, thus producing more realistic estimates of the uncertainty of ourforecasts. For our point estimate, we chose a model near the middle of therange of forecasts, which differed fro m C amp bell 's by including the followingvariables: (1) the president's approval rating, included as an interaction withthe national presidential incumbency variable; (2) the absolute differencebetween s ta te and candidate ideologies , as used by ~osenstone;" nd (3) anadditional regional variable for 1960 indicating the percentage of the state'spopu lation that was Catholic in that year. Ou r method is therefore equivalentto including all available explanatory variables, with a ppropriate prior weights.Modeling dependence among states. Campbell 's model ignores the year-by-yearstructure of the data, treating them as 531 independent observations, ratherthan eleven sets of roughly fifty related observations each. Substantively, thefeature of these data that Campbell 's model misses is that partisan supportacross the states varies together: the Demo cratic cand idate for president alm ostalways does better in Massachusetts than Utah, but both states give relativelymo re to the Dem ocrat when the Dem ocratic candida te does better nationwide.Statist ically, acknowledging this data grouping or dependence across stateswithin an election year can be accounted for by fitting an extra term in theregression model: a nationwide average forecasting error in addition to Camp-bell's sta te erro r term . As we show elsewhere,12 it is clear fro m the historic alda ta t hat Campb ell 's single erro r term underestimates the variance of nation-wide aggregate presidential vote share forecasts. Fitting a two-error modeldoes not change the point estimates of Democratic vote proportion in thestates, but allows a mo re realistic assessment of forecasting unce rtainty .Calculating the forecast. Campbell calculates the expected number of electoralcollege delegates for each candidate by allocating all the delegates in a stateto the candidate forecast to get more than half the vote, and then adds overall the states. We use a slightly more sophisticated procedure to account forthe uncertainty in the forecast. For each state, our model yields an estimateof the proportion of the two-party vote that the Democrat will win. Fromthis estimate, along with the stan dard deviation o f the forecast vote, we com-puted the probabili ty that Clinton would win the state, based on the normaldistribution used in the regression. Clinton's expected electoral vote count isjust the sum of the electoral vote in each state, multiplied by the probability

    " Rosenstone, Forecasting Presidential Elections.l 2 Gelman an d King, 'Forecasting the 1992 US Presidential Election'.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    9/43

    W hy Are Presidental Election Polls So Variable? 417tha t he wins the state. According to o ur calculation, Clinton had a 0.85 pro b-ability of winning the election, with an expected total of 53.1 per cent of thetwo -part y popu lar vote a nd 368 (of 535) electoral votes.13

    For comparison, we also provide a more detailed presentation of aggregatepublic opinion poll results over the last eleven presidential election campaigns.O ur d ata for this inquiry, an d for the rest of this article, include the Republicanprop ortio n of two-pa rty sup port rep orted in surveys over these eleven elections.The d at a before 1988 are from Gallup ; 1988 and 1992 also include all otherpolling organizations for which we could obta in relevant information.14 O urda ta include the aggregate information reported in Figure 1 an d individual-levelsurvey data from forty-nine cross-sectional polls during the 1988 campaign.15In tota l, the 1988 da ta includ e surveys of 67,492 people, 69 per cent of whomwere willing to state their candidate preference. The appendix describes theseda ta in m ore detail.16Figure 1 summarizes these da ta fo r each election since 1952. Th e triangleon the right-hand side of each graph reports the actual election outcome, andthe line traces out the changes in the Republican proportion of the two-partycand idate su ppo rt figures over the campaign.' '

    The graphs in Figure 1 show tha t, in m ost years, early public opinio n pollsgive fairly miserable forecasts of the actual election outcome. The situationis somewhat better after the second party convention, but through almost theentire campaign it would not be wise to use polls to forecast the election out-come. Additionally, in virtually every presidential election in the last forty

    l 3 We presented these forecasts several weeks before the election in public lectures at HarvardUniversity and the University of California, Berkeley, as well as in communications with severalothers .

    ' I O ur extensive analyses, some of which ar e reported below, indicate tha t one can safely mergethe data from the different polling organizations in order to study trends in candidate supportbut n ot the percentage undecided o r not responding.

    'e chose the 1988 e lect ion because i t was the mos t recent when we began our analyses .We co mpleted all but the final draft of this article before the 1992 election.

    These polls are a vast and relatively untappe d d ata source for election studies. As the Appendixdescribes most of the surveys also include a number of useful explanatory variables. Althougheach poll does not always include the exact question we would prefer, these data do containa considerable amount of data - considerably more interviews from 1988 alone than the sumtota l of all the interviews from every presidential Na tion al E lection Survey since 1952. See HerbertAsher, Polling and the Public: Wh at Every Citizen Should Know (Washington, DC : Congress ionalQuarterly Press, 1988), for a general review of polls and the public.

    I' Th e survey question as ked m ost o ften was, 'If th e 1988 Presidential election were being heldtoday, would you vote for George Bush for President and Dan Quayle for Vice President, theRepublican candidates, or for Michael Dukak is for President and Lloyd Bentsen for Vice President,the Democratic candidates?' Analogous questions were asked in the other years. We confrontpotential problems of question wording below.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    10/43

    -4 Days befo re elect ion

    0.2-200-150 -100 -50 0Days be fore elect ion

    Days before electionCTL0L 1956

    -200 -150 -100 -50 0Days b efore elect ion

    0.6l.&l ;;;r...........................................0)5 0.42u5 0.2-200 -150 -100 -50 0 -200 -1 50 -100 -50 02 Days befo re elect ion.- Days befo re elect ion-Days befo re elect ion 5 Days befo re elect ion

    r1964 P 1960

    P -0.6.

    6...-0.4

    mc 0.2

    -200 -150 -100 -50 0 -200 -150 -100 -50 0.-5 Days before elect ion Days before elect ion2

    0.2u-200 -150 -100 -50 0Days b efore elect ionFig. I . Presidential trial heat sNote s : The solid line in each plot is the proportion of the survey respondents who would votefor the Republican c andid ate for president, among those who report a preference for the Democraticor Republican candidates. The 1992 and 1988 graphs include data from all available nationwidepolls; plots for the other years are from the Gallup Report. The upward arrow marks the timeof Republican convention an d the downward arrow m arks the time of the Democratic convention.

    years, the polls converge to a point near the actual election outcome shortlybefore election day .

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    11/43

    W h y Are Presidental Election Polls S o Variable? 4192 . M O D E L S O F V O T E R D E C I S I O N M A K I N G A N D T H E I R I M P L I C A T I O N S

    2.1. Political Science ModelsMost existing political science forecasting models are based on state-level ornational-level aggregates, derived from the same ideas and underlying v ariablesas the models of individual voter choice favoured by political scientists. Beingaggregate results, though, these election predictions cannot truly confirm theindividual-level models. To understand individual-level behaviour, politicalscientists have turned to nu merou s studies based on public opinion data .Political scientists have develop ed num erou s models of voter decision ma king,mostly in the context of studies of presidential campaigns. In the broadestterms, we have the sociological models dominated by the Columbia School,the social-psychological models connected with the Michigan School and theration al choice mod els developed by the R ochester Schoo l. These models, theirdescendan ts an d nu me rou s other s are derived from diverse perspectives of voterchoice. For the purposes of this study, though, these models do not differamong each other as much as they differ as a whole from the models impliedby journ alists in their coverage of presidential ca mp aigns .Alth oug h m uch d eba te still exists over prop er m odels of voter decision mak -ing in political science, these models all seem to agree on some aspects ofthe same general picture: voters take the decision about whom to vote forrelatively seriously. They might n ot be able to recite the reason s for their votefor president to a survey researcher (indeed, they might not even know thereasons), but v oters at least base their decisions on relatively kn own and m easur-able variables. These fundamental variables measure their (or their group's)interests and include economic conditions, party identification, proximity ofthe voter's ideology and issue preferences to those of the candidates, etc. Asdiscussed by Lewis-Beck and Rice, all the serious forecasting methods try topredict the election result using some versions of the same funda me ntal variablesto measure economic well-being, party identification, candidate quality andso forth . '*2.2. W hy Are S om e Elections Harder t o Predict than Others?First, and most obviously, close elections such as 1960 and 1976 will alwaysbe ha rd to pre dict, since in these cases the best possible fore cast will be statisti-cally indistinguishable from 50 per cent. We consider a forecast successful ifit predicts the vote closely, even if the forecast is 49 percent and the outcomeis 5 1 per cent.

    More interestingly, in primaries, low-visibility elections, and uneven cam-paigns, we would no t expect forecasting based on fun dam enta l variables meas-ured before the campaign to work. The fast-paced events during a primarycampaign (such as verbal slips, gaffes, debates, particularly good photo'"ewis-~eck and Rice, Forecasting Elections

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    12/43

    opportunities, rhetorical victories, specific policy proposals, previous primaryresults, etc.) can make an important difference because they can affect voters'perceptions of the candidates' positions on fundamental issues. Also, primaryelection candidates often stand so close on funda me ntal issues that v oters aremore likely to base their decision on the minor issues that do separate thecandid ates. In additio n, the inherent instability of a multi-candidate race heigh-tens the im portance of concerns such as electability t hat have little to d o withpositions on fundam ental issues.In a low-visibility election, if all a voter knows about a candidate is a fewstatements about reducing defence spending, say, then these statements maybe very important in gauging a candidate's ideology. Thus, the voter mightnot have the opportunity to learn later on whether early statements reflectthe c andida te's ideology accurately.The outcome of elections with uneven campaigns would also be hard topredict based on fundamental variables alone. After all, it is well known thatfinancial resources are an important influence on the outcomes of uneven con-gressional races and ballot referendums, an effect which could be explainedby the ability of the can did ate with greater med ia resources better to ma nip ulat emany voters' perceptions of the candidates' positions on fundam ental issues.However, in the general election campaign for president, and in other highinformation and relatively balanced campaigns, the consensus in the politicalscience literature is tha t these events ar e largely ephem eral, having little effecton the eventual outcome. They can have important effects for short periodsan d on different localities,19 but the overall result is little affected. Th e lengthof the general election campaign and the ample resources on both sides allowcandidate mistakes and early voter misperceptions (perhaps based on thesemistake s) to be corrected. By election da y, voters are able t o vote based largelyon accurate measures of their fundamental variables. The argument here istha t althoug h presidential campaigns have an imp ortant effect, wha t is relevantis their existence; we expect the details of a completely-run campaign to havea small effect on the election outcome. This is a similar argument to that ofarku us."For example, among the first systematic studies of voting behaviour wasa six-wave panel survey of the 1940 presidential election designed to showwhat the au thors thought were huge campaign e ffec b2 ' In fact, they foundvery few campaign-specific effects of any kind. The considerable systematicresearch over the next half-century did little to change this basic co n ~ lu si o n .' ~

    l9 See Joh n Kessel, Presidential Campaign Politics (Belm ont, Calif.: Dorse y Press, 1988).20 See Gregory B. Markus, 'The Impact of Personal and National Economic Conditions on

    the Presidential Vote: A Pooled Cross-Sectional Analysis' , American Journal ofPolitica1 Science,32 (1988), 137-54.2 1 Paul F. Lazarsfeld, Bernard Berleson and Hazel Ga udet, The People's Choice: How The VoterMa kes Up His Mind in a Presidential Campaign (New Yor k: Duell, Sloan and Pearce, 1944)." Larry Bartels, 'Stability and Change in American Electoral Politics', in David Butler andAustin Ranney, eds, Electioneering (New Yo rk: O xford University Press, in press).

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    13/43

    W h y Are Presidental Election Polls S o Variable? 421Even those scholars who focus on the endogenous effect of the campaign (orexpected votes) on fundamental variables like party identification emphasizetha t these endog enous effects are minimal, especially in the sho rt run .232.3. The Implied Mo del of JournalistsJou rnalis ts have n o similar traditio n of detailing mod els of voter decision mak-ing. However, we can discern their implicit model by looking to the focusof media attention during election campaigns, and some explicit statementsfrom newspapers, magazines an d television. Of course, there are ab ou t as manyopinions am ong journalists as am ong political scientists, but at least a 'main-stream model' can be identified. Under this model, voters base their intendedvotes partly on f und am enta l variables, but considerably more o n the day-to-dayevents of the presidential campaign. Voters are assumed to have very shortmemories, relying fo r their decision dis pro portio nate ly on the most recent cam-paign events and last piece of information they ran across. Candidates arethou ght to be able to easily 'fool' voters by ch anging their policy stance durin gthe campaign o r causing the opposing candida te to say or d o something foolish.For example, the San Francisco Chronicle reported (on 13 September 1988)th at 'the survey [of Bush leading 49 per cent to 41 per cent] is the latest evidencethat the vice-president 's tough attacks on D ukakis are working . . . the Pledgeof Allegiance in public schools has been particularly effective, with votersexpressing disap prov al of the De m oc rat's actio n by a 2-1 ratio.' Similarly,the Dallas Times Herald reported (on 9 August 1988) tha t 'If the race is indeednarrowing, it is an indication that this strategy [of Bush actively attackingDuka kis] is working.'Also according to the journalists ' model, voters d o no t take their role inthe process very seriously, have very little information or knowledge of thecampaign and the issues, and frequently do not vote on the basis of theirown self-interest. For example, Profiles magazine approvingly quoted a topconsultant who indicated that 'people vote for character traits, not policieso r issues'.24Th e typical advice ofjourn alists to their colleagues is 'Don't assumeany vote knowledge . . . In othe r words, the press must occasionally bore itselfin orde r to inform theJournalists justify their model (or stance) by interpreting public op inion polls.

    '' See Charles H . Frankl in and Joh n E. Jackson, 'The Dynamics of Party Ident~ficat ion' , meri-can Political Science Review, 77 (1983), 957-73. We can distinguish between two kinds of fund am en-tal variables: (1) characteristics of the voter and his or her situation, including their positionon issues. party identification, ideology, economic conditions etc.; and (2) voters' perceived charac-teristics of the candidates, such as the candidates' ideology and positions on issues. There arealso variables like incumbency w hich mo dulate the effect of the second category of fundam entalvariable: if you run a stronger cam paign , you are most likely to convey a positive message abou tyourself relative to the other candidates. Variables in the first category change very little overthe cam pai gn, while variables in the seco nd are directly influenced by the camp aign.

    '4 ProJiles, December 1991, p. 21.?' Newsweek, 14 Octob er 1991, p. 29.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    14/43

    They do no formal studies, and so they cannot be very confident of theseinterpretations, but the causal inferences seem clear to them on the basis oftheir detailed knowledge of the campaign and their close observations. Forexam ple, George B ush was gaining in the polls in 1988 just at the time whenhe was on the strong offensive against Dukakis, and Dukakis at the sametime was avoiding getting into the fray. D ukak is lost a few points in the pollswhen he looked a bit foolish riding on a ta nk. F ou r days of the nation al mediafocusing on a candidate during a party convention certainly does seem toinfluence people to increase their su ppo rt in the polls fo r that can didate . Accord-ing to the journalists, Bush won because of these events, the Willie Hortontelevision advertis em ents (and especially the media cove rage of these advertise -ments), his opposition to flag burning and other campaign events. Campaignstrategies and tricks play a central role in journalists' interpretation of pollresults. For example: 'It was beyond brilliance the way Michael Dukakishandled Jesse Jack son'; 'Duk akis seemed to be stalled and passive'; 'Duk akisis a sourp uss comp ared to this amazing new Bush person.'26A more sophisticated news media analysis argues that character ma tters morethan campaign tricks: 'The Democrats . . . lost for a variety of reasons, butprincipal among them was that they presented a candidate whose virtues didnot include plausibility as a president or, often, even an apparent feeling forthe nature of the job.'27 This exp lanation does n ot, h owever, specify wherethe independent judgements of the candid ates' characters come from.Interes tingly, during the 1992 cam paign, the messages of political scienceseemed to reach the journalists: there was more mention of the state of theeconomy and even of individual forecasters such as Lewis-Beck and Cam pbell,amidst the usual satu ration cove rage of ephemeral campaign events.3. F L A W E D E X P L A N A TI O N SIf political scientists can forecast the election outcome reasonably well on thebasis of fundamental variables measured before the campaign, why do thepolls vary s o much? T o put it anothe r way, if the journalists' model is corre ct,then how can political scientists, or anyone else, forecast the outcome accu-rately? Alternatively, if the political science model is correct, why do pollsvary a t all, and why do they respond to specific campaign events such as conven-tions an d advertising campaigns?In this section, we raise several hypotheses th at could explain this ap pa ren tparad ox. Only some of these are competing hypotheses; many a re complem en-tary. We also provide, in most cases, sufficient evidence to discard each. Weretain some fea tures of some of the partially flawed explanations for later use.In most cases, we focus on the 1988 campaign, since our best data are fromthat contest.

    '' Lesley S tahl, CBS News broad cast, 22 July 1988, durin g the Dem ocratic convention; News-week, 5 September 1988; Newsweek, 19 Septem ber 1988.27 Editorial, Washington Post , 14-20 Oc tob er 1991.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    15/43

    W h y Are Presidental Election P olls S o Variable? 423We discuss flawed hypotheses for two reasons: first, they are plausibleexplanations, and many have been advanced by respected journalists and

    scholars. As such, they demand a hearing, and this work would be incompleteif it did not take them seriously. Secondly, exploring the implications of thevarious hypothese s gives us insight i nto the relation between political theoriesand electoral and poll data. By seeing how the data can refute certain ideas,we learn how to pose more sophisticated alternatives that are consistent withour observations.We divide the flawed explanations into four classes: measurement theories,which exp lain the poll results as artefacts of flawed survey metho ds; journalists'theories, which dismiss the forecasts; political science theorie s, which ar e consis-tent with the forecasts, but do not explain the poll variation; and rationalactor theories, which are consistent with some parts of the evidence but notall.3.1. Measurement TheoriesIt is possible to resolve the paradox presented in the title of this article bysimply dismissing the pre-election poll results. We list three hypotheses, inorder of increasing plausibility, under which we would not trust the opinionpolls.Th e polls are meaningless. The simplest hypothesis holds that public opinionpolls have nothing to do with real observable political behaviour, and areas meaningless as candid ates behind in the polls ma ke them o ut to be. Evidencefor this hypothesis is the high rate of non-response, and the perception thatresponden ts d o no t tak e the survey seriously, giving insincere or poorly tho ugh t-ou t answers to most q uestions.There is obviously some truth to this hypothesis, since early polls in mostelection years appear to have very little to do with the eventual outcome ofthe general election. However, much evidence exists to conclude that surveyresponses are related to actual vo ting, notab ly the predictive accuracy of pollstaken before the election (see Figure 1). To some scholars, i t was no greatsurprise tha t polls a few days before the election could forec ast tha t election.However, this does confirm that the polls are connected in some importantway to observable political behaviour. These relationships hold even thoughas many as half of survey respondents refuse to state a presidential preference,as late as the final week of pre-election polling.

    In add ition, relationships am ong v ariables within virtually all polls are quitepredictable and consistent with our theoretical understanding. For example,those who identify themselves as Dem ocrats su pp ort the Dem ocratic presiden-tial candidate more frequently, Republicans more frequently describe them-selves as conservatives, thos e wh o have higher levels of educatio n tend to havehigher levels of income, and so forth. There are numerous observable con-sequences of the thesis that the polls are meaningful, and indeed most of the

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    16/43

    evidence seems quite co nsistent with this idea. This does n ot explain why earlypolls do not forecast well, but it does provide some reason to dismiss thishypothesis.A closely related hypothesis is that variation in the polls is due to samplingerror. However, this cannot be true since the observed variation in the pollis often 10 or 20 per cent or more, as compared to typical sampling errorsof ab ou t 4 percentage points.28Question wording effects and survey organization methodologj~.Several versionsof this hypothesis can be posed. One simple version is that variation in thepolls largely derives from variations in qu estion wording. We kn ow from co n-siderable research in public opinion that minor changes in the wording ofsurvey que stions ca n have larg e effects on poll results.In order to study this hypothesis, we com pared surveys taken a t ab ou t thesame time but with different question wordings, and found that support forBush vs. Du kakis is no t strongly related t o the questions that have been asked.An example of the evidence for this point is the first graph in Figure 2. F oreighteen groups of voters (Democrats, Independents, Republicans, low edu-cation, high edu cation , liberals, etc.), this figure plots the propo rtion of respon-dents in each group who supported Bush, as recorded by the usual surveyquestion posed in June, by support for Bush in another June survey that hadan unusual question wording.29M ost gro ups (represented by numbers in Figure2) fall on or close to the 45" line, indicating that this question wording didnot have much effect on the measured level of support for Bush. There isa min or systematic pattern in the responses, since the non-whites an d the liberalsfall below the line, whereas the Republicans and the conservatives fall aboveit. This small effect appeared in a similar analysis, not shown here, of twoSeptember surveys. However, these patterns are much too small to accountfor significant parts of the main puzzle we seek to u nde rstand ; moreover, theycancel ou t in the aggregate survey totals.In similar analyses, we also rejected the related hypothesis that the differentpolling orga nizations pro du ced systematically different results. We did extensivesearches and explorations of this kind, finding only one systematic relationship:the proportion undecided or refusing to answer the survey question variedconsistently and considerably with the question wording an d polling organiza-tion. The bottom graph in Figure 2 demonstrates this by using the same twoJun e polls. G rou ps of citizens in the two polls correlate m oderately well; tha tis, since those groups more undecided on one question tend to be more un-decided on the o ther, the gro ups falls roughly along a straight line. However,

    28 See William Buc han an, 'Election Pred ictions: An Em pirical Assessment', Public Opinion Quar-terly, 50 (1986), 222-7.

    ' Th e responses to the stand ard question w ording refer to Gallup's poll conducted 15 Jun e1988. Th e responses to the non -standard wording refer to Gallup's poll conducted on 22 June. Thestandard question wording and the unusual question wordin g are given in the notes to F igure 2.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    17/43

    W hy Are Presidental Election Polls So Variable? 425Bush support by question wording

    0.0 0.2 0.4 0.6 0.8 1.0'If election were held tomorrow. . .'

    Proportion undecided by question wording

    'If election were held tomorrow . . .'Fig.2. Question wording eflects, 1988*

    1. Republicans2. Conservatives3. Over $50,00O/year4. Whites5. $25-50,000lyear6. College education7. Non-South8. Men9. Over 30 years old10. Women11. Independents12. Under 30 years old13. No college education14. South15. Moderates16. $1 5-20,000lyear17. Under $1 5,000lyear18. Liberals19. Non-whites20. Democrats

    * See over o n p. 426 for a note on question-wording effects.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    18/43

    Notes on Figure 2This figure shows how the wording of survey questions affects the proportion of respondentswho support Bush, among those who express a preference, based on two surveys held at aboutthe same time in July. Along the horizontal axis is the standard question wording: 'If the 1988Presidential election were being held today, would you vote for George Bush for President andDa n Q uayle for Vice President, the Republican candidates, or for Michael Duk akis for Presidentand Lloyd Bentsen for Vice President, the Democratic candidates?' The alternative question isrepresented along the vertical axis: '(George Bush is the Republican nominee for president andMichael D ukak is is the Democratic nominee.) Which (1988) presidential cand idate will you defin-itely vote for in this year's election?' Each number in these figures represents a group of surveyrespondents, coded as shown on the right-hand side of the graph (the groups are ordered indecreasing support for Bush). Fo r example, at the top of the upper graph in this figure, the num ber' 1 ' indicates that about 80 per cent of Republican respondents supported Bush when asked thestandard question as compared to about 90 per cent under the alternative wording. Since mostgroups fal l on or near the 45" line, we conclude that the differences in question wording arenot very important to our analysis. However, the bottom figure indicates that question wordingcan greatly affect the propo rtion undecided.

    the average undecided rate differs substantially between the two surveys (ab ou t15 per cent undecided for the question on the horizontal axis and 60 per centfor th e questio n o n the vertical axis), wh ich, because of differing axes' labels,can be seen in the figure by no ting th at 10 per cent undecided on one polldoes not predict 10 per cent on the other. The unequal rate of undecidedrespondents is interesting bu t does no t explain why suppo rt for the candid atesvaried s o much over the course of the camp aign.Non-response bias. An othe r hyp othesis holds tha t survey responden ts selectivelyrefuse to answer, or say they will not vote, when their candidate is not doingas well as the other candida te. In othe r words, under this assu mp tion, votersare embarrassed to support the candidate that appears not to be doing well .For example, during one party 's convention, when an eventual Republicanvoter is interviewed at home after watching four days of a Democratic partyconvention, he may feel more comfortable saying he does not plan to voteo r is unsure of his can dida te preference. If true, this would prod uce a systematicitem non-response bias. Under this scenario. campaign events would have abig effect on reported support for the candidates, but could have no effecton the eventual outcome.This is a theoretically satisfying explanation, essentially providing a com-pletely self-consistent methodological answer to the question posed in the titleto this article. Indeed, before we gathered our data, this explanation seemedplausible to us. Unfortunately, it is now clear to us that this non-responsebias hypothesis is false.Figure 3 presents the evidence in the form of three time-series plots of theprop ortion undecided brok en d ow n by party identification, ideology and race."

    '' These propo rtions are corrected fo r differences due to varying survey methodologies acrossthe different survey organizatio ns.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    19/43

    W hy Are Presidental Election Polls So Variable? 427

    Undecided rate by party ID Undecided rate by ideology0.6 r 1 0.6,

    0.0I. 0 L-200 -150 -100 -50 0 -200 -150 -100 -50 0Days before election0.5 -2 0.4 -2 0,3rn5 0.2 -0.1 - Days before electionUndecided rate by race0.6 1

    ..' \- - - I* ' *\,.:-y'nd, * - , - - - ?\.-**,,.......... 8.......--...Rep ..,..... "...... .... .,9$ iDem .,:'.,

    0.0-200 -150 -100 -50 0Days before election

    ...I.I ,....-2 0.4 mod ,-----_ .& - .?- 2 - 1ons' ,,'--.-- ................... . 3, .z~l!ib- , . . .

    rn I5 0.20.1

    Fig.3. Trend s in undecided respondents, 1988Notes: This figure includes three time-series plots of the proportion of survey respondents whoreport being undecided as to their vote. E ach line in a plot represen ts a different grou p of voters.The party identification graph tracks political independents ('Ind'), Republicans ('Rep'), andDemocrats ('Dem'). The ideology graph tracks ideological moderates ('mod'), conservatives('cons'), and liberals ('lib'). The final graph plots white and non-white respondents. In most cases,the lines representing different groups within each figure move in the same rather than oppositedirections, which confirms that th e propo rtion undecided did no t vary by these groups.

    As can be plainly seen, the proportion undecided does not vary dramati-cally over the course of the cam paign. But, more im porta nt fo r this hypothesisis that the groups vary together, whereas if the non-response bias hypothesiswere true we would expect the oppo site. Th us, it could n ot be tha t Republicansare m ore likely t o report being undecided during the Dem ocratic convention,and conversely. The same holds for race an d for ideolog y.3'

    '' Other variables also give similar results. We show in the Appendix that party identificationand ideology are largely exogenous variables, not responding m uch t o changes in voter preferencesor any thing else that changes during the cam paign.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    20/43

    3.2. Journalists' TheoriesA n alternative way to resolve the parad ox of volatile polls and accurate forecastsis to dismiss the forecasts, as in the first hypothesis below, o r to ac com mo datethe forecasts to the journalists' interpr etation of the polls, as in the secondhypothesis.The forecasters were luc ky because Bush ran a good campaign and D uka kisa poor one. The simplest way to dismiss the pre-campaign forecasts of thepolitical scientists and econo mists is to say they were just lucky and h appe nedto coincide with Bush running a go od campaign and Dukakis running poorly.Evidence for this hypothesis is that Bush's rapid gain in the polls coincidedwith what seemed to be his particularly ad ept cam paigning.Th e success of out-of-sa mp le forecasts discussed in Section 1 causes us todoub t th is hypo thes is . Moreover , as d iscussed by ~ e w i s - ~ e c k , ~ *everal otherschola rs have also produced relatively successful presidential e lection forecasts(for previous elections) based o n different statistical mo dels.33All these mo delsd o reason ably well in many election years, no t only 1988. Th e success of allthese forecasts is clearly due to more than chance, and we feel that, at thispoint, the burden of proof lies with the critics who still believe the forecastersare merely lucky.In add ition, wha t seemed to the journalists t o be Bush's adep t campaigningmight just be a justification in hindsigh t of wha t 'explained' the polls. Ho wca n we test this alternative explanation of the media's in terpre tation? In otherwords, w hat c an be don e to avoid rationalization after the fact? On e possibilityis to use wh at journalis ts identified as the keys to success in previous cam paignsand see how the Bush and Dukakis campaigns should be judged accordingto those rules.This is easily resolved: in all recent presidential election campaigns before1988, the main rule, according to the media, was which candidate was bettera t 'acting presidential'. Bush was the first candid ate in modern times directlyto attac k his op pone nt, w hich clearly violates the rule. In recent previous cam-paigns, this task was taken up by the vice-presidential candidate, campaigncomm ercials, or prom inent sup porter s, but never by the presidential cand idate.Th us, from this media perspective, Du kakis actually looked better tha n Bushduring the campaign, since he was acting in more presidential style. If thepolls had continued to favou r Dukakis, a nd he had won the election, we doubtwhether the media would have changed their criteria for evaluation. It maybe th at Bush's strategy was effective, bu t in this case the 1988 election providesonly a hypothesis, not a confirmation of one. On the other hand, althoughresolving these points without careful studies of the effect of campaign mediaevents is pro bably impossible, it does seem (almost!) undeniable at o ther timestha t events in the campa ign d o influence the poll results.

    3 Lewis-Beck, 'Election Forec asts in 1984: Ho w Ac cur ate Were They?'3 3 See, for example, Fair, 'The Effect of Economic Events on Votes for President ' an d up dates.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    21/43

    W h y Are Presidental Election Polls So Variable? 429Unbalanced campaigns or predictable convergence. Another hypothesis holdstha t the polls were accur ate indicators of the candidates' fortunes throu ghou t,and that they varied because Dukakis was legitimately ahead at the start ofthe campaign while Bush ran a better campaign and won the election. Rathertha n claim th at the forecasters were lucky, this model assumes tha t the electionresult was successfully forecast because the convergence of the poll resultsto the general election outcom e was predictable. Th us, under this hypothesis,sup por t for the cand idates really did change over the campa ign, but this changewas successfully predicted by the forecasts.This hypothesis mixes journalists' and political science theories, in that itaccepts the forecast, but still follows the story of the polls t o und ersta nd whyBush won. It accords w ith the methods, but no t the theories, of political science.

    This hypothesis has a reasonable construction and is internally consistent.How ever, it does no t explain why any foreca sts should predict tha t Bush wouldrun a better campaign - especially since the fore casting models include no thingwhich measures the two cand idates ' skills as cam paigners. Certainly few journ-alists had any idea this was going to happ en. M oreover, if Dukakis's adviserscould ha ve predicted th at they were going to run a poo r camp aign, they certainlywould have changed their strategy - hus mak ing the forecast incorrect.Uninformed voters. A final explanation posed by journalists is at the level ofthe voter. According to this idea, many people, or at least enough to swingelections, vote on the basis of factors that political scientists would not call'fundam ental', such as the pers onality of the candidate s, gaffes, speak ing style,campaign events and the like. Under this explanation, the voters who decidethis way may truly care about these factors, or may just not know enoughabout the fundamental variables to make an informed decision. This modelexplains the swings in the pre-election polls, but does not explain how pre-campaign forecasting methods predict so well, given that the political scienceforecasts d o not even try to a ccou nt for personalities an d campa ign events.3.3 Political Science TheoriesIn co ntrast, the political science theories take as a s tarting point that the abilityof economists and political scientists to forecast election results accuratelymo nths ahead of time is evidence that the election came ou t just as predicted.We present two flawed explanations here: the first is quite possibly true, butincomplete, as it does not address the relation between the campaign and theopinion polls. The second hypothesis is plausible but can be refuted by ourindividual-level poll da ta .Balanced campaigns. Un der this hypothesis, forecasting models worked in 1988because the campaigns were balanced, and thus the election outcome occ urredroughly as Rosenstone and others had predicted on the basis of informationavailable mo nths before the election.Although most journalists seem to deny it, political scientists believe thishypothesis to be almost certainly true. Un fortuna tely, even if true, it provides

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    22/43

    n o so lution to the key puzzle in the contex t of a model of voter decision m aking .The 1988 presidential election, like all modern presidential elections in whichno incumbent was running, pitted two m ajor-party campaigns th at were roughlyequal in strength an d resources. Th ere are plenty of examples durin g the cam-paign when astute po litical observers could suggest instances where o ne candi-date could have done something better, but with equal funding and the bestadvisers each party has to offer, it would be surprising to see a campaignas unbalanced as for many voter referendums or for num erous local elections.We suspect that if a presidential election happened to be severely unbalanced(beyond the predictable unbalance associated with incumbency), political sci-ence forecasting models would probably not perform well. We happen notto have observed an y such instances in mo dern times.The fact that modern presidential campaigns seem to be balanced, whichis consistent with the political science model of voter decision making, doesno t solve the puzzle ab ou t why the polls varied so much. Th e media wisdomabout the 1988 election is that the outcome is explained by Dukakis runninga poor campaign. Of course, this denies the hypothesis that the campaignsare balanced.Thu s, un de r the political science model, balanced campaigns cause no theo ret-ical problems, but they say nothing about why the polls should vary so much.Under the journalists' model, balanced campaigns are inconsistent with theobserv ation tha t the polls vary a lot. In neither case does this hypothesis explainthe paradox.34Partisans returning to the fold. Under another hypothesis, in January thereis a large mass of undecided voters, and over the course of the campaign,the number of those who report being undecided drop, as different groupsmove towards their natural home. This is observationally similar to the non-response bias hypothesis, but is theoretically very different. An elaborationof this hypothesis is that strong partisans come home to their party first, thenweaker partisans, an d so on . Different events bring in different groups of voters,but under the hypothesis being discussed here, the strong ones come homefirst, then su bseq uent events bring in others later. In this model, the campaignratchets in new group s of voters, who, once they migrate to the 'decided' c ate-gory , tend t o stay with their preference - perha ps du e t o p sychological justifica-tion mechanisms.The key evidence against this thesis is that the proportion of undecidedvoters does not drop over the course of the campaign (refer back to Figure3). It is especially noteworthy that the proportion undecided does not dropdu ring times of massive shifts in the polls (a s recorded in Figu re 1). T he elabo -ration of this hypothesis also seems wrong since stro ng Repub licans supp orte d

    34 The two models are also inconsistent with one another about the evidence they provide onwho ran a better campaign in 1988. Con trary t o the journalists ' claims (and even Duk akis himself),most political science models showed Duk akis d oing as well or even better th an expected, perhap sbecause Duk akis's vice-presidential selection was better (from a n electoral perspective) than Bush's.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    23/43

    W h y Are Presidental Election Polls S o Variable? 431Bush from the start, and did no t move much over the course of the campa ign.This can be seen in the first time-series plot of Figure 4 of Bush support byp a r t y i d e n t i f i ~ a t i o n . ~ ~oreover, support for Bush among the Democratsactually increased during the campaign, exactly opposite to what would beexpected und er this hypothesis. Sh ort-term changes in overall supp ort for Bush(conceivably in response to specific cam paign events) actually ap pe ar to occurfor Democrats, Republicans and Independents equally: the three series movetogether. Indeed, the same appears true for Bush support broken down bythe other variables in Figure 4. It thus appears quite clear that support forthis hypothesis in these dat a is largely non-ex istent.We do believe that voters are coming home to their natural preferences,but not that they are following the particular pattern of returning to the foldby party identification.3.4. Rational Actor TheoriesThese theories are also political science theories, but they differ from thosein the other categories because they are based on specific assumptions aboutindividual voters. Because of the lack of any contrary evidence, we assumefor each of the theories tha t vo ters answer survey questions abo ut candidatesupport sincerely. This is consistent with theoretical evidence from two-candidate, winner-take-all races, where there is not much point in strategicvoting. Moreover, it does not differ dramatically from the voting situationwhich, although somewhat m ore behav ioural, is no t more costly.Full information. Con sider first the extreme version of the rational ac tor mo del.Acc ording to this model, people(1) have full information throughout the campaign about their fundamentalvariables,(2) are using all the information they have at any time to form their surveyresponse o r voting decision, and(3) are rationally accounting for this uncertainty, in the sense of maximizingsome expected utility.If this model were accurate, political scientists would still forecast accurately,but the trial heat polls would not change at all over the campaign. Since thepolls obviously do change, this model can be rejected, but it will neverthelessbe useful in clarifying related models, as well as our preferred explanationpresented in Section 4.Incomplete information. An incomplete information model assumes, from thefull information model, that (1) is incorrect, but (2) and (3 ) hold. That is,voters gather information over the campaign, use this information in makingtheir decisions, and rationally account for their uncertainty. If this model,

    '' The Appendix shows that party identification and ideology in the population are roughlyconstant during the campaign.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    24/43

    Ideology

    Days before election Days before electionRace Reaion

    Days before election Days before electionSex Income

    1.0 < 1.0.

    0.0I.0-200 -150 -100 -500 0 -200 -150 -100 -500 0Days before electionays before election

    Education

    0.0I200 -150 -100 -500 0Days before election

    Fig. . Trends in supportfo r Bush by group, 1988Notes : This figure includes time-series plots of the proportion of survey respondents supportingBush over Dukakis. Each graph tracks two to four groups, identified by the abbreviation onthe left-hand side. These are defined more precisely in the Appendix. The lines within each graphtend to move together rather than in opposite directions, indicating that these different groupsresponded in a similar manner to ev ents during the camp aign.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    25/43

    W h y Are Presidental Election P olls So Variable? 433were correct, political science forecasts would work, as they do. On averageover the whole campaign, we would expect changes in polls to occur in thedirection of the forecasts; that is, as voters gathered more information, theywould gradually move in the direction of their fundamental variables. This,too , is consistent with the evidence.However, the model implies tha t changes at an y one time during the cam paignwould be relatively small, because voters would app ropri ately judge their uncer-tainty, at all times estimating the values of their fundamental variables andcandidate positions. S harp sho rt-term changes in the polls - deviations froma trend towards the forecast poll positions - would occu r only when campaignevents were unexpected, such as if a candidate did much better than expectedin a debate, or made a surprise change in his or her stand on an importantissue.This model is partly right, but since we find (and show below) that thepolls do respond t o in formation th at almost certainly was anticipated by voters,we reject this e ~ p l a n a t i o n . ~ ~4 . T O W A R D S A N E X P L A N A T I O N F O R P O L L V A R I A T I O NSection 3 raised and then provided sufficient evidence to dismiss several plaus-ible hypo theses of why the trial heat polls vary s o much , even given our abilityto forecast presidential election outcomes. We now turn to our preferred, butquite tentative, ex plan atio n, for which we present evidence in Section 5.Our working hypothesis is that voters cast their ballots in general electioncontests for president on the basis of their 'enlightened preferences'. As withthe concept of enlightened preferences in the political philosophy ~ it er at ur e, ~ 'we d o no t require t ha t people be able to discuss these preferences intelligentlyor even to know what they are; we only require that they know enough thattheir decisions are based o n the tru e values of the funda me ntal variables. Th efunction of the campaign, then, is to inform voters about the fundamental

    '6 According to Con dorcet 's ' jury theorem', if some voters have incomplete information, then,under certain conditions. a majority-rule electoral system will produce outcomes equivalent tothe situation that would exist if all voters were informed. This is obviously relevant to o ur inquiry,except that the assumptions required to prove this theorem are far too restrictive. Scholars haverecently been quite successful at drop pin g some of these restrictive ass ump tions, so perha ps inthe near future the two lines of research might converge. (See Nicholas R . Miller, 'Information,Electorates, and D emocracy: Some Extensions and Interpretations of the Condorc et Jury Theorem',in Bernard Grofman and Guillermo Owen, eds, Information Pooling and Group Decision Making(Greenwich, Conn.: Jai Press, 1986); Krishna Ladha, 'Condorcet 's Jury Theorem, Free Speechand Correlated Votes', American Journ al of Political Sci ence, forthcom ing.) Related work in experi-mental economics has studied how markets proceed on the road to various types of equilibria.(See Charles R . Plott, 'An Upd ated Review of Industrial Organization: A pplications of Experimen-tal Methods', in R. Schmalensee and R. D. Willig, eds, Handbook of Industrial Organization,Volume 11 (Amsterdam: Elsevier Science Publishers, 1989); and 'Industrial Organization Theoryand Experimental Economics', Journal of Economic Literature, 20 (1982), 1485-1527.)

    '' Dahl , Democracy and Its Critics.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    26/43

    4 3 4 G E L M A N A N D K I N Gvariables and their appropriate weights; notably, the candidates' ideologiesan d their positions o n ma jor issues.

    According to this explanation, only the second of the three assumptionsunder the full information rational actor model (see Section 3.4) is correct.That is, voters do not have full information and do not rationally judge orincorporate their uncertainty, but they do gather and use increasing amountsof inform ation over the course of the cam paign, with the largest increase occur-ring just before election day .38We also assume that voters answer surveysab ou t candidate su ppo rt sincerely. We elabo rate this model here.A t the start of the campaign, voters do no t have the inform ation necessaryto ma ke enlightened voting decisions. Ga therin g this information is costly an dmost citizens have no particularly good reason to gather it in time for thepollster's visit, so long as it can be gathered w hen needed on election day .Most polls ask whether the respondent intends to vote, and the questionapp ears to be answered sincerely an d relatively accurately. Likely voters withinsufficient information at the time of the poll still report that they will casta ballot o n election day. Un fortu nate ly, those who consider themselves 'voters'are willing to report to pollsters their 'likely' voting decisions, even if theyhave not gathered sufficient information to make this report accurate. Thereason is the quite general point, as much psychological research has shown,that hu ma n beings are very p oor at estimating uncertainty and at makin g fullyrational decis ions based on uncertain or incomplete i n f ~ r m a t i o n . ~ ~eople alsomake decisions based on these incorrect uncertainty judgements, producing,in only this narrow sense, ' irrational' decisions. Compounding the problemis the a wk wa rd situ ation of the survey interview: imagine survey respondents,who, when asked, indicate that they will vote; then they, when later askedfor the name of the candidate who will get their vote, are embarrassed toreveal their ignorance o r uncertainty, especially after already saying that theywould vote.40Th us, w ithout sufficient knowledge of their fun dam enta l variables, an d whenasked to give an opinion anyway, most respondents act as they will in thevoting booth on election day: they use information at their disposal abouttheir fun dam enta l variables, a nd repo rt a 'likely' vote to the pollster. We believetha t this re port to the pollster is sincere, bu t the survey response is still based

    '' See Samuel Popkin , Th e Reasoning Vot er: Com munication and Persuasion in Presidential Ca m-paigns (Chicag o: University of Chica go Press, 1991).

    " Daniel Kahn eman , Paul Slovic and A mos Tversky, eds, Judgm ent Under Uncertainty: Heuris-tics and B iases (New York: Cambridge University Press. 1982).Designing surveys so as to reduce this emb arrassm ent, making it easy to report 'no o pinion ',would not necessarily improve the forecasting ability of the polls, since those voters who expressa 'certain' opinion seem to mirror th e survey popu lation a s a whole; see the discussion of questionwording in Section 3.1 and Figure 2. A very useful future research project would be to designa survey or experiment to encourage voters to account rationally for their uncertainty (perhapsby giving them more time or financial incentives to give the 'right' answer), and see if it makesa difference to their reply.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    27/43

    W hy Are Presidental Election Polls So Variable? 435on a different information set from that which will be available by the timeof the election. It will therefore differ systematically from the eventual voteto the extent that the voter's inform ation set improves over the course of thecam paign. In relatively high-info rmation , balanced ca mp aigns, voters graduallyimprove their knowledge of their fundamental variables and generally havesufficient information by election day.Thus, the campaign itself will confer no large unexpected advantages onone party or the other. This acco unts for forecasting models, based o n infor-mation available only at the start of the general election campaign, workingwell. How ever, this does no t m ake the cam paign irrelevant, because withoutit election outcom es would be very different. Mo reover, if one cand ida te wereto slack off and not campaign as hard as usual, the campaigns would notbe balanced an d the election result would also be likely to change. Thu s, und erthis expla nati on, presidential election camp aigns play a central role in makin git possible for voters to become informed so they can ma ke decisions accordingto the equ ivalent of enlightened preferences when they get to the voting b oot h.This process then depends on the media to provide information, which theydo throughout the campaign, and the voters to pay attention, which they dodisproportion ately just before election day .Note that we are no t arguing that there exists an identifiable group of unin-formed voters, who gradually become more informed than other groups overthe course of the campaign. While it is undeniably true that knowledge variesconsiderably across citizens at any one time, we find that virtually all groupsof eventual voters have their preferences gradually enlightened duri ng the cam-paign by roughly the same amo unts.If this explanation of ou r central puzzle is corre ct, the only remaining qu estionis no t why the polls move in the direction they do ; we already kno w th atthey move in the direction of the political scientists' forecasts. The relevantquestion is why they begin where they do. Our hypothesis is that the earlyposition of the polls is a result of the information that is readily availableat the start of the general election campaign. For example, Dukakis's raceagainst Jesse Jackson alone at the end of the Dem ocratic nomin ation positionedhim as quite conservative. In part as a result of this, Dukakis was seen atthe start of the general election campaign as more conservative than he was(and at times even more conservative than Bush). As citizens learned moreabout the appropriate values of their fundamental variables, voter supportfor the candidates responded.

    5 . E V I D E N C E F O R E N L I G H T E N E D P R E F E R E N C E SAs we indicated at the start of this article, we have much more evidence aboutwhy many possible explanations are wrong than about which one is right.In particular, we are handicapped in our analysis here by having no directmeasures of voter information over the cam paign , o r of some of the fundam ental

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    28/43

    variables the forecasters use in their model^.^' Ou r strategy, then, is to extractwhatever information is available in our data, and leave it to future researchto m ore firmly establish or refute this explanatio n.We begin by providing evidence that preferences early in the campaign arerelatively unenlightene d. F ro m one perspective, this should neither be difficultnor perhaps even necessary to show, since numerous scholarly studies havedemonstrated the ignorance of Americans about most matters of policy andpolitics. How ever, we do n ot require citizens to be able to verbalize their motiva-tions o r detailed positions on their fun dam ental variables. T he idea of makingvoting decisions on the basis of enlightened preferences only requires tha t voterscast their ballots in the sam e ma nne r as they would if they had full info rm atio nand time for a complete consideration of all issues. Thus, survey questionsab ou t citizen knowledge would no t directly answer ou r concerns. Fo r the samereasons, it would also not be a good strategy to ask survey respondents w hattheir fundamental variables are. A measure of the 'revealed preferences' ofthis group of citizens would be better, but one cannot observe individual-levelpolitical behaviour in polling dat a.Instead we look for systematic discrepancies between actual voter supportand expected su pport, w hich we calculate on the basis of measured dem ograph icand fundamental variables. We do this in four different ways in this section,each a different observable consequence of the same theory of poll variationdescribed in Section 4. We begin in Section 5.1 by dem ons tratin g the 'irra tion a-lity' of early poll movem ents. Section 5.2 shows that the fu nda m enta l variablesare of increasing imp ortance over the campaign. We explore how voters weightthe fund ame ntal variables in decision-making in Section 5.3 an d demo nstratein Section 5.4 th at ch anges in these weights, an d no t the values of the variables,are wh at accoun t for polls fluctuations.

    5.1. T he 'Irrationality ' o f Early Poll Movem entsWe first demonstrate that voters do not 'rationally' account for uncertaintyin using information to make decisions about supporting presidential candi-date s. We show th is by focusing on predictable cha nges associated with totallyexpected campaign events, something that should not occur if survey respon-den ts were fully ration al.Figure 1 presents the prop ortion of supporters for each party over the courseof the campaign, an d mark s the dates of the Dem ocratic and Republican partyconven tions. In orde r to see the effects mo re clearly of these party con ventionson support for the presidential candidates, Figure 5 plots the proportion sup-" Some of the most important variables forecasters use do not change over the course of thecampaign, such as incumbency status and some other national variables. Th at we have no infor-mation on these does not affect our inferences because they are effectively controlled by beingheld cons tant. T he remaining variables tha t might have som e effect include perceived econo micwell-being and perceived ideological distances between voters an d candidate s, both of which mightchange over the campaign.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    29/43

    W hy Are Presidental Election Polls So Variable? 437portin g the Republican cand idate before and after each convention since 196 4.~ 'Republican conventions are marked 'R' and Democratic conventionsmarked 'd'. If a point appears above the 45" line, Republican support wentup after the convention; if it is below the line, Republican support dropped.If these conventions had no effect on the level of support, the points wouldbe scattered random ly o n a nd ab ou t the 45" line. The results are unambiguou s:support for the Republican candidate increased after all Republican conven-tions and decreased after all but one Dem ocratic convention. Th e 1988 conven-tions, which are circled, are fairly typical of the points on the graph, lendingcredence to ou r mo re detailed analysis of th at election year.43

    0.3 0.4 0.5 0.6Republican supportbefore the conv ent ion

    Fig. 5. Effects of pa rt y conventions on presidential campaign polls, 196448Notes: This figure summarizes all the plots in Figure 1 before and after the party conventions.Each 'R ' refers to survey support fo r the Republicans before an d after a Republican convention;each 'd ' indicates support before and after a Democratic convention. When a symbol appearsabove the 45" l ine, i t indicates that support for the Republican candidate increased during theconvention, whereas symbols below the line indicate that support for Bush declined. Note thatall R's appear above the line and almost all d 's ap pear below the line. Th e 1988 conventionsare circled an d a ppear typical of public opinion swings during the conventions.

    42 We omit 1952-60 from Figure 5 because Ga llup did n ot list polls between the two conventionsfor those years.James E. Campbell, Lynna L. Cherry and K enneth A . Win k, 'The Convention Bump', Ameri-can Politics Quarterly (1993, forthcom ing) also discuss poll movem ents during conv entions.

  • 7/29/2019 Gelman and King - Why Are American Presidential Election Campaign Polls So Variables When Votes Are So Predict

    30/43

    Th e clear results from F igure 5 are consistent with o ur exp lanation in Section4, for if people were informe d a nd reflective abo ut their cand idate preferencesearly on in the camp aign, they would also be able to predict tha t their opinionswill change after each party convention. In that case, they would realize thatthey sho uld change these preferences immediately. Thu s, if people were ratio-nally incorp orating their uncertainty a bo ut futu re events we would no t witnessany predictable changes in support for the candidates. Recall that if the fullo r incomplete inform ation ration al models were correct, only unexpected infor-mation would change voter preferences. Yet, almost all aspects of modernpolitical conventions have also been extremely predictable, from the nomineeto most aspects of the platform, and even the 'spontaneous demonstrations'on the convention floor for various candidates. We know the conventions pro-duce almost exclusively expected information from merely watching the newson the days leading up to the conventions. Moreover, any voter wh o was awaredu ring the con vention four years earlier (o r was reminded of this by the med ia)should not be surprised by a nything tha t happens during any recent politicalc ~ n v e n t i o n . ~ ~This logic also applies more generally, if the political science foreca