Invited session:Index models for forecasting
1. Introduction of the index methodAndreas Graefe
2. Predicting the effectiveness of advertisements: A Validation test of the index methodScott Armstrong
3. Who should be nominated to run in the 2012 U.S. Presidential Election?Andreas Graefe
4. Comment on „Who should be nominated to run in the 2012 U.S. Presidential Election?“Michael Lewis-Beck
Index method
Structured approach for summarizing domain knowledge (e.g., prior research, expert knowledge).
Long history in forecasting and decision-making.
An early advocate: Benjamin Franklin
“Divide half a sheet of paper by a line into two columns, writing over the one Pro and over the other Con.
I put down under the different headings
short lists of the different motives, that at different times occur to me for and against the measure.
When I have got them all together in one view, I endeavour to estimate the respective weights.”
An early application: Charles Darwin’s decision on marriage
Index method procedure
1. Identify variables from domain knowledge (i.e. prior empirical studies and/or subjective judgment by experts)
2. Use prior evidence to determine their directional influence on the outcome
3. Estimate relationships:
- Tallying (i.e., 1: favorable; 0: unfavorable/unknown)- Linear model, Dawes’ rule (i.e., 1: favorable; 0: unknown; -1
unfavorable)
4. Sum up scores using unit or equal weights.
5. Select the option that is favored by most variables.
Advantages of Index models
No limit in the number of variables
No need to estimate weights from the data and thus- No sample necessary for building the model.- Easy to incorporate new variables in the model
Ease in understanding.
Disadvantages of Index models
Expensive to summarize prior knowledge.
Difficult to account for effect size (coefficients and the amount of change in predictor variables).
Conditions favoring the index method
1. Many variables
2. Few observations
3. Much prior domain knowledge
Who should be nominated to run in the 2012 U.S. presidential election?
Long-term forecasts based on candidates’ biographies
Andreas Graefe, Sky Deutschland
J. Scott Armstrong, Wharton School, University of Pennsylvania
This talk is an extension of : tinyurl.com/bioindex
International Symposium on ForecastingPrague, June 27, 2011
Outline
1. Status-quo in election forecasting
2. Index models for forecasting elections
3. Bio-index model
4. 2012 forecasts of the bio-index model
U.S. Presidential Election forecasting:Evolution
1978: Economist Ray Fair publishes regression model that focuses on economic growth and inflation as predictor variables.
[Fair, 1978]
Over the next three decades, others would follow with models that use slightly different variables.
U.S. Presidential Election forecasting:Status quo (I)
In a brief review of 14 quantitative models by economists and political scientists
All were regression models12 used a measure of the state of the economy 7 used a measure of the incumbent’s popularity5 used both
[Jones & Cuzan, 2008]
Most models are economic vote models.
On average, these models perform well.
Widely-held view that a presidential election is a referendum on the incumbent president’s ability to handle the economy
Presidential campaigns, individual differences among candidates, and parties are assumed to have little impact on election outcomes.
U.S. Presidential Election forecasting:Status quo (II)
But what about the candidates?
Candidates play a vital role in election campaigns and are extensively discussed in the media, e.g. their
- Biography (experience)- Personality- Stands on the issues- Endorsed policies
Yet, no existing model uses such information.
Most existing models provide little help for decision-making in campaigns (e.g., Cuzan & Bundrick (1984), being an exception of fiscal policy)
Research with decision-making implications
Develop models that can help to advise…
A candidate’s decision on whether to run for office?
A party’s decision about who to nominate?
Decisions as to what issues a candidate should emphasize in a campaign?
Decisions as to which policies to endorse?
Improving the PollyVote forecast
Combining forecasts is most beneficial if one uses forecasts from different methods that use different information.
[Armstrong, 2001]
Index model forecasts should contribute to the accuracy of the combined forecast.
Outline
1. Status-quo in election forecasting
2. Index models for forecasting elections
3. Bio-index model
4. 2012 forecasts of the bio-index model
The first index model for forecasting electionsLichtman’s 13 Keys to the White House, has been published for
years. He made forecasts of the past 38 presidential elections (7 prospectively).
[Lichtman, 2008]
In all cases, the model’s predictions of the popular vote winner have been correct. No other approach has come close to this record.
From 1984 to 2004, Lichtman’s “Keys” yielded forecast errors of the popular vote shares almost as low as three established econometric models.
[Armstrong & Cuzán, 2006]
Current forecast for 2012: Obama 55.0%.
The issue-index model(tinyurl.com/issueindexmodel)
Predicts U.S. presidential election winners based on how voters expect the candidates to handle the issues
Examples: economy, budget deficit, Afghanistan, health care
Correctly predicted the winner in 9 of the 10 elections from 1972 to 2008 and thereby outperformed polls, prediction markets, and many econometric models.
[Armstrong & Graefe, 2011]
Current forecast for 2012: Obama 54.1%(check out pollyvote.com for continuous coverage)
Outline
1. Status-quo in election forecasting
2. Index models for forecasting elections
3. Bio-index model
4. 2012 forecasts of the bio-index model
Prediction problem
Forecast U.S. presidential election outcome from information about candidates’ biographies
Condition 1: Few observations
Biographical data were collected for the candidates of the two major parties for the 29 U.S. Presidential Elections from 1896 to 2008.
Condition 2: Large number of variables
The bio-index uses 59 variables.
Examples: - Single child - Prestigious college- Intelligence - Political positions held- Weight - Military experience- Age - Race- Education - Gender
Condition 3: Much domain knowledge
Large body of literature in psychology on the effect of biographical traits on leadership
[Antonakis, 2011: Predictors of leadership]
- Traits that objectively matter (e.g., intelligence, height)- Traits that seemingly matter (e.g., facial competence,
physical attractiveness)
We expected that voters are influenced by both types of traits when making their voting decisions.
Example of a biographical factor: Facial competence
Prior research found that facial competence led to
- 68% correct predictions in U.S. Congressional elections
[Todorov et al., 2005]
- 72% correct predictions in French parliamentary elections, even by children
[Antonakis & Dalgas, 2009]
- May and August 2007: Most competent looking candidates were Clinton & Obama for the 11 Democrats, and McCain, for the 13 Republicans
[Armstrong et al., 2011]
Bio-index
For each variables, the directional impact on election outcome was determined based on
- prior research (e.g. intelligence, height, beauty, facial competence)
- common sense (e.g. married, not divorced, has children)
CodingFour coders searched candidate’s biographies, fact books,
encyclopaedias and used data from prior studies or polls.
Yes / no variablesEach candidate was assigned a score of 1 if he possessed a
certain trait (at the time of the election campaign) and 0 otherwise
Examples: Orphan, Single child, Not divorced, Governor,…
Comparative variablesEach candidate was assigned a score of 1 if he scored higher
than his opponent on a particular cue and 0 otherwiseExamples: Height, IQ, beauty
Procedure for predicting the winner
Calculate the overall index score for each candidate.
Decision rule (Bio-index heuristic)
Predict the candidate with the higher index score to win the popular two-party vote.
Performance of the Bio-index heuristic
- Correctly predicted 27 out of 29 elections winners; - Hit rate: 93% correct predictions- Missed Carter in 1976 and Clinton in 1992
- Higher hit rate than - Election Eve Gallup polls (15 of 19), - Election Eve prediction market forecasts (22 of 26), and the - average of three established regression models (12.5 of 15.5)
- Relying only on information from the respective election year- Providing long-term forecasts
Bio-index model for predicting vote-shares
The simple heuristic performs well in predicting the winner.
But it does not allow for predicting the popular vote share.
Bio-index modelSimple linear regression to relate the relative index score (I) of
the incumbent to the popular vote (V)
Vote equation: V = 18.0 + 0.65 * I
Bio-index vs. 7 regression models (1996-2008)
Model Date of forecast 1996 2000 2004 2008 MAE
Bio-index January 4.4 2.3 0.4 0.2 1.8
Norpoth January 2.4 4.7 3.5 3.6 3.5
Abramowitz Late July 2.1 2.9 2.5 0.6 2.0
Fair Late July 3.5 0.5 6.3 2.2 3.1
Wlezien and Erikson Late August 0.2 4.9 0.5 1.5 1.8
Lewis-Beck and Tien Late August 0.1 5.1 1.3* 3.6 2.5
Holbrook Late August 2.5 10.0 3.3 2.0 4.4
Campbell Early September 3.4 2.5 2.6 6.4* 3.7
* Predicted wrong election winner
Absolute error of out-of-sample forecasts for the past four elections Bio-index MAE as low as MAE of most accurate modelBio-index forecast calculated long before the forecast of most other model
Limitations of the bio-index
CostsMust summarize prior knowledge about the field.Must have various coders
Acceptability Easy to understand and thus easy to criticize.
People wrongly believe that complex methods are necessary to solve complex problems. They exhibit a general resistance to simple solutions.
[Hogarth, in press]
Benefits of bio-indexSimple to use and easy to understand.
Contributes to accuracy of the PollyVote by using a different method and drawing upon different information
Can aid political decision-making
1. Can help political parties in nominating candidates for office.
2. Can help political candidates to decide whether to run for office.
Outline
1. Status-quo in election forecasting
2. Index models for forecasting elections
3. Bio-index model
4. 2012 forecasts of the bio-index model
Candidate
Chance to win GOP nomination Chance to win
election (Intrade**)
Index score difference
Index model forecast
RCP polls**Intrade**
5.3 17.1 6.6 +1 50.3
David Petraeus 0 0.1 0 0 49.5
Newt Gingrich* 7.1 1.3 1.0 -2 47.9
Donald Trump 0 0.2 0.6 -2 47.9
Michele Bachmann* 6.3 9 2.8 -2 47.7
Rudy Giuliani 11.0 1.8 0 -3 47.0
Mitt Romney* 24.4 35.6 16.5 -4 46.3
Tim Pawlenty* 4.9 9.8 4.0 -4 46.1
Rick Santorum* 3.7 0.6 0.2 -4 46.1
Jon Huntsman* 1.3 9.6 5.4 -5 45.3
Sarah Palin 16.0 5.1 3.2 -5 44.6
Ron Paul* 6.9 2.4 1.7 -6 44.4
Mike Huckabee 0 0.2 0.2 -6 43.8 Herman Cain* 9.3 2.0 1.3 -7 43.0
* Announced to run; **RCP and Intrade forecasts as of June 25, 2011
Candidate
Chance to win GOP nomination Chance to win
election (Intrade**)
Index score difference
Index model forecast
RCP polls**Intrade**
Rick Perry 5.3 17.1 6.6 +1 50.3
David Petraeus 0 0.1 0 0 49.5
Newt Gingrich* 7.1 1.3 1.0 -2 47.9
Donald Trump 0 0.2 0.6 -2 47.9
Michele Bachmann* 6.3 9 2.8 -2 47.7
Rudy Giuliani 11.0 1.8 0 -3 47.0
Mitt Romney* 24.4 35.6 16.5 -4 46.3
Tim Pawlenty* 4.9 9.8 4.0 -4 46.1
Rick Santorum* 3.7 0.6 0.2 -4 46.1
Jon Huntsman* 1.3 9.6 5.4 -5 45.3
Sarah Palin 16.0 5.1 3.2 -5 44.6
Ron Paul* 6.9 2.4 1.7 -6 44.4
Mike Huckabee 0 0.2 0.2 -6 43.8 Herman Cain* 9.3 2.0 1.3 -7 43.0
* Announced to run; **RCP and Intrade forecasts as of June 25, 2011
Summary
The bio-index predicts a tough time for Republicans to gain back the White House.
Of 14 potential nominees, currently only Texas Governor Rick Perry achieves an index score higher than Obama.
Limitations:- - Some variables have not yet been estimated
(e.g., facial competence, intelligence, weight).
- The bio-index model ignores much information such as performance and ability to handle issues.
The primary concern is with finding candidates that are within the index score range of Obama.
Are you fit to be president?
You think you know a candidate who could win against Obama in 2012?
Or, you want to test your own chances to win?
Check out the Are you fit to be president? test at
www.pollyvote.com
Conclusions
We used the index method to develop the bio-index, which is based on 59 cues about candidates’ biographies.
The bio-index correctly predicted the winner in 27 of the last 29 U.S. Presidential Elections.
Out of 14 potential Republican nominees, only Rick Perry is predicted to defeat Obama in a potential 2012 showdown
PollyBio contributes to the accuracy of long-term election forecasting and can help parties to select candidates running for office
Further improvements in accuracy are expected based on the index method – which itself can be used in many other applications.
Conditions for forecasting U.S. Presidential Elections
Conditions for forecasting U.S. Presidential Elections
Condition favoring
Multiple regression Index method
Few observations
(data on about 25 elections)
Many variables
Much domain knowledge
(e.g., expertise, prior studies, polls)
Conditions favoring the index method. In particular, if one wants to incorporate individual differences between candidates.
Outline
1. Status-quo in election forecasting
2. Index models for forecasting elections
3. Bio-index model
4. 2012 forecasts of the bio-index model
5. Future applications of the index method
6. Conclusions
Future work on index models (1)
Predict the election outcome based on how voters perceive the candidates’ personalities.
E.g., which of the candidates is more likable, honest, etc.
Implications for decision-making:- Helps candidates to decide whether to run- Helps parties to decide who to nominate
Future work on index models (2)
Predict the election outcome based on how voters agree with candidates’ positions on policies.
Implications for decision-making:- Helps candidates to decide which policies to pursue.
Examine policies related to issues such as gun control, income taxes, free trade, abortion, government spending to see which candidate is closest to the opinions of the voters on more policies.
The Index Model Challenge
Index method will be more accurate than econometric models in situations with
- many variables- much prior knowledge (especially experiments)
and- lack of data, measurement errors, and
collinearity.
Examples: Selecting CEOs, drafting athletes, marriages, economic growth rates of nations, value of real estate, medical treatments, effectiveness of ads.
Background: PollyVote.com project
The PollyVote project was begun in 2003 to demonstrate the value of forecasting principles by applying them to election forecasting.
The initial focus was on combining forecasts.