Post on 13-Jan-2016
description
transcript
ResearchResearch
Prediction Markets and the Wisdom of Crowds
David Pennock, Yahoo! Research
Joint with:
Yiling Chen, Varsha Dani, Lance Fortnow, Ryan Fugger, Brian Galebach, Arpita Ghosh, Sharad Goel, Mingyu Guo, Joe Kilian, Nicolas Lambert, Omid Madani, Mohammad Mahdian, Eddie Nikolova, Daniel Reeves, Sumit Sanghai, Mike Wellman, Jenn Wortman
ResearchResearch
Bet = Credible Opinion
• Which is more believable?More Informative?
• Betting intermediaries• Las Vegas, Wall Street, Betfair, Intrade,...• Prices: stable consensus of a large
number of quantitative, credible opinions• Excellent empirical track record
Obama will win the 2008 US Presidential election
“I bet $100 Obama will win at 1 to 2 odds”
ResearchResearch
A Prediction Market
• Take a random variable, e.g.
• Turn it into a financial instrument payoff = realized value of variable
$1 if $0 if
I am entitled to:
Bin Laden captured in 2008?(Y/N)
Bin Ladencaught ’08
Bin Ladencaught ’08
ResearchResearch
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
http://intrade.com
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
ResearchResearch
Outline
• The Wisdom of Crowds• The Wisdom of Markets
• Prediction Markets:Examples & Research
• Does Money Matter?• Combinatorial Betting
Story
Survey
Research
Research
ResearchResearch Survey
Story
A WOC Story
• ProbabilitySports.com• Thousands of probability judgments
for sporting events• Alice: Jets 67% chance to beat Patriots• Bob: Jets 48% chance to beat Patriots• Carol, Don, Ellen, Frank, ...
• Reward: Quadratic scoring rule:Best probability judgments maximize expected score
Research
Opinion
1/7
ResearchResearch
Individuals
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
• Most individuals are poor predictors• 2005 NFL Season
• Best: 3747 points• Average: -944 Median: -275• 1,298 out of 2,231 scored below zero
(takes work!)
ResearchResearch
Individuals
• Poorly calibrated (too extreme)• Teams given < 20%
chance actually won 30% of the time
• Teams given > 80% chance actually won 60% of the time
ResearchResearch
The Crowd
• Create a crowd predictor by simply averaging everyone’s probabilities• Crowd = 1/n(Alice + Bob + Carol + ... )• 2005: Crowd scored 3371 points
(7th out of 2231) !
• Wisdom of fools: Create a predictor by averaging everyone who scored below zero • 2717 points (62nd place) !• (the best “fool” finished in 934th place)
ResearchResearch
The Crowd: How Big?
More:http://blog.oddhead.com/2007/01/04/the-wisdom-of-the-probabilitysports-crowd/http://www.overcomingbias.com/2007/02/how_and_when_to.html
ResearchResearch
Can We Do Better?: ML/Stats
• Maybe Not• CS “experts algorithms”• Other expert weights• Calibrated experts• Other averaging fn’s (geo mean, RMS,
power means, mean of odds, ...)• Machine learning (NB, SVM, LR, DT, ...)
• Maybe So• Bayesian modeling + EM• Nearest neighbor (multi-year)
[Dani et al. UAI 2006]
ResearchResearch
Prediction Performance of MarketsRelative to Individual Experts
020406080
100120140160180200220240260280300
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Week into the NFL season
Rank
NewsFutures
Tradesports
Can we do better?: Markets
ResearchResearch
Prediction Markets:Examples & Research
ResearchResearch
The Wisdom of CrowdsBacked in dollars• What you can say/learn
% chance that• Obama wins• GOP wins Texas• YHOO stock > 30• Duke wins tourney• Oil prices fall• Heat index rises• Hurricane hits Florida• Rains at place/time
• Where
• IEM, Intrade.com• Intrade.com• Stock options market• Las Vegas, Betfair• Futures market• Weather derivatives• Insurance company• Weatherbill.com
ResearchResearch
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Prediction MarketsWith Money Without
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
ResearchResearch
The Widsom of CrowdsBacked in “Points”• HSX.com• Newsfutures.com• InklingMarkets.com• Foresight Exchange• CasualObserver.net• FTPredict.com• Yahoo!/O’Reilly Tech Buzz• ProTrade.com• StorageMarkets.com• TheSimExchange.com• TheWSX.com• Alexadex, Celebdaq, Cenimar, BetBubble, Betocracy, CrowdIQ,
MediaMammon,Owise, PublicGyan, RIMDEX, Smarkets, Trendio, TwoCrowds
• http://www.chrisfmasse.com/3/3/markets/#Play-Money_Prediction_Markets
http://tradesports.com
http://betfair.com
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Screen capture 2007/05/18
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Screen capture 2008/05/07
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
ResearchResearch
Example: IEM 1992
[Source: Berg, DARPA Workshop, 2002]
ResearchResearch
Example: IEM
[Source: Berg, DARPA Workshop, 2002]
ResearchResearch
Example: IEM
[Source: Berg, DARPA Workshop, 2002]
Does it work? Yes, evidence from real markets, laboratory
experiments, and theory Racetrack odds beat track experts [Figlewski 1979] Orange Juice futures improve weather forecast [Roll 1984] I.E.M. beat political polls 451/596 [Forsythe 1992, 1999][Oliven
1995][Rietz 1998][Berg 2001][Pennock 2002]
HP market beat sales forecast 6/8 [Plott 2000]
Sports betting markets provide accurate forecasts of game outcomes [Gandar 1998][Thaler 1988][Debnath EC’03][Schmidt 2002]
Laboratory experiments confirm information aggregation[Plott 1982;1988;1997][Forsythe 1990][Chen, EC’01]
Theory: “rational expectations” [Grossman 1981][Lucas 1972]
Market games work [Servan-Schreiber 2004][Pennock 2001]
[Thanks: Yiling Chen]
ResearchResearch
Prediction Markets:Does Money Matter?
ResearchResearch
The Wisdom of CrowdsWith Money Without
IEM: 237 Candidates HSX: 489 Movies
1 2 5 10 20 50 100estimate
1
2
5
10
20
50
100
actual
ResearchResearch
The Wisdom of CrowdsWith Money Without
ResearchResearch
Real markets vs. market gamesHSX FX, F1P6
probabilisticforecasts
forecast source avg log scoreF1P6 linear scoring -1.84F1P6 F1-style scoring -1.82betting odds -1.86F1P6 flat scoring -2.03F1P6 winner scoring -2.32
ResearchResearch
Does money matter? Play vs real, head to headExperiment• 2003 NFL Season• ProbabilitySports.com
Online football forecasting competition
• Contestants assess probabilities for each game
• Quadratic scoring rule• ~2,000 “experts”, plus:• NewsFutures (play $)• Tradesports (real $)
• Used “last trade” prices
Results:• Play money and real
money performed similarly• 6th and 8th respectively
• Markets beat most of the ~2,000 contestants• Average of experts
came 39th (caveat)
Electronic Markets, Emile Servan-Schreiber, Justin Wolfers, David Pennock and Brian Galebach
ResearchResearch
0
25
50
75
100
TradeSports Prices
0 20 40 60 80 100NewsFutures Prices
Fitted Value: Linear regression
45 degree line
n=416 over 208 NFL games.Correlation between TradeSports and NewsFutures prices = 0.97
Prices: TradeSports and NewsFutures
Prediction Performance of MarketsRelative to Individual Experts
020406080
100120140160180200220240260280300
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Week into the NFL season
Rank
NewsFutures
Tradesports
0
10
20
30
40
50
60
70
80
90
100
Observed Frequency of Victory
0 10 20 30 40 50 60 70 80 90 100Trading Price Prior to Game
TradeSports: Correlation=0.96NewsFutures: Correlation=0.94
Data are grouped so that prices are rounded to the nearest ten percentage points; n=416 teams in 208 games
Market Forecast Winning Probability and Actual Winning ProbabilityPrediction Accuracy
ResearchResearch
Does money matter? Play vs real, head to head
Probability-Football Avg
TradeSports(real-money)
NewsFutures(play-money)
DifferenceTS - NF
Mean Absolute Error
= lose_price
[lower is better]
0.443
(0.012)
0.439
(0.011)
0.436
(0.012)
0.003
(0.016)
Root Mean Squared Error
= ?Average( lose_price2 )
[lower is better]
0.476
(0.025)
0.468
(0.023)
0.467
(0.024)
0.001
(0.033)
Average Quadratic Score
= 100 - 400*( lose_price2 )
[higher is better]
9.323
(4.75)
12.410
(4.37)
12.427
(4.57)
-0.017
(6.32)
Average Logarithmic Score
= Log(win_price)
[higher (less negative) is better]
-0.649
(0.027)
-0.631
(0.024)
-0.631
(0.025)
0.000
(0.035)
Statistically:TS ~ NFNF >> AvgTS > Avg
ResearchResearch
Discussion
• Are incentives for virtual currency strong enough?• Yes (to a degree)• Conjecture: Enough to get what people already know;
not enough to motivate independent research• Reduced incentive for information discovery possibly
balanced by better interpersonal weighting
• Statistical validations show HSX, FX, NF are reliable sources for forecasts• HSX predictions >= expert predictions• Combining sources can help
ResearchResearch
A Problem w/ Virtual CurrencyPrinting Money
Alice1000
Betty1000
Carol1000
ResearchResearch
A Problem w/ Virtual CurrencyPrinting Money
Alice5000
Betty1000
Carol1000
ResearchResearch
YootlesA Social Currency
Alice0
Betty0
Carol0
ResearchResearch
YootlesA Social Currency
I owe you 5
Alice-5
Betty0
Carol5
ResearchResearch
YootlesA Social Currency
credit: 5 credit: 10
I owe you 5
Alice-5
Betty0
Carol5
ResearchResearch
YootlesA Social Currency
credit: 5 credit: 10
I owe you 5 I owe you 5
Alice-5
Betty0
Carol5
ResearchResearch
YootlesA Social Currency
credit: 5 credit: 10
I owe you 5 I owe you 5
Alice3995
Betty0
Carol5
ResearchResearch
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
YootlesA Social Currency• For tracking gratitude among friends• A yootle says “thanks, I owe you one”
ResearchResearch
Combinatorial Betting
ResearchResearchCombinatorics ExampleMarch Madness
ResearchResearch
Combinatorics ExampleMarch Madness• Typical today
Non-combinatorial• Team wins Rnd 1• Team wins Tourney• A few other “props”• Everything explicit
(By def, small #)• Every bet indep:
Ignores logical & probabilistic relationships
• Combinatorial• Any property• Team wins Rnd k
Duke > {UNC,NCST}ACC wins 5 games
• 2263 possible props(implicitly defined)
• 1 Bet effects related bets “correctly”;e.g., to enforce logical constraints
Expressiveness:Getting Information
• Things you can say today:– (63% chance that) Obama wins
– GOP wins Texas
– YHOO stock > 30 Dec 2007
– Duke wins NCAA tourney
• Things you can’t say (very well) today:– Oil down, DOW up, & Obama wins
– Obama wins election, if he wins OH & FL
– YHOO btw 25.8 & 32.5 Dec 2007
– #1 seeds in NCAA tourney win more than #2 seeds
Expressiveness:Processing Information
• Independent markets today:– Horse race win, place, & show pools
– Stock options at different strike prices
– Every game/proposition in NCAA tourney
– Almost everything: Stocks, wagers, intrade, ...
• Information flow (inference) left up to traders
• Better: Let traders focus on predicting whatever they want, however they want: Mechanism takes care of logical/probabilistic inference
• Another advantage: Smarter budgeting
ResearchResearch
Market CombinatoricsPermutations
• A > B > C .1
• A > C > B .2
• B > A > C .1
• B > C > A .3
• C > A > B .1
• C > B > A .2
ResearchResearch
Market CombinatoricsPermutations
• D > A > B > C .01• D > A > C > B .02• D > B > A > C .01• A > D > B > C .01• A > D > C > B .02• B > D > A > C .05• A > B > D > C .01• A > C > D > B .2• B > A > D > C .01• A > B > C > D .01• A > C > B > D .02• B > A > C > D .01
• D > B > C > A .05• D > C > A > B .1• D > C > B > A .2• B > D > C > A .03• C > D > A > B .1• C > D > B > A .02• B > C > D > A .03• C > A > D > B .01• C > B > D > A .02• B > C > D > A .03• C > A > D > B .01• C > B > D > A .02
ResearchResearch
Bidding Languages• Traders want to bet on properties of
orderings, not explicitly on orderings: more natural, more feasible• A will win ; A will “show”• A will finish in [4-7] ; {A,C,E} will finish in top 10• A will beat B ; {A,D} will both beat {B,C}
• Buy 6 units of “$1 if A>B” at price $0.4• Supported to a limited extent at racetrack
today, but each in different betting pools• Want centralized auctioneer to improve
liquidity & information aggregation
ResearchResearch
Example
• A three-way match• Buy 1 of “$1 if A>B” for 0.7• Buy 1 of “$1 if B>C” for 0.7• Buy 1 of “$1 if C>A” for 0.7
A
B
C
ResearchResearch
Pair Betting• All bets are of the form “A will beat B”
• Cycle with sum of prices > k-1 ==> Match(Find best cycle: Polytime)
• Match =/=> Cycle with sum of prices > k-1
• Theorem: The Matching Problem for Pair Betting is NP-hard (reduce from min feedback arc set)
ResearchResearch
Automated Market Makers
• A market maker (a.k.a. bookmaker) is a firm or person who is almost always willing to accept both buy and sell orders at some prices
• Why an institutional market maker? Liquidity! • Without market makers, the more expressive the betting
mechanism is the less liquid the market is (few exact matches)• Illiquidity discourages trading: Chicken and egg• Subsidizes information gathering and aggregation: Circumvents
no-trade theorems
• Market makers, unlike auctioneers, bear risk. Thus, we desire mechanisms that can bound the loss of market makers
• Market scoring rules [Hanson 2002, 2003, 2006]
• Dynamic pari-mutuel market [Pennock 2004]
[Thanks: Yiling Chen]
Overview: Complexity Results
Permutations Boolean Taxonomy
General Pair Subset General 2-clause Restrict
Tourney
General Tree
Call Market
NP-hard
EC’07
NP-hard
EC’07
Poly
EC’07
NP-hard
DSS’05
co-NP-complete
DSS’05
? ? ?
Market Maker
(LMSR)
#P-hard
EC’08
#P-hard
EC’08
#P-hard
EC’08
#P-hard
EC’08
Approx
STOC’08
#P-hard
EC’08
Poly
STOC’08
#P-hard
XYZ‘09
Poly
XYZ‘09
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
• March Madness bet constructor
• Bet on any team to win any game– Duke wins in Final 4
• Bet “exotics”:– Duke advances further
than UNC– ACC teams win at least 5– A 1-seed will lose in 1st
round
ResearchResearch
New Prediction Game: Yoopick An Application on Facebook
ResearchResearch
Catalysts
• Markets have long history of predictive accuracy: why catching on now as tool?
• No press is bad press: Policy Analysis Market (“terror futures”)
• Surowiecki's “Wisdom of Crowds”• Companies:
• Google, Microsoft, Yahoo!; CrowdIQ, HSX, InklingMarkets, NewsFutures
• Press: BusinessWeek, CBS News, Economist, NYTimes, Time, WSJ, ...http://us.newsfutures.com/home/articles.html
CFTC Role
• MayDay 2008: CFTC asks for help
• Q: What to do with prediction markets?
• Yahoo!, Google entered suggestions
• Right now, the biggest prediction markets are overseas, academic (1), or just for fun
• CFTC may clarify, drive innovationOr not
ResearchResearch
Conclusion
• Prediction Markets:hammer = market, nail = prediction• Great empirical successes• Momentum in academia and industry• Fascinating (algorithmic) mechanism design
questions, including combinatorial betting
• Points-paid peers produce prettygood predictions