Tobias Preis
Data Science Lab, Behavioural Science Warwick Business School
[email protected] http://www.tobiaspreis.de
Measuring and predicting human behaviour using online data
Future Orientation Index 2010
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8
Future Orientation Index 2010 Suzy Moat & Tobias PreisBased on Preis, Moat, Stanley and Bishop (2012)
Ratio of Google searches for “2011” to searches for “2009” during 2010 for 45 countries
more Google searches for “2009” more Google searches for “2011”
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8
Future Orientation Index 2012 Suzy Moat & Tobias PreisBased on Preis, Moat, Stanley and Bishop (2012)
Ratio of Google searches for “2013” to searches for “2011” during 2012 for 45 countries
Richer countries look forward
Time with Weekly Granularity
Sear
ch V
olum
e
0
5
10
2008 2009 2010
0
5A B
“2008”“2007”
“2009” “2010” “2011”
“2009”
Future-Orientation Index
GD
P / C
apita
[10
4 USD
]
1
2
3
4
0.0 0.5 1.0 1.5 2.0
Preis, Moat, Stanley &
Bishop (2012)
Featured by:
http://www.nature.com/srep/2012/120405/srep00350/pdf/srep00350.pdf
Hypothetical strategy
week t
Moat et al. (2013); Preis et al. (2013)
number of Google searches for keyword
number of Google searches for keyword
week t t-1 t-2 t-3
Moat et al. (2013); Preis et al. (2013) Hypothetical strategy
Search volume decreased: BUY stock
in week t+1
week t t-1 t-2 t-3
Moat et al. (2013); Preis et al. (2013) Hypothetical strategy
number of Google searches for keyword
Search volume decreased: BUY stock
in week t+1
Search volume increased: SELL stock
in week t+1 week t t-1 t-2 t-3
Moat et al. (2013); Preis et al. (2013) Hypothetical strategy
number of Google searches for keyword
−40
0
40
2005 2007 2009 2011
516
% profit
“culture” trading strategy buy and hold strategy mean ± 1 sd of random strategies
Preis, Moat & Stanley (2013)
Featured by: Example: “culture”
0100200300
2005 2007 2009 2011
326
16
% profit
“debt” trading strategy buy and hold strategy mean ± 1 sd of random strategies
Preis, Moat & Stanley (2013)
Example: “debt” Featured by:
http://www.nature.com/srep/2013/130425/srep01684/pdf/srep01684.pdf
Random strategy mean + 2 sds
Random strategy mean + 1 sd
return (random strategy sds)
0
1
2
-1
“debt” “culture”
How different keywords perform
Preis, Moat & Stanley (2013)
Random strategy mean + 2 sds
Random strategy mean + 1 sd
return (random strategy sds)
0
1
2
-1
“debt” “culture”
“stocks”
“credit”
“garden” “train”
Preis, Moat & Stanley (2013)
How different keywords perform
# occurrences in FT
# hits on Google
Returns significantly correlated with indicator
of financial relevance
Financial relevance
Random strategy mean + 2 sds
Random strategy mean + 1 sd
return (random strategy sds)
0
1
2
-1
Preis, Moat & Stanley (2013)
How different keywords perform
Return [Std. Dev. of Random Strategies]
Den
sity
0.0
0.2
0.4
0.6
−2 0 2
Wikipedia ViewsDJIA Companies
Wikipedia EditsDJIA Companies
RandomStrategy
Wikipedia: Dow Jones companies
Views strategies profitable
Moat, Curme, Avakian, Kenett,
Stanley & Preis (2013)
Featured by:
http://www.nature.com/srep/2013/130508/srep01801/pdf/srep01801.pdf
0.00
0.25
0.50
0.75
1.00
−2 0 2Return [Std. Dev. of Random Strategies]
Den
sity
Wikipedia ViewsFinancial Topics
Wikipedia EditsFinancial Topics
RandomStrategy
Wikipedia: Financial topics Moat, Curme,
Avakian, Kenett, Stanley & Preis
(2013)
Featured by:
Views strategies profitable
0.0
0.1
0.2
0.3
0.4
−2 0 2Return [Std. Dev. of Random Strategies]
Den
sity
Wikipedia ViewsActors & Filmmakers
RandomStrategy
Wikipedia: Actors and filmmakers?
Strategies NOT profitable
Moat, Curme, Avakian, Kenett,
Stanley & Preis (2013)
Featured by:
debt
housing
crisis
apple
orange
tree
housing
debt Curme, Preis, Stanley & Moat (2014)
What is searched for before falls?
55 groups of search terms
Business and politics most related
Curme, Preis, Stanley & Moat (2014)
What is searched for before falls? Cumulative Returns (%)
-100 0 100 200Random Strategy
Politics IBusiness
http://www.pnas.org/content/111/32/11600.full.pdf
Preis and Moat!(under review);
Time
Nor
mal
ised
Num
ber o
f Pho
tos
0.000.020.040.060.080.10
20 Oct 25 Oct 30 Oct 05 Nov 10 Nov 15 Nov 20 Nov
Flickr Photos with Hurricane Related Tags
Landfall of Hurricane Sandy
TimeAtm
osph
eric
Pre
ssur
e [m
bar]
960
980
1000
1020
20 Oct 25 Oct 30 Oct 05 Nov 10 Nov 15 Nov 20 Nov
Landfall of Hurricane Sandy
Averaged Pressure in US State New Jersey
A
B
Preis, Moat, Bishop, Treleaven and Stanley (2013)
Flickr and Hurricane Sandy
Flickr: photos taken
Air pressure
http://www.nature.com/srep/2013/131105/srep03141/pdf/srep03141.pdf
Preis & Moat (2014) Level of flu cases
Number of influenza cases in the US
●●●●●
●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●
●●●●●●●●●●
●
●●
●●
●
●
●●●
●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●●●●●●●●
●●●●●●
●●●●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●
●●●●
●●
●
●
●
●
●●●
●
●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●
2
4
6
2010 2011 2012 2013Time [Weeks]
Influ
enza−L
ike
Illne
ss [%
]
Observed Value●
Preis & Moat (2014) Level of flu cases
Number of influenza cases in the US
●●●●●
●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●
●●●●●●●●●●
●
●●
●●
●
●
●●●
●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●●●●●●●●
●●●●●●
●●●●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●
●●●●
●●
●
●
●
●
●●●
●
●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●
2
4
6
2010 2011 2012 2013Time [Weeks]
Influ
enza−L
ike
Illne
ss [%
]
Observed Value●
Level of flu cases
Predicting the present number of influenza cases
●●●●●
●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●
●●●●●●●●●●
●
●●
●●
●
●
●●●
●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●●●●●●●●
●●●●●●
●●●●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●
●●●●
●●
●
●
●
●
●●●
●
●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●
2
4
6
2010 2011 2012 2013Time [Weeks]
Influ
enza−L
ike
Illne
ss [%
]
Predicted ValueObserved Value80% Prediction Interval95% Prediction Interval
●
Trai
ning
Per
iod
Out-of-Sample Nowcast
Preis & Moat (2014)
Level of flu cases
Predicting the present number of influenza cases
●●●●●
●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●
●●●●●●●●●●
●
●●
●●
●
●
●●●
●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●●●●●●●●
●●●●●●
●●●●
●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●
●●●●
●●
●
●
●
●
●●●
●
●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●
2
4
6
2010 2011 2012 2013Time [Weeks]
Influ
enza−L
ike
Illne
ss [%
]
Predicted ValueObserved Value80% Prediction Interval95% Prediction Interval
●
Trai
ning
Per
iod
Out-of-Sample Nowcast
Preis & Moat (2014)
Forecast errors significantly reduced by between 16% and 53%.
http://rsos.royalsocietypublishing.org/content/royopensci/1/2/140095.full.pdf
Data from the Internet may help us measure and even predict
human behaviour
How can open data help you? [email protected]
Twitter: t_preis