+ All Categories
Home > Entertainment & Humor > Social media a prediktivní analýza

Social media a prediktivní analýza

Date post: 18-Jul-2015
Category:
Upload: josef-slerka
View: 1,433 times
Download: 0 times
Share this document with a friend
Popular Tags:
35
Social media a prediktivní anal ý za 15. 6. 2011 Josef Š lerka, Praha konference Social media ve finan č ních slu ž bách
Transcript
Page 1: Social media a prediktivní analýza

Social media a prediktivní analýza15. 6. 2011 Josef Šlerka, Prahakonference Social media ve finančních službách

Page 2: Social media a prediktivní analýza
Page 3: Social media a prediktivní analýza

Predictive analytics

Predictive analytics encompasses a variety of statistical techniques from modeling, data mining and game theory that analyze current and historical facts to make predictions about future events. (WIKIPEDIA)

Page 4: Social media a prediktivní analýza

Predictive analytics

In business, predictive models exploit patterns found in historical and transactional data to identify risks and opportunities. Models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions. (WIKIPEDIA)

Page 5: Social media a prediktivní analýza

Search jako signál

Hyunyoung Choi, Hal Varia:

Predicting the Present with Google Trends

Page 6: Social media a prediktivní analýza
Page 7: Social media a prediktivní analýza

Jak je to možné?

Život je hledání ... (taky)

a dříve než se rozhodneme, tak hledáme ... (taky)

Page 8: Social media a prediktivní analýza
Page 9: Social media a prediktivní analýza

Google Insights

služba, kterou Google postkytuje zadarmo

lze ji využít i pro predikční analýzy

Nikolaos Askitas, Klaus F. Zimmermann:

Google Econometrics and Unemployment Forecasting

Page 10: Social media a prediktivní analýza
Page 11: Social media a prediktivní analýza

of the song “Right Round” in terms of search volume closelytracks its rank on the Billboard Hot 100 chart.

Thus motivated, we now investigate whether search activity isa systematic leading indicator of consumer activity by forecasting(i) opening weekend box-office revenue for 119 feature films re-leased in the United States between October 2008 and September2009; (ii) first-month sales of video games across all gamingplatforms (e.g., Xbox, PlayStation, etc.) for 106 games releasedbetween September 2008 and September 2009; and (iii) theweekly rank of 307 songs that appeared on the Billboard Hot100 list between March and September 2009. Search data for mo-vies and video games come from Yahoo!’s Web search query logsfor the US market. Predictions in these domains are based onlinear models with Gaussian error of the form

log!revenue" # !0 $ !1 log!search" $ ";

where, in order to account for the highly skewed distributions ofpopularity, both revenue and search volume are log-transformed.For songs, search data were collected from Yahoo!’s dedicatedmusic site, music.yahoo.com. We predict the weekly Billboardrank using search rank from the current and previous weeks:

billboardt$1 # !0 $ !1searcht $ !2searcht!1 $ ":

Fig. 2 A–C shows that search-based predictions are stronglycorrelated with realized outcomes for movies (0.85) and videogames (0.76) and moderately correlated for music (0.56), wherein each case revenue or rank is predicted on the day immediatelypreceding the event of interest. Moreover, Fig. 2 D–F shows thatthe predictive power of search persists as far out as several weeksin advance—for example, four weeks prior to a movie’s release

Transformers 2

Time to Release (Days)

Sea

rch

Volu

me

A

!30 !20 !10 0 10 20 30

Tom Clancy's H.A.W.X

Time to Release (Days)

Sea

rch

Volu

me

B

!30 !20 !10 0 10 20 30

Right Round

Week

Ran

k

40

30

20

10

C

Mar!09 Apr!09 May!09 Jun!09 Jul!09 Aug!09

BillboardSearch

Fig. 1. Search volume for the movie Transformers 2 (A) and the video game Tom Clancy’s H.A.W.X. (B) prior to and after their release, and search and Billboardrank for the song “Right Round” by Flo Rida (C).

Movies

Predicted Revenue (Dollars)

Act

ual R

even

ue (

Dol

lars

)

10

10

10

10

10

10

10

103 104 105 106 107 108 109

Video Games

Predicted Revenue (Dollars)

Act

ual R

even

ue (

Dol

lars

)

103

104

105

106

107

103 104 105 106 107

Non!Sequel

Sequel

Music

Predicted Billboard Rank

Act

ual B

illbo

ard

Ran

k

0

20

40

60

80

100

0 20 40 60 80 100

Movies

Time to Release (Weeks)

Mod

el F

it

0.4

0.5

0.6

0.7

0.8

0.9

!6 !5 !4 !3 !2 !1 0

Video Games

Time to Release (Weeks)

Mod

el F

it

0.4

0.5

0.6

0.7

0.8

0.9

!6 !5 !4 !3 !2 !1 0

Music

Time to Release (Weeks)

Mod

el F

it

0.4

0.5

0.6

0.7

0.8

0.9

!6 !5 !4 !3 !2 !1 0

A B C

D E F

Fig. 2. Search-based predictions for box-office movie revenue (A), first-month video game sales (B), and the Billboard rank of songs (C), where predictions aremade immediately prior to the event of interest; correlation between predicted and actual outcomes when predictions are based on query data t weeks priorto the event (D–F).

Goel et al. PNAS " October 12, 2010 " vol. 107 " no. 41 " 17487

COMPU

TERSC

IENCE

SSO

CIALSC

IENCE

S

of the song “Right Round” in terms of search volume closelytracks its rank on the Billboard Hot 100 chart.

Thus motivated, we now investigate whether search activity isa systematic leading indicator of consumer activity by forecasting(i) opening weekend box-office revenue for 119 feature films re-leased in the United States between October 2008 and September2009; (ii) first-month sales of video games across all gamingplatforms (e.g., Xbox, PlayStation, etc.) for 106 games releasedbetween September 2008 and September 2009; and (iii) theweekly rank of 307 songs that appeared on the Billboard Hot100 list between March and September 2009. Search data for mo-vies and video games come from Yahoo!’s Web search query logsfor the US market. Predictions in these domains are based onlinear models with Gaussian error of the form

log!revenue" # !0 $ !1 log!search" $ ";

where, in order to account for the highly skewed distributions ofpopularity, both revenue and search volume are log-transformed.For songs, search data were collected from Yahoo!’s dedicatedmusic site, music.yahoo.com. We predict the weekly Billboardrank using search rank from the current and previous weeks:

billboardt$1 # !0 $ !1searcht $ !2searcht!1 $ ":

Fig. 2 A–C shows that search-based predictions are stronglycorrelated with realized outcomes for movies (0.85) and videogames (0.76) and moderately correlated for music (0.56), wherein each case revenue or rank is predicted on the day immediatelypreceding the event of interest. Moreover, Fig. 2 D–F shows thatthe predictive power of search persists as far out as several weeksin advance—for example, four weeks prior to a movie’s release

Transformers 2

Time to Release (Days)

Sea

rch

Volu

me

A

!30 !20 !10 0 10 20 30

Tom Clancy's H.A.W.X

Time to Release (Days)

Sea

rch

Volu

me

B

!30 !20 !10 0 10 20 30

Right Round

Week

Ran

k

40

30

20

10

C

Mar!09 Apr!09 May!09 Jun!09 Jul!09 Aug!09

BillboardSearch

Fig. 1. Search volume for the movie Transformers 2 (A) and the video game Tom Clancy’s H.A.W.X. (B) prior to and after their release, and search and Billboardrank for the song “Right Round” by Flo Rida (C).

Movies

Predicted Revenue (Dollars)

Act

ual R

even

ue (

Dol

lars

)

10

10

10

10

10

10

10

103 104 105 106 107 108 109

Video Games

Predicted Revenue (Dollars)

Act

ual R

even

ue (

Dol

lars

)

103

104

105

106

107

103 104 105 106 107

Non!Sequel

Sequel

Music

Predicted Billboard Rank

Act

ual B

illbo

ard

Ran

k

0

20

40

60

80

100

0 20 40 60 80 100

Movies

Time to Release (Weeks)

Mod

el F

it

0.4

0.5

0.6

0.7

0.8

0.9

!6 !5 !4 !3 !2 !1 0

Video Games

Time to Release (Weeks)

Mod

el F

it

0.4

0.5

0.6

0.7

0.8

0.9

!6 !5 !4 !3 !2 !1 0

Music

Time to Release (Weeks)

Mod

el F

it

0.4

0.5

0.6

0.7

0.8

0.9

!6 !5 !4 !3 !2 !1 0

A B C

D E F

Fig. 2. Search-based predictions for box-office movie revenue (A), first-month video game sales (B), and the Billboard rank of songs (C), where predictions aremade immediately prior to the event of interest; correlation between predicted and actual outcomes when predictions are based on query data t weeks priorto the event (D–F).

Goel et al. PNAS " October 12, 2010 " vol. 107 " no. 41 " 17487

COMPU

TERSC

IENCE

SSO

CIALSC

IENCE

S

Page 12: Social media a prediktivní analýza

Funguje i u nás?

nejsou žadné přesné studie

není důvod, aby nefungoval

Page 13: Social media a prediktivní analýza
Page 14: Social media a prediktivní analýza
Page 15: Social media a prediktivní analýza

Social media jako signál

Život NENÍ jen hledání ... Fans, followers, pages

“Co se vám honí hlavou?” (Facebook)

“What’s happening?” (Twitter)

Page 16: Social media a prediktivní analýza

Predikce burzy

Page 17: Social media a prediktivní analýza

Predikce burzy

To put it in simple words, when the emotions on twitter fly high, that is when people express a lot of hope, fear, and worry, the Dow goes down the next day. When people have less hope, fear, and worry, the Dow goes up. It therefore seems that just checking on twitter for emotional outbursts of any kind gives a predictor of how the stock market will be doing the next day. Zhang, Fuehres, Peter A. Gloor: Predicting Stock Market Indicators Through Twitter “I hope it is not as bad as I fear”

Page 18: Social media a prediktivní analýza

Predikce akcií

sledované akcie Starbucks, Coca Cola a Nike

použité signály Facebook Fans, Twitter flowers, YouTube Views

Page 19: Social media a prediktivní analýza
Page 20: Social media a prediktivní analýza

Predikce voleb

volby do amerického senátu

signálem byl počet followerů na Twitteru

korelace mezi vítězstvím a počtem byla 71%

u porovnání FB fanoušků dokonce 80%

Page 21: Social media a prediktivní analýza

Funguje to i u nás?

Zdá, se že ano:-)

Výzkum na datech ze www.ataxosocialinsider.cz

Page 22: Social media a prediktivní analýza

Ataxo Social Insider

nástroj pro analýzu dat ze sociálních sítí, diskusních fór, blogů a zpravodajských serverů

Page 23: Social media a prediktivní analýza

Ataxo Social Insider

Page 24: Social media a prediktivní analýza
Page 25: Social media a prediktivní analýza
Page 26: Social media a prediktivní analýza
Page 27: Social media a prediktivní analýza

A co predikce?

Case study:

počty zmínek na Facebooku a návštěvnost filmu

Page 28: Social media a prediktivní analýza

zmínky o Inception na českém Facebooku 2010 a divácký ohlas

Page 29: Social media a prediktivní analýza

Harry Potter na českém Facebooku 2010 a divácký ohlas

Page 30: Social media a prediktivní analýza

FB zmínky jako signál

Korelace ukazuje schopnost předvídat dynamiku tržeb filmů, protože lidé většinou dělají, co říkají....

Page 31: Social media a prediktivní analýza

Budoucnost?

Propojme data a dívejme se...

Page 32: Social media a prediktivní analýza

Profilování klientů

propojení statusů uživatelů s jejich finačním chováním

predikce solventnosti

míra spolehlivosti jejich sítě

ověření reality

Page 33: Social media a prediktivní analýza

Hledání produktů

šití produktů na míru

objevování patterns v chování

Page 34: Social media a prediktivní analýza

Půjde to?

Jde to! V USA firma RapLeaf.

U nás zatím není poptávka.

Data ano.


Recommended