+ All Categories
Home > Documents > Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal...

Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal...

Date post: 27-Mar-2015
Category:
Upload: anthony-burke
View: 228 times
Download: 1 times
Share this document with a friend
Popular Tags:
42
Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009
Transcript
Page 1: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary 1

Predicting the PresentWith Google Trends

Hyunyoung Choi

Hal Varian

June 2009

Page 2: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

2Google Confidential and Proprietary 2

Problem statement

Government agencies and other organizations produce monthly reports on economic activity Retail Sales

House Sales

Automotive Sales

Unemployment

Problems with reports Compilation delay of several weeks

Subsequent revisions

Sample size may be small

Not available at all geographic levels

Google Trends releases daily and weekly index of search queries by industry vertical Real time data

No revisions (but some sampling variation)

Large samples

Available by country, state and city

Can Google Trends data help predict current economic activity? Before release of preliminary statistics

Before release of final revision

Page 3: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary 3

Categories in Google Trends by Query Shares

Note: Queries from 2009-01-01 to 2009-04-30 & Growth Comparison w/ the same time window

Page 4: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Real Estate

Page 5: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary 5

Geography

Category

Time window

Page 6: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

6Google Confidential and Proprietary

Real Estate Agencies

Rental Listings & Referrals

Home Insurance

Home Inspections & Appraisal

Pro

pe

rty

Ma

na

ge

me

nt

Home Financing

6

Subcategories under Real Estate by Query Shares

Page 7: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

7Google Confidential and Proprietary 7

Search on Real Estate Agencies

Page 8: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

8Google Confidential and Proprietary 8

Searches on Rental Listings & Referrals

Page 9: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

9Google Confidential and Proprietary

Depicting trends

Google Trends measures normalized query share of particular category of queries – controls for overall growth

Often useful to look at year-on-year changes to eliminate seasonality.

Illustrate correlations and covariates.

Improving predictions

Forecast time series using its own lagged values and add Trends data as a predictor.

• Statistical significance?

• Improved fit?

• Improved forecasts?

• Identify turning points?

9

20 06 20 07 20 08

30

20

10

0

10

20

R eal Es t at e A gencies Q uery Index

O ct Jan A pr Ju l20

15

10

5

0

5

R eal Est at e A gencies YO Y G row th Index

Page 10: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

10

Google Confidential and Proprietary 10

15 yr Mortgage Rate vs. Home Financing

Page 11: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

11

Google Confidential and Proprietary 1111

Forecasting primer

Basic forecasting models

Autoregressive: value at time t depends on

• Value at time t-1

Seasonal adjustment: value at time t depends on

• Value at time t-12

• For monthly data

Transfer function: value at time t depends on

• Other contemporaneous or lagging variables

Seasonal autoregressive transfer model: Value at time t depends on

• Value at time t-12 (seasonality)

• Value at time t-1 (recent behavior)

• Other lagging or contemporaneous variables (such as Google Trends data)

Typical question of interest

• How much more accurate forecasts can you get from additional variables over and above the accuracy

you get with the history of the time series itself?

Page 12: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

New Home Sales

Model

Recent Trend with New Home Sales at t-1

Seasonality with New Home Sales at t-12

Recent Search Activity on

• Real Estate Agencies

• Rental Listings & Referrals

• Home Inspections & Appraisal

• Property Management

• Home Insurance

• Home Financing

Time Series Google Trends

Housing affordability with Average/Median Home Price

Exogenous Variables

Page 13: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

13

Google Confidential and Proprietary 13

Predicting the present

Monthly release 24 – 28 days after the month

Seasonally adjusted

National and Regional aggregate

Home Inspections & Appraisal

Home Insurance

Home Financing

Property Management

Rental Listings & Referrals

Real Estate Agencies

New Residential Sales from US Census Google Trends Real Estate by Category

Page 14: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

14

Google Confidential and Proprietary 14

New House Sales vs. Real Estate Google Trends

Page 15: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

15

Google Confidential and Proprietary

Model:

Yt = 446.1 + 0.864 * Yt - 1 – 4.340 * us378.1 + 4.198 * us96.2 – 0.001 * AvgPt – 1

Yt : New house sold at t-th month

AvgPt – 1: Average Sales Price of New One-Family Houses Sold at (t-1)-th month

us378.1 : Google Trend of vertical id = 378 (Rental Listings & Referrals ) at t-th month 1st week

us96.2 : Google Trend of vertical id = 96 (Real Estate Agent) at t-th month 2nd week

15

Analysis and Forecasting

July 2008

Actual = 515K

Predicted = 442.98K

Z-score = 2.53

August 2008 Prediction = 417.52K

Page 16: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

16

Google Confidential and Proprietary 16

Analysis and Forecasting

Observations

Since 2005 new house sales have been decreasing, with little seasonality

Google Trends captures seasonality & recent trends

Positive association with Real Estate Agencies (96)

Negative association with Rental Listings & Referrals (378) and Average Price

Page 17: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

17

Google Confidential and Proprietary

Travel

Page 18: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

18

Google Confidential and Proprietary

Hotels & Accommodations

Attractions & Activities

Air Travel

Bus & Rail

Cruises & Charters

Ad

ve

ntu

re

Tra

ve

l

Car Rental & Taxi Services

Vacation Destinations

18

Subcategories under Travel by Query Shares

Page 19: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

19

Google Confidential and Proprietary 19

Travel to Hong Kong

Monthly summaries release with 1 month lag

Reports Country/Territory of Residence of visitors

Data available 2004-2008

Hotels & Accommodations

Air Travel

Car Rental & Taxi Services

Cruises & Charters

Attractions & Activities

Vacation Destinations• Australia

• Caribbean Islands

• Hawaii

• Hong Kong• Las Vegas

• Mexico

• New York City

• Orlando

Adventure Travel

Bus & Rail

Google Trends Travel by CategoryVisitors Arrival Statistics from Hong

Kong Tourism Board

Page 20: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

20

Google Confidential and Proprietary 20

Visitors Arrival Statistics vs. Google Trends

Page 21: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

21

Google Confidential and Proprietary 21

Analysis and Forecasting

Model:

log(Yi,t) = 0.664 + 0.113 * log(Yi,t-1) + 0.828 * log(Yi,t-12) + 0.001 * Xi,t,2 + 0.001 * Xi,t,3

+ 0.005 * FXrate i,t + ηi, + ei,t

ei,t ~ N(0, 0.09382), ηi ~ N(0, 0.02282)

Yi,t = Arrival to Hong Kong at month t and from i-th country

Xi,t,1 = Google Trend Search at 1st week of month t and from i-th country

Xi,t,2 = Google Trend Search at 2nd week of month t and from i-th country

Xi,t,3 = Google Trend Search at 3rd week of month t and from i-th country

FXrate i,t = Hong Kong Dollar per one unit of i-th country’s local currency at month t. Average of first

week’s FX rate is used as a proxy to FX rate per each month.

Page 22: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

22

Google Confidential and Proprietary 22

Visitor Arrival Statistics - Actual & Fitted

Page 23: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

23

Google Confidential and Proprietary 23

Analysis and Forecasting

Conclusion

Arrival at time t is positively associated with arrival at time t-1 and arrival at time t-12.

• It shows strong seasonality and autocorrelation

Arrival at time t is positively associated with searches on [Hong Kong].

Arrival at time t is positively associated with FX rates.

• When the local currency appreciates relative to Hong Kong Dollar, visitors to Hong Kong increase.

Page 24: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

24

Google Confidential and Proprietary

Automobiles

Page 25: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

25

Google Confidential and Proprietary 2525

US Auto Sales by Make

Monthly summaries released 1 week after end of month

Data available by Car Sales, Truck Sales and Total Sales for each make

Data available from 2003-2008

Source: Automotive News Data Center

Google Trends subcategory Vehicle Brands.

Weekly Search query index

Total 31 verticals in this subcategory• 27 verticals matching to Monthly Sales

available

Google Trends under Vehicle Brands Category

US Auto Sales by Make

Page 26: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

26

Google Confidential and Proprietary 26

Google Categories under Vehicle Brands

NOTE: Area represents the queries volume from first half year 2008 and the color represents queries yearly growth rate

Page 27: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

27

Google Confidential and Proprietary 2727

Auto Sales by Make (Top 9 Make by Sales) Monthly Sales vs. Google Trends at Second Week of each month

Page 28: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

28

Google Confidential and Proprietary 2828

Analysis and Forecasting

Fixed effects model:

log(Yi,t) = 2.4276 + 0.2552 * log(Yi,t-1) + 0.4930 * log(Yi,t-12)

+ 0.0005 * Xi,t,2 + 0.0014 * Xi,t,2 + ai * Makei + ei,t

ei,t ~ N(0, 0.13472) , Adjusted R2 = 0.9829

Yi,t = Auto Sales of i-th Make at month t

Xi,t,1 = Google Trend Search at 1st week of month t and from i-th make

Xi,t,2 = Google Trend Search at 2nd week of month t and from i-th make

Makei =Dummy variable for Auto Make

ai = Coefficient to capture the mean level of Auto Sales by Make

ANOVA Table

Df Sum Sq Mean Sq F value Pr(>F)

trends1 1 12.89 12.89 710.3542 < 2e-16 ***

trends2 1 0.05 0.05 2.7987 0.09455 .

log(s1) 1 1532.95 1532.95 84452.7530 < 2e-16 ***

log(s12) 1 24.07 24.07 1325.9741 < 2e-16 ***

as.factor(brand) 26 3.34 0.13 7.0696 < 2e-16 ***

Residuals 1480 26.86 0.02

Page 29: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

29

Google Confidential and Proprietary 29

Actual vs. Fitted Sales (Top 9 Make by Sales)

Page 30: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

30

Google Confidential and Proprietary 3030

Analysis and Forecasting

Conclusion

Sales at time t are positively associated with Sales at time t-1 and Sales at time t-12.

• Sales show strong seasonality and autocorrelation

Monthly Sales are positively correlated to the first and second weeks search volume of each

month.

• If the search volume increase by 1%, the sales volume will increase by an average of 0.19%.

Page 31: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

31

Google Confidential and Proprietary

Unemployment

Page 32: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

YoY Growth in Initial Claims & Google Search

According to the NBER, the current recession started December 2007.

National unemployment rate passed 5% in mid 2008 and search queries on [Welfare and Unemployment] also increased at same time.

Page 33: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Initial claims is an important leading indicator

Page 34: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Google Trends data [Search Insights screenshot]

Page 35: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Initial Claims and Google Trends

Month May 2009

Week3/15/09 - 3/21/09

3/22/09 - 3/28/09

3/29/09 - 4/4/09

4/5/09 - 4/11/09

4/12/09 - 4/18/09

4/19/09 - 4/25/09

4/26/09 - 5/2/09

Initial Claims 81,236 74,179 69,471 75,875 84,410 Continued Claims 859,561 826,924 866,734 834,569 846,477 Covered Employment 15,395,215 15,395,215 15,395,215 15,356,117 15,356,117 Insured Unemployment Rate 5.58 5.37 5.63 5.43 5.51

Jobs 9% 6% 2% 0% 1% -9% -11%Welfare & Unemployment -2% -9% -13% -12% -6% -9% -10%

CaliforniaMarch 2009 April 2009

Release at 5/7/09

Release at 5/14/09

Google Trends

US Dept of Labor

Page 36: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Strong Autocorrelation in Initial Claims

Time Series Autocorrelation Function

Page 37: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Initial Claims Before/After Recession Started

California New York

Page 38: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Time Window for Analysis

Window For Long Term Model

Window For Short Term Model

Recession Starts

Page 39: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Model

Reference ARIMA(0,1,1) X (1,0,0)12 Model

ARIMA(0,1,1) X (1,0,0)12 Model With Google Trends

Model Fit improved significantly – smaller Standard deviation, high log likelihood and smaller AIC

Initial Claims are positively correlated with searches on Jobs and Welfare.

Sigmalog

likelihoodAIC Sigma

log likelihood

AIC

LT Model -0.755 *** 0.619 *** 0.086 268.85 -531.69 -0.725 *** 0.565 *** 0.004 ** 0.003 ** 0.083 285.96 -561.91ST Model -0.691 *** 0.463 *** 0.098 99.04 -192.08 -0.657 *** 0.359 ** 0.002 0.007 *** 0.088 114.19 -218.38

Reference Model Model with Google Trends

Theta Phi Theta Phi Jobs Welfare

Signif. codes: 0.001 ‘***’ 0.05 ‘**’ 0.01 ‘*’

Page 40: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Long Term Model: Prediction Comparison with MAE

With Google Trends, the out-of-sample prediction MAE decreases by 16.84%. Prediction with rolling window from 1/11/2009 to 4/12/2009

Prediction Error at t:

Mean Absolute Error:

Page 41: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Short Term Model: Prediction Comparison with MAE

With Google Trends, the out-of-sample prediction MAE decreases by 19.23%. Prediction errors are within the same range as LT Model.

Fit improvement is better with ST Model.

Page 42: Google Confidential and Proprietary 1 Predicting the Present With Google Trends Hyunyoung Choi Hal Varian June 2009.

Google Confidential and Proprietary

Summary

Google Trends significantly improves out-of-sample prediction of state unemployment, up to 18 days in advance of data release.

Mean absolute error for out-of-sample predictions declines by 16.84% for LT Model and 19.23% for ST Model.

Further work Can examine metro level data

Other local data (real estate)

Combine with other predictors

Detect turning points?


Recommended