google trends predicting present v2
TRANSCRIPT
-
8/22/2019 Google Trends Predicting Present v2
1/42
Google Confidential and Proprietary 1
Predicting the PresentWith Google Trends
Hyunyoung Choi
Hal Varian
June 2009
-
8/22/2019 Google Trends Predicting Present v2
2/42
Google Confidential and Proprietary 2 2
Problem statement
Government agencies and other organizations produce monthly reports on economic activity
Retail Sales
House Sales
Automotive Sales
Unemployment
Problems with reports
Compilation delay of several weeks Subsequent revisions
Sample size may be small
Not available at all geographic levels
Google Trends releases daily and weekly index of search queries by industry vertical
Real time data
No revisions (but some sampling variation)
Large samples
Available by country, state and city
Can Google Trends data help predict currenteconomic activity?
Before release of preliminary statistics
Before release of final revision
-
8/22/2019 Google Trends Predicting Present v2
3/42
Google Confidential and Proprietary
3
Categories in Google Trends by Query Shares
Note: Queries from 2009-01-01 to 2009-04-30 & Growth Comparison w/ the same time window
-
8/22/2019 Google Trends Predicting Present v2
4/42
Google Confidential and Proprietary
Real Estate
-
8/22/2019 Google Trends Predicting Present v2
5/42
Google Confidential and Proprietary 5
Geography
Category
Time window
-
8/22/2019 Google Trends Predicting Present v2
6/42
Google Confidential and Proprietary 6
Real Estate Agencies
Rental Listings & Referrals
Home Insurance
Home Inspections
& Appraisal
P
roperty
M
anagement
Home Financing
6
Subcategories under Real Estate by Query Shares
-
8/22/2019 Google Trends Predicting Present v2
7/42Google Confidential and Proprietary 7 7
Search on Real Estate Agencies
-
8/22/2019 Google Trends Predicting Present v2
8/42Google Confidential and Proprietary 8 8
Searches on Rental Listings & Referrals
-
8/22/2019 Google Trends Predicting Present v2
9/42Google Confidential and Proprietary 9
Depicting trends
Google Trends measures normalizedquery
share of particular category of queries controls for overall growth
Often useful to look at year-on-year changes
to eliminate seasonality.
Illustrate correlations and covariates.
Improving predictions
Forecast time series using its own lagged
values and add Trends data as a predictor.
Statistical significance?
Improved fit?
Improved forecasts?
Identify turning points?
9
2006 2007 2008
30
20
10
0
10
20
Real Estate Agencies Query Index
Oct Jan Apr Jul20
15
10
5
0
5
Real Estate Agencies YOY Growth Index
-
8/22/2019 Google Trends Predicting Present v2
10/42Google Confidential and Proprietary 10 10
15 yr Mortgage Rate vs. Home Financing
-
8/22/2019 Google Trends Predicting Present v2
11/42Google Confidential and Proprietary 11 1111
Forecasting primer
Basic forecasting models
Autoregressive: value at time t depends on
Value at time t-1
Seasonal adjustment: value at time t depends on
Value at time t-12
For monthly data
Transfer function: value at time t depends on
Other contemporaneous or lagging variables
Seasonal autoregressive transfer model: Value at time t depends on
Value at time t-12 (seasonality)
Value at time t-1 (recent behavior)
Other lagging or contemporaneous variables (such as Google Trends data)
Typical question of interest
How much more accurate forecasts can you get from additional variables over and above the accuracy
you get with the history of the time series itself?
-
8/22/2019 Google Trends Predicting Present v2
12/42Google Confidential and Proprietary
New Home Sales
Model
Recent Trend with NewHome Sales at t-1
Seasonality with NewHome Sales at t-12
Recent Search Activity on
Real Estate Agencies
Rental Listings & Referrals
Home Inspections &Appraisal
Property Management
Home Insurance
Home Financing
Time Series Google Trends
Housing affordabilitywith Average/MedianHome Price
Exogenous Variables
-
8/22/2019 Google Trends Predicting Present v2
13/42Google Confidential and Proprietary 13 13
Predicting the present
Monthly release 24 28 days after the
month
Seasonally adjusted
National and Regional aggregate
Home Inspections & Appraisal
Home Insurance
Home Financing
Property Management
Rental Listings & Referrals
Real Estate Agencies
New Residential Sales from US Census Google Trends Real Estate by Category
-
8/22/2019 Google Trends Predicting Present v2
14/42
Google Confidential and Proprietary 14 14
New House Sales vs. Real Estate Google Trends
-
8/22/2019 Google Trends Predicting Present v2
15/42
Google Confidential and Proprietary 15
Model:
Yt = 446.1 + 0.864 * Yt - 1 4.340 * us378.1 + 4.198 * us96.2 0.001 * AvgPt 1
Yt : New house sold at t-th month
AvgPt 1: Average Sales Price of New One-Family Houses Sold at (t-1)-th month
us378.1 : Google Trend of vertical id = 378 (Rental Listings & Referrals ) at t-th month 1st week
us96.2 : Google Trend of vertical id = 96 (Real Estate Agent) at t-th month 2nd week
15
Analysis and Forecasting
July 2008
Actual = 515K
Predicted = 442.98K
Z-score = 2.53
August 2008 Prediction = 417.52K
-
8/22/2019 Google Trends Predicting Present v2
16/42
Google Confidential and Proprietary 16 16
Analysis and Forecasting
Observations
Since 2005 new house sales have been decreasing, with little seasonality
Google Trends captures seasonality & recent trends
Positive association with Real Estate Agencies (96)
Negative association with Rental Listings & Referrals (378) and Average Price
-
8/22/2019 Google Trends Predicting Present v2
17/42
Google Confidential and Proprietary 17
Travel
-
8/22/2019 Google Trends Predicting Present v2
18/42
Google Confidential and Proprietary 18
Hotels & Accommodations
Attractions & Activities
Air Travel
Bus & Rail
Cruises &
Charters
A
dventure
Travel
Car Rental
& Taxi Services
Vacation Destinations
18
Subcategories under Travel by Query Shares
-
8/22/2019 Google Trends Predicting Present v2
19/42
Google Confidential and Proprietary 19 19
Travel to Hong Kong
Monthly summaries release with 1
month lag
Reports Country/Territory of Residence
of visitors
Data available 2004-2008
Hotels & Accommodations
Air Travel
Car Rental & Taxi Services
Cruises & Charters
Attractions & Activities
Vacation Destinations
Australia
Caribbean Islands
Hawaii
Hong Kong
Las Vegas Mexico
New York City
Orlando
Adventure Travel
Bus & Rail
Google Trends Travel by CategoryVisitors Arrival Statistics from Hong
Kong Tourism Board
-
8/22/2019 Google Trends Predicting Present v2
20/42
Google Confidential and Proprietary 20 20
Visitors Arrival Statistics vs. Google Trends
-
8/22/2019 Google Trends Predicting Present v2
21/42
Google Confidential and Proprietary 21 21
Analysis and Forecasting
Model:
log(Yi,t) = 0.664 + 0.113 * log(Yi,t-1) + 0.828 * log(Yi,t-12) + 0.001 * Xi,t,2 + 0.001 * Xi,t,3
+ 0.005 * FXrate i,t + i, + ei,t
ei,t ~ N(0, 0.09382), i ~ N(0, 0.0228
2)
Yi,t = Arrival to Hong Kong at month t and from i-th country
Xi,t,1 = Google Trend Search at 1st week of month t and from i-th country
Xi,t,2 = Google Trend Search at 2nd week of month t and from i-th country
Xi,t,3 = Google Trend Search at 3rd week of month t and from i-th country
FXrate i,t = Hong Kong Dollar per one unit of i-th countrys local currency at month t. Average of first
weeks FX rate is used as a proxy to FX rate per each month.
-
8/22/2019 Google Trends Predicting Present v2
22/42
Google Confidential and Proprietary 22 22
Visitor Arrival Statistics - Actual & Fitted
-
8/22/2019 Google Trends Predicting Present v2
23/42
Google Confidential and Proprietary 23 23
Analysis and Forecasting
Conclusion
Arrival at time t is positively associated with arrival at time t-1 and arrival at time t-12.
It shows strong seasonality and autocorrelation
Arrival at time t is positively associated with searches on [Hong Kong].
Arrival at time t is positively associated with FX rates.
When the local currency appreciates relative to Hong Kong Dollar, visitors to Hong Kong increase.
-
8/22/2019 Google Trends Predicting Present v2
24/42
Google Confidential and Proprietary 24
Automobiles
-
8/22/2019 Google Trends Predicting Present v2
25/42
Google Confidential and Proprietary 25 2525
US Auto Sales by Make
Monthly summaries released 1 week
after end of month
Data available by Car Sales, Truck
Sales and Total Sales for each make
Data available from 2003-2008
Source:Automotive News Data Center
Google Trends subcategory Vehicle
Brands.
Weekly Search query index
Total 31 verticals in this subcategory
27 verticals matching to Monthly Sales
available
Google Trends under Vehicle Brands
CategoryUS Auto Sales by Make
-
8/22/2019 Google Trends Predicting Present v2
26/42
Google Confidential and Proprietary 26 26
Google Categories under Vehicle Brands
NOTE: Area represents the queries volume from first half year 2008 and the color represents queries yearly growth rate
-
8/22/2019 Google Trends Predicting Present v2
27/42
Google Confidential and Proprietary 27 2727
Auto Sales by Make (Top 9 Make by Sales)Monthly Sales vs. Google Trends at Second Week of each month
-
8/22/2019 Google Trends Predicting Present v2
28/42
Google Confidential and Proprietary 28 2828
Analysis and Forecasting
Fixed effects model:
log(Yi,t) = 2.4276 + 0.2552 * log(Yi,t-1) + 0.4930 * log(Yi,t-12)
+ 0.0005 * Xi,t,2 + 0.0014 * Xi,t,2 + ai * Makei + ei,t
ei,t ~ N(0, 0.13472) , Adjusted R2 = 0.9829
Yi,t = Auto Sales of i-th Make at month t
Xi,t,1 = Google Trend Search at 1st week of month t and from i-th make
Xi,t,2
= Google Trend Search at 2nd week of month t and from i-th make
Makei =Dummy variable for Auto Make
ai = Coefficient to capture the mean level of Auto Sales by Make
ANOVA Table
Df Sum Sq Mean Sq F value Pr(>F)trends1 1 12.89 12.89 710.3542 < 2e-16 ***
trends2 1 0.05 0.05 2.7987 0.09455 .
log(s1) 1 1532.95 1532.95 84452.7530 < 2e-16 ***
log(s12) 1 24.07 24.07 1325.9741 < 2e-16 ***
as.factor(brand) 26 3.34 0.13 7.0696 < 2e-16 ***
Residuals 1480 26.86 0.02
-
8/22/2019 Google Trends Predicting Present v2
29/42
Google Confidential and Proprietary 29 29
Actual vs. Fitted Sales (Top 9 Make by Sales)
-
8/22/2019 Google Trends Predicting Present v2
30/42
Google Confidential and Proprietary 30 3030
Analysis and Forecasting
Conclusion
Sales at time t are positively associated with Sales at time t-1 and Sales at time t-12.
Sales show strong seasonality and autocorrelation
Monthly Sales are positively correlated to the first and second weeks search volume of each
month.
If the search volume increase by 1%, the sales volume will increase by an average of 0.19%.
-
8/22/2019 Google Trends Predicting Present v2
31/42
Google Confidential and Proprietary 31
Unemployment
-
8/22/2019 Google Trends Predicting Present v2
32/42
Google Confidential and Proprietary
YoY Growth in Initial Claims & Google Search
According to the NBER, the current recession started December 2007.
National unemployment rate passed 5% in mid 2008 and search queries on [Welfare
and Unemployment] also increased at same time.
-
8/22/2019 Google Trends Predicting Present v2
33/42
Google Confidential and Proprietary
Initial claims is an important leading indicator
Google Trends data [Search Insights screenshot]
-
8/22/2019 Google Trends Predicting Present v2
34/42
Google Confidential and Proprietary
Google Trends data [Search Insights screenshot]
-
8/22/2019 Google Trends Predicting Present v2
35/42
Google Confidential and Proprietary
Initial Claims and Google Trends
Month May 2009
Week3/15/09 -
3/21/09
3/22/09 -
3/28/09
3/29/09 -
4/4/09
4/5/09 -
4/11/09
4/12/09 -
4/18/09
4/19/09 -
4/25/09
4/26/09 -
5/2/09
Initial Claims 81,236 74,179 69,471 75,875 84,410Continued Claims 859,561 826,924 866,734 834,569 846,477
Covered Employment 15,395,215 15,395,215 15,395,215 15,356,117 15,356,117
Insured Unemployment Rate 5.58 5.37 5.63 5.43 5.51
Jobs 9% 6% 2% 0% 1% -9% -11%
Welfare & Unemployment -2% -9% -13% -12% -6% -9% -10%
California
March 2009 April 2009
Release at
5/7/09
Release at
5/14/09
Google
Trends
US Dept of
Labor
-
8/22/2019 Google Trends Predicting Present v2
36/42
Google Confidential and Proprietary
Strong Autocorrelation in Initial Claims
Time Series Autocorrelation Function
-
8/22/2019 Google Trends Predicting Present v2
37/42
Google Confidential and Proprietary
Initial Claims Before/After Recession Started
California New York
-
8/22/2019 Google Trends Predicting Present v2
38/42
Google Confidential and Proprietary
Time Window for Analysis
Window For Long Term Model
Window For Short Term Model
Recession Starts
-
8/22/2019 Google Trends Predicting Present v2
39/42
Google Confidential and Proprietary
Model
Reference ARIMA(0,1,1) X (1,0,0)12 Model
ARIMA(0,1,1) X (1,0,0)12 Model With Google Trends
Model Fit improved significantlysmaller Standard deviation, high log likelihood and smaller AIC
Initial Claims are positively correlated with searches on Jobs and Welfare.
Sigmalog
likelihoodAIC Sigma
log
likelihoodAIC
LT Model -0.755 *** 0.619 *** 0.086 268.85 -531.69 -0.725 *** 0.565 *** 0.004 ** 0.003 ** 0.083 285.96 -561.91
ST Model -0.691 *** 0.463 *** 0.098 99.04 -192.08 -0.657 *** 0.359 ** 0.002 0.007 *** 0.088 114.19 -218.38
Reference Model Model with Google Trends
Theta Phi Theta Phi Jobs Welfare
Signif. codes: 0.001 ***0.05 ** 0.01 *
-
8/22/2019 Google Trends Predicting Present v2
40/42
Google Confidential and Proprietary
Long Term Model: Prediction Comparison with MAE
With Google Trends, the out-of-sample prediction MAE decreases by 16.84%.
Prediction with rolling window from 1/11/2009 to 4/12/2009
Prediction Error at t:
Mean Absolute Error:
-
8/22/2019 Google Trends Predicting Present v2
41/42
Google Confidential and Proprietary
Short Term Model: Prediction Comparison with MAE
With Google Trends, the out-of-sample prediction MAE decreases by 19.23%.
Prediction errors are within the same range as LT Model.
Fit improvement is better with ST Model.
-
8/22/2019 Google Trends Predicting Present v2
42/42
Summary
Google Trends significantly improves out-of-sample prediction of state unemployment, up
to 18 days in advance of data release.
Mean absolute error for out-of-sample predictions declines by 16.84% for LT Model and19.23% for ST Model.
Further work
Can examine metro level data
Other local data (real estate)
Combine with other predictors
Detect turning points?