empirical asset pricing via machine learning

33
Empirical Asset Pricing via Machine Learning Shihao Gu 1 Bryan Kelly 2 Dacheng Xiu 1 1 University of Chicago Booth School of Business 2 Yale School of Management New Methods for the Cross Section of Returns Sept. 2018, University of Chicago

Upload: others

Post on 05-Jun-2022

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Empirical Asset Pricing via Machine Learning

Empirical Asset Pricing via Machine Learning

Shihao Gu1 Bryan Kelly2 Dacheng Xiu1

1University of Chicago Booth School of Business

2Yale School of Management

New Methods for the Cross Section of ReturnsSept. 2018, University of Chicago

Page 2: Empirical Asset Pricing via Machine Learning

The HypeThe Intellectually Honest Answer

4

CNBC, July 12, 2017

Bloomberg, July 15, 2017

Bloomberg, Aug 9, 2017

Same Reporter3 weeks later

Economist, May 2017

Economist, Dec 2017

Page 3: Empirical Asset Pricing via Machine Learning

The Hype

The Intellectually Honest Answer

4

CNBC, July 12, 2017

Bloomberg, July 15, 2017

Bloomberg, Aug 9, 2017

Same Reporter3 weeks later

Economist, May 2017

Economist, Dec 2017

Despite this excitement, understanding of role/value

of these methods in asset pricing setting is barely nascent

Page 4: Empirical Asset Pricing via Machine Learning

What We Do

I Comparative analysis of machine learning methods in context of perhaps most widely

studied problem in finance, measuring equity risk premia

Clearest way to understand the relevance of ML for AP is to apply methods and

compare performance in familiar empirical problem

Two Primary Contributions

1. New benchmark of accuracy in measuring risk premia (aggregate and individual asset)

I Unprecedented out-of-sample predictive R2

I Strategies that leverage ML forecasts earn Sharpe ratios >2.0

2. Synthesize empirical AP with field of machine learning

I Show ML well positioned to push frontier of risk premium measurementI We provide comparative overview of ML methods applied to the two canonical problems of

empirical asset pricing: predicting returns in the cross section and time series

Page 5: Empirical Asset Pricing via Machine Learning

What We Do

I Comparative analysis of machine learning methods in context of perhaps most widely

studied problem in finance, measuring equity risk premia

Clearest way to understand the relevance of ML for AP is to apply methods and

compare performance in familiar empirical problem

Two Primary Contributions

1. New benchmark of accuracy in measuring risk premia (aggregate and individual asset)

I Unprecedented out-of-sample predictive R2

I Strategies that leverage ML forecasts earn Sharpe ratios >2.0

2. Synthesize empirical AP with field of machine learning

I Show ML well positioned to push frontier of risk premium measurementI We provide comparative overview of ML methods applied to the two canonical problems of

empirical asset pricing: predicting returns in the cross section and time series

Page 6: Empirical Asset Pricing via Machine Learning

What We Do

I Comparative analysis of machine learning methods in context of perhaps most widely

studied problem in finance, measuring equity risk premia

Clearest way to understand the relevance of ML for AP is to apply methods and

compare performance in familiar empirical problem

Two Primary Contributions

1. New benchmark of accuracy in measuring risk premia (aggregate and individual asset)

I Unprecedented out-of-sample predictive R2

I Strategies that leverage ML forecasts earn Sharpe ratios >2.0

2. Synthesize empirical AP with field of machine learning

I Show ML well positioned to push frontier of risk premium measurementI We provide comparative overview of ML methods applied to the two canonical problems of

empirical asset pricing: predicting returns in the cross section and time series

Page 7: Empirical Asset Pricing via Machine Learning

Return prediction is economically meaningful.

Fundamental goal of asset pricing is to understand the behavior of risk premia. If expected

returns were perfectly observed, we would still need theories to explain their behavior and

empirical analysis to test those theories

Efficient markets.

Risk premia are notoriously difficult to measure

— market efficiency forces return variation to be dominated by unforecastable news

— our empirical exercise is non-trivial because of the extremely low signal-to-noise ratio

Page 8: Empirical Asset Pricing via Machine Learning

What is Machine Learning?

Definition of “machine learning” is inchoate and often context specific.

Our definition:

i. a diverse collection of high-dimensional models for statistical prediction

+

ii. regularization methods for model selection and mitigation of overfit

+

iii. efficient algorithms for searching among a vast number of potential model specifications

Page 9: Empirical Asset Pricing via Machine Learning

Why Apply Machine Learning to Asset Pricing?

Reason 1: Measuring an asset’s risk premium is fundamentally a problem of prediction

Fama Nobel Lecture: Two pillars of empirical asset pricing research

1. Describe/understand differences in risk premis across assets

2. Describe/understand dynamics of the market equity risk premium

A risk premium is a conditional expectation of a future realized excess return.

ML methods specialize in prediction tasks, thus ideally suited to risk premium measurement.

Page 10: Empirical Asset Pricing via Machine Learning

Why Apply Machine Learning to Asset Pricing?

Reason 2: The collection of candidate conditioning variables for the risk premium is large

I We’ve accumulated a staggering list of return predictors

I They are often close cousins and highly correlated

Traditional prediction methods break down when predictor count approaches the observation

count and/or predictors are highly correlated

I With its emphasis on variable selection and dimension reduction, ML well suited for

such challenging empirical issues

Page 11: Empirical Asset Pricing via Machine Learning

Why Apply Machine Learning to Asset Pricing?

Reason 3: Functional form is unknown and likely complex

I Theoretical literature offers little guidance for winnowing list of conditioning variables and

functional forms

Three aspects of ML suited to problems of ambiguous functional form

1. Suite of dissimilar methods. Casts wide net in model search

2. Nonparametric design to approximate complex nonlinear associations

3. Parameter penalization and conservative model selection criteria complement complexity,

help avoid overfit and false discovery

Page 12: Empirical Asset Pricing via Machine Learning

The (Very Familiar) Empirical Setting

∼100 stock characteristics (usual suspects)

+

∼10 macroeconomic predictors (a la Goyal-Welch)

Monthly returns on 1) individual stocks and 2) stock portfolios

Page 13: Empirical Asset Pricing via Machine Learning

Which Machine Learning Methods?

I Linear ModelsI OLS(3) includes value, size, momentumI OLS + Elastic Net + Huber’s Loss

I Dimension ReductionI PCA, PLS

I Generalized Linear ModelsI Additive Series Regression

I Regression TreesI Random ForestI Gradient Boosted Regression Trees

I Neural Networks aka “Deep Learning”I up to 5 hidden layersI around 30,000 parameters

Page 14: Empirical Asset Pricing via Machine Learning

Main Empirical Findings

Machine learning holds promise for empirical asset pricing

1. Vast predictor sets viable in linear prediction when penalization used

2. Non-linearities substantially improve predictions

3. Shallow learning outperforms deeper learning

4. Distance between non-linear methods and benchmark widens when predicting portfolios

5. Gains from machine learning forecasts are economically large

6. Most successful predictors: price trends, liquidity, and volatility

Page 15: Empirical Asset Pricing via Machine Learning

Data and Over-arching Model

I Goal: Find best representation of Et(ri,t+1)

I We consider general model

Et(ri,t+1) = g?(zi,t), zi,t = xt ⊗ ci,t ,

I xt vector of macro predictorsI ci,t vector of stock characteristicsI g?(·) functional form approximated by ML

I Our framework nests typical (time-varying) beta-pricing specification

ri,t+1 = Et(ri,t+1) + β′i,t(Ft+1 − Et(Ft+1)) + εi,t+1︸ ︷︷ ︸

Noise

, Et(ri,t+1) = β′i,tλt ,

Page 16: Empirical Asset Pricing via Machine Learning

Training, Validation, and TestingWe divide the 60 years of data into

I 18 years of training sampleI 12 years of validation sampleI and remaining 30 years for out-of-sample testing.

Page 17: Empirical Asset Pricing via Machine Learning

Empirical Assessments

I Predictive performance in statistical terms

I R2OOS = 1 −

∑(i,t)∈T3

(ri,t+1−ri,t+1)2∑

(i,t)∈T3(ri,t+1−0)2

I Predictive performance in economic terms

I Sharpe ratio

I Model comparison

I Diebold-Mariano tests

I Variable importance

I Decrease in R2 from exclusion

I Benchmark predictive model (“OLS-3”)

I Linear model using size, value, and momentum

Page 18: Empirical Asset Pricing via Machine Learning

Individual Stock Return Prediction: Monthly

OLS OLS-3 PLS PCR ENet GLM RF GBRT NN1 NN2 NN3 NN4 NN5

All -4.60 0.16 0.18 0.28 0.09 0.19 0.27 0.30 0.35 0.38 0.39 0.37 0.35

Top 1000 -14.21 0.15 -0.10 -0.05 0.10 0.17 0.62 0.53 0.44 0.58 0.72 0.67 0.69

Bottom 1000 -2.13 0.37 0.29 0.36 0.18 0.28 0.29 0.27 0.41 0.45 0.46 0.42 0.40

OLS-3+H

PLSPC

REN

et+H

GLM

+H

RF

GBRT+H

NN

1N

N2

NN

3N

N4

NN

5

R2 oos

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8All

Top

Bottom

Similar for big/small stocks, monthly/annual returns

Page 19: Empirical Asset Pricing via Machine Learning

“Bottom-up” Prediction of Portfolio Returns

Predict portfolio returns by aggregating individual stock return predictions

Given weights wpi,t in portfolio p, and individual return predictions ri,t+1,

rpt+1 =N∑i=1

wpi,t × ri,t+1

Page 20: Empirical Asset Pricing via Machine Learning

Predicting Pre-specified PortfoliosMonthly R2

OLS-3 PLS PCR ENet GLM RF GBRT NN1 NN2 NN3 NN4 NN5

S&P 500 -0.11 -0.86 -2.62 -0.38 0.86 1.39 1.13 0.84 0.96 1.80 1.46 1.60

Big Growth 0.41 0.75 -0.77 -1.55 0.73 0.99 0.80 0.70 0.32 1.67 1.42 1.40

Big Value -1.05 -1.88 -3.14 -0.03 0.70 1.41 1.04 0.78 1.20 1.57 1.17 1.42

Small Growth 0.35 1.54 0.72 -0.03 0.95 0.54 0.62 1.68 1.26 1.48 1.53 1.44

Small Value -0.06 0.40 -0.12 -0.57 0.02 0.71 0.90 0.00 0.47 0.46 0.41 0.53

Big Conservative -0.24 -0.17 -1.97 0.19 0.69 0.96 0.78 1.08 0.67 1.68 1.46 1.56

Big Aggressive -0.12 -0.77 -2.00 -0.91 0.68 1.83 1.45 1.14 1.65 1.87 1.55 1.69

Small Conservative 0.02 0.75 0.48 -0.46 0.55 0.59 0.60 0.94 0.91 0.93 0.99 0.88

Small Aggressive 0.14 0.97 0.06 -0.54 0.19 0.86 1.04 0.25 0.66 0.75 0.67 0.79

Big Robust -0.58 -0.22 -2.89 -0.27 1.54 1.41 0.70 0.60 0.84 1.14 1.05 1.21

Big Weak -0.24 -1.47 -1.95 -0.40 -0.26 0.67 0.83 0.24 0.60 1.21 0.95 1.07

Small Robust -0.77 0.77 0.18 -0.32 0.41 0.27 -0.06 -0.06 -0.02 0.06 0.13 0.15

Small Weak 0.02 0.32 -0.28 -0.25 0.17 0.90 1.31 0.84 0.85 1.09 0.96 1.08

Big Up -1.53 -2.54 -3.93 -0.21 0.40 1.12 0.68 0.46 0.85 1.28 0.99 1.05

Big Down -0.10 -1.20 -2.05 -0.26 0.36 1.09 0.77 0.48 0.89 1.34 1.17 1.36

Small Up -0.79 0.42 -0.36 -0.33 -0.33 0.31 0.40 0.23 0.60 0.67 0.55 0.61

Small Down 0.40 1.16 0.47 -0.46 0.62 0.93 1.20 0.80 0.97 0.97 0.97 0.96

Page 21: Empirical Asset Pricing via Machine Learning

Variable ImportanceNN5

NN4

NN3

NN2

NN1

GBRT+

HRF

GLM

+H

ENet

+HPCR

PLS

mom

1mm

vel1

chm

omm

axre

t

mom

12m

indm

omdo

lvol

secu

redi

nd spre

tvol

turn

ninc

rid

iovo

l

mom

36m

std_

turn

basp

readag

rrd

_mve

mom

6m ep dyco

nvin

d ps illch

csho rd

depr

zero

trade

beta

age

beta

sqor

gcap

cash

debt lgr

bmlev

cash

prch

inv

inve

stbm

_ia

rd_s

ale

roic

sale

inv

oper

prof

roav

olegr

priced

elayms

herf

cfp

sgr

hire

cash

roaq

sale

rec

sale

cash

mve

_ia

gma

grca

pxcu

rrat

absa

ccacc

quick

roeq

pchc

apx_

iagr

ltnoa

cfp_

iata

ngse

cure

dpc

tacc

std_

dolvol tb

chem

pia

aeav

olch

pmia

chtx

rsup

pchs

ale_

pchi

nvt

cinv

est

real

esta

teno

ise5

pchs

ale_

pchx

sga

pchs

alei

nv

pchg

m_p

chsa

lesic2

pchd

epr

divo

pchq

uickear

chat

oia

noise4

stda

ccst

dcf

divi

pchc

urra

tno

ise1

pchs

ale_

pchr

ect

noise3

noise2sin

Page 22: Empirical Asset Pricing via Machine Learning

Marginal Relation: Characteristics and Et(ri ,t+1)

mom1m

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

×10-3

-5

0

5

ENet+H

GLM+H

GBRT+H

RF

NN3

retvol

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

×10-3

-5

0

5

mvel1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

×10-3

-5

0

5

acc

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

×10-3

-5

0

5

Page 23: Empirical Asset Pricing via Machine Learning

Machine Learning Long-Short PortfoliosOLS-3+H PLS PCR

Pred Avg Std SR Pred Avg Std SR Pred Avg Std SR

Low -0.41 0.19 6.52 0.10 -0.85 -0.05 6.73 -0.03 -0.92 -0.49 7.02 -0.242 -0.08 0.40 5.28 0.26 -0.26 0.31 6.02 0.18 -0.29 0.15 6.20 0.089 1.44 0.76 7.10 0.37 1.50 1.10 5.34 0.71 1.45 1.34 5.44 0.85High 1.81 1.84 8.52 0.75 2.15 1.51 5.79 0.90 2.09 1.91 6.08 1.09

H-L 2.22 1.65 6.41 0.89 3.01 1.56 4.45 1.22 3.01 2.40 4.65 1.79

Enet+H GLM+H RFLow 0.06 -0.31 7.01 -0.15 -0.43 -0.51 6.94 -0.25 0.26 -0.33 7.13 -0.162 0.33 0.39 6.03 0.22 0.02 0.33 6.08 0.19 0.41 0.32 5.68 0.199 1.34 1.03 5.92 0.60 1.37 1.29 5.59 0.80 0.89 1.17 5.64 0.72High 1.65 1.80 7.31 0.86 1.84 1.64 6.34 0.90 1.07 1.83 6.78 0.93

H-L 1.59 2.12 5.48 1.34 2.27 2.15 4.39 1.70 0.81 2.16 5.34 1.40

GBRT+H NN1 NN2Low -0.03 -0.38 6.67 -0.20 -0.47 -0.80 7.47 -0.37 -0.37 -0.82 7.97 -0.362 0.16 0.43 5.66 0.26 0.14 0.21 6.24 0.12 0.19 0.17 6.44 0.099 0.81 1.11 5.26 0.73 1.64 1.21 5.49 0.76 1.40 1.13 5.46 0.72High 1.02 1.70 6.57 0.90 2.46 2.13 7.30 1.01 2.32 2.36 8.03 1.02

H-L 1.04 2.08 4.25 1.70 2.93 2.93 4.81 2.11 2.69 3.18 4.90 2.25

NN3 NN4 NN5Low -0.39 -0.96 7.77 -0.43 -0.28 -0.90 7.87 -0.40 -0.21 -0.76 7.93 -0.332 0.17 0.13 6.42 0.07 0.25 0.18 6.57 0.09 0.25 0.24 6.58 0.139 1.44 1.16 5.50 0.73 1.32 1.22 5.60 0.75 1.30 1.24 5.54 0.77High 2.30 2.23 7.78 0.99 2.28 2.35 7.95 1.02 2.19 2.21 7.78 0.98

H-L 2.69 3.19 4.77 2.32 2.56 3.25 4.79 2.35 2.39 2.97 5.05 2.03

Page 24: Empirical Asset Pricing via Machine Learning

Machine Learning (Value-Weighted) Long-Short PortfoliosOLS-3+H PLS PCR

Pred Avg Std SR Pred Avg Std SR Pred Avg Std SR

Low -0.42 0.39 5.22 0.26 -0.86 0.27 5.57 0.17 -0.90 0.04 5.92 0.032 -0.08 0.60 4.47 0.46 -0.27 0.49 5.10 0.33 -0.29 0.43 5.33 0.289 1.42 0.56 7.50 0.26 1.49 0.94 5.01 0.65 1.47 1.18 4.98 0.82High 1.72 0.90 8.18 0.38 2.03 0.96 5.45 0.61 2.02 1.36 5.61 0.84

H-L 2.14 0.51 6.46 0.27 2.89 0.70 4.35 0.56 2.92 1.32 4.72 0.97

ENet+H GLM+H RFLow 0.05 0.08 5.64 0.05 -0.42 0.11 5.43 0.07 0.28 0.09 6.09 0.052 0.33 0.52 5.07 0.35 0.02 0.46 4.67 0.34 0.41 0.39 5.17 0.269 1.33 0.88 5.59 0.55 1.36 0.97 5.49 0.61 0.89 1.20 5.88 0.70High 1.59 0.80 6.83 0.40 1.76 1.18 6.30 0.65 1.01 1.49 7.18 0.72

H-L 1.54 0.72 5.49 0.45 2.17 1.08 4.52 0.83 0.73 1.40 5.54 0.87

GBRT+H NN1 NN2Low 0.00 0.03 5.76 0.02 -0.40 -0.37 7.16 -0.18 -0.30 -0.50 7.89 -0.222 0.16 0.50 5.00 0.34 0.15 0.40 6.03 0.23 0.18 0.36 6.13 0.219 0.81 0.99 5.08 0.67 1.60 0.94 5.09 0.64 1.40 1.01 5.52 0.63High 0.97 1.20 5.81 0.71 2.18 1.37 6.31 0.75 2.03 1.43 6.95 0.72

H-L 0.97 1.16 4.27 0.94 2.58 1.73 5.62 1.07 2.32 1.94 5.68 1.18

NN3 NN4 NN5Low -0.21 -0.51 7.83 -0.23 -0.29 -0.43 7.74 -0.19 -0.15 -0.36 7.63 -0.162 0.26 0.32 6.39 0.18 0.20 0.39 6.15 0.22 0.26 0.29 6.36 0.169 1.28 1.20 5.79 0.72 1.36 1.07 5.87 0.63 1.26 1.31 5.77 0.79High 1.99 1.58 7.33 0.74 2.02 1.47 7.11 0.72 1.91 1.55 6.90 0.78

H-L 2.20 2.09 5.78 1.25 2.30 1.90 5.83 1.13 2.06 1.91 6.01 1.10

Page 25: Empirical Asset Pricing via Machine Learning

Conclusion

Machine learning holds promise for empirical asset pricing

1. Vast predictor sets viable in linear prediction when penalization used

2. Non-linearities substantially improve predictions

3. Shallow learning outperforms deeper learning

4. Distance between non-linear methods and benchmark widens when predicting portfolios

5. Gains from machine learning forecasts are economically large

6. Most successful predictors: price trends, liquidity, and volatility

Page 26: Empirical Asset Pricing via Machine Learning

Can Machines Learn Finance?

2011: Google Brain launches. Uncertain if deep neural

networks would identify a cat, let alone drive a car

Answer: Yes, but there is much more to learn

I Most anecdotes in low capacity settings (HFT, OTC, etc.)

I We provide first large scale evidence of value for long-term asset management

I These are early days (2011 cat recognition)

Page 27: Empirical Asset Pricing via Machine Learning

Can Machines Learn Finance?

2011: Google Brain launches. Uncertain if deep neural

networks would identify a cat, let alone drive a car

Answer: Yes, but there is much more to learn

I Most anecdotes in low capacity settings (HFT, OTC, etc.)

I We provide first large scale evidence of value for long-term asset management

I These are early days (2011 cat recognition)

Page 28: Empirical Asset Pricing via Machine Learning

Comparing Predictions: Diebold-Mariano TestsPositive numbers = column model outperforms row model

OLS-3 PLS PCR ENet GLM RF GBRT NN1 NN2 NN3 NN4 NN5

+H +H +H +H

OLS+H 3.81 3.82 3.85 3.81 3.83 3.91 3.94 3.96 3.96 3.98 3.97 3.96

OLS-3+H 0.23 1.72 -0.80 0.63 1.55 1.93 1.98 2.83 3.01 2.61 2.63

PLS 1.58 -0.71 0.08 1.39 1.61 1.52 2.29 2.43 2.18 2.15

PCR -1.51 -1.62 0.06 0.48 0.54 1.13 1.20 0.94 0.85

ENet+H 1.00 1.59 1.79 2.09 2.02 2.19 1.92 1.94

GLM+H 1.21 1.59 1.70 2.55 2.76 2.44 2.33

RF 0.66 0.66 1.12 1.30 0.94 0.90

GBRT+H 0.24 0.73 0.83 0.53 0.46

NN1 0.87 1.11 0.49 0.31

NN2 0.10 -1.09 -1.20

NN3 -1.03 -1.92

NN4 -0.47

Page 29: Empirical Asset Pricing via Machine Learning

Predicting Pre-specified Portfolios (Annual R2)

OLS-3 PLS PCR ENet GLM RF GBRT NN1 NN2 NN3 NN4 NN5

S&P 500 -3.31 0.43 -7.17 0.26 2.07 8.80 7.28 9.99 12.02 15.68 15.30 13.15

Big Growth 3.36 4.88 -4.04 3.62 0.49 9.50 5.86 8.76 8.54 12.42 9.95 7.56

Big Value -11.82 -6.92 -10.22 -2.13 2.44 7.14 6.93 7.47 11.06 11.67 13.37 10.03

Small Growth 6.11 10.81 8.94 8.41 4.31 8.05 3.75 7.24 6.37 7.48 6.60 4.81

Small Value 4.25 2.87 3.19 0.21 0.03 6.20 2.13 3.96 5.52 6.84 2.60 7.23

Big Conservative -8.34 -2.42 -9.77 -3.77 5.17 8.44 5.26 -1.31 8.64 9.65 12.47 6.09

Big Aggressive -0.92 1.89 -4.72 1.36 2.00 7.42 6.67 11.00 11.74 13.08 11.27 10.67

Small Conservative 1.30 6.36 5.01 3.19 2.35 4.60 0.62 5.31 5.39 5.97 4.22 4.71

Small Aggressive 5.53 5.12 2.88 1.04 0.37 6.43 3.23 2.50 4.50 5.50 1.47 6.56

Big Robust -7.17 -2.55 -9.18 1.33 5.42 7.61 6.60 12.55 12.04 13.92 15.29 13.35

Big Weak -1.81 3.09 -7.15 -1.02 -1.12 9.62 7.62 4.41 9.95 11.39 11.73 8.40

Small Robust -2.33 0.93 -0.20 0.76 3.72 0.41 -0.87 2.92 3.67 4.47 0.86 4.19

Small Weak 4.72 9.89 5.68 2.15 -1.11 7.53 3.10 -0.48 1.53 2.96 1.61 1.08

Big Up -24.02 -11.77 -19.16 -5.11 0.52 6.15 6.21 4.26 11.44 11.11 14.48 10.62

Big Down -2.32 0.39 -2.79 -0.15 0.71 7.64 5.53 3.58 8.78 9.54 10.32 6.79

Small Up -5.47 3.82 0.71 -2.83 1.57 1.84 -0.19 -4.22 0.70 1.12 -1.42 2.83

Small Down 4.72 5.59 4.84 2.87 0.50 7.23 3.49 3.24 4.63 5.90 3.28 5.22

Page 30: Empirical Asset Pricing via Machine Learning

Cumulative Returns of Long-Short ML Portfolios

1987 1990 1993 1996 1999 2002 2005 2008 2011 2014 2016

S

ho

rt P

osi

tio

n

Lo

ng

Po

siti

on

4

2

0

2

4

6

8

OLS-3+H PLS PCR ENet+H GLM+H RF GBRT+H NN3 SP500-Rf solid = long dash = short

Page 31: Empirical Asset Pricing via Machine Learning

Drawdowns, Turnover, and Risk Adjusted Alpha

OLS-3 PLS PCR ENet GLM RF GBRT NN1 NN2 NN3 NN4 NN5 MOM1m

+H +H +H +H

Drawdowns and Turnover

Max DD 64.74 35.51 34.17 35.25 27.05 50.22 35.24 19.70 23.72 14.84 18.66 21.73 68.91

Max 1M Loss 38.69 25.05 22.22 34.11 16.92 34.94 22.87 16.98 23.72 10.15 18.66 21.71 43.67

Turnover 156.78 76.92 106.79 143.61 129.22 113.91 136.82 110.87 112.08 113.07 114.08 113.57 172.56

Risk-adjusted Performance

Mean Ret. 1.65 1.56 2.40 2.12 2.15 2.16 2.08 2.93 3.18 3.19 3.25 2.97 1.80

FF3 α 1.43 1.39 2.44 1.96 2.15 2.09 2.08 2.89 3.20 3.20 3.24 2.98 1.54

R2 4.92 12.75 11.59 4.35 3.85 8.64 0.31 8.89 7.28 8.45 8.34 8.38 5.92

FF5 α 1.64 0.96 2.01 1.58 1.74 1.82 1.95 2.66 2.97 3.00 3.03 2.73 1.88

R2 7.05 25.10 23.50 10.96 15.28 14.97 4.09 12.59 10.03 10.74 11.11 11.43 10.60

FF5+Mom α 1.88 0.82 1.76 1.34 1.54 1.75 1.78 2.62 2.93 2.98 2.95 2.68 2.14

R2 17.46 32.91 43.77 26.27 31.80 16.59 16.81 13.19 10.59 10.88 13.61 12.25 20.53

Note: All numbers are in percentage.

Page 32: Empirical Asset Pricing via Machine Learning

Which Covariates Predict Returns?

PLS PCR ENet+H GLM+H

levoperprof

retvolchcsho

nincrcashpr

agrrd_mve

dolvolep

mvel1mom6m

spturn

maxretstd_turn

mom12mindmomchmom

mom1m

0.0 0.1 0.2 0.3mom36m

bm_iachinv

lgrbm

deprcashpr

mom6mep

investagr

rd_mvechcshomvel1

spmaxret

indmommom12m

chmommom1m

0.0 0.1 0.2 0.3mom36m

mom6mretvolmvel1

epturn

chinvchmom

nincrps

std_turnsp

chcshoinvest

rd_mvedolvol

agrindmom

mom12mmom1m

0.0 0.2 0.4 0.6mom36m

securedindsic2

chinvlgrep

turnchmom

dolvolcashprchcsho

illinvest

agrrd_mvemaxretmvel1

indmommom12m

mom1m

0.0 0.1 0.2 0.3 0.4 0.5

RF GBRT+H NN2 NN3

mom36mdolvol

illbetasq

betasp

idiovolconvindmom6m

baspreadmom12m

chmomretvolnincr

securedindindmommaxretmvel1

dymom1m

0.00 0.05 0.10rd_mve

turnmom36m

agebeta

mom6mbaspread

idiovolsp

convindchmom

mvel1indmom

retvolnincr

maxretmom12m

securedindmom1m

dy

0.00 0.05 0.10 0.15 0.20betasq

spmom36m

securedindzerotrade

nincrindmomstd_turn

idiovolill

baspreadmom6m

mom12mturn

dolvolmaxretchmom

retvolmvel1

mom1m

0.00 0.05 0.10 0.15 0.20beta

spsecuredind

mom36mzerotrade

nincrindmomstd_turn

illmom12mbaspread

mom6midiovol

turndolvol

chmommaxret

retvolmvel1

mom1m

0.00 0.05 0.10 0.15 0.20

Page 33: Empirical Asset Pricing via Machine Learning

Characteristic Importance over Time by NN319

87198819

89199019

91199219

93199419

95199619

97199819

99200020

01200220

03200420

05200620

07200820

09201020

11201220

13201420

152016

mom

1mm

vel1

mom

12m

chm

omm

axre

t

indm

omre

tvol

dolvol sp

turnagr

ninc

rrd

_mve

std_

turn

mom

6m

mom

36m ep

chcs

ho

secu

redi

ndid

iovo

l

basp

read ill

age

conv

ind rd

depr

beta

beta

sqca

shpr ps

zero

trade dy

orgc

apbm lgr

cash

debt

chin

vin

vest lev

oper

prof

bm_i

asa

lein

veg

rcf

prd

_sal

esg

rro

aqroic

sic2

mve

_ia

ms

quic

khe

rfhi

re

pric

edel

aysa

lere

cro

avol

roeq

grca

pxcu

rrat

cash

std_

dolvol

acc

cfp_

iagr

ltnoa

gma

pcta

ccab

sacc

sale

cash

secu

red

pchd

epr

tang

pchc

apx_

iach

empi

aea

r

pchs

ale_

pchi

nvt

pchs

alei

nvch

txch

pmia

chat

oia tb

aeav

olrs

up

pchg

m_p

chsa

le

pchs

ale_

pchx

sga

cinv

est

pchq

uick

pchs

ale_

pchr

ect

real

esta

te

pchc

urra

tst

dacc

stdc

fdi

vidi

vosin