predicting risk from financial reports with regressionnasmith/slides/kogan+levin+routledge... ·...
TRANSCRIPT
![Page 1: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/1.jpg)
Predicting Risk from Financial Reports
with Regression
Shimon Kogan, University of Texas at AustinDimitry Levin, Carnegie Mellon UniversityBryan R. Routledge, Carnegie Mellon UniversityJacob S. Sagi, Vanderbilt UniversityNoah A. Smith, Carnegie Mellon University
![Page 2: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/2.jpg)
Talk In A Nutshell
financial risk = f(financial report)
volatility of returns
Form 10-K, Item 7
SV regression
![Page 3: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/3.jpg)
What This Talk Isn’t and Is
New statistical models for NLP ...
Exciting text domains like political blogs ...
Advances in applications like translation and summarization ...
![Page 4: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/4.jpg)
What This Talk Isn’t and Is
New statistical models for NLP ...
Exciting text domains like political blogs ...
Advances in applications like translation and summarization ...
Shay Cohen, 10:40 am yesterday
Tae Yano, 10:40 am tomorrow
Ashish Venugopal, right now
André Martins, 11 am Thursday
![Page 5: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/5.jpg)
What This Talk Isn’t and Is
New statistical models for NLP ...
Exciting text domains like political blogs ...
Advances in applications like translation and summarization ...
![Page 6: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/6.jpg)
What This Talk Isn’t and Is
New statistical models for NLP ...
Bag of terms representation and
SVR model.
Exciting text domains like political blogs ...
Boring (to read) text domain of financial
reports.Advances in applications like translation and summarization ...
Under-explored application: forecasting.
![Page 7: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/7.jpg)
See Also ...
•Lavrenko et al. (2000), Koppel and Shtrimberg (2004), and others: prices
•Blei and McAuliffe (2007): popularity
•Lerman et al. (2008): prediction markets
![Page 8: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/8.jpg)
Outline
•Mini-lesson in finance
•A new text-driven forecasting task
•Regression models trained on text
•Experimental results and analysis
•Outlook
![Page 9: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/9.jpg)
Finance
Allocation of wealth (e.g., money) across time and risk (states of nature).
![Page 10: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/10.jpg)
Finance
From an NLP perspective: crucial information about your investments that’s buried in documents you’d rather not read.
![Page 11: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/11.jpg)
financial risk = f(financial report)
![Page 12: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/12.jpg)
financial risk = f(financial report)
volatility of returns
![Page 13: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/13.jpg)
•Return on day t:
•Sample standard deviation from day t - τ to day t:
•This is called measured volatility.
What is Risk?
v[t!!,t] =
!""#!$
i=0
(rt!i ! r̄)2%
!
rt =closingpricet + dividendst
closingpricet!1
! 1
![Page 14: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/14.jpg)
Why Not Predict Returns, Get Rich, Retire Early?
•Hard: predicting a stock’s performance.
•To predict returns, we would need to find new information.
•Our reports probably don’t contain new information (10-Ks do not precede big price changes).
![Page 15: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/15.jpg)
Will This Talk Make Anyone Rich?
•Some people think you can exploit accurate volatility predictions.
•I’m not really qualified to give financial advice.
•Consulting to portfolio/wealth managers is a huge industry.
![Page 16: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/16.jpg)
So Then Why Do Finance Researchers Care?
•Models of economics and finance treat information simplistically.
•No notion of extracting information from large amounts of raw data.
•These reports are produced at huge expense. Are they worth it?
![Page 17: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/17.jpg)
Important Property of Volatility
•Autoregressive conditional heteroscedacity: volatility tends to be stable (over horizons like ours).
•v[t - τ, t] is a strong predictor of v[t, t + τ]
•This is our strong baseline.
![Page 18: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/18.jpg)
financial risk = f(financial report)
volatility of returns
Form 10-K, Item 7
![Page 19: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/19.jpg)
Form 10-K, Item 7
General Motors Corp.
March 5, 2009
Item 7. Management’s Discussion and Analysis of Financial Condition and Results of OperationsOverviewWe are primarily engaged in the worldwide production and marketing of cars and trucks. We operate in two businesses, consisting of our automotive operations, which we also refer to as Automotive, GM Automotive or GMA, that includes our four automotive segments consisting of GMNA, GME, GMLAAM and GMAP, and our financing and insurance operations (FIO). Our finance and insurance operations are primarily conducted through GMAC, a wholly-owned subsidiary through November 2006. On November 30, 2006, we sold a 51% controlling ownership interest in GMAC to a consortium of investors. After the sale, we have accounted for our 49% ownership interest in GMAC under the equity method. GMAC provides a broad range of financial services, including consumer vehicle financing, automotive dealership and other commercial financing, residential mortgage services, automobile service contracts, personal automobile insurance coverage and selected commercial insurance coverage.
Automotive IndustryIn 2008, the global automotive industry has been severely affected by the deepening global credit crisis, volatile oil prices and the recession in North America and Western Europe, decreases in the employment rate and lack of consumer confidence. The industry continued to show growth in Eastern Europe, the LAAM region and in Asia Pacific, although the growth in these areas moderated from previous levels and is beginning to show the effects of the credit market crisis which began in the United States and has since spread to Western Europe and the rest of the world. Global industry vehicle sales to retail and fleet customers were 67.1 million units in 2008, representing a 5.1% decrease compared to 2007. We expect industry sales to be approximately 57.5 million units in 2009.
![Page 20: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/20.jpg)
Our Corpus
•Edgar database at http://www.sec.gov
•26,806 examples of Item 7, 1996-2006
•247.7 million words in total
•http://www.ark.cs.cmu.edu/10K
![Page 21: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/21.jpg)
“Annotation”
•For each report at time t, we gathered
•“Historical” volatility: v[t - 1y, t]
•“Future” volatility: v[t, t + 1y]
•Source: Center for Research in Security Prices U.S. Stocks Databases
![Page 22: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/22.jpg)
Methodology
•Input: Item 7 and/or historical volatility
•Output: predicted future volatility
•Test on (input, output) pairs from year Y
•Train on (input, output) from years < Y
•Evaluation: MSE of (log) volatility
![Page 23: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/23.jpg)
financial risk = f(financial report)
volatility of returns
Form 10-K, Item 7
SV regression
![Page 24: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/24.jpg)
Support-Vector Regression (Drucker et al., 1997)
•Predicted future volatility is a function of a document (Item 7), d, and a weight vector w:
•The training criterion:
v̂ = f(d;w)
minw!Rd
12!w!2 +
C
N
N!
i=1
max
"0,
#####vi " f(di;w)
#####" !
$
regularize prediction within ε of correct
![Page 25: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/25.jpg)
Representation
•Vector-space model (tf, tfidf, etc.)
•So far, unigrams and bigrams
•Linear kernel (for interpretability)
f(d;w) = h(d)!w =N!
i=1
N!
i=1
!iK(d,di) =N!
i=1
N!
i=1
!ih(d)!h(di)
w =N!
i=1
!ih(di)
![Page 26: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/26.jpg)
Representation
•Vector-space model (tf, tfidf, etc.)
•So far, unigrams and bigrams
•Linear kernel (for interpretability)
w =N!
i=1
!ih(di)
dual
f(d;w) = h(d)!w =N!
i=1
!iK(d,di) =N!
i=1
!ih(d)!h(di)
![Page 27: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/27.jpg)
Experiment
•Test on year Y.
•Train on (Y - 5, Y - 4, Y - 3, Y - 2, Y - 1).
•Six such splits.
•Compare history-only baseline, text-only SVR, combined SVR.
![Page 28: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/28.jpg)
MSE of Log-Volatility
0.120
0.143
0.165
0.188
0.210
2001 2002 2003 2004 2005 2006 Micro-ave.
HistoryTextText + History
Using “log(1+freq.)” representation on all unigrams and bigrams. See paper.
*
**
*
*
lower is better
![Page 29: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/29.jpg)
Dominant Weights (2000-4)
loss 0.025 net income -0.021net loss 0.017 rate -0.017
year # 0.016 properties -0.014expenses 0.015 dividends -0.013
going concern 0.014 lower interest -0.012a going 0.013 critical accounting -0.012
administrative 0.013 insurance -0.011personnel 0.013 distributions -0.011
high volatility words low volatility words
![Page 30: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/30.jpg)
MSE of Log-Volatility
0.120
0.143
0.165
0.188
0.210
2001 2002 2003 2004 2005 2006 Micro-ave.
HistoryTextText + History
Using “log(1+freq.)” representation on all unigrams and bigrams. See paper.
*
**
*
*
lower is better
![Page 31: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/31.jpg)
Changes Over Time
0
3,250
6,500
9,750
13,000
‘96 ‘97 ‘98 ‘99 ‘00 ‘01 ‘02 ‘03 ‘04 ‘05 ‘06
average length of Item 7
![Page 32: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/32.jpg)
2002
•Enron and other accounting scandals
•Sarbanes-Oxley Act of 2002
•Longer reports
•Are the reports more informative after 2002? Because of Sarbanes-Oxley?
![Page 33: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/33.jpg)
Changes In w
50
54
58
62
‘97-’01 ‘98-’02 ‘99-’03 ‘00-’04 ‘01-’05
change from previous weights
Measured in L1 distance; based on unigram model with “log(1 + freq.)” representation.
![Page 34: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/34.jpg)
Language Over Time
-0.015
-0.010
-0.005
0
0.005
96-00 97-01 98-02 99-03 00-04 01-05
w
accounting policies
estimates
0
2
4
6
8
ave.
term
freq
uenc
y
![Page 35: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/35.jpg)
0
0.2
0.4
0.6
0.8
ave.
term
freq
uenc
y-0.010
-0.005
0
0.005
96-00 97-01 98-02 99-03 00-04 01-05
w
Language Over Time
reit (“Real Estate Investment Trust”)
mortgages
![Page 36: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/36.jpg)
Language Over Time
-0.010
-0.005
0
0.005
0.010
96-00 97-01 98-02 99-03 00-04 01-05
w
higher margin
lower margin
0
0.05
0.10
0.15
0.20
ave.
term
freq
uenc
y
![Page 37: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/37.jpg)
Delisting
•Rare (4%) event: delisting due to dissolution after bankruptcy, merger, violation of rules.
•bulletin, creditors, dip, otc, court
0
25
50
75
100
01 02 03 04 05 06
precision at 10precision at 100
![Page 38: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/38.jpg)
Conclusions
•Text-driven forecasting of volatility, by regression.
•Works nearly as well as strong history predictor.
•Often works better in combination.
•Suggestion of effects of legislation on a real-world text-generating process.
![Page 39: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/39.jpg)
Future Work
•Measuring the effect of Sarbanes-Oxley
•Other predictions
•Other text representations
•Other datasets
![Page 40: Predicting Risk from Financial Reports with Regressionnasmith/slides/kogan+levin+routledge... · 2009. 6. 9. · Predicting Risk from Financial Reports with Regression Shimon Kogan,](https://reader035.vdocument.in/reader035/viewer/2022071002/5fbfb91c47f46a64f459de35/html5/thumbnails/40.jpg)
Future Work (Text-Driven Forecasting)
•Application for NLP: techniques that use text to make real-world predictions.
•Many potential domains (finance, politics, government, sales, ...)
•There’s lots of room for improvement!