scholarship project paper 2019 good news travels slowly and bad news has wings: can news-based...
TRANSCRIPT
www.set.or.th/setresearch
Disclaimer: The views expressed in this working paper are those of the author(s) and do not necessarily represent the Capital Market Research Institute or the Stock Exchange of Thailand. Capital Market Research Institute Scholarship Papers are research in progress by the author(s) and are published to elicit comments and stimulate discussion.
Scholarship Project Paper 2019
Good News Travels Slowly and Bad News Has Wings: Can News-Based Sentiment Predict Stock Market Movements?
Sapphasak Chatchawan Faculty of Commerce and Accountancy
Thammasat University
20 April 2020
Abstract The objective of the study is to investigate effects of news-based market sentiment on the predictive power and market volatility of SET, SET50, SET100 and MAI returns from November 2017- February 2019. The market sentiment is extracted from a corpus of headline news related to the Stock Exchange of Thailand in Thomson Reuters Eikon newswires. We find that the news-based sentiment index has the predictive power and contains significant information of SET, SET50, SET100 and MAI returns. In addition, the positive market sentiment significantly affects the volatility in SET50 and SET100 returns. As the positive sentiment increases, the volatility of SET50 and SET100 returns decreases. JEL Classification: G14 Keywords: News-based Market Sentiment, Computational Linguistics, Financial Market Volatility E-Mail Address: [email protected]
2019 Capital Market Research Institute, The Stock Exchange of Thailand
Content
Page
Chapter 1 Introduction 3
Chapter 2 Literature Review 5
Chapter 3 Research Methodology 6
Chapter 4 Empir ical Strategy 15
Chapter 5 Empir ical Results 19
Chapter 6 Robustness Checks 28
Chapter 7 Discussion 29
Chapter 8 Conclusion 32
References 33
Table
1. Dependent variables 7
2. Descript ive stat ist ics of dependent variables 7
3. Independent variables 9
4. Descript ive stat ist ics o f independent variables 9
5. Binary probit 19
6. Results of ARMA(2,2) 21
7. Results of ARMA (2,2) 22
8. Benchmark EGARCH(1,1) and volume-augmented
EGARCH(1,1) 24
9. Posit ive and Negative sentiment in EGARCH(1,1) 25
10. Posit ive and Negative sentiment in volume-augmented
EGARCH(1,1) 26
Figure
1. A sample of headlines from newswires 11
2019 Capital Market Research Institute, The Stock Exchange of Thailand
2. Word frequency in headlines 11
3. WordCloud before TF-IDF 12
4. WordCloud after TF-IDF 13
5. News-based sentiment index 14
6. Negative sentiment 17
7. Posit ive sentiment 18
Appendix
1. Unit root test 36
Acknowledgement 37
2019 Capital Market Research Institute, The Stock Exchange of Thailand
3
Chapter 1 Introduction
Financial firms enjoy benefits of real-time newswire services from international news agencies,
such as Thomson Reuters Eikon and Bloomberg. Newswires electronically distribute financial
related press releases, headlines and stories in qualitative textual information and help market
participants monitor real-time market conditions. Newswires spread up-to-the-minute news in
every second and naturally create a voluminous amount of textual data. This paper asks
following empirical questions: to what extent the sentiment of financial news from newswires
predicts the direction of the stock market? how does stock prices react to positive and negative
market sentiment? and, is news-based sentiment better at forecasting stock market volatility
than stock market movements?
A growing body of literature empirically explores causal relations between financial news and
stock market movements. The sentiment extracted from financial newspapers, such as The
Wall Street Journal, The New York Times and The Guardian, also has the predictive power over
stock market movements and corporate earnings (Tetlock, 2007; Tetlock, Saar-tsechansky and
Macskassy ,2008; Ferguson, Philip, Lam and Guo, 2014). The content of news influences
investors’ decision making. The effect of the unfavourable tone of financial news on stock
market prices is more pronounced during the economic recession and has more predictive
power than the favorable tone (Garcia, 2013). In addition to the study of printed newspapers,
there exists a number of papers explore impacts of newswires on financial markets (Boudoukh,
Feldman, Kogan and Richardson ,2012). The sentiment constructed from newswires has the
predictive over forecasts the stock market changes better than macroeconomic variables (Uhl
,2014). The daily sentiment predicts stock market returns for very short period. However, the
weekly is able to forecast returns for a longer period. Moreover, it seems that the sentiment
2019 Capital Market Research Institute, The Stock Exchange of Thailand
4
predicts the direction of volatility better than stock prices (Allen, McAleer and Singh,2013;
Heston and Sinha, 2016; Atkins, Niranjan and Gerding, 2018).In addition to the stock market,
news-based sentiment can also predict and explain movements in the term structure of interest
rates (Gotthelf and Uhl, 2018) and in foreign exchange market (Uhl, 2017).
The paper attempts to quantify the daily market sentiment from Reuters Eikon’s newswires by
employing methods in computational linguistics. The sentiment index is computationally
extracted from a collection of more than 1,000 news headlines. The whole corpus consists of
more than 20,000 words. This paper empirically analyzes the predictive power of news-derived
sentiment, which is characterized into both negative and positive sentiment, on the direction of
prices and volatility in the Stock Exchange of Thailand (Hereafter, SET) from 2017 to 2019.
Research Objective:
1. Does news-based market sentiment have predict ive power over
stock market returns?
2. How dose new-based market sentiment affect f inancial market
volat i l i ty in Thailand?
3. How do negative and posit ive sentiment variables affect market
volat i l i ty?
Expected outcomes
1. News-based market sentiment is l ikely to contain meaningful
information for predict ing direct ions of the Stock Exchange of Thailand.
2. Posit ive and negative market sentiment variables may affect stock
market volati l i ty in different ways.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
5
Chapter 2 Literature Review
The paper contributes to the existing literature in three aspects. First, the study employs the
user-defined dictionary method as presented in Louhran and McDonald (2011) to characterize
the market sentiment into positive and negative sentiment. The technique is automated and
replicable. Unlike Heston and Sinha (2016), Uhl (2014), Uhl (2017) and Gotthelf and Uhl (2018)
that use Thomson Reuters sentiment index, the market sentiment for SET is constructed from a
collection of all news headlines relevant to SET from Thomson Reuters Eikon’s newswires. This
can be achieved by searching “BKstmad.BK”. Thus, the approach is more flexible than using
Reuters sentiment index, which was released until 2012. Second, the study investigates the
predictive power of news-based sentiment index on both stock market prices and volatility. This
is to confirm findings in Allen, McAleer and Singh (2013), Heston and Sinha (2016), Atkins,
Niranjan and Gerding (2018), which study the same aspect in the U.S. market. Third, to my
knowledge, the paper is the first attempt to explore the effects of news-based sentiment on
movements and volatility of prices and returns and the effects of sentiment on the trading
volume in SET.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
6
Chapter 3 Research Methodology
Sources and Data description
All financial data are collected from Thomson Reuters Eikon. The sample begins from
9/11/2017 to 8/02/2019, which is the longest range of the sample period in which news
headlines were possibly scraped from Thomson Reuters Eikon’s newswires. All financial news
headlines were compiled by Routers Instrument Code (RIC): “BKstmad.BK”. Textual analysis
will be employed to extract market sentiment from those headlines, which are specific to the
Stock Exchange of Thailand. The major reason why the paper does not use Thomson Reuters
Eikon sentiment score is that the score had been published until 2012. It seems that the service
is no longer available1. There are 308 observations of both financial and news-based sentiment
variables. Data are daily time series and pass the unit root test at 1% level of significance. That
is, empirical results drawn from the data rule out the possibility of spurious relationship among
variables. The unit root tests are presented in table.A1 in Appendix 1.
Dependent variable
There are four independent variables, which are stock market returns, in the study. All
dependent variables are the percentage change of the Stock Exchange Thailand index (SET
index) and the Market for Alternative Investment (MAI index). Stock market returns can be
computed from equation 1.
𝑅𝑒𝑡𝑡 = ln (Xt
𝑋𝑡−1) × 100
1 The service is no longer available.
(1)
2019 Capital Market Research Institute, The Stock Exchange of Thailand
7
𝑋𝑡 represents SET, SET50, SET100 and MAI index. In table 1, SET_RET, SET50_RET,
SET100_RET and MAI_RET are stock market returns. Table 2 shows descriptive statistics for
dependent variables.
Table 1: shows dependent variables in the study.
Variable Description
SET_RET SET index return
SET50_RET SET50 index return
SET100_RET SET100 index return
MAI_RET MAI index return
Source: Author’s calculation
Table 2: Descriptive statistics of dependent variables
SET_RET SET100_RET SET50_RET MAI_RET
Mean -0.01 0.00 0.00 -0.13
Median 0.05 0.06 0.06 -0.01
Maximum 2.27 2.71 2.81 1.61
Minimum -2.42 -2.65 -2.60 -2.65
Std. Dev. 0.71 0.80 0.80 0.72
2019 Capital Market Research Institute, The Stock Exchange of Thailand
8
Skewness -0.32 -0.22 -0.17 -0.79
Kurtosis 4.12 4.26 4.31 4.13
Jarque-Bera 21.51 23.07 23.53 48.64
Probability 0.00 0.00 0.00 0.00
Sum -3.74 -1.21 1.14 -38.64
Sum Sq. Dev. 156.21 195.01 195.92 160.56
Observations 308 308 308 308
Source: Author’s calculation
Independent variables
In the study, independent variables are classified into 2 groups, namely financial data and
textual data. Financial data are trading volume in SET, SET50 and SET100. Trading volume
data enter into EGARCH (1,1) in the mean equation in the form of percentage changes. The
textual data in the study are the news-based market sentiment, the positive sentiment and the
negative sentiment. All of which is extracted from the corpus of headline news scraped from
Thomson-Reuters newswire. Table 3 and table 4 depict the variable description and descriptive
statistics of independent variables.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
9
Table 3: Independent variables
Variable Description
SET_VOLM Percentage change of trading volume in SET
SET50_VOLM Percentage change of trading volume in SET50
SET100_VOLM Percentage change of trading volume in SET100
MKT_SEN News-based market sentiment
POS Positive sentiment
NEG Negative sentiment
Source: Author’s calculation
Table 4: Descriptive statistics of independent variables
SET_VOLM SET50_VOLM SET100_VOLM MKT_SEN POS NEG
Mean 0.14 0.17 -0.01 -0.02 0.03 0.05
Median -0.89 -1.10 -2.12 0.00 0.00 0.04
Maximum 70.22 123.78 85.29 0.36 0.36 0.29
Minimum -65.66 -89.31 -72.07 -0.29 0.00 0.00
Std. Dev. 20.35 32.13 27.77 0.08 0.05 0.06
2019 Capital Market Research Institute, The Stock Exchange of Thailand
10
Skewness 0.22 0.20 0.33 0.14 2.22 1.05
Kurtosis 3.83 3.41 3.10 4.16 11.03 3.66
Jarque-Bera 11.42 4.35 5.88 18.18 1080.08 62.06
Probability 0.00 0.11 0.05 0.00 0.00 0.00
Sum 43.71 52.98 -4.47 -5.79 9.84 15.63
Sum Sq. Dev. 127097.50 316928.30 236801.70 2.09 0.70 0.99
Observations 308 308 308 308 308 308
Source: Author’s calculation
Textual analysis
Analytical pre-processing
Financial news headlines are unstructured data in nature. That is, numbers, punctuations,
alphanumeric and non-alphanumeric characters are mixed together in haphazard way. Text
mining is the process of transforming unstructured to structured and quantifiable data. That is,
we are going to quantify text. In the natural language processing, textual pre-processing is the
procedure of cleaning text and transformation of textual characters into the bag-of-words model.
In this paper, the corpus is a daily collection of financial headline news from Reuters Eikon
newswires. One day is equivalent to one document in the corpus. For example, in figure 1,
there are two documents. One is 8-Feb-19 and another is 7-Feb-19. Text data pertaining to
each day will be used in textual analysis.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
11
Figure 1 Financial news headlines from Reuters Eikon ‘s newswire.
Source: Author’s calculation
The ultimate goal of pre-processing is to eliminate the redundant dimensionality in the bag-of-words in order to filter noises out of the document. In this paper, the procedure of textual pre-processing includes word concatenation, removing white space, numbers, punctuation marks, non-alpha numeric (symbols), unicode characters and ‘English’ stopwords, stemming words with Porter stemming and tokenization. The financial corpus contains more than 20,000 terms before analytical pre processing.
Figure 2: shows the number of words in the headline news from Thomson Reuters Eikon newswire.
Source: Author’s calculation
0
10
20
30
40
50
1
10
19
28
37
46
55
64
73
82
91
10
0
10
9
11
8
12
7
13
6
14
5
15
4
16
3
17
2
18
1
19
0
19
9
20
8
21
7
22
6
23
5
24
4
25
3
26
2
27
1
28
0
28
9
29
8
30
7
31
6
The Number of Words in each Financial Headline News
Words 5-day moving average
2019 Capital Market Research Institute, The Stock Exchange of Thailand
12
Some terms in financial headlines occurred at a high frequency rate but they did not contribute
meaningful information. According to Shannon entropy, terms with high frequency carry little
information with them. Therefore, I employ the term-frequency inverse document frequency (TF-
IDF) weighting scheme as the means to drop those terms out of the document. TF-IDF is the
weighting scheme, which is alternative to term-frequency (TF), based on how important words
explain the document. Words that rarely occur in the document will score the high weight
because they contain meaningful information. I will drop terms that their TF-IDF values are zero
out of the document and keep terms that their values are greater than zero.
Figure 3: Word Cloud of the corpus of headline news before TF-IDF scheme.
Source: Author’s calculation
2019 Capital Market Research Institute, The Stock Exchange of Thailand
13
Figure 4: Word Cloud of the corpus of headline news after TF-IDF scheme
Source: Author’s calculation
Dictionary method and market sentiment
Dictionary method ensures the replicability and the consistency of the result. This paper is not
the first that uses this method. The dictionary method is also used in Tetlock (2007), Tetlock et
al. (2008) and Loughran and McDonald (2011).
In order to construct the market variable, I employ a list of terms in finance used in Loughran
and McDonald (2011) as a user-defined dictionary. Then, simple counting method is used to
compute the market as presented in (2).
𝑀𝐾𝑇_𝑆𝐸𝑁𝑇𝑡 =𝑛𝑑,𝑡
𝑃𝑜𝑠𝑡 − 𝑛𝑑,𝑡𝑁𝑒𝑔
𝑛𝑑,𝑡 (2)
2019 Capital Market Research Institute, The Stock Exchange of Thailand
14
"𝑡" is a trading day. 𝑀𝐾𝑇_𝑆𝐸𝑁𝑇𝑡 is the news-based market sentiment extracted from
Reuters Eikon ‘s newswires. 𝑛𝑑,𝑡𝑃𝑜𝑠𝑡 is the number of positive-tone words in a document 𝑑 on
date "𝑡". 𝑛𝑑,𝑡𝑁𝑒𝑔 is the number of negative-tone words in a document 𝑑 on date "𝑡". 𝑛𝑑,𝑡
is the total number of terms in each document.
Figure 5: presents the News-Based Sentiment index from headline news.
Source: Author’s calculation
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
1
10
19
28
37
46
55
64
73
82
91
10
0
10
9
11
8
12
7
13
6
14
5
15
4
16
3
17
2
18
1
19
0
19
9
20
8
21
7
22
6
23
5
24
4
25
3
26
2
27
1
28
0
28
9
29
8
30
7
31
6
Text-based market sentiment index
Market sentiment 5-day moving average
2019 Capital Market Research Institute, The Stock Exchange of Thailand
15
Chapter 4 Empirical strategy
Auto regressive model
In order to explore whether the sentiment contains the meaningful information that can be used
to explain stock market movements, I retreat to a basic multiple regression to investigate the
relationship. The econometric specification is closely related to McLaren and Shanbhogue
(2011), which analyses the predictive power of Google search data in U.S. labour and housing
markets. The baseline model is just a standard ARMA(p,q) model.
𝑋𝑡 = 𝛽0 + 𝛽1𝑋𝑡−1 + 𝛽2𝑋𝑡−2+. . . +𝛽𝑝−1𝑋𝑡−𝑝+1 + 𝛽𝑝𝑋𝑡−𝑝 +
𝜀𝑡 + 𝛾1𝜀𝑡−1 + 𝛾2𝜀𝑡−2+. . . + + 𝛾𝑞𝜀𝑡−𝑞
From equation 3, 𝑋𝑡 is a dependent variable. Nine benchmark equations, which are SETI,
SETI50, SETI100, SET_RET, SET_RET50, SET_RET100, SET_VOL, SET50_VOL and
SET100_VOL, are estimated. "𝑝" is the order of lagged variable and is optimally chosen by
the Akaike Information Criteria (Hereafter, AIC). 𝑋𝑡−1, 𝑋𝑡−2, … , 𝑋𝑡−𝑝 are autoregressive
variables. 𝜀𝑡 is an independent, identical and normal disturbance term .
𝑋𝑡 = 𝛽0 + 𝛽1𝑋𝑡−1 + 𝛽2𝑋𝑡−2+. . . +𝛽𝑝−1𝑋𝑡−𝑝+1 + 𝛽𝑝𝑋𝑡−𝑝 +
𝜀𝑡 + 𝛾1𝜀𝑡−1 + 𝛾2𝜀𝑡−2+. . . + + 𝛾𝑞𝜀𝑡−𝑞 + 𝛼𝑀𝐾𝑇_𝑆𝐸𝑁𝑇𝑡
In order to examine the predictive power of market sentiment, I firstly fit equation 3 and
compare AIC with AIC from benchmark models. If the news-based sentiment contains
meaningful information to stock market variables, the estimated coefficient of 𝛼 will be
statistically significant with a positive sign and AIC in equation 4 will be lower than AIC in
equation 3.
(3)
(4)
2019 Capital Market Research Institute, The Stock Exchange of Thailand
16
Exponential General Autoregressive Conditional Heteroscedastic (EGARCH)
Volatility cluster is ubiquitous in high-frequency financial data. In order to explore the effects
of the news-based sentiment index, we employs EGARCH. There are three major reasons why
the paper will use EGARCH. First, GARCH assumes that only the magnitude of unanticipated
excess returns, not the positivity or negativity excess returns, influences the conditional
variance. This assumption rules out the evidence that the volatility rises in response to bad
news and falls in response to good news. Second, the non-negativity constraints imposed on
GARCH parameters to ensure that the conditional variance is positive. Third, it is difficult to
interpret the persistence in the conditional variance. Thus, Nelson (1991) proposes Exponential
GARCH model (EGARCH) as an alternative to GARCH.
EGARCH model has an advantage over GARCH because it ensures that the conditional
variance is positive and allows for the asymmetric response of the volatility to good and bad
news. EGARCH also incorporate the asymmetric leverage. This paper therefore employs
Nelson‘s EGARCH as a work horse to analyze effects of the semantic similarity of forward
guidance on the volatility in financial markets data
EGARCH (1,1) specification
𝑦𝑡 = 𝛽0 + 𝛽1𝑉𝑂𝐿𝑡 + 𝜔𝑡
𝜔𝑡~(0, ℎ𝑡)
log(ℎ𝑡) = 𝛾0 + 𝛾1 (𝜔𝑡−1
√ℎ𝑡−1
) + 𝛾2 (|𝜔𝑡−1
√ℎ𝑡−1
| − √2
𝜋) + 𝛾3 log(ℎ𝑡−1)
+ 𝛼𝑆𝑡
I will employ a parsimonious EGARCH(1,1) because the model is sufficient to capture the non-
normality of the data in this paper. There will be 9 EGARCH equations as they are consistent
with the study in section 3.23. 𝑦𝑡 is the independent variable. 𝜔𝑡 is the disturbance term. ℎ𝑡
is the variance of 𝜔𝑡 . Equation 5 and 6 are the mean and the conditional variance equations. I
control the effects of the trading volume in each market in the mean equation. In this section,
(5)
(6)
2019 Capital Market Research Institute, The Stock Exchange of Thailand
17
the market sentiment is classified into positive and negative sentiment. Equation 7 and 8 show
the derivation.
𝑃𝑂𝑆 =𝑛𝑑,𝑡
𝑃𝑜𝑠
𝑛𝑑,𝑡
𝑁𝐸𝐺 =𝑛𝑑,𝑡
𝑁𝑒𝑔
𝑛𝑑,𝑡
Equation 7 depicts the fraction of the positive term in relation to the total number of terms in the
document as well as equation 8, which shows the fraction of the negative term. Maximum
likelihood2 is the estimation method for EGARCH. I fit the model with different assumptions of
the distribution of residuals.
Figure 6: presents the News-Based Sentiment index from headline news.
Source: Author’s calculation
2 See Tsay (2013)
0
0.05
0.1
0.15
0.2
0.25
0.3
1
10
19
28
37
46
55
64
73
82
91
10
0
10
9
11
8
12
7
13
6
14
5
15
4
16
3
17
2
18
1
19
0
19
9
20
8
21
7
22
6
23
5
24
4
25
3
26
2
27
1
28
0
28
9
29
8
30
7
31
6
Negative sentiment index
Negative sentiment 5-day moving average
(8)
(7)
2019 Capital Market Research Institute, The Stock Exchange of Thailand
18
Figure 7: presents the News-Based Sentiment index from headline news.
Source: Author’s calculation
In order to analyze the effect of market sentiment on the volatility SET, in the variance
equation, 𝑆𝑡 ∈ {𝑃𝑂𝑆, 𝑁𝐸𝐺}. Most importantly, the expected signs of parameters are
𝛼𝑃𝑂𝑆 < 0 and/or 𝛼𝑁𝐸𝐺 > 0 . This can be interpreted that the relatively negative
market sentiment will lead to the high volatility. On the contrary, more positive market sentiment
will lower the market volatility.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
1
10
19
28
37
46
55
64
73
82
91
10
0
10
9
11
8
12
7
13
6
14
5
15
4
16
3
17
2
18
1
19
0
19
9
20
8
21
7
22
6
23
5
24
4
25
3
26
2
27
1
28
0
28
9
29
8
30
7
31
6
Positive market sentiment
Positive sentiment 5-day Moving Average 10-day moving average
2019 Capital Market Research Institute, The Stock Exchange of Thailand
19
Chapter 5 Empirical Results
Simple Probit model
Table 5: shows the results of Binary Probit model
𝑀𝐾𝑇_𝑆𝐸𝑁 𝑀𝐾𝑇_𝑆𝐸𝑁𝑡−1 𝑀𝐾𝑇_𝑆𝐸𝑁𝑡−2 𝑀𝐾𝑇_𝑆𝐸𝑁𝑡−3
𝑆𝐸𝑇_𝑅𝐸𝑇 2.06**
(0.89)
1.24
(0.88)
1.33
(0.85)
-0.1
(0.87)
𝑆𝐸𝑇50_𝑅𝐸𝑇 1.95**
(0.89)
1.60*
(0.89)
1.45*
(0.87)
0.41
(0.87)
𝑆𝐸𝑇100_𝑅𝐸𝑇 1.92**
(0.89)
1.06
(0.88)
1.53*
(0.87)
0.30
(0.87)
𝑀𝐴𝐼_𝑅𝐸𝑇 1.81**
(0.88)
-0.08
(0.87)
-0.20
(0.87)
0.94
(0.87)
Source: Author’s calculation
Results from the simple binary probit presented in table 5 corroborate the fact that the news-
based market sentiment has the predictive power over market returns to some degree. In table
5, lagged variables of the market sentiment significantly determine and predict the probability
of ups and downs of stock market returns.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
20
ARMA
In order to explore the predictive power of the news-based market sentiment, the positive and
negative sentiment on stock market returns, the study employs ARMA(p,q) process as a
workhorse used to compare the predictive power among market sentiment variables. From the
benchmark model, ARMA (2,2) is fitted to capture the process of market returns in SET, SET50
and SET100. As for MAI, it turns out that ARMA (1,1) sufficiently captures the data generating
process of the return of MAI index.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
21
𝑆𝐸𝑇_𝑅𝐸𝑇𝑡 Benchmark
(1)
𝑆𝐸𝑇_𝑅𝐸𝑇𝑡 Market Sentiment
(2)
𝑆𝐸𝑇_𝑅𝐸𝑇𝑡 Positive Sentiment
(3)
𝑆𝐸𝑇_𝑅𝐸𝑇𝑡 Negative Sentiment
(4)
𝑆𝐸𝑇50_𝑅𝐸𝑇𝑡 Baseline
(5)
𝑆𝐸𝑇50_𝑅𝐸𝑇𝑡 Market Sentiment
(6)
𝑆𝐸𝑇50_𝑅𝐸𝑇𝑡 Positive
Sentiment (7)
𝑆𝐸𝑇50_𝑅𝐸𝑇𝑡 Negative Sentiment
(8) 𝑀𝐾𝑇_𝑆𝐸𝑁𝑡 - 0.97*
(0.50) - - - 1.06*
(0.56) - -
𝑃𝑂𝑆𝑡 -
- 2.26** (0.87)
- - - 2.22** (0.97)
-
𝑁𝐸𝐺𝑡 -
- - -0.65 (0.72)
- - - -0.70 (0.81)
𝐴𝑅(1) -0.44** (0.17)
-0.45** (0.18)
-0.92*** (0.12)
-0.44** (0.17)
-0.48** (0.17)
-0.49** (0.18)
-0.54*** (0.15)
-0.47** (0.18)
𝐴𝑅(2) -0.80*** (0.19)
-0.80*** (0.20)
-0.60*** (0.13)
-0.80*** (0.19)
-0.81*** (0.21)
-0.80*** (0.23)
-0.84*** (0.19)
-0.80*** (0.22)
𝑀𝐴(1) 0.47*** (0.16)
0.48** (0.17)
0.97*** (0.11)
0.47** (0.17)
0.51*** (0.17)
0.52*** (0.18)
0.56*** (0.14)
0.50** (0.18)
𝑀𝐴(2) 0.82*** (0.18)
0.82*** (0.20)
0.69*** (0.11)
0.82*** (0.19)
0.82*** (0.21)
0.81*** (0.23)
0.86*** (0.19)
0.81*** (0.22)
Constant -0.01 (0.04)
0.01 (0.04)
-0.08 (0.05)
0.02 (0.06)
0.01 (0.05)
0.03 (0.58)
-0.07 (0.06)
0.04 (0.06)
Durbin Watson 1.98 1.99 2.02 1.99 2.03 2.04 2.03 2.04
AIC 2.1747 2.1685 2.1631 2.1785 2.4042 2.3987 2.3935 2.4082
Table 6: shows the results of ARMA (2,2) . */**/*** denotes the level of significance at 10%,5% and 1%
2019 Capital Market Research Institute, The Stock Exchange of Thailand
22
Table 7: shows the results of ARMA(2,2). */**/*** denotes the level of significance at 10%,5% and 1%
2019 Capital Market Research Institute, The Stock Exchange of Thailand
23
In table 6, column 1 and column 5 present benchmark equations for the returns of SET and
SET50. AR and MA terms are statistically significant and AIC values equal 2.17 and 2.40. In
addition, Durbin-Watson statistics indicate that there is no autocorrelation left. When
incorporating the news-based market sentiment into benchmark equations in column 2 and
column 6, coefficients of the news-based sentiment are statistically significant at 10% level
and adding such variables also decreases AIC values. Thus, the news-based market sentiment
improves the predictive power of ARMA (2,2) of SET and SET50 returns.
With regard to the effects of positive market sentiment on SET and SET50 returns in column 3
and 7, the coefficients of the positive market sentiment are significant at 5%. Their AIC values
are lower than those in the benchmark. The positive market sentiment therefore has the
predictive power over SET and SET50 returns. On the contrary, in column 4 and 8, coefficients
of the negative sentiment are not significant. AIC values in column 4 and 8 are greater than
AIC values of the benchmark models. Adding the negative market sentiment does not
significantly contribute to the predictive power of the model. In conclusion, the news-based
market sentiment and the positive market sentiment may contain meaningful information for
returns of SET and SET50.
For SET100 and MAI in table 7, coefficients of the news-based market sentiment in column 2
and 6 are positive and statistically significant. Furthermore, coefficients of the positive sentiment
are statistically significant at 5% level and these coefficients exhibit positive signs. However,
coefficients of the negative sentiment are insignificant in both SET100 and MAI. This may
conclude that the news-based sentiment and the positive sentiment carry some meaningful
information that helps predict stock market returns.
When we incorporate both the news-based market sentiment and the positive sentiment into
the model, values of AIC in ARMA(2,2) and ARMA(1,1) also decrease and these signify that
regression functions may take into account some important information already.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
24
2019 Capital Market Research Institute, The Stock Exchange of Thailand
25
2019 Capital Market Research Institute, The Stock Exchange of Thailand
26
2019 Capital Market Research Institute, The Stock Exchange of Thailand
27
EGARCH (1,1)
Table 8 shows results of benchmark EGARCH (1,1) and volume-augmented EGARCH (1,1)
models. Dependent variables in EGARCH (1,1) are SET, SET50, SET100 returns, but, for
volume-augmented EGARCH (1,1), dependent variables are SET, SET50 and SET100 returns.
In EGARCH (1,1), the percentage change of trading volume enters into the model in the mean
equation ;whereas, in volume-augmented EGARCH (1,1) models, the level of trading volume
enters into the models, which share the same spirit of Lamoureux and Lastrapes (1990), in the
variance equation.
In table 9, which depicts the results of EGARCH (1,1) for negative and positive sentiment
variables, on the left panel, the coefficients of the negative sentiment are insignificant across
SET, SET50, SET100 and MAI returns. However, except for MAI, the log likelihood values of
EGARCH (1,1) augmented with the negative sentiment are higher than those in the benchmark
models. The coefficients of positive sentiment in EGARCH (1,1) are statistically significant and
show negative signs in SET, SET40 and SET100 returns. As the good news is prevalent in the
market, the overall volatility of market returns decreases.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
28
Chapter 6 Robustness Checks
Volume-augmented EGARCH (1,1) is employed for the robustness check in table 10. Under the
different set up, some results exist and some are not. On the left panel in table 10, only the
coefficient of the negative sentiment in SET100 returns is statistically significant and shows a
positive sign. This can give rise to an explanation that bad news generates higher volatility in
SET100 returns. However, the result is not that robust after comparing with the estimated
results from EGARCH (1,1).
On the right panel in table 10, the coefficients of the positive sentiment in volume-augmented
EGARCH (1,1) are statistically significant and display negative signs for SET and SET50
returns. The finding is SET100 returns is not that robust.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
29
Chapter 7 Discussion
The study constructs the news-based sentiment index from the corpus of financial news specific
to the Stock Exchange of Thailand from Thomson-Reuters Eikon’s media newswire from 2017-
2019. Dictionary technique is a tool that is used to extract the sentiment from the corpus. Not
only the news-based sentiment, the study constructs the negative and the positive sentiment as
well. The news-based sentiment consists of the positive sentiment and the negative sentiment.
Thus, this allows the researcher to examine the effects of the positive and negative sentiment
on market returns in Thailand during 2017-2019.
Does the news-based sentiment from the media newswire have the predictive power over
market returns?
From the Probit model, the news-based sentiment has some predictive power. The index itself
significantly explains the probability of the directions of stock market returns in SET, SET50,
SET100 and even MAI. The 1-day and 2-day lagged sentiment variables have some predictive
power over SET50 returns. In SET100, only 2-day lagged sentiment is statistically significant.
Thus, according to the simple Probit model, the news-based sentiment carries some meaningful
information of the probability of ups and downs of market returns and its lagged values, 1-day
and 2-day, may contain useful information for the probability of the directions of the stock
market returns in SET50 and SET100.
To warrant more empirical evidence of the predictive power inherited in the news-based
sentiment index, the time series models, ARMA (2,2) for SET, SET50 and SET100 and ARMA
2019 Capital Market Research Institute, The Stock Exchange of Thailand
30
(1,1) for MAI, are employed as a standard workhorse to examine the predictive power of the
newly introduced index.
News-based market sentiment and its predictive power
Comparing with the benchmark models, adding the news-based sentiment index into ARMA
(2,2) for SET, SET50 and SET100 returns and ARMA (1,1) for MAI returns reduces AIC values
in all equations. Thus, incorporating the news-based sentiment variable improves the model in
terms of the information criterion.
In addition, coefficients of the news-based sentiment are statistically significant. The news-
based sentiment contains significant information for market returns. As for the relationship
between the sentiment index and returns, each market-sentiment coefficient has a positive sign.
The more positive the market sentiment is, the higher the stock market returns across SET,
SET50, SET100 and MAI.
When we separate the news-based sentiment into the positive and the negative sentiment, this
gives more insightful results. Only coefficients of the positive sentiment index are statistically
significant; whereas, coefficients of the negative sentiment index are insignificant. With regard
to the positive sentiment, the coefficients show positive signs. That is consistent with the overall
market sentiment index. Unlike the positive sentiment, signs of negative-sentiment coefficients
are minus. However, they are insignificant.
Above all, the news-based sentiment index and the positive sentiment index contain the
significant information for market prediction.
Effects of the positive and negative sentiment on market volatility.
The positive and the negative sentiment indices are what we obtain from the dictionary method
when calculating the news-based sentiment index. In other words, in calculating the news-
2019 Capital Market Research Institute, The Stock Exchange of Thailand
31
based sentiment index, we also obtain the positive and the negative sentiment variables. In this
study, the effects of positive and negative sentiment index on volatility of stock market returns
in SET, SET50 and SET100 are investigated. The parsimonious EGARCH (1,1) models with
the trading volume are estimated and employed to explore the effect of positive sentiment
(good news) and negative sentiment (bad news) on the volatility.
Results show that the coefficients of the positive sentiment in the variance equations are
statistically significant in EGARCH (1,1) for SET, SET50 and SET100 returns. The positive
sentiment tends to have larger effects on the volatility of SET50 returns than SET’s and
SET100’s. Each coefficient exhibits the negative sign, which is in contrast to those in ARMA
model. However, this gives rise to some insightful interpretation. That is, when the positive
sentiment increases, it lowers the volatility in stock markets. The findings seem to be contrast
to the case of the negative sentiment in EGARCH (1,1). Coefficients of the negative sentiment
in the variance equation are insignificant.
In order to provide further evidence of the findings in EGARCH (1,1), the volume augmented
EGARCH (1,1) is employed as robustness checks. The robustness test confirms the effect of
the positive sentiment on the volatility of SET50 and SET100 returns. The volatility of SET50
returns tends to be the most vulnerable to the positive sentiment from the financial newswire.
However, with regard to the negative sentiments, even though signs of coefficients are positive,
they are insignificant in the variance equation.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
32
Chapter 8 Conclusion
The market sentiment is extracted from headline news about the Stock Exchange of Thailand in
Thomson Reuters newswire. The textual analysis, which is the dictionary method, quantifies the
market sentiment from the corpus of headline news in the study. The news-based sentiment
covers the period from 2017-2019 and the positive and the negative market sentiment variables
are also calculated. The news-based sentiment index, together with the positive and the
negative sentiment, contains the critical information in predicting the market returns. In addition,
the positive sentiment index has significant impact on the volatility of market returns. The more
positive the sentiment, the lower the volatility in stock markets.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
33
Reference
Allan, D. E., McAleer, M. J. and Singh, A. K. (2015) Machine News and Volatility: The Dow
Jones Industrial Average and the TRNA Real-Time High-Frequency Sentiment Series. In
Gregoriou, G.N. (Ed.), The handbook of high frequency trading (pp.327-344) Amsterdam:
Elsevier.
Atkins A., Niranjan M. and Gerding E. (2018). Financial News Predicts Stock Market Volatility
Better than Close Price. The Journal of Finance and Data Science, 4,120-137.
Boudoukh, F., Feldman, R. Kogan, S. and Richardson, M. (2012). Which News Moves Stock
Prices? a Textual Analysis, NBER working paper series, 18725.
Ferguson, N.J., Philip, D., Lam H.Y.T. and Guo, M. (2015). Media Content and Stock Returns:
the Predictive Power of Press, Multinational Finance Journal, 19(1), 1-31.
Garcia, D. (2013). Sentiment during Recessions. The Journal of Finance. 68(3), 1267-1300
2019 Capital Market Research Institute, The Stock Exchange of Thailand
34
Gotthelf, N. and Uhl, M.W., News Sentiment - A New Yield Curve Factor (March 14, 2018).
Journal of Behavioral Finance, Vol. 19 (3), 2018.
Heston, Steven L. and Sinha, Nitish Ranjan, News Versus Sentiment: Predicting Stock Returns
from News Stories (2016-06). FEDS Working Paper. No. 2016-048.
Lamoureux, C., & Lastrapes, W. (1990). Heteroskedasticity in Stock Return Data: Volume
versus GARCH Effects. The Journal of Finance, 45(1), 221-229. doi:10.2307/2328817
Loughran, T., and McDonald B. (2011). When is a Liability not a Liability? Textual Analysis,
Dictionaries, and 10-Ks. The Journal of Finance, 66, 35–65.
McLaren, Nick and Shanbhogue, Rachana, Using Internet Search Data as Economic Indicators
(June 13, 2011). Bank of England Quarterly Bulletin. No. 2011 Q2.
Nelson, D. B. ( 1991) . Conditional Heteroskedasticity in Asset Returns: a New Approach.
Econometrica: Journal of the Econometric Society, 347-370.
Tetlock, P. ( 2007) . Giving Content to Investor Sentiment: The Role of Media in the Stock
Market. The Journal of Finance, 62, 1139–1168.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
35
Tetlock, P. , Saar-Tsechansky M. , and Macskassy S. (2008) . More Than Words: Quantifying
Language to Measure Firms’ Fundamentals. The Journal of Finance, 63, 1437–1467.
Tsay, R. (2013). An Introduction to Analysis of Financial Data with R. John Wiley & Sons, Inc.
Uhl, M.W. (2014). Reuters Sentiment and Stock Returns, Journal of Bahavioral Finance,15:4,
287-298.
Uhl, M.W. (2017). Emotions Matter: Sentiment and Momentum in Foreign Exchange., Journal
of Behavioral Finance,18:3,249-257.
2019 Capital Market Research Institute, The Stock Exchange of Thailand
36
Appendix I
Variable t-statistic P-value
MKT_SEN -14.27 0.00
POS -14.54 0.00
NEG -14.53 0.00
SET_RET -16.90 0.00
SET50_RET -17.32 0.00
SET100_RET -17.06 0.00
MAI_RET -14.53 0.00
SET_VOLM -15.53 0.00
SET50_VOLM -13.54 0.00
SET100_VOLM -14.43 0.00
SET_VOLM_level -10.47 0.00
SET50_VOLM_level -10.17 0.00
SET100_VOLM_level -8.18 0.00
2019 Capital Market Research Institute, The Stock Exchange of Thailand
37
Acknowledgement
I would like to thank the Stock Exchange of Thailand’s Capital Market Research Institute
(CMRI) and I have benefited greatly from the discussion with CMRI’s researchers. The study is
funded by CM research innovation contest.