towards a social negativity index: giving content to ...€¦ · towards a social negativity index:...

45
Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business, Carleton University [email protected] March 21 st , 2018 ABSTRACT I develop a linguistic index of investor negativity expressed on social media at the firm-day level and call it the Social Negativity Index (SNI). Higher SNI levels correspond to lower stock returns and greater trading volume. Consistent with the psychology literature, markets appear to respond more to negative tweeting than they do to positive tweeting. For the universe of firms listed on the NYSE, NYSE American and NASDAQ, firms with more retail ownership, are tweeted about more suggesting that tweeting largely originates from the retail investor-base. In addition, firms with greater dispersion of analyst forecasts are tweeted about more, in what appears to be an attempt by non-sophisticated investors to resolve the difference of opinion of analysts. The results suggest that social media has assumed some of the roles traditionally associated with analysts and the financial media. JEL classification: G10, G12, G14 Keywords: Social media; Textual Analysis; Wisdom of the crowd; Media

Upload: others

Post on 28-May-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

Towards a Social Negativity Index: Giving Content to Financial

Tweeting

Mohamed Al Guindy

Sprott School of Business, Carleton University

[email protected]

March 21st, 2018

ABSTRACT

I develop a linguistic index of investor negativity expressed on social media at the firm-day level

and call it the Social Negativity Index (SNI). Higher SNI levels correspond to lower stock returns

and greater trading volume. Consistent with the psychology literature, markets appear to respond

more to negative tweeting than they do to positive tweeting. For the universe of firms listed on the

NYSE, NYSE American and NASDAQ, firms with more retail ownership, are tweeted about more

– suggesting that tweeting largely originates from the retail investor-base. In addition, firms with

greater dispersion of analyst forecasts are tweeted about more, in what appears to be an attempt by

non-sophisticated investors to resolve the difference of opinion of analysts. The results suggest

that social media has assumed some of the roles traditionally associated with analysts and the

financial media.

JEL classification: G10, G12, G14

Keywords: Social media; Textual Analysis; Wisdom of the crowd; Media

Page 2: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

1

1. Introduction

On August 28th, 2017, Gilead Sciences announced the acquisition of Kite Pharma. Five

days prior, an Artificial Intelligence (AI) system that monitors social media conversations, had

predicted this imminent acquisition by observing changes to social media conversations (Ram and

Wiggleworth, 2017). Given the rise of AI systems in financial markets, particularly in the domain

of social media, the goal of this paper is to develop a systematic methodology to quantify large-

scale financial information derived from social media. In particular, I develop a text-based Social

Negativity Index (SNI), reflecting total negativity about a stock on social media, and show that SNI

relates to daily stock performance and trading volume. More generally, I illustrate that social media

has assumed some of the functions traditionally associated with analysts and the financial press.

In recent years, Twitter was used to mobilize Egyptian street protestors during Egypt’s

Arab Spring (Acemoglu, Hassan, and Tahoun, 2018), to predict flu epidemics in New York City

(Broniatowski, Paul and Dredze, 2013), and to aid efforts dealing with hurricanes and other natural

disasters (Seetharaman and Wells, 2017). The relationship between social media and stock returns

has captured attention in recent years. For example, tweets about the pharmaceutical industry by

former US Presidential Candidate, Hillary Clinton, sent the industry stocks down on two separate

events (Egan, 2015; Wang, 2016). Tweets by Senator Bernie Sanders also affected stock

performance of pharmaceuticals (Bloomfield, 2016). Most recently, and perhaps most

prominently, tweets by US President Donald Trump influenced the stock performance of such

companies as Boeing (Lovelace, 2016), Lockheed Martin (Wang, 2016b), and Toyota (Rich,

2017). The influence of tweets from the US President has become so well-established that a mobile

application has been developed to track his tweeting activities and send notifications to investors

Page 3: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

2

who own stocks in a company when the President tweets about it. As Rachel Mayer, co-founder

of Trigger Finance (the company that provides this service) puts it, “Tweets really do matter.”

While tweets from prominent figures can affect stock performance, the goal of this paper

is to deal with the subject systemically to determine the extent to which our knowledge of

traditional financial media extends to the domain of social media. In particular, I investigate the

relationship between social negativity expressed on Twitter, and stock returns and trading volume.

One of the contributions of this paper is to develop and make available a daily firm-level index of

total investor negativity expressed on social media – Social Negativity Index (SNI). Both tweeting

volume and SNI appear to be reflected in securities’ prices. Interestingly, one of the key findings

of Tetlock (2007) – that investors respond more to negative language in the financial press than

they do to positive language – appears to extend to social media – which is congruent with the

psychology literature.

I also find that firms with less institutional ownership, and thus greater retail ownership,

are tweeted about more frequently than other firms. This is consistent with the notion that tweeting

likely originates from the retail investor-base. In the same vein, tweeting about firms is higher

where the dispersion of analyst forecasts is greater. This suggests that Twitter provides an outlet

where investors can discuss various views about a stock in the absence of analysts’ consensus.

This last point is particularly important as it suggests a role for social media in the information

production process – a function historically connected with analysts and traditional media.

The setting of this paper is a compelling one to study for a number for reasons. First, the

goal of this paper is to establish a daily systematic link between the aggregation of all opinions

about stocks, as depicted on Twitter, and stock performance. In doing so, this study mimics

previous studies about the financial media but does so in the context of social media. Second,

Page 4: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

3

unlike studying tweets that originate from firms or from individuals, the number of tweets that

possible to analyze in this setting is very large (over 18 million tweets) which encompasses every

firm listed on all major US exchanges. Third, most tweeting originating from firms is positive in

tone – as predicted by theories of selective disclosure in Verrecchia (1983) and Jung and Kwon

(1988). Tweeting from individuals, on the other hand, exhibits more variance in tone, thus allowing

for the development of the SNI.

The “Buzz index”

In April 2016, Sprott Asset Management, launched BUZZ (sentiment) Social Media

Insights ETF (NYSE: BUZ)1. This ETF, distributed by ALPS Portfolio Solutions, aggregates the

sentiment of all stocks in the US based on their social media sentiment. Using proprietary textual

analysis, Big Data, and Artificial Intelligence (AI) algorithms, the index selects 75 stocks with the

most positive sentiment to include in the ETF. The ETF itself is reconstructed monthly. Jamie

Wise, the developer of the BUZZ ETF says: “We discovered that the overall level of buzz or

sentiment around stocks was in fact predictive, and could lead to a process where you could select

stocks ranked based on that level of sentiment and ultimately come up with a portfolio of securities

that could outperform the market.2”

Tweets about stocks

Since its inception, Twitter used the hashtag “#” symbol to identify the topic of a tweet.

For example #StanleyCup mentioned in a tweet signifies that the tweet is about the Stanley Cup in

1 See http://www.businesswire.com/news/home/20160419005303/en/Investing-%E2%80%9CSocial%E2%80%9D-

Sprott-BUZZ-Social-Media-Insights 2 See http://www.etf.com/sections/etf-industry-perspective/sprott-new-etf-captures-investor-buzz

Page 5: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

4

particular. The use of the ‘#’ symbol not only makes it easy for individuals to tweet about topics,

but also facilitates searches for tweets about specific topics.

Because of the rise of Twitter discussions about the stock market, Twitter introduced the

“cashtag” symbol ($) in 2012. The cashtag is used in lieu of the hashtag to signify that a tweet is

about the stock of a specific firm3. For example, $AMZN stated in a tweet, indicates that the tweet

is about the stock of Amazon Inc. The use of the cashtag makes it easy to identify and isolate

tweets that strictly pertain to the stock of a company.

Literature review

This paper relates to a number of strands in the literature. First, it relates to the literature

that examines the role of the media in financial markets. Second, it relates to the literature on

textual analysis. Finally, this paper relates to the emerging literature on the use of the Internet by

investors, and particularly social media, to communicate financial information.

The financial economics literature established the role of the media in financial markets.

For example, Tetlock (2007) illustrated that stock markets respond to the content of a popular Wall

Street Journal article. Tetlock finds that higher media pessimism predicts downward pressure on

market prices in the short term. Tetlock, Saar-Tsechansky, and Macskassy (2008) further illustrate

the role of media sentiment in that they show that the language content of the media can be used

to predict stock returns and accounting earnings. Engelberg and Parsons (2011) illustrate the causal

impact of media in financial markets, and Fang and Peress (2009) show that media coverage affects

stock returns due to the breadth of information dissemination.

3 See https://www.cnet.com/news/twitter-introduces-ticker-symbol-cashtags-for-finance-searches/ for details about

the introduction of the cashtag.

Page 6: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

5

The literature on textual analysis is an emerging strand of literature in finance and is often

combined with studies examining the role of the media in financial markets. Tetlock (2007) and

Tetlock, Saar-Tsechansky, and Macskassy (2008) use the Harvard-IV-4 psychological dictionary

to conduct textual analysis on media content. More recently, Loughran and McDonald (2011)

introduced a second dictionary to extract textual sentiment that pertains specifically to financial

language. The application of textual analysis to social media is an emerging area of interest in the

literature. As Loughran and McDonald (2016) comment “Hopefully, (textual analysis) methods

can be developed that are better able to capture the information in this [social media] very noisy

yet rich source of data.”

Due to advances in technology, the landscape of how investors gather and process

information has evolved. For example, investors use Internet stock message boards (Antweiler and

Frank, 2004). They also use Google to search for and gather financial information (Da, Engelberg,

and Gao, 2011; Drake, Roulstone and Thornock, 2012). Investors also turn to EGAR to collect

financial information (Loughran and McDonald, 2017). Blankespoor, Miller and White (2014),

show that firms that use Twitter to communicate information achieve a lower bid-ask spread,

consistent with a reduction in information asymmetry. Jung, Naughton, Tahoun and Wang (2017)

show that firms’ tweets about earnings announcements can improve their information

environments. Chen, De, Hu and Hwang (2014) show that collective opinion, or wisdom of the

crowd, of opinions transmitted on Seeking Alpha, a popular investment crowd-sourcing platform,

can predict stock returns. Bartov, Faurel, and Mohanram (2016) show that opinions on Twitter,

posted just before earnings announcements predict quarterly earnings. Chen, Hwang and Liu

(2016) examine tweeting of CEOs showing that such tweeting can increase customer base, and

improve stock liquidity, but that some of these effects are subsequently reversed. Chawla, Da, Xu

Page 7: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

6

and Ye (2015) use data from TD Ameritrade to show that the diffusion of news, particularly trading

news, on social media is associated with lower bid-ask spreads on news days. Al Guindy (2016)

show that corporate tweeting became significantly more prevalent after the Securities and

Exchange commission (SEC) endorsed social media as an official channel for corporate

communication.

This paper proceeds as follows, section 2 provides an overview of the data used,

particularly, the Twitter dataset. Section 3 explores the predictability of tweeting about firms.

Section 4 constructs the Social Negativity Index (SNI), while chapter 5 examines returns and

trading volume. Section 6 conducts a vector autoregression (VAR) analysis, while section 7

conducts robustness and additional tests, and the conclusion of the paper is stated in section 8.

2. Data and summary statistics

2.1 Twitter data collection

To collect the tweets used in this project. I set up a small laboratory consisting of computers

constantly collecting financial tweets about all firms listed on the three major US exchanges,

NYSE, NYSE American, and NASDAQ. These computers use programs that I wrote in the Python

programming language and make use of the Twitter Application Program Interface (API)4. The

collection of data takes place daily and captures financial tweets on a daily basis. I identify

financial tweets as those that contain the “cashtag” $ symbol, and the stock ticker. In my Python

program, I provide the tickers of all stocks listed on the NYSE, NYSE American and NASDAQ.

Twitter makes these tweets searchable and collectable for a period of approximately seven days5,

4 See https://dev.twitter.com for the Twitter API details. 5 A description of the availability of tweets for search is available at: https://dev.twitter.com/rest/public/search

Page 8: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

7

after which they are irretrievable. For this reason, it is necessary to build the infrastructure used in

this paper, and to collect the tweets on a daily basis6. After I collect the tweets, I store them in a

SQL database in preparation for further analysis.

In addition to the Twitter dataset, I obtain stock return and trading volume information

from CRSP, accounting details from COMPUSTAT, Institutional ownership data from Thomson

Reuters 13F filings, and analyst information from I/B/E/S. I exclude firms from regulated

industries, financials, and those that have no Fama-French 48 industry classification. In addition,

I winsorize daily returns at the 0.5% and the 99.5% levels.

Table 1 describes summary statistics about the sample of tweets collected. The sample

contains 18,319,583 tweets covering 2,292 firms in the period between January 1st, 2017 and

December 31st, 2017. The tweets originate from 1.02 Million unique Twitter users. For a detailed

description of all variables used and their sources, please see Appendix A.

[Insert Table 1 here]

Interestingly, tweeters tweet financial information using numerous methods (devices).

Within the sample set, tweeters use more than 8500 systems to tweet! As Table 2 illustrates, the

most common method used to tweet financial information is the Twitter website (25% of tweets),

which is not surprising. However, a substantial volume of tweeting originates from iPhone (12%)

and Android (9%) platforms. While it is generally thought that advances in technology are

allowing broader access to financial information, it appears that technological advancements, such

as mobile devices, are allowing market participants to generate financial information more easily.

6 Data appearing and disappearing quickly is known as “high velocity data” in which the data is only available for a

short period, after which it is not available.

Page 9: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

8

[Insert Table 2 here]

One of the benefits of the technological infrastructure built to collect the data for this paper,

is that it allows for the collection of numerous details about each tweet including information about

the tweeter. For example, the sample includes information about the language of the tweeter. The

vast majority of tweets (92%) originate from users in the English language. A small number of

tweets also originates in Russian, Spanish, French, German, Dutch, and Portuguese. Table 3

summarizes the top languages used by tweeters of financial information. It is not surprising that

most of the tweeting is in the English language for two reasons: firstly, most tweeting around the

globe is in English; secondly, the sample of firms used in this paper are those listed in the major

American exchanges. The fact that Twitter is officially blocked in China7 likely explains the

absence of tweeting in Mandarin and Cantonese.

[Insert Table 3 here]

2.2 Financial tweeting daily and hourly distribution

Next, it is useful to examine the distribution of financial tweeting throughout the week.

Figure 1 shows the percentage of all tweeting on each day of the week. As the figure shows, less

tweeting takes place on the weekend. In particular, the volume of tweeting on Saturdays and

Sundays is about half of the volume on weekdays. The volume of tweeting is higher, and generally

somewhat similar for most week days – with Tuesdays exhibiting slightly more tweeting than other

week days.

[Insert Figure 1 here]

7 See http://www.businessinsider.com/websites-blocked-in-china-2015-7/#facebook-4

Page 10: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

9

The breakdown of financial tweeting by hour of day is illustrated in Figure 2. As expected,

financial tweeting is highest during market hours and is generally lower outside of market hours.

There is also a period of elevated discussions just prior to and just after market hours.

[Insert Figure 2 here]

Necessarily, not every firm is tweeted about as frequently as other firms. For example,

Apple appears to be the firm most tweeted about in this sample, with 556.499 tweets. Other firms

in the sample include Amazon (458.891 tweets), Twitter Inc (297,282 tweets), Netflix (203,608

tweets) and Starbucks (58,002 tweets). Table 4 shows the total number of tweets for a subsample

of firms used in this study.

[Insert Table 4 here]

3. Determinants of tweeting about firms

3.1 Firm characteristics that predict tweeting volume

Given that firms are tweeted about with varying frequencies, I seek to identify the

characteristics associated with firms that are tweeted about frequently. For this, purpose, I use the

following regression model:

ln(𝑇𝑜𝑡𝑎𝑙 𝑇𝑤𝑒𝑒𝑡𝑖𝑛𝑔 𝑉𝑜𝑙𝑢𝑚𝑒)𝑖

= 𝛼1 + 𝛽1 ∗ 𝐹𝑖𝑟𝑚 𝐵𝑒𝑡𝑎 𝑖 + 𝛽2 ∗ 𝐵𝑜𝑜𝑘 𝑡𝑜 𝑀𝑎𝑟𝑘𝑒𝑡 𝑅𝑎𝑡𝑖𝑜𝑖 + 𝛽3 ∗ 𝐹𝑖𝑟𝑚 𝑆𝑖𝑧𝑒 𝑖

+ 𝛽4 ∗ 𝐿𝑒𝑣𝑒𝑟𝑎𝑔𝑒 𝑖 + 𝛽5 ∗ 𝑃𝑎𝑦𝑜𝑢𝑡 𝑟𝑎𝑡𝑖𝑜 𝑖 + 𝛽6 ∗ 𝐼𝑛𝑠𝑡𝑖𝑡𝑢𝑡𝑖𝑜𝑛𝑎𝑙 𝑂𝑤𝑛𝑠𝑒𝑟𝑠ℎ𝑖𝑝 𝑖

+ 𝛽7 ∗ 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐴𝑛𝑎𝑙𝑦𝑠𝑡𝑠 𝐹𝑜𝑙𝑙𝑤𝑜𝑖𝑛𝑔 𝑡ℎ𝑒 𝐹𝑖𝑟𝑚 𝑖

+ 𝛽8 ∗ 𝐷𝑖𝑠𝑝𝑒𝑟𝑠𝑖𝑜𝑛 𝑜𝑓 𝐴𝑛𝑎𝑙𝑦𝑠𝑡 𝐹𝑜𝑟𝑒𝑐𝑎𝑠𝑡𝑠 𝑖 + 𝛽9 ∗ 𝐼𝑛𝑑𝑢𝑠𝑡𝑟𝑦 𝑖 + 𝜀𝑖

Page 11: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

10

The dependent variable is the natural logarithm of the total number of tweets about a firm

in the full sample. Beta is the CAPM beta, the book to market ratio represents the faction of the

firm’s book value relative to its market value. Firm size is the natural logarithm of the dollar value

of the firm’s shares. Leverage is the proportion of debt in the firm’s capital structure. Institutional

ownership represents the percentage of shares in the firm held by institutional investors. The

number of analysts following the firm is the number of unique analysts providing EPS estimates

for the firm. Finally, the dispersion of analyst forecasts is the standard deviation of the analysts’

forecasts scaled by the mean estimate. In addition, industry fixed effects are included in the model.

The model is depicted in Table 5 and shows that the volume of tweeting about a firm

depends on many of the firm characteristics above. In particular, larger firms are tweeted about

more than smaller firms, which is consistent with the notion that investors are paying more

attention to larger firms. Firms with a higher CAPM beta are tweeted about more frequently than

firms with a lower beta, suggesting that riskier firms attract more discussions. Firms with lower

institutional ownership are tweeted about more than firms with greater institutional ownership.

This suggests that much of the tweeting of financial information originates from retail rather than

institutional investors. One somewhat surprising result, is that firms with greater analyst coverage

are tweeted about more often than firms with less analyst coverage, but this may be due to the fact

that the same reasons that attract additional analyst coverage also attract additional retail interest.

However, where the dispersion of analyst forecasts is greatest, the volume of tweeting is also

higher. This suggests that in the absence of analysts’ consensus, investors tweet more about a stock

in what may be an attempt to resolve the difference of opinion.

[Insert Table 5 here]

3.2 Determinants of daily tweeting volume

Page 12: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

11

After examining firm characteristics that predict the volume of investor tweeting about a

firm, I now focus on the determinants of daily tweeting volume. In particular, the goal is to uncover

the factors that lead to a large tweeting volume on a given day. For this purpose, I use the following

model:

ln(𝐷𝑎𝑖𝑙𝑦 𝑇𝑤𝑒𝑒𝑡𝑖𝑛𝑔 𝑉𝑜𝑙𝑢𝑚𝑒)𝑖𝑡

= 𝛼1 + 𝛽1 ∗ 𝐹𝑖𝑟𝑚 𝑟𝑒𝑡𝑢𝑟𝑛 𝑖𝑡−1 + 𝛽2 ∗ 𝑀𝑎𝑟𝑘 𝑟𝑒𝑡𝑢𝑟𝑛𝑖𝑡−1 + 𝛽3 ∗ 𝑉𝐼𝑋 𝑖𝑡−1

+ 𝛽4 ∗ 𝐸𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑑𝑎𝑦 𝑖𝑡 + 𝛽5 ∗ 𝑊𝑒𝑒𝑘 𝑏𝑒𝑓𝑜𝑟𝑒 𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑖𝑡

+ 𝛽6 ∗ 𝑊𝑒𝑒𝑘 𝑎𝑓𝑡𝑒𝑟 𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑖𝑡

+ 𝛽7 ∗ 𝑇𝑤𝑒𝑒𝑡𝑖𝑛𝑔 𝑣𝑜𝑙𝑢𝑚𝑒 𝑜𝑛 𝑡ℎ𝑒 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠 𝑑𝑎𝑦 𝑖𝑡 + 𝛽8 ∗ 𝐼𝑛𝑑𝑢𝑠𝑡𝑟𝑦 𝑡𝑤𝑒𝑒𝑡𝑖𝑛𝑔 𝑖𝑡

+ 𝛽9 ∗ 𝐹𝑖𝑟𝑚 𝑓𝑖𝑥𝑒𝑑 𝑒𝑓𝑓𝑒𝑐𝑡𝑠 𝑖 + 𝜀𝑖𝑡

This model is a panel regression model where the dependent variable is the natural

logarithm of the number of daily tweets about a firm. The independent variables are the firm’s

return on the previous trading day8, the market return on the previous trading day, the volatility

index on the previous trading day. Other independent variables include whether a given day is the

day when a firm announces its quarterly earnings, whether the day is in the week leading to the

earnings announcement, or the week following the earnings announcement. The model also

includes the firm’s tweeting volume on the previous day, as well as the tweeting volume for the

firm’s industry (based on the Fama-French 48 industry classification). Firm fixed effects are

included to account for the heterogeneity in firm characteristics. Standard errors are clustered by

firm and trading day as suggested by Peterson (2009).

8 Returns are calculated from the close of markets on the previous trading day to the close of markets on the current

trading day. This definition also corresponds to the definition of a ‘tweeting day’ which spans the same time from the

close of markets on the previous day to the close of markets on a given day.

Page 13: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

12

The results, documented in Table 6, show that tweeting volume about a firm is driven, in

part, by its return on the previous trading day. In particular, where the return on the previous day

is high, a firm is tweeted about more by investors. Interestingly, the coefficient on

𝑀𝑎𝑟𝑘 𝑟𝑒𝑡𝑢𝑟𝑛𝑖𝑡−1, is not statistically significant, suggesting that tweeting volume about a firm is

not dependent on the market’s previous day’s return, but only on the individual firm’s return.

Similarly, tweeting volume about a firm is not dependent on the volatility index (VIX) on the

previous day.

The coefficient on 𝐸𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑑𝑎𝑦 𝑖𝑡 is positive and significant at the 1% level. This is

rather expected since investor attention is likely to be highest on the day of the firm’s earnings

announcements. Similarly, tweeting volume remains high during the week following earnings

releases, as suggested by the positive and significant coefficient on 𝑊𝑒𝑒𝑘 𝑎𝑓𝑡𝑒𝑟 𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑖𝑡. In

addition, the volume of Twitter discussions is also high during the week leading to the day of

earnings announcements as suggested by the positive and significant coefficient on

𝑊𝑒𝑒𝑘 𝑏𝑒𝑓𝑜𝑟𝑒 𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑖𝑡. It is not surprising that the number of tweets about a stock is highest

during earnings season.

If tweeting volume is high on a given day, it is likely to be high on the next day, as

suggested by the positive and significant coefficient on 𝑇𝑤𝑒𝑒𝑡𝑖𝑛𝑔 𝑣𝑜𝑙𝑢𝑚𝑒 𝑜𝑛 𝑡ℎ𝑒 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠 𝑑𝑎𝑦 𝑖𝑡.

This suggests autocorrelation of tweeting, or that tweeting about firms is “sticky”.

Finally, this model examines whether tweeting about a firm corresponds to tweeting

volume about other firms in the same industry. To construct the 𝑖𝑛𝑑𝑢𝑠𝑡𝑟𝑦 𝑡𝑤𝑒𝑒𝑡𝑖𝑛𝑔 variable for

a given firm, I sum all the tweeting about all firms in the same Fama-French 48 industry

classification (excluding the firm in question). I then divide the total number of tweets by the total

number of firms in the industry (again excluding the firm in question) resulting in the average daily

Page 14: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

13

tweeting volume for the industry. The coefficient on this variable is positive and significant at the

1% level of significance, suggesting that investors are likely to tweet about a firm on a given day

if they tweet about other firms in the same industry. This result is plausible insofar as investors are

likely to pay attention to firms in the same industry at the same time.

[Insert Table 6 here]

4. Constructing the Social Negativity Index (SNI)

In this section, I construct the Social Negativity Index (SNI). The SNI is an index bounded

by the values 0 and 1. 0 represents no negativity, while 1 represents maximum negativity about a

stock on a given day. Sections 5 and 6 will relate SNI to stock returns and trading volume.

Not all tweets are expected to have the same effect on asset prices; tweets that contain

positive information will affect markets differently from tweets containing negative information.

I conduct textual analysis to identify the tone or the linguistic sentiment of each tweet in the dataset.

A detailed procedure of the textual analysis algorithm used in this paper follows.

Textual analysis, as a subfield of finance, has gained prominence over the last decade.

Tetlock (2007) used the Harvard IV-4 Psychological Dictionary to analyze the tone of a popular

Wall Street Article. Tetlock showed that the aggregate sentiment of the article (identified by the

proportion of negative words contained in the article) affects stock returns for the subsequent

trading days. The SNI developed in this paper captures the negativity reflected on social media in

particular rather than print media.

Loughran and McDonald (2011) demonstrated that financial language is unique in

comparison to “normal” English language. For example, according to the Harvard Dictionary, a

word such as “cancer” or “debt” would be treated as a negative word. However, in financial

Page 15: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

14

language, such words are not necessarily negative. For example, a pharmaceutical company

developing a drug for cancer will likely use the word ‘cancer’ extensively in its statements and

news. Similar, the word ‘debt’ is used frequently without implying any negative meaning. For this

reason, Loughran and McDonald (2011) developed a new dictionary that is particularly suited to

analyzing financial language taking some of the above issues into account.

In this paper, the tweets from investors are likely to contain financial information by virtue

of being discussions about the stock of a firm, for this reason, I use the Loughran and McDonald

(2011) dictionary for the main analysis, and later use the Harvard Psychological Dictionary for

robustness.

To conduct the textual analysis, I use the Python programming language and obtain a copy

of the Loughran and McDonald (2011) dictionary9. The text of each tweet is analyzed individually

using the program. Each tweet receives a score corresponding to the count of the number of

positive and the number of negative words in the tweet. Tweets containing more positive than

negative words are deemed positive, and tweets containing more negative words are deemed

negative. Tweets containing an equal number of positive and negative words, as well as tweets

containing no key positive or negative words are deemed neutral. In the sample set, 1,956,800

tweets are identified as positive, 2,092,904 as negative, and the remaining tweets as neutral.

In the next stage, I aggregate the tweets for each firm on each day. I add up the number of

positive tweets and negative tweets for each firm. Having identified the sentiment of each tweet

and knowing the total number of tweets for a given firm, I am now ready to define the SNI (for

each stock-day) as follows:

9 The dictionary of financial keywords is available on Bill McDonald’s website: https://www3.nd.edu/~mcdonald/

Word_Lists.html

Page 16: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

15

𝑆𝑜𝑐𝑖𝑎𝑙 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑖𝑡𝑦 𝐼𝑛𝑑𝑒𝑥 (𝑆𝑁𝐼) =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑤𝑒𝑒𝑡𝑠 𝑤𝑖𝑡ℎ 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑠𝑒𝑛𝑡𝑖𝑚𝑒𝑛𝑡

𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓𝑡𝑤𝑒𝑒𝑡𝑠 𝑎𝑏𝑜𝑢𝑡 𝑓𝑖𝑟𝑚

5. SNI, tweeting volume, and market reaction

5.1. SNI, tweeting volume and stock returns

In this section, I examine firms’ stock returns on a given market day to identify whether

SNI and tweeting volume correspond to stock returns. In Table 7, I conduct variations of the

following model (at a daily frequency):

𝑅𝑒𝑡𝑢𝑟𝑛 (𝑏𝑎𝑠𝑖𝑠 𝑝𝑜𝑖𝑛𝑡𝑠)𝑖𝑡

= 𝛼1 + 𝛽1 ∗ 𝑆𝑁𝐼𝑖𝑡 + 𝛽2 ∗ ln(𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑤𝑒𝑒𝑡𝑠)𝑖𝑡 + 𝛽3 ∗ 𝐹𝑖𝑟𝑚 𝑟𝑒𝑡𝑢𝑟𝑛 𝑖𝑡−1

+ 𝛽4 ∗ 𝑀𝑎𝑟𝑘𝑒𝑡 𝑟𝑒𝑡𝑢𝑟𝑛 𝑡 + 𝛽5 ∗ 𝑀𝑎𝑟𝑘𝑒𝑡 𝑟𝑒𝑡𝑢𝑟𝑛 𝑡−1 + 𝛽6 ∗ 𝐸𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑑𝑎𝑦 𝑖𝑡

+ 𝛽7 ∗ 𝑊𝑒𝑒𝑘 𝑏𝑒𝑓𝑜𝑟𝑒 𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑖𝑡 + 𝛽8 ∗ 𝑊𝑒𝑒𝑘 𝑎𝑓𝑡𝑒𝑟 𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑖𝑡

+ 𝛽9 ∗ 𝑉𝐼𝑋𝑡 + 𝛽10 ∗ 𝐷𝑎𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑤𝑒𝑒𝑘 𝑓𝑖𝑥𝑒𝑑 𝑒𝑓𝑓𝑒𝑐𝑡𝑠

+ 𝛽11 ∗ 𝐹𝑖𝑟𝑚 𝑓𝑖𝑥𝑒𝑑 𝑒𝑓𝑓𝑒𝑐𝑡𝑠𝑖𝑡 + 𝛽12 ∗ 𝐷𝑎𝑦 𝑓𝑖𝑥𝑒𝑑 𝑒𝑓𝑓𝑒𝑐𝑡𝑠 𝑖𝑡 + 𝜀𝑖𝑡

In Table 7, I regress daily returns (in basis points) on SNI and the number of tweets

generated about each firm (the variables of interest). Control variables include the firm’s lagged

return, market return, lagged market return, a dummy variable that takes a value of 1 if the day is

the firm’s quarterly earnings release. As well as dummy variables for the week before and the

week after a firm’s quarterly earnings announcement. I also include the volitively index (VIX) and

day of the week fixed effects. Furthermore, I include firm fixed effects, which account for the

heterogeneity of firm characteristics, and day fixed effects (in some of the models) which account

for daily market conditions. Standard errors are double clustered by firm and day.

Page 17: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

16

In the first model (1), I examine whether SNI corresponds to stock returns. As Table 7

shows, an increase of social negativity from 0 to 1 corresponds to a negative return of 35.5 basis

points. This result is significant at the 1% level. Furthermore, this result is robust in all the

specifications examined. In model (2), I focus on tweeting volume

ln (𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑤𝑒𝑒𝑡𝑠) about a firm. The model shows that an increase of one unit of tweeting

volume corresponds to a 25.6 basis points increase in returns. This result is also significant at the

1% level and robust in all specifications examined. To the extent that the number of Twitter

discussions corresponds to investor attention, this result can be seen in light of Barber and Odean

(2008), who reported that retail investors are net buyers of attention-grabbing stocks.

In model 3, I combine SNI with tweeting volume in the same specification, and find that

the results are similar to the ones described above for each of tweeting volume and SNI. Taken

together, these results suggest that both SNI and tweeting volume correspond to stock returns. It

may be possible to think of tweeting volume as a proxy for attention, and of SNI as a proxy for

market sentiment.

Models 4-6 replicate the analysis of models 1-3 with the exception that day fixed are

included in the model. Because day fixed effects capture overall daily market conditions such as

market return, VIX, etc. such variables are omitted from the control vector.

[Insert Table 7 here]

5.2 SNI, tweeting volume and trading volume

Having looked at the relationship between SNI, tweeting volume and returns, I now focus

on trading volume. The analysis is analogous to the one in section 5.1, but instead focuses on

trading volume. To examine this relationship, I use variations of the following model:

Page 18: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

17

𝑇𝑟𝑎𝑑𝑖𝑛𝑔 𝑣𝑜𝑙𝑢𝑚𝑒𝑖𝑡

= 𝛼1 + 𝛽1 ∗ 𝑆𝑃𝐼𝑖𝑡 + 𝛽2 ∗ ln(𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑤𝑒𝑒𝑡𝑠)𝑖𝑡 + 𝛽3 ∗ 𝐹𝑖𝑟𝑚 𝑟𝑒𝑡𝑢𝑟𝑛 𝑖𝑡−1

+ 𝛽4 ∗ 𝐹𝑖𝑟𝑚 𝑡𝑟𝑎𝑑𝑖𝑛𝑔 𝑣𝑜𝑙𝑢𝑚𝑒 𝑖𝑡−1 + 𝛽5 ∗ 𝑀𝑎𝑟𝑘𝑒𝑡 𝑟𝑒𝑡𝑢𝑟𝑛 𝑡

+ 𝛽6 ∗ 𝑀𝑎𝑟𝑘𝑒𝑡 𝑟𝑒𝑡𝑢𝑟𝑛 𝑡−1 + 𝛽7 ∗ 𝐸𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑑𝑎𝑦 𝑖𝑡

+ 𝛽8 ∗ 𝑊𝑒𝑒𝑘 𝑏𝑒𝑓𝑜𝑟𝑒 𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑖𝑡 + 𝛽9 ∗ 𝑊𝑒𝑒𝑘 𝑎𝑓𝑡𝑒𝑟 𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑠 𝑖𝑡

+ 𝛽10 ∗ 𝑉𝐼𝑋𝑡 + 𝛽11 ∗ 𝐷𝑎𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑤𝑒𝑒𝑘 𝑓𝑖𝑥𝑒𝑑 𝑒𝑓𝑓𝑒𝑐𝑡𝑠

+ 𝛽12 ∗ 𝐹𝑖𝑟𝑚 𝑓𝑖𝑥𝑒𝑑 𝑒𝑓𝑓𝑒𝑐𝑡𝑠𝑖𝑡 + 𝛽13 ∗ 𝐷𝑎𝑦 𝑓𝑖𝑥𝑒𝑑 𝑒𝑓𝑓𝑒𝑐𝑡𝑠 𝑖𝑡 + 𝜀𝑖𝑡

𝑇𝑟𝑎𝑑𝑖𝑛𝑔 𝑣𝑜𝑙𝑢𝑚𝑒 𝑖𝑡 is the natural logarithm of the number of shares traded for a given firm

on a given day. The independent variables of interest are SNI and the number of tweets. In addition,

I use the same control variable used in the analysis of stock returns in section 5.1, with the addition

of lagged trading volume. As before, standard errors are clustered by firm and day.

The results of this analysis are reported in Table 8. Specification 1 examine the relationship

between SNI and trading volume. Interestingly, an increase in SNI corresponds to an increase in

trading volume. This suggests that investors trade more on days of increased negativity

(pessimism); necessarily, this also suggests that investors trade less on days of low negativity (high

optimism).

Specification 2 focuses on tweeting volume as the parameter of interest. Not surprisingly,

greater tweeting volume corresponds to greater trading volume. I then combine SNI and tweeting

volume in specification 3 and find that the results remain consistent. Finally, I replicate the analysis

of specifications 1-3 in specifications 4-6 but use day fixed effects and yield similar results.

[Insert Table 8 here]

Page 19: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

18

It is perhaps instructive to summarize these results, along with the return results from the

previous section. It appears that days of increased investor tweeting activities correspond to greater

stock returns (on the day of) as well as increased trading volume. Controlling for tweeting volume,

the SNI corresponds to asset returns; particularly, greater negativity expressed in the form of higher

SNI corresponds to lower returns. Trading volume, on the other hand, increases as the negativity

expressed on social media increases.

5.3 Social Positivity rather than social negativity

The discourse thus far focused on social negativity, an alternate framing of this exposition

can focus on positive language rather than negative language. In other words, it is possible to

construct a Social Positivity Index instead of the Social Negativity Index, where the unit of measure

is a positive tweet rather than a negative tweet. I repeat the analysis of Table 7 with the exception

that I use social positivity rather than social negativity, and report the results in Table 9.

[Insert Table 9 here]

Table 9 shows that, while social positivity corresponds to positive stock returns, the results

are both economically and statistically weak – suggesting that markets do not necessarily respond

favorably to positive tweeting. This is in contrast to negativity – which has a strong negative impact

on asset prices. Tetlock (2007) reported that markets respond more to negative language than to

positive language in the financial press. This paper demonstrates that Tetlock’s finding extends to

social media – that markets respond more to negativity than to positivity. This finding is also

consistent with the psychology literature, which argues that negative information has greater

impact and is processed more thoroughly than positive information (Rozin and Rozyman (2001),

Baumeister, Bratslavsky, Finkenauer, Vohs, (2001)).

Page 20: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

19

6. SNI and vector autoregression (VAR) analysis

Thus far, the analysis has been focused on examining a single day. In the previous sections,

I showed that firm returns and trading volume correspond to tweeting volume, and more

importantly to the linguistic sentiment expressed in the form of SNI. Given that SNI is a daily

index and that daily returns are calculated at a daily frequency, a natural way to model the

interaction between the two variables is a panel vector autoregression (VAR) analysis.

The VAR methodology was used by Tetlock (2007) to examine the dynamics of returns

and sentiment. In Tetlock’s setting, the sentiment is updated daily. Similarly, in the setting of this

paper, SNI is updated daily. Two benefits of the VAR analysis are that it allows us to determine

whether SNI has predictive power over returns on subsequent days. Equally importantly, it allows

us to determine whether returns experience return reversals on days following SNI shocks.

In this panel VAR settings, I define two endogenous variables: returns and SNI. The VAR

model accounts for 5 lags of returns and SNI representing roughly a trading week. For this purpose,

it is useful to define the Lag operator (Lx) as used in Tetlock (2007). The Lag operator of a variable

represents a vector consisting of x number of lags of the variable. For example, L5(zt) is the vector

[zt-1, zt-2, zt-3, zt-4, zt-5,]. I also use the 0 subscript to denote the inclusion of the contemporaneous

term as follows: L50(zt)= [zt, zt-1, zt-2, zt-3, zt-4, zt-5,].

I run the following VAR model with the results summarized in Table 10.

𝑅𝑒𝑡𝑢𝑟𝑛𝑖𝑡 = 𝛽1 ∗ 𝐿50(𝑆𝑁𝐼) 𝑖𝑡 + 𝛽2 ∗ 𝐿5(𝑅𝑒𝑡𝑢𝑟𝑛) 𝑖𝑡 + 𝛽3 ∗ 𝐸𝑥𝑜𝑔𝑖𝑡 + 𝜀𝑖𝑡

In this model, returns are daily firm returns in basis points. Returns are included for the

current day as well as 5 lags of the returns. SNI is included with 5 lags in addition to the

contemporaneous term. The exogenous (control) variables, include the market return with five

Page 21: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

20

lags, contemporaneous and five lags of tweeting volume, dummy variables for earnings day, week

before earnings, week after earnings, and day of the week fixed effects.

[Insert Table 10 here]

As Table 10 shows, when the system is exposed to an SNI, or a high social negativity

shock, the returns on the same day are reduced by 37 basis points. This figure is statistically

significant at the 1% level of significance. The remainder of the table shows the returns 1, 2, ….5

days after the SNI innovation. Importantly, we see no evidence of return reversals (or any major

changes for that matter) as suggested by the lack of statistical significance of all the days. This

suggests that social negativity expressed through SNI is permanent (at least for the duration of the

trading week). This finding is consistent with the findings of Tetlock, Saar-Tsechansky, and

Macskassy (2008), showing that news stories – or in this case tweets – have a permanent impact

on prices. The results offer a contrast to the findings of Antweiler and Frank (2006), who show

that news stories about firms, regardless of tone, while triggering an initial market response, are

later reversed.

7. Additional and robustness tests

7.1 Effect of earnings announcements period:

One of the aims of this paper is to show that aggregate tweeting volume and sentiment

about stocks contains stock-relevant information. The results suggest that this is the case. One

possible concern, however, is that the results may be driven by the earnings announcements period.

More specifically, it may be that aggregate tweeting contains useful information during earnings

announcements period, but not outside of those periods, or in other words, that the results are

driven by earnings announcements.

Page 22: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

21

This possibility is already addressed in the analysis of sections 5.1 and 5.2 in which dummy

variables for earnings day, week before earnings, and week after earnings are included. To address

this issue more directly, however, I conduct further analysis on the sample having removed

earnings announcements period (earnings day, week before earnings, and week after earnings).

The results of this analysis are reported in panel B in Tables 7 and 8. This robustness test confirms

the main finding that that aggregate tweeting about firms contains useful information on a daily

basis, and not only during earnings season. Solomon (2012) explains that the media plays a more

important role outside of earnings season than during earnings season because earnings season is

a time where information is already abundant. In the case of social media, it appears to play a role

both during and outside earnings season.

Tetlock, Saar-Tsechansky, and Macskassy (2008) show that the majority of news stories

published about firms are clustered close to days of earnings announcements. In the case of

tweeting, while significantly more tweeting occurs during earnings season (earnings day, week

before earnings day, and week after earnings day), much tweeting still occurs outside of this

window. Specifically, in the sample set, approximately 75% of tweeting occurs outside of earnings

season, while 25% of tweeting occurs during earnings season. Unlike print media, which is

physically constrained by space availability in the publication, tweeting does not face the same

constraints – this is an important distinction between print media and social media.

7.2 Alternative definition of SNI

One of the central tenants of this paper is the definition of SNI. In all the previous

analysis, I define SNI as the number of negative tweets divided by the total number of tweets

Page 23: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

22

about a firm on a given day. One possible alternative definition of SNI is:

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑡𝑤𝑒𝑒𝑡𝑠−𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑤𝑒𝑒𝑡𝑠

𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑤𝑒𝑒𝑡𝑠

This alternate definition is different in that it directly accounts for the number of positive

tweets about a firm on a given day. I repeat the analysis of Tables 7 and 8 using this alternate

definition and find that the results are not affected by this choice. The results of this analysis are

shown in the Internet Appendix Tables IA.1 and IA.2.

7.3 Alternate dictionary for identifying sentiment

In the preceding analysis, I used the Loughran and McDonald (2011) dictionary to classify

tweets as positive or negative in tone. As explained, this dictionary is specifically accurate at

classifying the sentiment of financial language. Indeed, the tweets examined here are exclusively

financial. As a robustness test, however, I replicate the analysis of Table 7 using the Harvard

Psychological dictionary instead and report the results in Internet Appendix Table IA.3. The results

are generally consistent with those illustrated in Table 7.

8. Conclusion

This paper illustrates that social negativity expressed on social media corresponds to asset

prices. In particular, the Social Negativity Index (SNI), which measures the daily aggregate

negativity about a stock expressed on social media, corresponds to negative returns. Using a VAR

analysis, I show that these results are not reversed on subsequent trading days. Unlike negativity,

social media positivity has a much weaker positive relationship to asset returns – suggesting that

Page 24: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

23

markets react more to negativity than to positivity. This is consistent with the psychology

literature, that humans react more to negative information than to positive information.

For the universe of stocks listed on the NYSE, NYSE American and NASDAQ, firms with

more retail ownership are tweeted about more than firms with less retail ownership suggesting that

tweeting originates largely from the retail investor base. Moreover, firms with greater dispersion

of analysts’ forecasts are tweeted about more than firms with less dispersion. This suggests that

investors may turn to social media for discussions about stocks when analysts disagree.

Overall, the results in this paper suggest that social media has the capacity to assume many

of the roles of traditional print media. Furthermore, social media may aid in the information

production process – a function traditionally associated with financial analysts.

Looking forward, as Artificial Intelligence (AI) becomes an important emerging trend in

financial markets, social media will follow suit as a source of abundant information about financial

securities. This paper contributes to this emerging domain by showing that social media contains

useful information about financial markets. Indeed, AI has been used to monitor social media

discussions and predict acquisition activities – this trend is likely on the rise.

Page 25: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

24

References

Acemoglu, D., Hassan, T., Tahoun, A., 2018. The power of the street: evidence from Egypt’s Arab

Spring. Review of Financial Studies 31(1): 1–42.

Al Guindy, M., 2016. Is corporate tweeting informative or is it just hype? Evidence from the SEC

social media regulation. Working paper.

Antweiler, W., Frank, M., 2004. Is all that talk just noise? The information content of Internet

stock message boards. Journal of Finance 59: 1259–1293.

Antweiler, W., Frank, M. 2006, Do U.S. stock markets typically overreact to corporate news

stories? Working paper, University of British Columbia

Barber, B., Odean T., 2008. All That Glitters: The effect of attention and news on the buying

behavior of individual and institutional investors. Review of Financial Studies 21(2): 785-

818

Bartov, E., Faurel, L., Mohanra, P., 2016. Can Twitter Help Predict Firm-Level Earnings and Stock

Returns?. Working Paper.

Baumeister, R., Bratslavsky, E., Finkenauer, C., Vohs, K., 2001, Bad is stronger than good.

Review of General Psychology 5, 323–370.

Blankespoor, E., Miller, G., White, H., 2014. The role of dissemination in market liquidity:

evidence from firms’ use of Twitter. The Accounting Review, 89(1), 79–112.

Bloomfield, D., 2016. Sanders’ tweet on drugmaker Ariad’s ‘Greed’ sends stock plunging

(October 14). Available at: https://www.bloomberg.com/news/articles/2016-10-

14/sanders-tweet-on-drugmaker-ariad-s-greed-sends-stock-plunging.

Bodnaruk, A., Loughran, T., McDonald, B., 2015. Using 10-k text to gauge financial constraints.

Journal of Financial and Quantitative Analysis 50 (4): 623–646.

Broniatowski D., Paul M., Dredze M., 2013. National and local influenza surveillance through

Twitter: An Analysis of the 2012-2013 Influenza Epidemic. PLoS ONE 8(12): e83672.

https://doi.org/10.1371/journal.pone.0083672

Chawla, N., Da, Z., Xu, J., Ye, M., 2015. Catching fire: the diffusion of retail attention on Twitter.

Working Paper, Notre Dame University.

Chen, H. De, P., Hu, Y., Hwang, B.H., 2014. Wisdom of crowds: the value of stock opinions

transmitted through social media. Review of Financial Studies 27, 1367–1403.

Chen, H., Hwang, B.H., Liu, B., 2016. The Economic consequences of having ‘social’ executives.

Working paper, City University of Hong Kong, Cornell University, and Florida State

University.

Page 26: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

25

Da, Z., Engelberg, J., Gao, P., 2011. In search of attention. Journal of Finance 66 (5): 1461–1499.

Diamond, D., Verrecchia, R., 1991. Disclosure, liquidity, and the cost of equity capital. Journal of

Finance 46: 1325–60.

Drake, M., Roulstone, D., Thornock, J., 2012. Investor information demand: evidence from

Google searches around earnings announcements. Journal of Accounting Research 50(4):

1001–1040.

Egan, M., 2015. Hillary Clinton tweet crushes biotech stocks, CNN.com (September 22).

Available at: http://money.cnn.com/2015/09/21/investing/hillary-clinton-biotech-price-

gouging.

Engelberg, J., Parsons, C., 2011. The causal impact of media in financial markets. Journal of

Finance 66 (1) 67–99.

Fama, E.; French, K., 1992. The cross-section of expected stock returns. The Journal of Finance

47, 427–465.

Fang, L., Peress, J., 2009. Media coverage and the cross-section of stock returns. Journal of

Finance 64 (5): 2023–2052.

Jung, M., Naughton, J., Tahoun, A., Wang, C., 2017. Do firms strategically disseminate? Evidence

from corporate use of social media. The Accounting Review, Forthcoming.

Jung W., Kwon, Y., 1988. Disclosures when the market is unsure of information endowment of

managers. Journal of Accounting Research 26 (1): 146–153.

Loughran, T., McDonald, B., 2011. When is a liability not a liability? Textual analysis,

dictionaries, and 10-Ks. Journal of Finance 66, 35–65.

Loughran, T., McDonald, B., 2016. Textual analysis in accounting and finance: a survey. Journal

of Accounting Research (forthcoming).

Loughran, T., McDonald B., 2017. The use of EDGAR filings by investors. Journal of Behavioral

Finance 18: 231–248.

Lovelace, B., Donald Trump just took a shot at Boeing in Trump Tower, CNBC.com (December

6). Available at: https://www.cnbc.com/2016/12/06/boeing-shares-slide-after-trump-says-

air-force-ones-cost-out-of-control.html.

Peterson, M. Estimating standard errors in finance panel data sets: comparing approaches. Review

of Financial Studies 22 (1): 435-480

Q4 Web Systems, 2013. New Q4 Whitepaper: Pubic Company Use of Social media for IR – Part

1 Twitter & StockTwits (August 15). Available at: http://www.q4blog.com/2013/08/15/

new-2013-q4-whitepaper-public-company-use-of-social-media-for-ir-part-1-twitter-

stocktwits/

Page 27: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

26

Ram, A., Wigglesworth, R. 2017. When Silicon Valley came to Wall Street. Financial Times (Oct

28).

Rich, M., 2017. Trump’s Twitter warning to Toyota unsettles Japanese carmaker. New York Times

(January 6).

Rozin, P., Royzman, E., 2001. Negativity bias, negativity dominance, and contagion. Personality

and Social Psychology Review 5, 296–320.

Scannell, K., 2013. Companies allowed to tweet #USearnings. Financial Times (April 2).

Securities and Exchange Commission (SEC), 2008. Commission guidance on the use of company

websites. Release No. 34–58288. Washington, D.C.: SEC.

Securities and Exchange Commission (SEC), 2013. SEC says social media ok for company

announcements if investors are alerted. Press Release 2013–51.Washington, D.C.: SEC.

Seetharaman, D., Wells, G., 2017. Hurricane Harvey victims turn to social media for assistance.

The Wall Street Journal (August 29).

Solomon, D., 2012. Selective publicity and stock prices. Journal of Finance 67 (2): 599–637.

Tetlock, P.C., 2007. Giving content to investor sentiment: the role of media in the stock market.

Journal of Finance 62, 1139–1168.

Tetlock, P.C., Saar-Tsechansky, M., Macskassy, S., 2008. More than words: quantifying language

to measure firms’ fundamentals. Journal of Finance, 63 (3), 1437–1467.

Verrecchia, R., 1983. Discretionary disclosure. Journal of Accounting and Economics 5 (3), 179–

194.

Wang, C., 2016. Biotech takes a hit after Clinton tweets about EpiPen pricing, CNBC.com (August

24). Available at: https://www.cnbc.com/2016/08/24/biotech-gains-amid-buyout-chatter-

upbeat-clinical-trial-results.html.

Wang, C., 2016b. Lockheed Martin shares take another tumble after Trump tweet, CNBC.com

(December 22). Available at: https://www.cnbc.com/2016/12/22/lockheed-martin-shares-

take-another-tumble-after-trump-tweet.html.

Page 28: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

27

Appendix A: Regression variable definitions and data sources

Variable Definition Source

Panel A: Dependent Variables

Return

Trading volume

Daily return on company’s common share

calculated on a 24-hour basis

Natural logarithm of the number of shares traded

CRSP

CRSP

Panel B: Control Variables

Beta

The result of the regression of firms’ monthly

excess return on the excess return of the CRSP

value-weighted portfolio using a 60-month rolling

window defined in June of each year. Excess return

is defined as the monthly return above the one-

month treasury bill.

Author’s calculation from

CRSP returns data

Book to market ratio The ratio of book value of equity to the market

value of equity. The book value is defined as: [the

book value of shareholders’ equity + deferred taxes

and investment tax credit – Book value of preferred

stocks]

Author’s calculation from

COMPUSTAT data

Leverage The ratio of the firm’s long term debt to the total

assets of the firm

Author’s calculation from

COMPUSTAT data

Ln (Size) The natural logarithm of the market value of the

firm’s equity (in millions of dollars).

Author’s calculation from

COMPUSTAT data

Analyst following The number of analysts providing one-year EPS

estimates for the stock

Author’s calculation from

I/B/E/S data

Dispersion of forecasts The dispersion of analyst forecasts is the standard

deviation of analysts’ one-year ahead forecasts

scaled by the mean of estimates

Author’s calculation from

I/B/E/S data

Institutional ownership The total percentage of the company’s shares that

are held by institutional investors

Author’s calculation from

Thomson Reuter’s 13F

Payout This is the ratio of the firm’s net income paid as

dividends. Defined as common dividends/net

income

Author’s calculation from

COMPUSTAT

Page 29: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

28

Appendix A (Continued)

Variable Definition

Industry

Market return

VIX

Earnings day

Week before earnings

Week after earnings

The industry membership of the firm in one of the

Fama French 48 industry classifications

The average daily value-weighted market return

(vwretd)

CBOE S&P 500 Volatility Index

A binary variable that takes the value of 1 on a

firm’s earnings announcements day

A binary variable that takes the value of 1 for the

week prior to a firm’s earnings announcement day

A binary variable that takes the value of 1 for the

week following a firm’s earnings announcement

Determined from CRSP

historical SIC codes and

Kenneth French’s website

(to convert SIC to FF 48)

CRSP

CBOE Indexes

Author’s calculation from

Compustat

Author’s calculation from

Compustat

Author’s calculation from

Compustat

Panel C: Twitter Variables

Tweeting volume The natural logarithm of the number of tweets

about a firm on a given day

Twitter API/ author’s

calculation

Social Negativity Index

(SNI)

Number of negative tweets about a stock on a given

trading day divided by the total number of tweets

for that day (bounded by 0 and 1.

Twitter API/ author’s

calculation

Tweeting on previous day

Industry tweeting

A binary variable that takes the value of 1 if the

firm tweeted on the previous day

A unit variable that represents the proportion of

tweeting firms from a given industry on a given day

(excluding the given firm)

Twitter API/author’s

calculation

Author’s calculation

Page 30: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

29

Figure 1: Tweeting distribution by day of the week. This figure shows the breakdown of financial

tweets (in percentages) by day of the week in the sample period. The tweets are those that strictly

discuss financial information.

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

Sunday Monday Tuesday Wednesday Thursday Friday Saturday

% o

f all

tw

eets

Day of the week

Page 31: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

30

Figure 2: Tweeting distribution by hour of day. This figure shows the breakdown of financial tweets

(in percentages) by hour of day. The sample includes a total of 12,440,121 financial tweets collected

between January 1st 2017 and October 1st 2017. The tweets are those that strictly discuss financial

information.

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

0-1

1-2

2-3

3-4

4-5

5-6

6-7

7-8

8-9

9-1

0

10-1

1

11-1

2

12-1

3

13-1

4

14-1

5

15-1

6

16-1

7

17-1

8

18-1

9

19-2

0

20-2

1

21-2

2

22-2

3

23-2

4

% o

f a

ll t

wee

ts

Hour

Page 32: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

31

Table 1

Twitter sample descriptive statistics

This table provides general descriptive statistics for the sample of tweets. The tweets include all

financial tweets in which a Twitter user mentions a firm’s stock using the $ symbol and the stock

ticker, indicating that the tweet strictly discusses a firm’s stock. Firms are those listed on the NYSE,

AMEX, and NASDAQ.

Sample period

January 1st 2017- December 31st, 2017

Number of tweets 18,319,583

Number of firms covered 2,292

Number of unique tweeters 1,021,106

Number of tweets classified as positive in tone 1,956,880

Number of tweets classified as negative in tone 2,092,904

Page 33: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

32

Table 2

Modes of tweeting financial information

This table shows the mode of communication used by Twitter users to tweet financial information.

The Twitter system records the device/method used by the tweeting user. The tweets include all

financial tweets in which a Twitter user mentions a firm’s stock using the $ symbol and the stock

ticker, indicating that the tweet strictly discusses a firm’s stock. Firms are those listed on the NYSE,

AMEX, and NASDAQ.

Mode % of all tweets

Twitter website

24.75%

IFTT (web-based service) 16.23%

Twitter for iPhone 12.15%

Twitter for Android 8.92%

Page 34: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

33

Table 3

Top financial tweeting users’ languages

This table depicts the languages used by Twitter users tweeting financial information about stocks

listed on the NYSE, AMEX, or NASDAQ. A Twitter user indicates their language when they sign

up for a Twitter account and this data is summarized below for the dataset used in this paper.

Language % of all tweets

English

92.28%

Russian 1.39%

Spanish 1.30%

French 0.79%

German 0.59%

Dutch 0.57%

Portuguese 0.49%

Page 35: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

34

Table 4

Number of tweets for a sample of firms in the dataset

This table depicts the number of tweets in which a firm is mentioned in a financial tweet for a subset of

firms within the sample period. A tweet is identified to belong to a firm when it contains the $ symbol and

the firm ticker (e.g. $AAPL). This signifies that the tweet strictly discusses the stock of the firm.

Firm Number of tweets

Apple Inc.

556,499

Amazon Inc. 458,891

Twitter Inc. 297,282

Nvidia Corp. 221,653

Netflix Inc. 203,608

IBM 84,170

General Motors 71,934

Starbucks 58,002

Page 36: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

35

Table 5

Determinants of tweeting about a firm given firm characteristics

This table depicts the predictability of the volume of tweets about a firm given lagged firm characteristics. The sample

covers all firms listed on NYSE, AMEX, and NASDAQ. The dependent variable of the regression takes the value of

100*ln(total number of tweets about a firm). The independent variables are previous year’s parameters: Beta, which

is the CAPM beta; B/M represents the book to market ratio of equity; Size is the natural logarithm of the market value

of equity; Leverage is the leverage ratio of the firm; Payout is the payout ratio, Institution is the percentage of shares

held by institutional investors; Analysts is the number of analysts following the firm. Dispersion is the standard

deviation of analyst forecasts scaled by the absolute value of the mean of forecasts in percentage points. Fama and

French 48 industry fixed effects are also included. ***, **, * denote statistical significance at the 1% 5% and 10% levels

respectively. Standard errors are reported in parentheses.

Number of tweets about

firm

Beta

0.02**

(0.01)

B/M 0.01

(0.02)

Size 0.09***

(0.004)

Leverage 0.05

(0.03)

Payout -0.05*

(0.03)

Institution -0.64***

(0.11)

Analysts 4.25***

(0.29)

Dispersion 1.04***

(0.40)

Industry fixed effects

Included

Adjusted R2 0.65

N 1491

Page 37: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

36

Table 6

Determinants of tweeting volume about a firm on a given day

This table documents the predictability of the number of tweets about a given firm on a trading day. The dependent

variable is the natural logarithm of the number of tweets about a firm. Estimates are from a panel regression with firm

fixed effects. Firm’s Returnt-1 is the firm’s return on the previous trading day expressed in percentage points. Market

return t-1 is the market return on the previous trading day expressed in percentage points. VIX t-1 is the previous day’s

volatility index. Earnings day is the day of the firm’s earnings announcement, Week Before Earnings is the week prior

to the firm’s earnings announcement. Week After Earnings is the week after the firm’s earnings announcement.

Tweeting on previous day is the natural logarithm of the firm’s number of tweets on the previous trading day. Industry

tweeting represents the natural logarithm of the average number of tweets per firm in the same industry on a given

day. ***, **, * denote statistical significance at the 1% 5%, and 10% levels respectively. Standard errors are reported

in parentheses.

Number of tweets

about firm

Firm’s Return t-1 0.02***

(0.001)

Market return t-1 0.01

(0.03)

VIXt-1 0.01

(0.01)

Earnings Day

Week Before Earnings

Week After Earnings

Tweeting on previous day

Industry tweeting

Adjusted R2

0.83***

(0.03)

0.04*

(0.02)

0.36***

(0.02)

0.30***

(0.01)

0.49***

(0.03)

0.64

N 459620

Page 38: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

37

Table 7

Returns, tweeting volume and Social Negativity Index (SNI)

This table documents the results of the panel regression of returns (in basis points) on the Social Negativity Index

(SNI), and tweeting volume about a given firm. Social Negativity Index (SNI) is the proportion of tweets with negative

sentiment about a firm on a given day. Tweeting volume is the natural logarithm of the number of tweets about a firm

on a given day. Control variables used but not shown in the table are: Lag_return is the firm’s previous day’s return.

Earnings day is the day of the firm’s earnings announcement; Week before earnings is the week prior to the firm’s

earnings announcement. Week after earnings is the week after the firm’s earnings announcement. Models 1, 2, and 3

also include the volatility index (VIX), market return on the previous day, and day of the week fixed effects. Firm

fixed effects are used in all the models, and models 4-6 include day fixed effects. Panel A shows the results for the

full sample while Panel B shows the result for the sample excluding earnings season. Standard errors, in parentheses,

are clustered by firm and day. ***, **, * denote statistical significance at the 1% 5%, and 10% levels respectively.

Panel A: Full sample

Panel B: Sample excluding earnings period

(1) (2) (3) (4) (5) (6)

Returns (basis points)

Social Negativity Index (SNI)

-35.52***

-----

-36.86***

-36.72***

-----

-38.62***

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(2.96)

-----

Included

No

Included

25.60***

(1.42)

Included

No

Included

(3.11)

25.70***

(1.42)

Included

No

Included

(2.56)

-----

Included

Included

Included

26.18***

(1.24)

Included

Included

Included

(2.72)

26.33***

(1.24)

Included

Included

Included

R2 0.055 0.060 0.061 0.070 0.075 0.075

N 461632 461632 461632 461632 461632 461632

(1) (2) (3) (4) (5) (6)

Returns (basis points)

Social Negativity Index (SNI)

-27.98***

-----

-29.28***

-29.68***

-----

-31.60***

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(3.02)

-----

Included

No

Included

25.00***

(1.50)

Included

No

Included

(3.13)

25.08***

(1.50)

Included

No

Included

(2.67)

-----

Included

Included

Included

26.36***

(1.33)

Included

Included

Included

(2.77)

26.49***

(1.33)

Included

Included

Included

R2 0.063 0.068 0.068 0.079 0.084 0.085

N 377064 377064 377064 377064 377064 377064

Page 39: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

38

Table 8

Trading volume, tweeting volume and Social Negativity Index (SNI)

This table documents the results of the panel regression of trading volume, defined as the natural logarithm of the

number of shares traded, on the Social Negativity Index (SNI), and tweeting volume about a given firm. Tweeting

volume is the natural logarithm of the number of tweets about a firm on a given day. Social Negativity Index (SNI) is

the proportion of tweets with negative sentiment about a firm on a given day. Tweeting volume is the natural logarithm

of the number of tweets about a firm on a given day. Control variables used but not shown in the table are: Lag_return,

which is the firm’s previous day’s return, Lag trading volume, which is the trading volume of the firm’s stocks on the

previous day. Earnings day is the day of the firm’s earnings announcement; Week before earnings is the week prior

to the firm’s earnings announcement. Week after earnings is the week after the firm’s earnings announcement. Models

1, 2, and 3 also include the volatility index (VIX), market return on the previous day, and day of the week fixed effects.

Firm fixed effects are used in all the models, and models 4-6 include day fixed effects. Panel A shows the results for

the full sample while Panel B shows the result for the sample excluding earnings season. Standard errors, in

parentheses, are clustered by firm and day. ***, **, * denote statistical significance at the 1% 5%, and 10% levels

respectively.

Panel A: Full sample

Panel B: Sample excluding earnings period

(1) (2) (3) (4) (5) (6)

Trading volume

Social Negativity Index (SNI)

0.05***

-----

0.04***

0.05***

-----

0.04***

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(0.01)

-----

Included

No

Included

0.19***

(0.01)

Included

No

Included

(0.01)

0.19***

(0.01)

Included

No

Included

(0.01)

-----

Included

Included

Included

0.21***

(0.01)

Included

Included

Included

(0.01)

0.21***

(0.01)

Included

Included

Included

R2 0.90 0.91 0.91 0.91 0.91 0.91

N 460598 460598 460598 460598 460598 460598

(1) (2) (3) (4) (5) (6)

Trading volume

Social Negativity Index (SNI)

0.05***

-----

0.04***

0.05***

-----

0.04***

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(0.01)

-----

Included

No

Included

0.17***

(0.01)

Included

No

Included

(0.01)

0.17***

(0.01)

Included

No

Included

(0.01)

-----

Included

Included

Included

0.19***

(0.01)

Included

Included

Included

(0.01)

0.19***

(0.01)

Included

Included

Included

R2 0.90 0.91 0.91 0.91 0.91 0.91

N 376177 376177 376177 376177 376177 376177

Page 40: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

39

Table 9

Returns, tweeting volume and Social Positivity

This table documents the results of the panel regression of returns (in basis points) on the Social Positivity Index, and

tweeting volume about a given firm. Social Positivity Index is the proportion of tweets with positive sentiment about

a firm on a given day. Tweeting volume is the natural logarithm of the number of tweets about a firm on a given day.

Control variables used but not shown in the table are: Lag_return is the firm’s previous day’s return. Earnings day is

the day of the firm’s earnings announcement; Week before earnings is the week prior to the firm’s earnings

announcement. Week after earnings is the week after the firm’s earnings announcement. Models 1, 2, and 3 also

include the volatility index (VIX), market return on the previous day, and day of the week fixed effects. Firm fixed

effects are used in all the models, and models 4-6 include day fixed effects. Panel A shows the results for the full

sample while Panel B shows the result for the sample excluding earnings season. Standard errors, in parentheses, are

clustered by firm and day. ***, **, * denote statistical significance at the 1% 5%, and 10% levels respectively.

Panel A: Full sample

Panel B: Sample excluding earnings period

(1) (2) (3) (4)

Returns (basis points)

Social Positivity Index

3.15

7.92**

3.88*

9.45***

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(3.35)

-----

Included

No

Included

(3.43)

25.68***

(1.42)

Included

No

Included

(2.26)

-----

Included

Included

Included

(2.29)

26.29***

(1.23)

Included

Included

Included

R2 0.055 0.060 0.070 0.075

N 461632 461632 461632 461632

(1) (2) (3) (4)

Returns (basis points)

Social Positivity Index

1.52

5.40

1.42

6.27**

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(3.49)

-----

Included

No

Included

(3.65)

25.04***

(1.50)

Included

No

Included

(2.44)

-----

Included

Included

Included

(2.46)

26.43***

(1.32)

Included

Included

Included

R2 0.062 0.068 0.078 0.084

N 377064 377064 377064 377064

Page 41: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

40

Table 10

Vector autoregression of returns and Social Negativity Index (SNI)

This table reports estimates from panel vector autoregressions: yit = αi +∑ 𝛽𝑖 ∗ 𝑦𝑖𝑡−15𝑖=1 + 𝛽6𝐸𝑥𝑜𝑔𝑖𝑡 + εit. The

coefficients are obtained using system GMM estimations. The dependent variables are returns and Social Negativity

Index (SNI). SNI is calculated as the proportion of tweets of negative sentiment about a firm on a given day relative

the body of tweets about a firm. The model focuses on the effect on returns due to a shock in Social Negativity Index

(SNI). Exogenous variables used (but not listed) are: ln_tweetCount is the natural logarithm of the number of tweets

about a firm (including five lags). Market return is the daily market return (including five lags); VIX is the volatility

index; Earnings day is the day of earnings announcement; Week before earnings and Week after earnings are the week

before and after earnings announcement. Day of the week fixed effects are also included. ***, **, * denote statistical

significance at the 1%, 5% and 10% levels respectively. Standard errors are reported in parentheses.

Social Negativity Index (SNI) Dep. variable: Returns (basis

points)

Tweeting Day t

-37.32***

(2.26)

Tweeting Dayt-1 0.55

(2.27)

Tweeting Dayt-2 0.19

(2.28)

Tweeting Dayt-3 -2.69

Tweeting Dayt-4

Tweeting Dayt-5

(2.27)

-0.94

(2.27)

0.19

(2.25)

N 459633

Page 42: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

Internet Appendix

to the paper

Towards a Social Negativity Index: Giving Content to Financial

Tweeting

Page 43: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

i

Table IA. 1

Returns, tweeting volume and alternate Social Negativity Index (SNI) definition

This table documents the results of the panel regression of returns (in basis points) on the Social Negativity Index

(SNI), and tweeting volume about a given firm. The definition of Social Negativity Index (SNI) in this table is defined

as [(number of negative tweets about a firm on a given day – number of positive tweets about a firm on a given

day)/Total number of tweets about a firm on a given day]. Tweeting volume is the natural logarithm of the number of

tweets about a firm on a given day. Control variables used but not shown in the table are: Lag_return is the firm’s

previous day’s return. Earnings day is the day of the firm’s earnings announcement; Week before earnings is the week

prior to the firm’s earnings announcement. Week after earnings is the week after the firm’s earnings announcement.

Models 1, 2, and 3 also include the volatility index (VIX), market return on the previous day, and day of the week

fixed effects. Firm fixed effects are used in all the models, and models 4-6 include day fixed effects. Panel A shows

the results for the full sample while Panel B shows the result for the sample excluding earnings season. Standard

errors, in parentheses, are clustered by firm and day. ***, **, * denote statistical significance at the 1% 5%, and 10%

levels respectively.

Panel A: Full sample

Panel B: Sample excluding earnings period

(1) (2) (3) (4)

Returns (basis points)

Social Negativity Index (SNI)

-12.94***

-15.24***

-13.71***

-16.72***

(alternate definition)

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(2.16)

-----

Included

No

Included

(2.26)

25.17***

(1.50)

Included

No

Included

(1.71)

-----

Included

Included

Included

(1.74)

26.62***

(1.32)

Included

Included

Included

R2 0.062 0.068 0.079 0.084

N 377064 377064 377064 377064

(1) (2) (3) (4)

Returns (basis points)

Social Negativity Index (SNI)

-17.20***

-19.91***

-18.14***

-21.45***

(alternate definition)

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(2.03)

-----

Included

No

Included

(2.12)

25.84***

(1.42)

Included

No

Included

(1.57)

-----

Included

Included

Included

(1.63)

26.51***

(1.24)

Included

Included

Included

R2 0.055 0.061 0.070 0.075

N 461632 461632 461632 461632

Page 44: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

ii

Table IA. 2

Trading volume, tweeting volume and alternate Social Negativity Index (SNI) definition

This table documents the results of the panel regression of trading volume, defined as the natural logarithm of the

number of shares traded, on the Social Negativity Index (SNI), and tweeting volume about a given firm. Tweeting

volume is the natural logarithm of the number of tweets about a firm on a given day. The definition of Social Negativity

Index (SNI) in this table is defined as [(number of negative tweets about a firm on a given day – number of positive

tweets about a firm on a given day)/Total number of tweets about a firm on a given day]. Tweeting volume is the

natural logarithm of the number of tweets about a firm on a given day. Control variables used but not shown in the

table are: Lag_return, which is the firm’s previous day’s return, Lag trading volume, which is the trading volume of

the firm’s stocks on the previous day. Earnings day is the day of the firm’s earnings announcement; Week before

earnings is the week prior to the firm’s earnings announcement. Week after earnings is the week after the firm’s

earnings announcement. Models 1, 2, and 3 also include the volatility index (VIX), market return on the previous day,

and day of the week fixed effects. Firm fixed effects are used in all the models, and models 4-6 include day fixed

effects. Panel A shows the results for the full sample while Panel B shows the result for the sample excluding earnings

season. Standard errors, in parentheses, are clustered by firm and day. ***, **, * denote statistical significance at the

1% 5%, and 10% levels respectively.

Panel A: Full sample

Panel B: Sample excluding earnings period

(1) (2) (3) (4)

Trading volume

Social Negativity Index (SNI)

0.05***

0.04***

0.05***

0.03***

(alternate definition)

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(0.01)

-----

Included

No

Included

(0.01)

0.19***

(0.01)

Included

No

Included

(0.01)

-----

Included

Included

Included

(0.004)

0.21***

(0.01)

Included

Included

Included

R2 0.90 0.91 0.91 0.91

N 460598 460598 460598 460598

(1) (2) (3) (4)

Trading volume

Social Negativity Index (SNI)

0.05***

0.04***

0.05***

0.03***

(alternate definition)

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(0.01)

-----

Included

No

Included

(0.01)

0.17***

(0.01)

Included

No

Included

(0.01)

-----

Included

Included

Included

(0.005)

0.19***

(0.01)

Included

Included

Included

R2 0.90 0.91 0.91 0.91

N 376177 376177 376177 376177

Page 45: Towards a Social Negativity Index: Giving Content to ...€¦ · Towards a Social Negativity Index: Giving Content to Financial Tweeting Mohamed Al Guindy Sprott School of Business,

iii

Table IA. 3

Returns, tweeting volume using Harvard Psychological Dictionary

This table documents the results of the panel regression of returns (in basis points) on the Social Negativity Index

(SNI), and tweeting volume about a given firm. The definition of Social Negativity Index (SNI) in this table is defined

as [(number of negative tweets about a firm on a given day – number of positive tweets about a firm on a given

day)/Total number of tweets about a firm on a given day]. Negative tweets are defined using the Harvard Psychological

Dictionary rather than the Loughran and McDonald Dictionary. Tweeting volume is the natural logarithm of the

number of tweets about a firm on a given day. Control variables used but not shown in the table are: Lag_return is the

firm’s previous day’s return. Earnings day is the day of the firm’s earnings announcement; Week before earnings is

the week prior to the firm’s earnings announcement. Week after earnings is the week after the firm’s earnings

announcement. Models 1, 2, and 3 also include the volatility index (VIX), market return on the previous day, and day

of the week fixed effects. Firm fixed effects are used in all the models, and models 4-6 include day fixed effects.

Panel A shows the results for the full sample while Panel B shows the result for the sample excluding earnings season.

Standard errors, in parentheses, are clustered by firm and day. ***, **, * denote statistical significance at the 1% 5%,

and 10% levels respectively.

Panel A: Full sample

Panel B: Sample excluding earnings period

(1) (2) (3) (4)

Returns (basis points)

Social Negativity Index (SNI)

-12.94***

-15.24***

-13.71***

-16.72***

(alternate definition)

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(2.16)

-----

Included

No

Included

(2.26)

25.17***

(1.50)

Included

No

Included

(1.71)

-----

Included

Included

Included

(1.74)

26.62***

(1.32)

Included

Included

Included

R2 0.062 0.068 0.079 0.084

N 377064 377064 377064 377064

(1) (2) (3) (4)

Returns (basis points)

Social Negativity Index (SNI)

-4.59*

-6.07**

-6.63**

-7.38***

(alternate definition)

Tweeting volume

Controls

Day fixed effects

Firm fixed effects

(2.76)

-----

Included

No

Included

(2.76)

25.62***

(1.42)

Included

No

Included

(2.55)

-----

Included

Included

Included

(2.53)

26.19***

(1.24)

Included

Included

Included

R2 0.055 0.060 0.070 0.075

N 461632 461632 461632 461632