media network based investors’ attention: a powerful...

Media Network Based Investors’ Attention: APowerful Predictor of Market Premium

Li Guo

Singapore Management University

Lin Peng

Baruch College

Yubo Tao


Jun Tu∗


April, 2018

∗Send correspondence to Jun Tu, Lee Kong Chian School of Business, Singapore ManagementUniversity, Singapore 178899; Telephone: (+65) 6828 0764. E-mail: [email protected]. Jun Tuacknowledges that the study was funded through a research grant from Sim Kee Boon Institutefor Financial Economics. The usual disclaimer applies. A previous version of this paper has beencirculated under the title, “Media Network and Return Predictability”.

Media Network Based Investors’ Attention: APowerful Predictor of Market Premium

Abstract

Studies on stock market equity premium predictability mostly examine

information-based predictors, such as the traditional fundamental economic

variables (hard information) and the recent news tones (soft information).

However, investors’ attention is largely ignored in the equity premium fore-

casting literature despite investor attention is crucial on how information

is incorporated into stock prices. In this study, we propose an investor-

attention-based predictor, media attention index (MAI), contracted using

media news network. We show that the MAI index can forecast the market

premium significantly and outperform various information-based predic-

tors.

JEL Classification: G11, G12, G41.

Keywords : Investor Attention; Media Network; Return Predictability; News

Sentiment

Among numerous studies regarding the stock market return predictability, almost

all of them are about information-based predictors, mostly using hard information

(e.g., fundamental economic variables in Goyal and welch (2007)) and recently using

soft information (e.g., news tones in Tetlock (2007)). However, without investors’

attention, information per se is not able to move stock prices. Given that investors’

attention has been documented as an important driving force of stock returns in recent

literature, it is surprising that there is a lack of investigation on the impact of investors’

attention on market premium forecasting. In this study, to our knowledge, we are the

first to apply media news network to construct a investors’ attention based predictor,

i.e., media attention index (MAI), for forecasting market premium.

There are evidences suggesting that attention can be a scarce resource. An investor

may choose to invest in a limited number of stocks and then only pay attention on

the information about those stocks they are holding. However, when one news article

mentions multiple stocks including the stocks an investor is holding, those stocks not

held yet by the investor but mentioned by the news article (labelled as connected stocks)

will be likely to grab the attention of investors as well.1

Due to the attention paid to the connected stocks (but not paid to the unconnected

stocks), the investor may then start to react to the information (from not only these

news articles but also maybe other sources) about the connected stocks (but not to

the information about the unconnected stocks). Moreover, due to the short-sales con-

straint, the investors react more easily to long signals than to short signals, and thus

lead to an incorporation of more good information than bad information in the prices

of connected stocks. As a consequence, the connected stock prices will be pushed above

the fair level by the attention generated from the news co-occurrence. In summary,

the more frequent the stocks are co-mentioned by media news, the more attention are

drawn from investors for connected stocks, and the higher probability of overvaluation

for connected stocks. By aggregating across all the stocks over the market for a given

1Investors attention can be drawn to a set of stocks mentioned by news (e.g., Barber and Odean(2007) and Yu (2015)).

1

time period, such as one month, we formulate a monthly MAI index using the adjacency

matrix provided by network theory to gauge the time varying overall aforementioned

investors’ attention generated by the news co-occurrence.

Empirically, we show our proposed investor-attention-based predictor, MAI, can

forecast the market premium with significantly negative coefficient and a 3.01% and a

3.36% monthly in-sample and out-of-sample R2s respectively. In addition, our findings

are statistically as well as economically significant even when we control for different

hard information, soft information and alternative attention proxies, including eco-

nomic predictors used in Goyal and Welch (2008), sentiment indices of Baker and

Wurgler (2006) and Huang et al. (2014), media coverage (Fang and Peress (2009)),

google search index (Google Search) following Da, Engelberg, and Gao (2011a), the

52-week high (PrcHigh) following George and Hwang (2004), change of average num-

ber of analysts aggregated from individual S&P500 stocks using equal weight or value

weight (∆ # of AnalystsEW or ∆ # of AnalystsVW ) and news tone measures based

on Loughran and McDonald (2011) dictionary (Engelberg (2008), Gurun and Butler

(2012), Hillert, Jacobs, and Muller (2014), Solomon, Soltes, and Sosyura (2014) and

Tetlock et al. (2008)). In fact, the MAI can outperform most of the existing predictors

for both in-sample and out-of-sample. We then examine the performance of our MAI in

predicting returns during the recession and expansion periods, and find that the MAI

obtains larger and positive R2s in both recession and expansion periods comparing with

alternative predictors. Moreover, the MAI index shows significant return predictability

only when investors’ beliefs are highly divergent and the short-sales constraint is tight.

This is consistent with the intuition that mispricing are more significant when there

are high belief divergence and tight short-sales constraint.

We further verify the news attention channel by predicting cross-sectional portfolios

and find more frequent news co-occurrence provides lower returns. Indeed, a long-

short portfolio based on media attention index generates 0.74% monthly return with a

monthly Sharpe ratio of 0.14. Moreover, the conventional risk factors such as CAPM,

Fama-French (1993) three factors, and Carhart (1997) four factors are unable to explain

2

the alphas generated by our media attention index.

In addition, we try to identify the fundamental source of MAI. We check the average

correlation of Google searches (or Bloomberg searches) between stock pairs. And find

that the more news articles mention two firms, the higher the correlation of Google

searches (or Bloomberg searches) between these two firms’ stocks is. This may provide

a direct evidence to support the investor attention interpretation of MAI.

Lastly, we further study the role of centrality score and value weight in affecting

attention effect. Under media network, a stock attracts investors’ attention from its

connected stocks while the attracted attention would not equally load on those stocks.

Our results reveal that a stock with a low centrality score (small size) tends to be more

affected by this connection and this effect will be amplified when the stock is connected

to a high centrality stock (big stock) than that of connecting to a low centrality stock

(small stock). In particular, a long-short portfolio based on the number of connected

news reveals a signicant excess return of 1.40% (1.98%) using stocks with low centrality

score (small size) that are connected to those high centrality stocks (big stocks). Indeed,

for the rest types stocks, we are not able to find such strong results, suggesting the

media connection induced investors’ attention mainly affect a specific type of stocks

rather than affecting all stocks in the market.

Our paper has shed new light upon a different aspect of investor attention. In

Peng and Xiong (2006), they documented that investors tend to process more market

information than firm-specific information due to limited attention, and thus generates

important features in return co-movement. A follow-up work Peng et al. (2007) show

that combining with limited attention and attention shifts, people can explain time-

varying asset co-movement. In terms of media attention, Odean (1999) and Barber

and Odean (2007) found that individual investors are more likely to trade the stocks

that have grabbed their attentions due to limited attention in searching what to trade,

especially for buying stocks. Fang and Peress (2009) and Fang et al. (2014) further

examined the cross-sectional return predictability and mutual funds’ trading and per-

formances using media coverage as proxy of attention-grabbing events, and they also

3

find evidence that both individual and institutional investors subject to limited atten-

tion. Different from those papers, we find an efficient proxy for investor attention by

making use of the media network formation and apply it to return predictability.

We also contribute to the literature that studies media’s role in return predictability.

In the past decades, the literature that investigates the media’s role in financial markets

mainly examines how the pessimistic tones revealed from the content is associated with

stock prices. Tetlock (2007) presents that the linguistic tone, especially negative tones,

can predict market excess returns. Tetlock et al. (2008) further explore the cross-

section predictability of returns by processing firm-specific news. Similarly, Zhang

et al. (2016) document a sector specific reaction based on their distilled sentiment

measure. Jegadeesh and Wu (2013) further improves Tetlock (2007) by using a term

weighting method of content analysis based on OLS and Naıve Bayes, and they also find

significant return predictability of news articles. Unlike these literature that focuses

on extracting investors sentiment between the lines, our indices take into account the

connected news coverage and this connectivity is shown to have powerful in-sample

and out-of-sample predictabilities on market returns.

Lastly, we contribute to the literature on application of network analysis in finan-

cial studies. Cohen and Frazzini (2008) and Menzly and Ozbas (2010) find that eco-

nomic links among certain individual firms and industries contribute to cross-firm and

cross-industry return predictability. They interpret their results as evidence of gradual

information diffusion across economically connected firms, in line with the theoretical

model of Hong et al. (2007). Rapach et al. (2015) investigate the predictability of

industry returns base on a wide range of industrial interdependencies. Different from

above literature, we are the first paper to construct the market-wide media network

and provide direct evidences on its market return predictability.

The rest of the paper is organized as follows. In section 1, we review the literature

exploring media network in financial markets and make some essential assumptions for

subsequent analysis. In section 2, we show how to compose a comprehensive measure

of media-network-based attention index. Then, we conduct some empirical tests and

4

present our results in section 3. In section 4, we provide economic explanations to our

MAI. Lastly, we conclude in section 5.

1 Media Connection and Media Network

In this section, we review the literatures that study the impact of the media connec-

tions and media networks on financial and economic matters, and introduce several

reasonable assumptions for constructing the new predictors of stock returns.

Media connection, by definition, is an inter-relationship that is built via news stories

which may through explicit mentions or implicit affections. The explicit mentions, also

known as media co-occurrence, is the most natural way of formulating the connectivity

of two entities. Ozgur et al. (2008) first studied the social network inferred from the

co-occurrence network of Reuters news. They show that the network exhibits small-

world features with power law degree distribution and it provides a better prediction

of the ranking on “importance” of people involved in the news comparing to other

algorithms. Scherbina and Schlusche (2015) studied the cross-predictability of stock

returns by identifying the economic linkage from co-mentions in the news story. They

constructed a linkage signal using the weighted average of the connected stock returns

and they find that the linked stocks cross-predict one anothers returns in the future

significantly, and the predictability increases with the number of the connected news2.

Apart from the explicit mentions, the connection may also be built through im-

plicit affections. One of the most popular channels is the industrial chain. As shown

in Cohen and Frazzini (2008), economic links among certain individual firms and in-

dustries contribute significantly to cross-firm and cross-industry return predictability.

Rapach et al. (2015) extends the perspective of Cohen and Frazzini (2008) by defin-

ing a connection between industries with the predictability of returns. Through these

industrial interdependencies, the news that conveys information on one industry will

2The connected news we are refering to throughout this paper is defined as the news that mentionsmore than one firm.

5

also percolates into the other industries. Further, due to the competitive relation of

stocks within the industry, the good (bad) news to one stock will be bad (good) news

to its competitors. In addition, business interaction is another important channel that

transfers news information from one firm to another.

Based on media connections, we can formulate a media network by taking the whole

picture of the connected stocks as a undirected graph with news tones or connectiv-

ity tagged on each stocks. In network analysis context, all these information can be

captured by the adjacency matrix or weighted adjacency matrix 3. Apart from adja-

cency matrix, we also need to make some essential and reasonable assumptions on news

arrival and network structures in advance to simplify our analysis.

Assumption 1 (Random News Arrival). Connected news arrives randomly and in-

vestors have no prior information on the distribution of news arrival.

In Daley and Green (2012) and Rubin et al. (2017), they presume the news arrival

follows some stochastic process or is priori unanticipated. This assumption is reason-

able as investors face two tiers of randomness. The first tier randomness comes from the

arrival of firm-specific news event and the second tier comes from the news connections.

In reality, a news event is always unpredictable, and even though investors realize a

news event will occur, the stocks that the news will mention are still mysterious to the

investors.

Assumption 2 (Multi-degree Network). The attention that the connected news at-

tracts not only affects the directly connected stocks but also indirectly connected stocks.

To fit stocks into a network structure based on media connection, the attention

attracted by media news could travel through the connected stocks. As a result, at-

tention induced by media connection will not only affect directly linked stocks but also

3In graph theory and computer science, an adjacency matrix is a square matrix used to representan unweighted graph. The elements of the matrix indicate whether pairs of vertices are adjacent or notin the graph. For weighted adjacency matrix, it is square matrix used to represent a weighted graphwhose edges are tagged with a weight to denote some relationship between the nodes, e.g. distance.The elements of the matrix are just the weight of the edges.

6

affect those stocks with indirect connections. In this case, the importance of each node

(stock) will depend on its connections with all the other nodes (stocks) in this social

network. To take this indirect effect into account, we use value weight and eigenvector

centrality weight to determine the importance of a node in the market. Details will be

discussed in the methodology section.

Assumption 3 (Majority Opinion). The aggregated news tones reflect the majorities’

opinions on future prices of both connected stocks.

The last assumption just ensures our indices constructed by aggregation will not be

dominated by some extreme opinions. Martins (2008) studies the dynamics of extreme

opinions in a model setup with network structure. The paper shows that increasing

contact between different opinions tend to make them less extreme. This result justifies

our assumption in the sense that when modelling news tones, which can be regarded as

the journalist opinion to the stock, in a network setting, extremists’ opinion will have

less chance to be dominant.

2 Data and Methodology

In this section, we will introduce the data sources and explain the methodology for

constructing the media attention index. Then, we introduce the alternative predictors

that we can competing with and the corresponding data sources.

2.1 Media Attention Index

The data we use for constructing media network is the firm-specific news from the

Thomson Reuters News Archive dataset ranging from Jan-1996 to Dec-2014. The

data contains various types of news, e.g. reviews, stories, analysis and reports etc.,

about markets, industries and corporations. It also provides news tones for all the

mentioned firms in each piece of news. The tones are expressed in three probabilities

SentPos (the probability of the article being positive), SentNeg (the probability of the

7

article being negative), and SentNeu (the probability of the article being neutral).

These three probabilities sum up to 1. Later, in our case, we use optimistic tone (i.e.

SentPos−SentNeg) to weigh the attention strength of each firm mentioned in the news

item. In this paper, we identify the news that has mentioned at least two stocks as

connected news and the others as self-connected news. This dichotomy allows us to

isolate the effect of the media connection by calculating the connectivity measure with

one stock as the centre of a news network, and the aggregation of connectivity measures

over the whole portfolio will provide information on the whole news network.

The media connection is identified though the connected news where stocks are co-

mentioned in the text. Based on connectivity, we can compute the monthly pairwise

connection scores of the news to each stock mentioned. After that, we employ two ways

to aggregate individual attention to form a market-wide Media Attention Index (MAI),

namely value weight and centrality weight. Firstly, we expect large stocks could deliver

more attention effect to other stocks. For example, stock A is connected to both stock

B and stock C with the same of connected news. While Stock B is a large stock while

stock C is a small stock. In this case, on average we expect stcok A could draw more

attentions from stock C than stock B. Specifically, we construct the value-weighted

connection score as follows:

CSsizei,j,t =

Kt∑k=1

Sizeki,t × Sizekj,t ×Occrki,t ×Occrkj,t, with i, j = 1, 2, · · · , N . (2.1)

where N is the total number of stocks in the sample, the superscript k denotes the

kth news in month t and Kt is the total number of news of month t which may vary

every month, Occr is the dummy variable from occurrence information matrix below,

8

indicating stocks’ occurrence in news.

news1 ··· newsKt

stock1 Occr11,t · · · OccrKt1,t

......

. . ....

stockN Occr1N,t · · · OccrKtN,t

,

The second type of connection score we construct mainly account for the centering

effect. In fact, we always wish to identify the most important vertices in the network,

and this motivates centrality indicator. In our case, the more important stock natu-

rally attracts more attentions from the investors. Therefore, it is essential to use the

centrality as weight to incorporate the importance of stocks into our attention proxy.

As introduced in Newman (2010), there are various types of centrality measures ap-

plying in network analysis (such as, degree centrality, closeness centrality, betweenness

centrality, eigenvector centrality, etc.), and we choose to use eigenvector centrality in

our study. Specifically, we first define the adjacency matrix At based on the occurrence

information matrix, that is

At =

stock1 stock2 ··· stockN

stock1 a11,t a12,t · · · a1N,t

stock2 a21,t a22,t · · · a2N,t

......

.... . .

...

stockN aN1,t aN2,t · · · aNN,t

.

where aij,t = 1 if∑Kt

k=1Occrki,tOccr

kj,t 6= 0, and 0 otherwise. Then, we calculate the

eigenvector corresponding to the largest eigenvalue (λmax) of the adjacency matrix, xt,

which is defined as our centrality score, i.e.,

Atxt = λmaxxt, for each t = 1, 2, · · · , T ,

9

where xt = (Ctry1,t, Ctry2,t, · · · , CtryN,t)′ and Ctryi,t stands for the eigencvector cen-

trality score of stock i at time t.

Unlike the degree centrality awarding one centrality point for every link a node

receives, eigenvector centrality thinks not all vertices are equivalent: some are more

relevant than others, and, reasonably, endorsements from important nodes count more.

In other words, the eigenvector centrality indicates that a node is important if it is

linked to by other important nodes. Based on the centrality scores, similarly, we can

formulate the connection scores as follows:

CSctri,j,t =

Kt∑k=1

Ctryi,t × Ctryj,t ×Occrki,t ×Occrkj,t, with i, j = 1, 2, · · · , N . (2.2)

To understand the centrality connection score better, we take the simple network

structure in Figure 1 as an example. Each vertex in the network represents a firm and

the edges indicate the media connections induced by news co-occurrence. The degree

centrality suggests that firm 1 and 3, firm 2 and 6, or firm 4 and 5 are equally important

since they have the same degrees. However, observing that although firm 2 and 6 both

have 2 degrees and both connect to firm 1, firm 6 connects to firm 3 which has more

degrees, or in other words, more important than firm 4 which is connected to firm 2.

Therefore, we should expect firm 6 to be more important than firm 2 in terms of spread-

ing the news as it has more second degree connections. By similar argument, we should

also expect firm 5 to take a more central position than firm 4, and firm 1 is more cen-

tered than firm 3. Based on the adjacency matrix, we obtain the eigenvector centrality

score as the leading eigenvector, which is [0.5641, 0.2960, 0.5454, 0.1268, 0.2337, 0.4753].

Evidently, the eigenvector centrality scores fits the situation better in describing the

propagation of news. Further, we then can interpret the connection scores constructed

by taking the product of the centrality scores as a measure of the radiating area of

the news happened between the two specific firms. In other words, the higher the

connection score, the more firms in the network will get influenced, and thus the more

10

attention is caught by the news.

[Insert Figure 1 here.]

With the basic elements available, we construct the Media Connectivity Matrices

on daily basis

Cpt =

CSp1,1,t CSp

1,2,t · · · CSp1,N,t

CSp2,1,t CSp

2,2,t · · · CSp2,N,t

......

. . ....

CSpN,1,t CSp

N,2,t · · · CSpN,N,t

, p ∈ {size, ctr}. (2.3)

Based on this Media Connectivity Matrix, we finally aggregate the network information

to compose Media Attention Indices (MAI) on daily basis,

MAIpt = ∆t

N∑i=1

N∑j 6=i

CSpi,j,t

N∑i=1

N∑j=1

CSpi,j,t

, p ∈ {size, ctr}. (2.4)

where size refers to the connection scores calculated with value weight and ctr refers to

the connection scores calculated with centrality weight. The indices are formulated by

taking temporal differences (∆t) of fractions between the sum of off-diagonal elements

and the sum of every element. This formulation is helpful in controlling news volume

effect (larger size firms may have greater news coverage) and eliminate the potential

persistency in index series.

To combine different aspects of information provided by news network, we then

form a composite media attention indices, MAI, as the weighted average of the two

standardized individual media attention measures. Since both measures likely contain

information about investors’ attention as well as idiosyncratic non-attention noise, the

averaged media attention index thus helps to capture the common investor attention

11

component in connected news and diversify away the idiosyncratic noise. To do that,

we standardize both MAI size and MAI ctr and then calculate the monthly composite

media attention index, MAI as simple average of two single factors:

MAIt = 0.5MAIsizet + 0.5MAIctrt . (2.5)

In Figure 2, we plot the composite media attention index and the other two indi-

vidual media attention indices. As we can see, overall, size-based index shows a similar

pattern as centrality weighted attention index. This is because large stocks also tend

to be those stocks with high centrality scores and both index reflect media connection

induced investor attention. In the meantime, these two indices are still different es-

pecially during the expansion period so it is still benefit to combine these two indices

together to remove non-attention noise. In addition, the correlation between MAI size

and MAI ctr is 0.59 and the composite media attention index, MAI shows correlation

0.89 and 0.88 with MAI size and MAI ctr respectively.


2.2 Alternative Predictors

According to Fang and Peress (2009), media coverage has a significant impact on

stock returns as a proxy for investor attention. Therefore, to ensure MAI’s predictive

power does not purely come from the media coverage, we then calculate the average

number of self news and average number of connected news to control for the effect

of media coverage. Given both variables are not stationary and show a strong time

trend, we take first order difference for both two predictors, labelled as ∆Self News and

∆Connected News. Meanwhile, a related type of literature suggests the use of linguistic

methods in order to quantify the tone of relevant textual documents (e.g. Engelberg

(2008), Gurun and Butler (2012), Hillert, Jacobs, and Muller (2014), Solomon, Soltes,

and Sosyura (2014), Tetlock et al. (2008)). The limited attention view then predicts

12

that this information has predictive power for the behavior of cognitively overloaded

investors suggested by Jacobs (2015). In this case, we construct soft information pre-

dictor using both value weight and equal weight to aggregate individual news tones

from S&P500 stocks. In particular, news tone for indivdiual stock i in month t is

calculated as# of Pos Wordi,t - # of Neg Wordsi,t

Total # of Wordsi,t, where positive words and negative words

follow Loughran and McDonald (2011) dictionary.

Apart from the media news data, we also construct some alternative attention

proxies, including google search index (Google Search) following Da, Engelberg, and

Gao (2011a), (PrcHigh) following George and Hwang (2004), change of average number

of analysts aggregated from individual S&P500 stocks using equal weight or value

weight (∆ # of AnalystsEW or ∆ # of AnalystsVW ).

On top of that, investor sentiment index in Baker and Wurgler (2006) and the in-

vestor sentiment aligned index in Huang et al. (2014) are included as well for comparing

with the sentiment content of the media attention index.

We then further collect 14 economic predictors that are linked directly to economic

fundamentals used in Goyal and Welch (2008) from Amit Goyal’s website. Specifically,

they are the log dividend-price ratio (D/P), log dividend yield (D/Y), log earnings-price

ratio (E/P), log dividend payout ratio (D/E), stock return variance (SVAR), book-to-

market ratio (B/M), net equity expansion (NTIS), treasury bill rate (TBL), long-term

bond yield (LTY), long-term bond return (LTR), term spread (TMS), default yield

spread (DFY), default return spread (DFR) and inflation rates (INFL).

Apart from controlling the sentiment indices and economic predictors, we would

also like to control for general synchronicity of firm level fundamentals. It is because

the stocks co-mentioned by the news are potentially highly correlated in fundamentals.

Therefore, we follow Morck et al. (2000) to construct the Earnings Co-movement Index

(ECI) for controlling fundamental correlations. To construct the index, we first run

the regression

ROAi = ai + bi × ROAm + εi, (2.6)

13

for each firm i in each period. ROAi is a firms returns on assets, calculated as annual

after-tax profit plus depreciation over total assets. ROAm is the value-weighted average

of the return on assets for all firms.

Earnings Co-movement Index =

∑iR

2i (ROA)× SSTi(ROA)∑

i SSTi(ROA), (2.7)

where R2i (ROA) and SSTi(ROA) are the R2 and the sum of squared total variations

derived from regression (2.6) for firm i. A higher ECI indicates that the earnings

frequently move together.

Moreover, in order to control for investors’ belief divergence, we construct the macro

disagreement measure by applying principal component to the same set of macro eco-

nomic variables in Li (2016). We also collect VIX as a complement to macro disagree-

ment. Besides, we compute the short interest ratio (SIR) to check how short-sales

constraint affect the return predictability of MAI.

[Insert Table 1 here.]

From the summary statistics in Table 1 we can observe that the monthly excess

market return has a mean of 0.41% and a standard deviation of 4.49%, implying a

monthly Sharpe ratio of 0.09. Moreover, most of economic predictors are highly persis-

tent while the excess market return has little autocorrelation. These summary statistics

are generally consistent with the literature.

3 Predicting Stock Market Returns with News Co-

occurrence

In this section, we provide a number of empirical results. Section 3.1 examines the

predictability of media attention index on the aggregate market. Section 3.2 compares

the media attention index with alternative predictors. Section 3.3 analyses the out-of-

14

sample predictability, and Section 3.4 assesses the cross-sectional predictability of the

media attention index.

3.1 Forecasting the Market

Consider the standard predictive regression model,

Rmt+1 = α + βMAIt + εt+1, (3.1)

where Rmt+1 is the excess market return, i.e., the monthly log return on the S&P500 in-

dex in excess of the risk-free rate. For comparison, we also run the same in-sample pre-

dictive regression with ∆Self News, ∆Connected News, BW sentiment index, SentBWt ,

and PLS sentiment index, SentPLSt . Specifically, we test the null hypothesisH0 : β = 0,

which means MAI has no predictability for stock returns, against the alternative

H1 : β 6= 0. Under the null hypothesis, (3.1) reduces to the constant expected re-

turn model, Rmt+1 = α + εt+1.


Table 2 reports the results of in-sample predictive regressions. Panel A to Panel

E provide the estimation results for the media attention index, media coverage index,

alternative attention proxies, soft information and sentiment indicies. As shown in the

table, MAI can predict negative returns significantly with an in-sample R2 of 3.26%.

Consistent with Baker and Wurgler (2006) and Huang et al. (2014), both sentiment

indices predict a negative return whereas they are not statistically significant unless

we apply a one-sided test critical value. While ∆Self News does not show strong

return predictability and ∆Connected News shows weak predictability comparing to

MAI index. This may suggest investors’ additional attention effect can be stronger for

the stocks with few self news. The last three columns report the overall R2 and R2s

in expansion and recession periods recorded by NBER. The results show that MAIs

15

provide larger in-sample R2s than sentiment indices.

Economically, the OLS coefficient suggests that a one-standard deviation increase

in MAI is associated with an approximate 0.78% decrease in expected excess market

return for the next month. On the one hand, recall that the average monthly excess

market return during our sample period is 0.41%, thus the slope of -0.78% implies that

the expected excess market return based on MAI varies by 1.9 times of the magnitude

of its average level, which indicates a strong economic impact. On the other hand, if we

annualize the 0.78% decrease in one month by the multiplication of 12, the annualized

level of 9.36% is somewhat large. In this case, one may interpret this as the model

implied expected change that may not be identical to the reasonable expected change

of the investors in the market. Empirically, this level is comparable with conventional

macroeconomic predictors. For example, a one-standard-deviation increase in the D/P

ratio, the CAY and the net payout ratio tends to increase the risk premium by 3.60%,

7.39%, and 10.2% per annum, respectively (see, e.g. Lettau and Ludvigson (2001) and

Boudoukh et al. (2007)).

Meanwhile, the R2 of MAI with OLS forecast is 2.63%, which is amount to PLS

sentiment index and substantially greater than all alternative attention proxies as well

as soft information predictors. This implies that if this level of predictability can be

sustained out-of-sample, it will be of substantial economic significance (Kandel and

Stambaugh (1996)). Indeed, Campbell and Thompson (2008) show that, given the

large unpredictable component inherent in the monthly market returns, a monthly

out-of-sample R2 of 0.5% can generate significant economic value and our findings in

section 3.3 are consistent with this argument.

Apart from just analyse the predictability over the whole sample period, it is also

important to analyse the predictability during business cycles to gain a better un-

derstanding about the fundamental driving forces. Following Rapach et al. (2010),

we compute the R2 statistics separately for economic expansions (R2up) and recessions

16

(R2down),

R2c = 1−

∑Tt=1 1{t∈Tc} · ε2t∑T

t=1 1{t∈Tc} · (Rmt − Rm)2

, c ∈ {up, down}, (3.2)

where 1{t∈Tup} (1{t∈Tup}) is an indicator that takes a value of one when month t is in

an NBER expansion (recession) period, i.e., Tup (Tdown), and zero otherwise; εt is the

fitted residual based on the in-sample estimates of the predictive regression model in

(3.1); Rm is the full-sample mean of Rmt ; and T is the number of observations for the

full sample. Note that, unlike the full-sample R2 statistic, the R2up (R2

down) have no

sign restrictions. Columns 4 and 5 of Table 2 report the R2up and R2

down statistics.

It is shown that MAI evenly gains return predictability over the expansions and over

the recessions. In addition, MAI has significant higher return predictability than both

sentiment indices over expansion periods while MAI underperforms PLS sentiment over

the recessions. This reveals a stable return predictability of our media network based

attention proxy.

3.2 Comparison with Economic Predictors

In this section, we compare the forecasting power of media attention indices with

alternative predictors and examine whether its forecasting power is driven by omitted

soft information, economic variables related to business cycle fundamentals or investor

sentiment. Specifically, we examine whether the forecasting power of MAI remains

significant after controlling for soft information, alternative attention proxies, economic

predictors and investor sentiment. To analyse the marginal forecasting power of MAI,

we conduct the following bivariate predictive regressions based on MAI and Zt,

Rmt+1 = α + βMAIt + φZt + εt+1, (3.3)

17

where Zt is one of alternative predictors described in section 2.2, and our main interest

is the coefficient β, and to test H0 : β = 0 against H1 : β 6= 0.


Table 3 shows that the estimates of β in (3.3) are negative and stable in magni-

tude, in line with the results of predictive regression (3.1) reported in Table 2. More

importantly, β remains statistically significant when augmented by other predictors.

These results demonstrate that MAI contains sizeable complementary forecasting in-

formation beyond what is contained in the seperated news index, economic predictors

and investor sentiment. Meanwhile, controlling other predictors does not discount MAI

effect (β remains almost the same magnitude as reported in Table 2), suggesting that

the information content of media-connection based predictors are not overlapping with

either economic predictors or investor sentiment predictors and it dominates seperated

news index effect (∆Self News and ∆Connected News).

3.3 Out-of-sample Forecasts

Despite the in-sample analysis provides more efficient parameter estimates and thus

more precise return forecasts by utilizing all available data, Goyal and Welch (2008),

among others, argue that out-of-sample tests seem more relevant for assessing genuine

return predictability in real time and avoid the over-fitting issue. In addition, out-of-

sample tests are much less affected by finite sample biases such as the Stambaugh bias

(Busetti and Marcucci (2013)). Hence, it is essential to investigate the out-of-sample

predictive performance of media attention indices.

For out-of-sample forecasts at time t, we only use information available up to t

to forecast stock returns at t+1. Following Goyal and Welch (2008), Kelly and Pruitt

(2013), and many others, we run the out-of-sample analysis by estimating the predictive

18

regression model recursively based on our media attention index,

Rmt+1 = αt + βtMAI1:t;t, (3.4)

where αt and βt are the OLS estimates from regressing {Rmr+1}t−1r=1 with model (3.1)

recursively. Like our in-sample analogues in Table 2, we consider different types of

media attention indices based on optimism, positive and negative news tones respec-

tively. For comparison purposes, we also carry out out-of-sample test with SentBWt

and SentPLSt , and the results are reported in Panel B of Table 4.

To evaluate the out-of-sample forecasting performance, we apply the widely used

Campbell and Thompson (2008) R2OS statistics based on unconstrained forecast and

truncated forecast that imposing non-negative equity premium constraint. The uncon-

strained R2OS statistic measures the proportional reduction in mean squared forecast

error (MSFE) for the predictive regression forecast relative to the historical average

benchmark. Goyal and Welch (2008) show that the historical average is a very stringent

out-of-sample benchmark, and individual economic variables typically fail to outper-

form the historical average. To compute R2OS, let r be a fixed number chosen for the

initial sample training, so that the future expected return can be estimated at time

t = r+ 1, r+ 2, ..., T . Then, we compute s = T − r out-of-sample forecasts: {Rmt+1}T−1t=r .

More specifically, we use first one third data over 1996:01 to 2002:06 as the initial

estimation period so that the forecast evaluation period spans over 2002:07 to 2014:12.

R2OS = 1−

∑T−1t=r (Rm

t+1 − Rmt+1)

2∑T−1t=r (Rm

t+1 − Rmt+1)

2, (3.5)

where Rmt+1 denotes the historical average benchmark corresponding to the constant

expected return model (Rmt+1 = α + εt+1), i.e.,

Rmt+1 =

1

t

t∑s=1

Rms . (3.6)

19

By construction, the R2OS statistic lies in the range (−∞, 1]. If R2

OS > 0, it means that

the forecast Rmt+1 outperforms the historical average Rm

t+1 in terms of MSFE.

The statistical significance of the out-of-sample R2s we report is based on MSFE-

adjusted statistic of Clark and West (2007) (CW-test hereafter). It tests the null

hypothesis that the historical average MSFE is not greater than the predictive regres-

sion forecast MSFE against the one-sided (right-tail) alternative hypothesis that the

historical average MSFE is greater than the predictive regression forecast MSFE, cor-

responding to H0 : R2OS ≤ 0 against H1 : R2

OS > 0. Clark and West (2007) show that

the test has a standard normal limiting distribution when comparing forecasts from

the nested models. Intuitively, under the null hypothesis that the constant expected

return model generates the data, the predictive regression model produces a noisier

forecast than the historical average benchmark as it estimates slope parameters with

zero population values. We thus expect the benchmark models MSFE to be smaller

than the predictive regression model’s MSFE under the null. The MSFE-adjusted

statistic accounts for the negative expected difference between the historical average

MSFE and predictive regression MSFE under the null, so that it can reject the null

even if the R2OS statistic is negative.


Panel A of Table 4 show that MAI index generate positive and significant R2OS

statistics and thus delivers a lower MSFE than the historical average. Thus, it is safe

to conclude that MAI has strong out-of-sample predictive ability for market returns,

which confirms our conjectures in previous in-sample results (Table 2). Comparing

with MAI, SentBW exhibits much weaker out-of-sample predictive ability for market

excess returns as shown in Panel B. Its R2OS is negative and insignificant in general

with exception in expansion periods. Interestingly, the PLS sentiment presents very

good out-of-sample return predictability in all cases. This result once again show that

the sentiment aligned approach extracts the true factors from the noises for predicting

20

market as explained in Huang et al. (2014). Despite SentPLS showing strong predicting

power, our media attention index (MAI t) still outperforms it in general. It proves that

our media attention index is a powerful predictor for market returns. In addition, the

last two columns of Table 4 show that, the predictability of media attention index are

significantly strong and stable across both expansions and recessions.


Since MAI is constructed from media news, its predictability may partially come

from the investors’ sentiment. To understand differences in forecasting power between

sentiment indices and MAI, Figure 4 depicts the predicted returns based on SentBWt ,

SentPLSt and MAI t for the 2002:07–2014:12 out-of-sample period. It is clear that

the MAI-predicted returns are much more volatile than the forecasts of sentiment in-

dices. As the actual realized excess returns (plotted in the figure as 6-month moving

average for better visibility) are even more volatile than the MAI-predicted returns.

This explains why the connected-news-based index does a better job than the hard-

information-based sentiment measures in capturing the expected variation in the mar-

ket return.


Following Goyal and Welch (2008) and Rapach et al. (2010), Figure 5 presents the

time-series plots of the differences between cumulative squared forecast error (CSFE)

for the historical average benchmark forecasts and the CSFE for predictive regression

forecasts based on MAI and sentiment indices over 2002:07–2014:12. This time-series

plot is an informative graphical device on the consistency of out-of-sample forecasting

performance over time. When the difference in CSFE increases, the model forecast out-

performs the historical average, while the opposite holds when the curve decreases. The

solid blue line in Figure 5 shows that our media attention index, MAI consistently out-

performs the historical average in all periods. The curve has slopes that grow rapidly

21

during the recession periods, indicating that the good out-of-sample performance of

MAI mainly steps from the recession period. For comparison, we also plot the differ-

ences in CSFE of investor sentiment indices in dashed lines. The dashed red line shows

that SentBW fails to consistently outperform the historical average. As a consequence,

it does a poor job in terms of monthly out-of-sample forecasts. The SentPLS, which is

depicted by dashed yellow line, however is shown to perform better than SentBW , it

is still not as good as media attention index. These results suggest that MAI contains

useful information in predicting market returns that investor sentiment indices are fail

to capture.

Lastly, we compare the out-of-sample performance of media attention index with

the combined economic predictors proposed in Rapach et al. (2010). From Panel C of

Table 4 we can conclude that the out-of-sample predictability of the combined economic

predictors during our sample period is very poor in general except for the expansion

periods. This result implies that the out-of-sample predictability of our media attention

indices does not come from the hard information either.

In summary, out-of-sample analysis shows that media attention index is a power-

ful and reliable predictor for the excess market returns, and consistently outperforms

investor sentiment indices and combined economic predictors across different sample

periods which is consistent with our previous in-sample results (Tables 2 and 3).

3.4 Forecasting Cross-sectional Portfolio

Based on our theory, MAI should predict negative returns given short-sales constraint.

The rationale behind is that news co-occurrence reveals investor attention to connected

stocks and this attention generates asymmetric effect between the good news and bad

news. Investors can simply buy the stock to react to the good news while they are not

able to short-sales the stock. In this case, an increased news co-occurrence incorporates

more good information than bad information into stock price of connected stocks, hence

pushing up the prices of those connected stocks above a fair value.

22

To test the conjecture above, we test the cross-sectional return predictability by

sorting on number of connected news4. We form 10 equal-weighted portfolios and label

the stocks with media attentions in the top (bottom) decile as high (low) attention

group. The rest are grouped as median attention group. All portfolios are rebalanced

monthly at the close price of next month. The performance of the sorted portfolios are

shown in the first column of Table 5. As expected, the low media attention portfolio

gains a significant higher alpha than the high media attention portfolio of 0.74% per

month (t-statistic = 2.15).


In addition, in Table 5, we test if the alphas generated by media attention, a

portfolio that is long stocks with small number of connected news and shortsells stocks

with large number of connected news, can be explain by existing factors. We apply

CAPM (Markowitz, 1952), Fama-French three-factor model (Fama and French, 1993)

and Carhart four-factor model (Carhart, 1997) to dissect the alphas generated by

media attention. The results show that media attention portfolio can deliver a high

alpha under all cases. Specifically, the media attention portfolio has Fama and French

(1993) abnormal returns of 0.81% per month (t-statistic = 2.52). Further adjusting for

Carhart (1997) momentum factor, the media attention portfolio earns abnormal returns

of 0.62% per month (t-statistic = 2.00). These results indicate that connected news

indeed captures a different aspect of market excess returns that cannot be explained

by conventional market factors.

4Directly sorting on MAI is problematic as MAI is constructed by the aggregated change of mediaco-occurrence, and the change of a market-wide index is different from the aggregation of changefor individual stocks cross-sectionally. Meanwhile, it generates missing values by using change ofconnections. So cross-sectionally, we can only prove our intuition by studying number of connectionand weighted scheme seperately.

23

4 Economic Explanations

In this section, we explore the source of predictability of MAI from different angles.

First and foremost, we test if higher news co-occurrence induces more frequent search

activities, which is an important proxy for investor attention (Da et al., 2011b). Sec-

ondly, we examine the performance of MAI under different environments of belief

uncertainty and short-sales constraints. Lastly, we justify the economic meaning of

using centrality and news tones for constructing MAI by checking how under different

stock combinations contribute to abnormal returns.

4.1 Google Search and Bloomberg Attention

As discussed in Da et al. (2011b), the attention proxies based on the media occurrence

should always make the assumption that if its name was mentioned in the news media,

then investors should have paid attention to it. However, news occurrence does not

guarantee attention unless investors actually read it. Therefore, Da et al. (2011b)

propose using Google search frequency as a direct measure of investor attention.

Respecting the argument in Da et al. (2011b), we then test if news co-occurrences

can induce search activities, in order to show our MAI indeed reflects investor atten-

tions. Firstly, we sort the connected pair stocks into quintiles based on the frequency

of news co-occurrence. Then, in each month, we randomly pick up five pairs in each

group and calculate the corresponding Google and Bloomberg search volume correla-

tions. The aggregated results are shown below.5


As shown in Figure 6, the average correlation of Google search and Bloomberg

search increase with the news co-occurrences very significantly. Specifically, the average

5For correlation coefficient series of each group, we put them in the appendix, which is availableupon request.

24

correlations in group with most news co-occurrences are 9% and 17% for Google search

and Bloomberg Search respectively. However, the average correlations for group with

fewest news co-occurrences are merely 2% and 3% for Google search and Bloomberg

Search respectively. These results together provide strong evidence to support the

investor attention interpretation of news co-occurrences.

4.2 Belief Divergence and Short-sale Constraint

Miller (1977) asserts that the stock prices in equilibrium will reflect only the optimists

view and hence will more likely be overvalued when investors have divergent opinions

and short-selling is not allowed. Similarly, Hong and Stein (2007) argue that the two

key ingredients for explaining stock overpricing behaviour are disagreement stemmed

from heterogeneous belief and short-sales constraint. Therefore, to verify these two

assumptions, we check the return predictability performance of MAI over high and low

environments of belief divergence and of short-sales constraint tightness.

For belief divergence, we construct macro disagreement measure using the same set

of macro variables suggested in Li (2016). Instead of using simple average suggested

in Li (2016), we apply principal component analysis to extract the most informative

factor. In addition, we also use VIX to proxy the investors’ belief divergence in the

market. For short-sales constraint, we follow Asquith et al. (2005) and use the short

interest ratio to proxy the tightness of the short-sales constraint. The in-sample return

predictability results under each environment are summarized in Table 6.


As shown in Table 6, MAI only shows strong return predictability when investors’

beliefs are highly divergent and the short-sales constraint is tight. This result justifies

our assumptions for news co-occurrence to generate market over-valuation. Actually,

media coverage of multiple stocks, in an environment of high belief divergence and

tight short-sales constraint, can lead to correlated over-valuation for these stocks. It

25

then spreads to the every corner of the market through the network structure and

constitutes a market-wide over-valuation proxy. In addition, it shows that weighting

scheme is indeed important to capture the attention spreading effect in predicting stock

returns and we will make a detailed discussion about it in the next subsection.

4.3 Centrality and Investors Attention

In this section, we try to understand the role of centrality scores in affecting attention

effect. In the market, there are four types of stocks, namely, stocks with high centrality

scores that connect to low centrality stocks (HL), stocks with high centrality scores that

connect to high centrality stocks (HH), stocks with low centrality scores that connect

to high centrality stocks (LH), and stocks with low centrality scores that are connected

with low central stocks (LL). Under media network, a stock attracts investors’ attention

from its connected stocks. But importantly, the attracted attention would not equally

affect all connected stocks. In particular, a stock with a low centrality score tends to

be more affected by this connection and this effect will be amplified when the stock is

connected to a high centrality stock than that of connecting to a low centrality stock.

To understand this argument better, we conduct long-short portfolio within each type

of stocks based on the media attention, proxied by the number of connected news.

To balance the level of connections for both long and short stocks in each type

of stocks, we independently sort stocks according to the number of connected news,

self centrality score (SCS) and average centrality score of connected stocks (CCS).

Specifically, SCS (CCS) classfies stocks into two groups by cutting at median point

while the number of connected news divides stocks into 10 deciles. We then report the

portfolio return and risk adjusted alpha of attention based trading strategy for each

type of stocks. Specifically, we label the group with number of connected news in the

top (bottom) decile as high (low) attention group, and our portfolio strategy is to long

26

the stocks in the low attentin group and sell stocks in the high attention group.6


Under this setting, we are able to identify which type of stocks is more sensitive

to media connections, and hence contributes to market-wide over-valuation. Table 7

reports the excess portfolio return (risk adjusted portfolio return) of media connection

based trading strategy, formed by using different types of stocks. Indeed, not all stocks

suffer co-overvaluation – for those stocks with high centrality scores, they are less sen-

sitive to media connection effect with insignificant excess portfolio returns (t-statistics

are 0.87, 0.94 and 1.62 for HL, HH and LL stocks respectively). Only stocks with

low centrality that connect to high centrality stocks (LH) show strong and significant

trading profit. The trading strategy generates 1.40% excess return with a t-statistic

of 3.09. The results cannot be fully explained by conventional risk factors, including

CAPM, Fama-French (1993) three factors and Carhart (1997) four factors. As a result,

it provides an intuitive way to understand the significance of our centrality weighting

scheme, that is, even though the stock itself may receive little attention, but when it

links to a giant through news co-occurrence, it will receive excess attention and end up

with an over-valuation.

4.4 Size and Investors Attention

Similarly, in this section, we study the role of value weight in affecting attention effect.

We classify stocks into four types, namely, big stocks that are connected to small

stocks (big-connect-small), big stocks that are connected to big stocks (big-connect-

big), small stocks that are connected to small stocks (small-connect-small), and small

stocks that are connected to big stocks (small-connect-big). Again, we conduct long-

short portfolio within each type of stocks based on the media attention following the

6For some periods, a certain type of stocks may not cover any long (short) stocks, we then replaceits long (short) excess return with risk free rate (equivalent to long or short a risk free bond)

27

same rule we apply for the centrality weight. Table 8 reports the excess portfolio return

(risk adjusted portfolio return) of media connection based trading strategy, formed by

using different types of stocks. Consistent with our expectation, small stocks that are

connected to big stocks tend to be most affected by attention effect. To some extend,

the excess portfolio return of connected news based trading strategy achives 1.98% per

monthly with significant t-statistic, 2.09 (while is 0.24, -1.43 and -0.38 for Big-connect-

Small, Big-connect-Big and Small-connect-Small stocks respectively. The results are

also robust after controlling conventional risk factors, including CAPM, Fama-French

(1993) three factors and Carhart (1997) four factors. As a result, we provide the

economic meaning to the value weight scheme, that is, small stocks are more likely to

be affected by media connection, especially when they are linked to big companies. By

drawing market attention through big stocks, small stocks receive investors’ asymmetry

trading behavior due to short sale constraints, hence contributing to an overall lower

market premium. All in all, network structure shows powerful function in transmitting

the investor attentions between the stocks and leads to stock mispricing.


5 Conclusions

Investor attention affects market reactions to new information and has been docu-

mented as an important driving force of stock returns. Existing literature have con-

structed predictors using both hard information and soft information, while investors’

attention effect seems to be underexplored. Based on media news network, we pro-

pose a novel predictor, media attention index (MAI), which proxies investor attention

with media news co-occurrence. In general, we find MAI consistently provides neg-

ative return forecasts for both time-series and cross-sectional portfolios. In a sample

of S&P500 stocks from 1996 to 2014, we first document MAI can provide significant

in-sample and out-of-sample return predictability. Then, we show the return pre-

28

dictability is robust by controlling for other predictors, such as investor sentiment and

economic factors. We also provide evidence that MAI captures investor attention by

sorting cross-sectional portfolios on news co-occurrence frequencies and by checking the

performance of average correlation of Google search and Bloomberg search frequencies.

29

Figure 1: This figure is a simple network example to illustrate how eigenvector cen-trality differs from degree centrality. Each node in the network represents a companyand two nodes are connected when there exists news mentions both of them.

30

−4

−2

0

2

4

6

2000 2005 2010 2015

MAI

MAIsize

MAIcentrality

Figure 2: This figure plots the composite media attention index, size-based media atten-tion index, and the centrality-based media attention index. The solid red line depictsthe composite media attention index, the dashed orange line depicts the centrality-based media attention index, and the dashed purple line depicts the size-based mediaattention index. All indices are standardized to have zero mean and unit variance. Theshaded periods correspond to NBER-dated recessions. The sample period is 1996:01–2014:12.

31

−2.5

0.0

2.5

5.0

2000 2005 2010 2015

MAI

BW

PLS

Figure 3: This figure plots the composite media attention index, Baker and Wurgler(2006) investor sentiment index, and Huang et al. (2014) investor sentiment alignedindex. The solid red line depicts the media attention index, the dashed yellow linedepicts the Baker and Wurgler (2006) investor sentiment index, and the dashed blueline depicts the Huang et al. (2014) investor sentiment aligned index. All indices arestandardized to have zero mean and unit variance. The shaded periods correspond toNBER-dated recessions. The sample period is 1996:01–2014:12.

32

−0.1

0.0

0.1

2005 2010 2015

BW

MAI

PLS

Realize Return

Figure 4: This figure depicts the excess market return forecasts of media attentionindex, Baker and Wurgler (2006) investor sentiment index, and Huang et al. (2014)investor sentiment aligned index. The solid green line depicts the realized marketreturns. The dashed red line depicts the out-of-sample predictive regression forecastfor excess market return based on the previous month media attention index. Thedashed yellow line depicts the out-of-sample excess market return forecast based onBaker and Wurgler (2006) investor sentiment index, and the dashed blue line depictsthe Huang et al. (2014) investor sentiment aligned index. The excess market returnforecasts are estimated recursively based on information up to the period of forecastformation period t alone. The shaded periods correspond to NBER-dated recessions.The out-of-sample period is 2002:07–2014:12.

33

−0.004

0.000

0.004

0.008

0.012

2005 2010 2015

MAI

BW

PLS

Figure 5: This figure depicts the difference between the cumulative squared forecasterror (CSFE) for the historical average benchmark and the CSFE for the out-of-samplepredictive regression forecast based on the previous month predictor values. The solidred line depicts the difference between CSFE of media attention index and the CSFEof historical average benchmark, the dashed yellow line represent CSFE of Baker andWurgler (2006) investor sentiment index, and the dashed blue line depicts CSFE ofHuang et al. (2014) investor sentiment aligned index. Both indices and regressioncoefficients are estimated recursively based on information up to the period of forecastformation period t alone. The shaded periods correspond to NBER-dated recessions.The out-of-sample period is 2002:07–2014:12.

34

1 2 3 4 5

Ranking

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

Ave

rage

Cor

rela

tion

GoogleBloomberg

Figure 6: This figure plots the average correlation coefficient of Google and Bloombergsearch volumes within each group which is sorted on news attentions. Within eachgroup, the correlation coefficient is calculated monthly using the stock pairs randomlychosen from the 5 sorted groups. The time span is 1996:01–2014:12.

35

Table 1: Summary Statistics

This table reports summary statistics for the log excess aggregate stock market return defined asthe log return on the value-weighted S&P500 stocks in excess of the risk-free rate (Rm), risk-free rate(Rf ), media attention measures, number of self-connected news (Self News), number of connectednews (Connected News), google search index (Google Search) following Da et al. (2011a), (PrcHigh)following George and Hwang (2004), change of average number of analysts aggregated from individualS&P500 stocks using equal weight or value weight (∆ # of AnalystsEW or ∆ # of AnalystsVW ),News Tones based on Loughran and McDonald (2011) dictionary using both equal weight and valueweight (ToneEW and ToneVW ). Baker and Wurgler (2006) sentiment index, Huang et al. (2014) PLSsentiment aligned index, Morck et al. (2000) earnings co-movement index (ECI), macro disagreement(MDis), VIX from CBOE, Asquith et al. (2005) short interest ratio (SIR), and 12 economic variablesfrom Amit Goyals website: the log dividend-price ratio (D/P), the log dividend-yield ratio (D/Y),log earnings-price ratio (E/P), log dividend payout ratio (D/E), stock return variance (SVAR), book-to-market ratio (B/M), net equity expansion (NTIS), Treasury bill rate (TBL), long-term bond yield(LTY) long-term bond return (LTR), term spread (TMS), default yield spread (DFY), default returnspread (DFR), inflation rate (INFL). For each variable, the time-series average (Mean), standarddeviation (Std. Dev.), skewness (Skew.), kurtosis (Kurt.), minimum (Min.), maximum (Max.), andfirst-order autocorrelation (ρ(1)) are reported. The sample period is 1996:01–2014:12. (Google Searchis from 2004:01 – 2014:12)

Variable Mean Std. Skew. Kurt. Min. Max. ρ(1)Rm 0.0041 0.0449 -0.6565 3.9294 -0.1702 0.1077 0.0841Rf 0.0020 0.0018 0.2342 1.4425 0.0000 0.0056 0.9760MAI 0.0018 0.7303 0.0559 8.9464 -3.0866 3.0804 -0.3484MAI size 0.0000 0.0430 0.0317 2.9977 -0.1230 0.1251 -0.3733MAI ctr 0.0011 0.0568 0.8071 13.7780 -0.2345 0.3309 -0.3209Self News 42.1855 18.0060 0.6412 2.6174 18.0162 103.9885 0.8516Connected News 0.0981 0.0755 1.2271 3.6229 0.0264 0.3379 0.7951Google Search 19.6071 19.4225 0.5502 2.3120 0.0000 78.0000 0.9075PrcHigh 0.9376 0.0954 -1.9519 6.5747 0.5249 1.0000 0.9242∆ # of AnalystsEW 0.0212 0.1325 0.8471 4.7283 -0.3546 0.5410 0.0320∆ # of AnalystsVW 0.0188 0.2680 1.6642 13.9122 -0.7986 1.8765 -0.0060ToneEW 0.0000 0.0000 -0.6725 3.3833 0.0000 0.0000 0.5691ToneVW -0.0031 0.0014 -0.5315 3.2283 -0.0072 0.0008 0.5575SentBW 0.0981 0.6979 1.6608 6.2224 -0.9300 2.8400 0.9740SentPLS -0.1912 0.8566 1.8387 5.9806 -1.1070 3.0270 0.9775ECI 0.1474 0.0660 0.4828 2.5107 0.0349 0.3097 0.9574MDis 0.8157 1.3221 -1.0760 4.2412 -3.3874 2.9949 0.9579VIX 21.3006 8.1753 1.8087 8.3610 10.8200 62.6400 0.8769SIR 0.0146 0.0025 0.4291 3.1964 0.0097 0.0221 0.9557D/P -4.0157 0.3990 8.6644 108.6964 -4.5236 0.9531 0.3049D/Y -4.0282 0.2293 0.4221 4.8505 -4.5309 -3.0061 0.8965E/P -3.1708 0.4264 -1.8816 7.3344 -4.8365 -2.5656 0.9042D/E -0.8449 0.6466 5.9170 52.4644 -1.2442 5.7558 0.5144SVAR 0.0033 0.0055 6.0983 52.2353 -0.0025 0.0581 0.6977B/M 0.2623 0.0786 -0.2289 2.3391 0.0003 0.4411 0.9002NTIS 0.0042 0.0188 -1.2641 4.4489 -0.0577 0.0311 0.9720TBL 2.4348 2.1300 0.2001 1.3899 0.0100 6.1700 0.9852LTY 4.7884 1.2597 -0.3014 2.7387 0.5642 7.2600 0.9426LTR 0.6887 3.0497 0.0287 5.6444 -11.2400 14.4300 -0.0147TMS 2.3536 1.4059 -0.4527 2.7105 -3.2258 4.5300 0.9032DFY 0.9899 0.5026 0.9458 17.0779 -2.2800 3.3800 0.7864DFR -0.0164 1.8399 -0.4594 9.1939 -9.7500 7.3700 0.0198INFL 0.0020 0.0041 0.5341 13.7810 -0.0192 0.0290 0.325036

Table 2: Forecasting Market Return with News Network

This table provides in-sample estimation results for the predictive regression of monthly excessmarket return on media attention indices, media coverage index, alternative attention proxies, newstone measures and Baker and Wurgler (2006) sentiment index and Huang et al. (2014) PLS sentimentaligned index.

Rmt+1 = α+ βXt + εt+1,

where Rmt+1 denotes the monthly excess market return (%). *, **, and *** indicate significance at the

10%, 5% and 1% levels respectively. The sample period is 1996:01–2014:12 (Google Search is from

2004:01 – 2014:12).

Predictor β t-stat. R2 R2up R2

down

Panel A: Media Connection Indices

MAI -0.7766*** -2.6296 3.0075 2.9523 3.0753

MAIsizet -0.8048*** -2.7334 3.2419 2.8097 7.2027

MAIctrt -0.5856** -1.9644 1.7010 1.8219 1.0119

Panel B: Media Coverage Index

∆Self Newst -0.44 -1.47 0.96 0.60 4.19

∆Connected Newst -0.51* -1.72 1.32 0.83 3.08

Panel C: Alternative Attention Proxy

Google Search -0.2595 -0.8682 0.3369 0.0059 0.0466

PrcHigh 0.2068 0.6911 0.2137 0.0288 5.2961

∆ # of AnalystsEW 0.5529* 1.8596 1.5270 0.8227 11.2316

∆ # of AnalystsVW -0.1241 -0.4148 0.0771 0.0032 4.5502

Panel D: Soft Information

Toneew 0.4022 1.3424 0.8016 1.1346 0.0081

Tonecw 0.4564 1.5242 1.0310 1.1707 0.0387

Panel E: Investor Sentiment Index

SentBW -0.5912** -1.9927 1.7495 2.4105 0.2326

SentPLS -0.8019*** -2.7223 3.2164 2.0573 5.9064

37

Table 3: Comparison with Alternative Predictors

This table provides in-sample estimation results for the bivariate predictive regression of monthlyexcess market return on one of media coverage, alternative attention proxies, news tones, 14 economicpredictors, or investor sentiment indices, Zt, and on the media attention indices, Xt.

Rmt+1 = α+ βXt + φZt + εt+1,

where Rmt+1 denotes the monthly excess market return (%). The significance of the estimates are

based on Newey-West t-statistics. *, **, and *** indicate significance at the 10%, 5% and 1% levelsrespectively. The sample period is 1996:01–2014:12 (Google Search is from 2004:01 – 2014:12).

MAI t

Predictor β φ R2 R2up R2

down

∆Self News -0.7506*** -0.2714 3.6046 1.9596 4.3424

∆Connected News -0.7364** -0.1523 3.3432 1.8681 4.9827

Google Search -0.7963** -0.6079* 6.4051 3.5823 3.7811

PrcHigh -0.8439*** 0.1603 4.3728 8.6483 8.5214

∆ # of AnalystsEW -0.7874*** 0.3089 4.7683 3.3622 9.5947

∆ # of AnalystsVW -0.8562*** -0.1075 4.2877 3.0691 21.5183

Toneew -0.816*** 0.4321 4.1416 4.1385 7.2092

Tonecw -0.8082*** 0.469 4.3122 4.1084 7.2599

SentBW -0.7919*** -0.5656* 4.8482 3.9898 1.2252

SentPLS -0.8113*** -0.803*** 6.4674 3.8304 7.2419

ECI -0.8119*** 0.0003 3.2560 1.8899 8.0039

D/P -0.8098*** 1.0173* 4.815 7.8207 2.3725

D/Y -0.8024*** 0.6569** 5.2149 6.8955 4.0616

E/P -0.8179*** 0.2503 3.5474 3.4656 12.276

D/E -0.8107*** 0.0673 3.268 1.8175 9.3369

SVAR -0.7945*** -0.6436** 5.3086 1.8169 3.4257

B/M -0.8086*** 0.3054 3.6976 2.5343 1.086

NTIS -0.8166*** 0.5968** 5.0304 1.9377 1.2861

TBL -0.8083*** -0.1757 3.4095 2.1509 2.4001

LTY -0.8079*** -0.3142 3.7232 2.641 2.0243

LTR -0.8113*** 0.1211 3.3291 1.811 2.6135

TMS -0.8119*** -0.0031 3.256 1.8175 2.0512

DFY -0.8123*** -0.3373 3.7151 1.8671 1.9623

DFR -0.8148*** 0.3411 3.8337 1.9528 1.4936

INFL -0.8114*** 0.1809 3.3871 2.738 7.8256

38

Table 4: Out-of-sample Forecasting

This table reports the out-of-sample performances of various measures of Media Attention Indicesin predicting the monthly excess market return. Panel A provides the results using the media attentionindices, Panel B are results of investor sentiment indices by Baker and Wurgler (2006) and Huanget al. (2014), and Panel C are results using combined economic predictors by Rapach et al. (2010).All of the predictors and regression slopes are estimated recursively using the data available at theforecast formation time t. R2

OS is the out-of-sample R2 with no constraints. CW-test is the Clarkand West (2007) MSFE-adjusted statistic calculated according to prevailing mean model. R2

OS,up

(R2OS,down) statistics are calculated over NBER-dated business-cycle expansions (recessions) based on

the no constraint model. *, **, and *** indicate significance at the 10%, 5% and 1% levels respectively.The out-of-sample evaluation period is 2002:07–2014:12 (Google Search is from 2008:01 – 2014:12).

Predictor R2OS CW-test R2

OS,up R2OS,down

Panel A: Media Attention Indices

MAI 3.3633*** 2.5138 3.6028 2.9255

MAIsizet 2.9996*** 2.4133 2.3011 4.2766

MAIctrt 1.8816* 1.7703 2.4312 0.8770

Panel B: News Coverage Indices

∆Self Newst -3.3057 0.0904 -0.1977 -8.9872

∆Connected Newst -0.7443* 1.7833 -0.8553 -0.5414

Panel C: Alternative Attention Proxy

Google Search 0.8591 1.0972 3.7750 -2.1958

PrcHigh -13.0300 0.3699 -1.5056 -25.1036

∆ # of AnalystsEW -1.0429 0.4147 -2.6142 0.6033

∆ # of AnalystsVW -3.1650 -0.5815 -8.2905 2.2047

Panel D: Soft Information

Toneew 0.0907 0.4552 0.3716 -0.4228

Tonecw 0.1394 0.5667 0.2278 -0.0222

Panel E: Investor Sentiment Indices

SentBWt -0.2470 0.7076 1.0609 -2.6379

SentPLSt 2.0618* 1.8737 0.4386 5.0292

Panel F: Combined Economic Predictors

Mean -0.6688 0.0031 -0.3302 1.3496

Median 0.0521 0.2242 0.1783 2.4225

Trimmed Mean -0.4926 -0.0008 -0.3277 1.8358

DMSPE, θ = 1.0 -0.6925 0.0203 -0.2110 1.1304

DMSPE, θ = 0.9 -0.6055 0.0973 -0.2394 1.3700

39

Table 5: Performance of Sorted Decile Portfolios Based on Media Co-occurrence

This table reports excess portfolio return and risk adjusted alpha of investment strategies basedon number of connected news in last month. The sample period is from Jan, 1996 to Dec, 2014. Wefirst sort stocks into 10 deciles according to firms’ number of connected news and label all stocks withnumber of connected news in the top (bottom) decile as short (long) group. We hold each groupof stocks for 1 month and rebalance them at the close price of next month. Three types of riskfactors are considered: CAPM, Fama-French (1993) three-factor model, including size (SMB), andbook-to-market (HML) and Carhart (1997) four-factor model to account for incremental impact ofthe momentum factor. t-statistics are reported below the portfolio return (risk adjusted alpha).

Portfolios Rm CAPM FF-3 Cahart-4

Long 0.90% 0.30% 0.13% 0.21%

(2.59) (1.77) (0.98) (1.61)

2 0.88% 0.26% 0.11% 0.20%

(2.53) (1.73) (0.93) (1.85)

3 1.05% 0.44% 0.32% 0.44%

(3.04) (2.77) (2.23) (3.32)

4 1.06% 0.44% 0.31% 0.43%

(2.96) (2.57) (2.06) (3.03)

5 0.79% 0.15% 0.04% 0.20%

(2.13) (0.82) (0.22) (1.28)

6 0.83% 0.18% 0.11% 0.22%

(2.28) (1.16) (0.76) (1.57)

7 0.92% 0.21% 0.13% 0.27%

(2.28) (1.14) (0.73) (1.60)

8 0.60% -0.07% -0.11% -0.01%

(1.51) (-0.31) (-0.53) (-0.06)

9 0.68% -0.09% -0.11% 0.11%

(1.50) (-0.37) (-0.49) (0.53)

Short 0.16% -0.67% -0.67% -0.41%

(0.32) (-2.28) (-2.31) (-1.54)

Long - Short 0.74% 0.96% 0.81% 0.62%

(2.15) (2.92) (2.52) (2.00)

40

Table 6: Return Predictability under Different Belief Divergence and Shortselling Con-straints

This table provides in-sample estimation results for the predictive regression of monthly excessmarket return on media attention indices over high and low belief divergence environment as wellas high and low short-sales constraint periods. We use macro disagreement and VIX as proxy ofbelief divergence and use value weighted short interest ratio of S&P500 stocks as proxy for short-salescontraint. A high belief divergence (short-sales constraint) indicator equals one if the belief divergenceindex (short interest ratio) in the previous month is above the median value of the sample period and0 otherwise. The sample period is 1996:01–2014:12. ***, ** and * denote statistical significance atthe 1%, 5%, and 10% levels, respectively.

PredictorHigh Low

β t-stat. R2 β t-stat. R2

Panel A: Macro Disagreement

MAIsizet -1.1300*** -2.7425 0.0614 -0.2805 -0.6671 0.0042

MAIctrt -0.8295** -2.2978 0.0439 0.3743 0.6499 0.0040

MAIt -1.0497*** -2.7783 0.0629 -0.0622 -0.1264 0.0002

Panel B: VIX

MAI sizet -1.4200*** -2.8402 0.0672 0.0259 0.0976 0.0001

MAI ctrt -0.8240* -1.8496 0.0296 0.1188 0.3596 0.0012

MAIt -1.2438*** -2.6234 0.0579 0.0667 0.2328 0.0005

Panel C: Short Interest Ratio

MAI sizet -1.0502*** -2.5426 0.0546 -0.4873 -1.1639 0.0123

MAI ctrt -1.1917** -2.3672 0.0477 -0.2406 -0.6603 0.0040

MAI t -1.2827*** -2.8112 0.0659 -0.3774 -0.9840 0.0088

41

Table 7: Risk Adjusted Alphas of Attention-based Trading Strategies under CentralityWeights

We independently sort stocks according to the number of connected news, self centrality score(SCS) and average centrality score of connected stocks (CCS). SCS (CCS) classfies stocks into 2 groupsby cutting at median point while the number of connected news divides stocks into 5 groups. Wethen report the portfolio return and risk adjusted alpha of attention based trading strategy under4 types of stocks, including stocks with high centrality scores that connect to low centrality stocks(high-connect-low), stocks with high centrality scores that connect to high centrality stocks (high-connect-high), stocks with low centrality scores that connect to low centrality stocks (low-connect-low), and stocks with low centrality scores that connect to high centralilty stocks (low-connect-high).The trading strategy labels all stocks with number of connected news in the top (bottom) group ashigh (low) attention group and the portfolio is formed by buying stocks in the low attentin groupwhile selling stocks in the high attention group in last month. For some periods, when a certain typeof stocks do not meet any long (short) stocks, we replace the long (short) excess return with risk freerate. We then hold this portfolio for 1 month and rebalance stocks at the close price of next month.Three types of risk factors are considerred to find risk adjusted alpha: CAPM, Fama-French (1993)three-factor model, including size (SMB), and book-to-market (HML) and Carhart (1997) four-factormodel to account for incremental impact of the momentum factor. t-statistics are reported below theportfolio return (risk adjusted alpha). The sample period is 1996:01–2014:12


High-connect-Low 0.75% 0.78% 0.54% 0.26%

(0.87) (0.89) (0.61) (0.29)

High-connect-High 0.26% 0.22% 0.24% 0.24%

(0.94) (0.77) (0.83) (0.83)

Low-connect-Low 1.28% 1.04% 0.99% 1.14%

(1.62) (1.31) (1.24) (1.42)

Low-connect-High 1.40% 0.75% 0.57% 0.61%

(3.09) (2.38) (1.89) (2.02)

42

Table 8: Risk Adjusted Alphas of Attention-based Trading Strategies under ValueWeights

We independently sort stocks according to the number of connected news, firm self value weight(SVW) and average value weight of connected stocks (CVW). SVW (CVW) classfies stocks into 2groups by cutting at median point while the number of connected news divides stocks into 5 groups.We then report the portfolio return and risk adjusted alpha of attention based trading strategy under4 types of stocks, including big stocks that are connected to small stocks (big-connect-small), bigstocks that are connected to big stocks (big-connect-big), small stocks that are connected to smallstocks (small-connect-small), and small stocks that are connected to big stocks (small-connect-big).The trading strategy labels all stocks with number of connected news in the top (bottom) group ashigh (low) attention group and the portfolio is formed by buying stocks in the low attentin groupwhile selling stocks in the high attention group in last month. For some periods, when a certain typeof stocks do not meet any long (short) stocks, we replace the long (short) excess return with risk freerate. We then hold this portfolio for 1 month and rebalance stocks at the close price of next month.Three types of risk factors are considerred to find risk adjusted alpha: CAPM, Fama-French (1993)three-factor model, including size (SMB), and book-to-market (HML) and Carhart (1997) four-factormodel to account for incremental impact of the momentum factor. t-statistics are reported below theportfolio return (risk adjusted alpha). The sample period is 1996:01–2014:12


Big-connect-Small 0.08% 0.23% 0.17% -0.05%

0.24 0.67 0.51 -0.14

Big-connect-Big -0.83% -0.34% -0.39% -0.58%

-1.43 -0.65 -0.72 -1.08

Small-connect-Small -0.27% -0.29% -0.19% -0.22%

-0.38 -0.39 -0.25 -0.29

Small-connect-Big 1.98% 2.10% 2.17% 1.85%

2.09 2.20 2.26 1.93

43

References

Asquith, P., P. A. Pathak, and J. R. Ritter (2005): “Short interest, institu-tional ownership, and stock returns,” Journal of Financial Economics, 78, 243–276.

Baker, M. and J. Wurgler (2006): “Investor sentiment and the cross-section ofstock returns,” The Journal of Finance, 61, 1645–1680.

Barber, B. M. and T. Odean (2007): “All that glitters: The effect of attention andnews on the buying behavior of individual and institutional investors,” The Reviewof Financial Studies, 21, 785–818.

Boudoukh, J., R. Michaely, M. Richardson, and M. R. Roberts (2007): “Onthe importance of measuring payout yield: Implications for empirical asset pricing,”The Journal of Finance, 62, 877–915.

Busetti, F. and J. Marcucci (2013): “Comparing forecast accuracy: a MonteCarlo investigation,” International Journal of Forecasting, 29, 13–27.

Campbell, J. Y. and S. B. Thompson (2008): “Predicting excess stock returns outof sample: Can anything beat the historical average?” Review of Financial Studies,21, 1509–1531.

Carhart, M. M. (1997): “On persistence in mutual fund performance,” The Journalof finance, 52, 57–82.

Clark, T. E. and K. D. West (2007): “Approximately normal tests for equalpredictive accuracy in nested models,” Journal of econometrics, 138, 291–311.

Cohen, L. and A. Frazzini (2008): “Economic links and predictable returns,” TheJournal of Finance, 63, 1977–2011.

Da, Z., J. Engelberg, and P. Gao (2011a): “In search of attention,” The Journalof Finance, 66, 1461–1499.

——— (2011b): “In search of attention,” The Journal of Finance, 66, 1461–1499.

Daley, B. and B. Green (2012): “Waiting for News in the Market for Lemons,”Econometrica, 80, 1433–1504.

Engelberg, J. (2008): “Costly information processing: Evidence from earnings an-nouncements,” .

Fama, E. F. and K. R. French (1993): “Common risk factors in the returns onstocks and bonds,” Journal of financial economics, 33, 3–56.

Fang, L. and J. Peress (2009): “Media coverage and the cross-section of stockreturns,” The Journal of Finance, 64, 2023–2052.

44

Fang, L. H., J. Peress, and L. Zheng (2014): “Does Media Coverage of StocksAffect Mutual Funds’ Trading and Performance?” The Review of Financial Studies,27, 3441–3466.

George, T. J. and C.-Y. Hwang (2004): “The 52-week high and momentuminvesting,” The Journal of Finance, 59, 2145–2176.

Goyal, A. and I. Welch (2008): “A comprehensive look at the empirical perfor-mance of equity premium prediction,” Review of Financial Studies, 21, 1455–1508.

Gurun, U. G. and A. W. Butler (2012): “Don’t believe the hype: Local mediaslant, local advertising, and firm value,” The Journal of Finance, 67, 561–598.

Hillert, A., H. Jacobs, and S. Muller (2014): “Media makes momentum,” TheReview of Financial Studies, 27, 3467–3501.

Hong, H. and J. C. Stein (2007): “Disagreement and the stock market,” Journalof Economic perspectives, 21, 109–128.

Hong, H., W. Torous, and R. Valkanov (2007): “Do industries lead stockmarkets?” Journal of Financial Economics, 83, 367–396.

Huang, D., F. Jiang, J. Tu, and G. Zhou (2014): “Investor sentiment aligned:A powerful predictor of stock returns,” Review of Financial Studies, hhu080.

Jacobs, H. (2015): “The role of attention constraints for investor behavior and eco-nomic aggregates: what have we learnt so far?” Management Review Quarterly, 65,217–237.

Jegadeesh, N. and D. Wu (2013): “Word power: A new approach for contentanalysis,” Journal of Financial Economics, 110, 712–729.

Kandel, S. and R. F. Stambaugh (1996): “On the predictability of stock returns:an asset-allocation perspective,” The Journal of Finance, 51, 385–424.

Kelly, B. and S. Pruitt (2013): “Market expectations in the cross-section ofpresent values,” The Journal of Finance, 68, 1721–1756.

Lettau, M. and S. Ludvigson (2001): “Consumption, aggregate wealth, and ex-pected stock returns,” the Journal of Finance, 56, 815–849.

Li, F. W. (2016): “Macro Disagreement and the Cross-Section of Stock Returns,”The Review of Asset Pricing Studies, 6, 1–45.

Loughran, T. and B. McDonald (2011): “When is a liability not a liability?Textual analysis, dictionaries, and 10-Ks,” The Journal of Finance, 66, 35–65.

Markowitz, H. (1952): “Portfolio selection,” The journal of finance, 7, 77–91.

45

Martins, A. C. R. (2008): “Mobility and social network effects on extremist opin-ions,” Physical Review E, 78, 036104.

Menzly, L. and O. Ozbas (2010): “Market segmentation and cross-predictabilityof returns,” The Journal of Finance, 65, 1555–1580.

Miller, E. M. (1977): “Risk, uncertainty, and divergence of opinion,” The Journalof finance, 32, 1151–1168.

Morck, R., B. Yeung, and W. Yu (2000): “The information content of stock mar-kets: why do emerging markets have synchronous stock price movements?” Journalof financial economics, 58, 215–260.

Newman, M. (2010): Networks: an introduction, Oxford university press.

Odean, T. (1999): “Do Investors Trade Too Much?” American Economic Review,89, 1279–1298.

Ozgur, A., B. Cetin, and H. Bingol (2008): “Co-occurrence network of reutersnews,” International Journal of Modern Physics C, 19, 689–702.

Peng, L. and W. Xiong (2006): “Investor attention, overconfidence and categorylearning,” Journal of Financial Economics, 80, 563–602.

Peng, L., W. Xiong, and T. Bollerslev (2007): “Investor Attention and Time-varying Comovements,” European Financial Management, 13, 394–422.

Rapach, D., J. Strauss, J. Tu, and G. Zhou (2015): “Industry interdependenciesand cross-industry return predictability,” Working Paper.

Rapach, D. E., J. K. Strauss, and G. Zhou (2010): “Out-of-sample equitypremium prediction: Combination forecasts and links to the real economy,” Reviewof Financial Studies, 23, 821–862.

Rubin, A., B. Segal, and D. Segal (2017): “The interpretation of unanticipatednews arrival and analysts skill,” Journal of Financial and Quantitative Analysis, 52,1491–1518.

Scherbina, A. and B. Schlusche (2015): “Economic linkages inferred from newsstories and the predictability of stock returns,” Working Paper.

Solomon, D. H., E. Soltes, and D. Sosyura (2014): “Winners in the spot-light: Media coverage of fund holdings as a driver of flows,” Journal of FinancialEconomics, 113, 53–72.

Tetlock, P. C. (2007): “Giving content to investor sentiment: The role of media inthe stock market,” The Journal of Finance, 62, 1139–1168.

46

Tetlock, P. C., M. Saar-Tsechansky, and S. Macskassy (2008): “More thanwords: Quantifying language to measure firms’ fundamentals,” The Journal of Fi-nance, 63, 1437–1467.

Yu, Y. (2015): “Market-wide attention, trading, and stock returns,” Journal of Fi-nancial Economics, 116, 548–564.

Zhang, J. L., W. K. Hardle, C. Y. Chen, and E. Bommes (2016): “Distillationof news flow into analysis of stock reactions,” Journal of Business & EconomicStatistics, 34, 547–563.

47

media network based investors’ attention: a powerful...

Documents