measuring segmentation in the financial news market · measuring segmentation in the financial news...

57
Measuring Segmentation in the Financial News Market * Harm H. Sch¨ utt Tilburg School of Economics and Management Tilburg University November 1, 2018 Abstract This study examines the extent of financial news market segmentation. I propose that fi- nancial events leave room for interpretation, allowing news outlets to differentiate and target audiences with different levels of financial sophistication and dispositional optimism. I de- velop a probabilistic model to infer these unobservable audience characteristics from earnings announcement coverage and find economically significant audience heterogeneity. Consistent with this heterogeneity reflecting differences in earnings news interpretations, a larger dif- ference in audiences exposed to an earnings announcement is associated with significantly higher trading volume and return volatility after the announcement. Keywords: media coverage, company earnings, information dissemination, trading volume, dif- ferences in beliefs JEL Classification: M41; G14; L10; D83 * I gratefully acknowledge support and funding from the German Research Society. I thank Eric Allen, Brian Ayash, Matthias Breuer, Alissa Br¨ uhne, Patricia Dechow, Joachim Gassen, Katharina Hombach, Martin Jacob, Maximilian M¨ uller, Jeffrey Ng (discussant), Martin Nienhaus, Jan Riepe, Anna Rohlfing-Bastian, Thorsten Sellhorn, workshop participants at the Frankfurt School of Finance and Management, University of Bern, University of Cologne, Erasmus University of Rotterdam, Humboldt University, University of G¨ ottingen, University of Paderborn, Tilburg University, University of T¨ ubingen, WHU — Otto Beisheim School of Management, and participants at the EAA meeting in Milan for valuable comments. I thank Christopher Paciorek for his patience and introduction to Bayesian methods. I thank Konstantin Ignatov for exceptional research assistance. Any remaining errors are my own. A significant part of the study was written during my visit to the Haas Business School, Berkeley, CA, in 2016. An earlier version of this manuscript was circulated under the title: ‘Competition in Financial News Markets and Trading Activity’. The Online Appendix can be found at https://hschuett.github.io/mediacomp-paper/. Send correspondence to [email protected].

Upload: others

Post on 26-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Measuring Segmentation in the Financial News Market∗

Harm H. SchuttTilburg School of Economics and Management

Tilburg University

November 1, 2018

Abstract

This study examines the extent of financial news market segmentation. I propose that fi-

nancial events leave room for interpretation, allowing news outlets to differentiate and target

audiences with different levels of financial sophistication and dispositional optimism. I de-

velop a probabilistic model to infer these unobservable audience characteristics from earnings

announcement coverage and find economically significant audience heterogeneity. Consistent

with this heterogeneity reflecting differences in earnings news interpretations, a larger dif-

ference in audiences exposed to an earnings announcement is associated with significantly

higher trading volume and return volatility after the announcement.

Keywords: media coverage, company earnings, information dissemination, trading volume, dif-ferences in beliefs

JEL Classification: M41; G14; L10; D83

∗I gratefully acknowledge support and funding from the German Research Society. I thank Eric Allen, BrianAyash, Matthias Breuer, Alissa Bruhne, Patricia Dechow, Joachim Gassen, Katharina Hombach, Martin Jacob,Maximilian Muller, Jeffrey Ng (discussant), Martin Nienhaus, Jan Riepe, Anna Rohlfing-Bastian, Thorsten Sellhorn,workshop participants at the Frankfurt School of Finance and Management, University of Bern, University ofCologne, Erasmus University of Rotterdam, Humboldt University, University of Gottingen, University of Paderborn,Tilburg University, University of Tubingen, WHU — Otto Beisheim School of Management, and participants atthe EAA meeting in Milan for valuable comments. I thank Christopher Paciorek for his patience and introductionto Bayesian methods. I thank Konstantin Ignatov for exceptional research assistance. Any remaining errors aremy own. A significant part of the study was written during my visit to the Haas Business School, Berkeley, CA, in2016. An earlier version of this manuscript was circulated under the title: ‘Competition in Financial News Marketsand Trading Activity’. The Online Appendix can be found at https://hschuett.github.io/mediacomp-paper/. Sendcorrespondence to [email protected].

Page 2: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

1 Introduction

The media plays a central role in financial markets due to its information processing and dissem-

ination function. However, the media is best described as a market for news. Processing and

dissemination by media outlets is not random, but the result of competition for profits. Such com-

petition can lead to a segmented market, where news outlets differentiate themselves by selecting

and framing news stories in accordance with specific beliefs (Mullainathan and Shleifer, 2005;

Gentzkow and Shapiro, 2006, 2011). The notion of news outlets catering to beliefs is of special

interest in financial markets, because it would not only reflect but possibly exacerbate differences

in investors’ beliefs. As differences in investor beliefs determine stock market outcomes, such as

trading activity (e.g., Kandel and Pearson, 1995; Hong and Stein, 2007; Banerjee and Kremer,

2010) or overpricing (e.g., Miller, 1977; Diether, Malloy, and Scherbina, 2002; Hong and Sraer,

2016), it is important to further our understanding of the information transmission dynamics that

influence belief formation.

However, in spite of its potentially significant role, the extent of financial news segmentation

has remained an open issue so far. The reasons are twofold. First, in contrast to political news,

there are theoretical arguments for and against financial news segmentation (Mullainathan and

Shleifer, 2005; Gentzkow and Shapiro, 2006). There is no segmentation, if financial information

has little room for interpretation and the news market is competitive. In this case, slant (framed

and selective reporting) is quickly uncovered and penalized (Gentzkow and Shapiro, 2006). If, on

the other hand, financial information leaves room for interpretation and investors prefer news that

agree with their own beliefs, then segmentation arises: News outlets offer a differentiated product

by slanting their news coverage towards their target audience’s beliefs (Mullainathan and Shleifer,

2005; Gentzkow and Shapiro, 2006).

Second, good measures of financial news segmentation have been lacking so far. Especially if

the interest is in the relation between differences in investors beliefs and news segmentation, then

measures of news audience beliefs for a broad range of outlets are necessary. However, an outlet’s

audience is typically unobservable, which has prevented researchers from examining financial news

segmentation and its consequences.

1

Page 3: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

This study develops and employs an approach to estimate unobserved audience characteristics

in order to overcome this issue. The underlying idea is that if a news outlet caters to its audience,

then its writing and selection of news events must be distinctive enough to attract said audience.

If this is the case, one can reverse engineer the unknown (latent) audience characteristics from

observable coverage choice (ie. which news event the outlet covers) and content choice (how it

frames its chosen stories). The rationale is roughly similar to that of LDA topic models, which infer

unknown text topics from observed word and document frequencies (Blei, 2014). Crucially, both

coverage and content choices need to be incorporated, because an audience’s beliefs will already

influence what the audience (does not) want to read. Thus, ignoring this pre-selection might

result in biased estimates. In addition, incorporating both sources of information about audience

characteristics will lead to more efficient estimation. This is achieved by formulating a Bayesian

hierarchical model that describes a news outlet’s coverage and content of earnings announcements

(EAs) as a joint function of two unobservable audience characteristics and various observable

firm characteristics. The two audience characteristics are labeled financial sophistication and

dispositional optimism. Dispositional optimism represents a stable expectancy regarding future

outcomes, such as a permanently bullish or bearish disposition (Puri and Robinson, 2007; Sharot,

Korn, and Dolan, 2011). Financial sophistication represents general financial knowledge. Both

reflect fundamental determinants of beliefs1 and have straight-forward, inferable relations with

an outlet’s coverage choice (e.g., more financially sophisticated audiences will demand more news

about less visible stocks) and content choice (i.e. optimism affecting the tone of the news piece

and sophistication affecting the amount of financial jargon).

The model is fit using online news about company earnings. Three features make online

coverage of EAs an opportune setting to examine the extent of financial news segmentation. First,

the online news market is large and highly competitive with low barriers to entry. Second, earnings-

related news are of major interest to investors and highly relevant for belief formation (Kandel

and Pearson, 1995; Huberman and Regev, 2001; Busse and Green, 2002; Tetlock, Saar-Tsechansky,

and Mackassy, 2008; Engelberg, Sasseville, and Williams, 2012). Thus, demand for earnings news

1Fundamental in the sense that they also affect other characteristics, such as risk aversion (e.g., Kuhnen andMiu, 2017).

2

Page 4: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

is more likely to be driven by its information rather than its entertainment value.2 Third, earnings

information is easily accessible directly rather than through the media. Thus, investors can in

principle compare the content of news coverage with the original earnings releases, which would

make the detection of slant relatively costless in the absence of room for interpretation (Solomon,

2012).

To fit the model I constructed a new, comprehensive sample of online EA news. The sample

consists of 461,216 online posts by 2,128 outlets (websites) about 15,502 EAs of Russell 3000 firms

from April 2015 to November 2016. Out of the 2,128 outlets, 212 have a clear finance/investment

focus and produce 161,408 posts about 11,408 EAs. Both samples show substantial heterogeneity

in outlets’ audience characteristics. Even for the investment-only sample, the estimated variation

in outlet-specific dispositional optimism is ca. 75% of the variation that is EA-specific (0.4 vs 0.53

in standard deviation estimates). Variation in outlet-specific financial sophistication far outweighs

its EA-specific counterpart (1.29 vs. 0.16 in standard deviation estimates).

To validate that the estimated variation in outlet characteristics relates to differences in investor

beliefs, I examine the relation between dispersion of attending audiences and abnormal trading

volume after EAs. Disagreements give rise to trading opportunities and thus higher volume (Varian,

1985, 1989; Kandel and Pearson, 1995; Hong and Stein, 2007; Banerjee and Kremer, 2010; Atmaz

and Basak, 2017). Results are consistent with variation in news outlet characteristics reflecting

differences in investor beliefs. I find that dispersion in audience characteristics among outlets

covering an EA is positively associated with post-EA trading activity while controlling for a firm’s

newsworthiness (via firm fixed effects and controls for the size of the attending audience, return

volatility, firm size, earnings surprise, and press release tone). An increase in the dispersion

of attending audiences’ negativity by one standard deviation is associated with an increase in

abnormal volume of around 7% of normal volume. A one standard deviation increase in the

dispersion of attending audiences’ sophistication corresponds to an increase in abnormal volume of

about 10%. As a comparison, (Peress, 2014) documents a reduction of 12% of volume on newspaper

strike days. A similar albeit weaker link exists between belief dispersion and volatility. This is

2In contrast, other types of company news (e.g., news about board room in-fighting, CEO succession, productlaunches, etc.) might also be read for a certain amount of entertainment.

3

Page 5: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

because volatility reflects the frequency and magnitude of the change in average belief among

investors. Consistent with this weaker link, I find an association between negativity dispersion

among covering outlets and return volatility, while there is no strong evidence of a relation between

dispersion in financial sophistication among covering outlets and volatility.

The results are robust to several alternative explanations and robustness tests. A possible

alternative explanation is that outlet owners influence coverage decisions. However, I find con-

siderable variation across outlets with the same owner, which is consistent with similar findings

of Gentzkow and Shapiro (2010)). Another alternative is that negativity and complexity capture

the degree of journalistic quality of the outlet. Inconsistent with this explanation, an outlet’s

negativity and complexity are both negatively related to proxies of journalistic quality (size of the

website audience and average news post length).

Robustness tests give further support for the model. Inferences are naturally misleading if

the model is misspecified. To assess the model’s fit, I perform simulations and out-of-sample

prediction tests. In both cases the model performs better than alternatives. As the model is a

generative model, it can be used to simulate how the EA news sample would look like, if the

model is correct. I compare characteristics (e.g., average news post tone, percentage of covered

EAs, standard deviation in outlet tone) of 1,000 simulated samples to the characteristics of the

real sample. These checks do not reveal striking misspecifications and a generally better fit than

alternative models. Furthermore, the model with outlet characteristics substantially outperforms

a model without audience parameters in joint out-of-sample predictions of EA coverage, tone, and

complexity. The same holds for out-of-sample predictions of firms being mentioned in market

summary posts.

This study contributes to the literature on information diffusion in financial markets in three

ways. First, there exists a large number of media studies examining the causal effects of news

coverage on information transmission (e.g., Bushee, Core, Guay, and Hamm, 2010; Engelberg

and Parsons, 2011; Drake, Guest, and Twedt, 2014; Peress, 2014; Hillert, Jacobs, and Muller,

2014; Martin and Yurukoglu, 2017; Lawrence, Ryans, Sun, and Laptev, 2018). In contrast, there

is little evidence on what drives financial news coverage such as the incentives and potential

4

Page 6: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

biases that arise in the media market (Engelberg, 2018). Miller (2006) finds that business-oriented

press is more likely to undertake original analysis while non-business periodicals focus primarily

on rebroadcasting. Gurun and Butler (2012) find positive slant that is driven by local firms’

advertising rather than demand in local news papers from 2002 till 2006. They investigate all firm

news, not just financial news. Ahern and Sosyura (2014) find evidence in M&A news coverage that

is suggestive of firms encouraging positive coverage of share deals in a short time window beneficial

for the share exchange ratio. Drake, Thornock, and Twedt (2017) examine whether the amount of

online coverage from more professional reporters is associated with more positive capital market

effects. And while Gentzkow and Shapiro (2010) provide compelling evidence of slant in political

news coverage being driven by political belief heterogeneity, it is not clear ex-ante whether similar

mechanisms are at work in the financial news market. Gentzkow and Shapiro (2006) argue that

financial information provides less room for interpretation and disagreement than political news.

Thus, the existence of segmentation in today’s competitive online financial news market with its

low barriers to entry is an open question. I contribute to the literature by providing evidence

of sizable and predictable outlet-specific variation in earnings news coverage that is significantly

related to trading volume as well as return volatility. This pattern is consistent with the financial

news market catering to differences in beliefs rather than reducing them (e.g., Veldkamp, 2006; Li,

Ramesh, and Shen, 2011; Twedt, 2015), adding further pieces to the puzzle of a “more complete

theory of the role of the media in financial markets” (Miller and Skinner, 2015, p. 232).

Second, the results add to the growing literature on the consequences of differences in beliefs

for market outcomes. Apart from a wide theoretical literature examining the relation of differences

in beliefs with asset prices, volume, and volatility (e.g., Hong and Stein, 2007; Xiong and Yan,

2009; Banerjee and Kremer, 2010; Atmaz and Basak, 2017), recent studies have started to examine

the information acquisition consequences of investors’ heterogeneous beliefs in more detail. For

example, Han, Lu, and Zhou (2017) find that disagreement among investors in one firm has spillover

effects on the pricing of other stocks owned by the same investors. In terms of management

reactions to heterogeneous beliefs, Huang and Thakor (2013) find that firms are more likely to

strategically buy back shares when the level of investor-management agreement is lower. As

5

Page 7: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

investors disagreeing with management are more likely to tender their shares, agreement improves

as a consequence. Thakor (2015) argues that “more valuable firms, and those whose strategies

investors are more likely to agree with, disclose less information in equilibrium.” To this literature

I add evidence that the structure of the financial media market is a function of differences in

beliefs as well. Given that news coverage guides investor attention (DellaVigna and Pollet, 2009;

Ben-Rephael, Da, and Israelsen, 2017), the documented significant variation in news audience

characteristics and its relation to trading volume provides important insights for future theories of

investors’ information acquisition strategies and firms’ disclosure strategies.

Third, since target audience characteristics are typically unobservable, I further contribute

to the emerging literature employing structural estimation approaches to accounting and finance

settings. The developed model offers a flexible, robust approach to inferring target audiences of

news outlets from multiple observable actions, such as the decision which EA to cover and how

to frame the content. Because of the amount of parameters to be estimated (e.g., two latent

parameters for each outlet and two for each EA), there is a concern that estimates of variation are

easily confounded by noise or sparsely distributed data (e.g., Gelman, 2014; Gentzkow, Shapiro,

and Taddy, 2016). To avoid this, I use a Bayesian approach, which provides a methodological

innovation through regularization of estimates via conservative priors. This approach can also be

applied to other settings, such as inferring heterogeneous analyst or CEO characteristics.

The paper proceeds as follows. Section 2 lays out the theory on financial news market dynamics.

Section 3 describes the sample and the data used in the empirical tests. Section 5 derives and

discusses the probabilistic model for inferring target audiences. Section 6 describes the main

empirical tests and presents results. Section 7 presents robustness tests to evaluate the probabilistic

model. Section 8 concludes.

6

Page 8: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

2 Theory development

2.1 Heterogeneity in news content and beliefs

One necessary condition for news segmentation is that financial information leaves room for inter-

pretation. This can best be explained by an example. On October 14, 2015, Netflix announced its

third-quarter results, significantly missing consensus estimates:3

While global growth was as we expected, our forecast was high for the US and low for inter-

national. We added 0.88 million new US members in the quarter compared to 0.98 million

the prior year and a forecast of 1.15 million. Our over-forecast in the US for Q3 was due

to slightly higher-than-expected involuntary churn (inability to collect), which we believe was

driven in part by the ongoing transition to chip-based credit and debit cards. (NFLX 8-K,

filed October 14, 2015)

Unsurprisingly, news coverage was negative, on average. However, there was still noticeable

variation in sentiment and financial sophistication across news posts. Many posts remained neutral,

simply summarizing the press release:

The ... service added 3.62 million subscribers during the three months ended September, it

announced Wednesday .... That’s slightly more than the company had predicted. But Netflix

didn’t gain as many U.S. subscribers during the latest quarter as management anticipated, a

shortfall that it blamed on an unusually large number of accounts canceled because the company

couldn’t charge their credit cards. (mercurynews.com, Exhibit 1)

However, both negative and positive coverage could be found. Also, the level of financial

sophistication varied. For example, zerohedge.com posted:

Not only did the company miss Q3 results, but it also guided far lower than expected, and is

now expecting 1.65 million domestic streaming subs, below the 1.81 million estimate. And

while the company added a solid 2.7 million international subs, NFLX continues to burn a

ton of cash here, with the contribution margin on international streaming now at -13.1%.

And there will be much more losses. (zerohedge.com; Exhibit 2)

In contrast, commentary at techradar.com found a positive note in the price hike announced

and a positive explanation for high cash use:

3Full article text for all quoted examples can be found in Online Appendix A.

7

Page 9: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

In its third quarter earnings report, Netflix explained again why it increased the price of its

most popular high-definition 2-screen plan by $1, and it could be because you asked for it.

According to a letter to shareholders, the price hike will ‘improve our ability to acquire and

offer high quality content, which is the number one member request’. . . . And because of

all this “high quality content,” Netflix reported a lower free cash flow total than the second

quarter, saying it was due to the “intensity of our investment in originals.” (techradar.com,

Exhibit 3)

Both techradar.com and zerohedge.com picked up on the negative free cash flow numbers, but

one viewed it as an investment and the other as an expense due to low profitability. Indeed, the

same information can be interpreted differently based on what the prior belief is about future

profitability. More formally, assume investors want to form beliefs about the probability (ps) that

Netflix’s strategy pays off in the future. Assume further that the EA consists of m clues about

the strategy’s success chance. For now, consider the case where there is no room for interpretation

and clues can either be positive or negative. There is only uncertainty about the probability of

future success. We can think of the amount of positive clues y to follow a binomial distribution

with probability ps of each clue being positive y ∼ Bin (m, ps). Assuming for illustration purposes

that priors beliefs about ps follow a beta distribution p (ps) ∼ beta(α, β), then the posterior belief

about ps is p (ps|y,m) ∼ beta (α + y, β +m− y).4 Fig. 1 Panel A illustrates how investors update

their beliefs about ps for y = 6 positive out of m = 15 clues. 6 out of 15 clues can plausibly arise

from various underlying probabilities to a greater or lesser degree (as depicted by the top middle

plot that shows the likelihood of observing 6 out of 15 positive clues for a given probability ps).

The top right plot shows that, depending on how positive or negative beliefs are, the posterior

beliefs about ps have converged but are still distinct.

Consider now the case of room for interpretation (e.g., Netflix’s negative cash flow being a

sign of productive investments or cash burn). Rather than everyone agreeing that there a y = 6

positive clues, some of the clues are assumed open for interpretation. Fig. 1 Panel B illustrates the

situation where the pessimist interprets four clues as positive and the optimist eleven. Compared

to the situation in Panel A, the posterior beliefs about ps remain sharply distinct. What’s more,

4Because the beta distribution is the conjugate prior distribution to the binomial distribution, which impliesthat the resulting posterior is a binomial distribution again.

8

Page 10: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

both beliefs have become stronger, since both investors have ”seen” belief-confirming data and

var(ps) = αβ/ ((α + β)2(α + β + 1)) has decreased for both.

Thus, as long as the information in the EA leaves room for interpretation, differences in prior

beliefs will not fully converge, can persist for a while, and can even be reinforced.5 For example,

one year later, at the next Q3 EA, subscriber growth at Netflix accelerated and zerohedge.com

wrote:

Netflix Soars 20% As Subscribers Smash Expectations But Cash Burn Explodes. ... But

while the subscriber growth was admirable, the biggest problem facing NFLX, its cash burn

just went into overdrive in Q3, with the company reporting that cash burn for the quarter

doubled from $254 million to over half a billion, or $504 million, most of it the result of the

spike in original produced content. ... For now, however, the reason why the shorts are being

massively squeezed is due to the return to a “growth” mode, one which cost the company over

half a billion in cash in the quarter, and the immediate result is a 20% surge in the company

stock after hours. At some point the relentless cash burn will matter, but not right now.

(zerohedge.com; Exhibit 4)

Naturally, the question arises why differences in prior beliefs exist in the first place. Going

back as far as Savage (1954), at least two reasons are discussed in the literature. The first can be

described as financial sophistication, which determines the amount of attention and information

processing errors (see, e.g., Brandenburger, Dekel, and Geanakoplos (1992); Kim and Verrecchia

(1994), who showed this in a formal sense). The second is dispositional optimism, a generally

stable positive or negative attitude toward expectancies regarding future outcomes (see Carver

and Scheier (2014) for a review).

Financial sophistication expresses itself in two ways: the level of financial literacy and access to

other sources of information. Both forms affect information processing and thus belief formation.

For example, retail investors (usually less sophisticated) have a lower level of financial literacy,

hold concentrated portfolios and are less well informed than institutional investors (Barber and

Odean, 2013; Hendershott, Livdan, and Schurhoff, 2015). They do not have direct access to analyst

research and their main source of information is financial news. Consistent with this, Ben-Rephael

5Hales (2007) provides experimental evidence of such behavior based on motivated reasoning. In a similarscenario, Barron, Byard, and Yu (2017) observe that analyst’s private information increases if certain reportingitems are disclosed, while their common information increases for others. This would be consistent with differentdegrees of room for interpretation for different items.

9

Page 11: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

et al. (2017) find that, while news coverage of a stock draws the attention of both institutional and

retail investors, retail attention is heightened for a prolonged period, whereas institutional attention

quickly reverts to prior levels. Importantly, financial sophistication not only affects beliefs, but

also yields predictions of how news can be presented depending on a chosen target audience. Less

financially literate investors with the media as their main source of information prefer news that

summarize and explain market and stock movements in a manner that is easy to understand. On

the other hand, such news coverage would not be valued highly by more sophisticated investors

with access to other information channels, since those investors would prefer more detail and

broader coverage.6

Similar arguments relate to the second driver of prior beliefs: dispositional optimism (Puri

and Robinson, 2007; Sharot et al., 2011; Carver and Scheier, 2014; Angelini and Cavapozzi, 2017;

Kuhnen and Miu, 2017). Dispositional optimism is a cognitive construct and concerns time-

stable expectancies regarding future outcomes, such as an unconditional bullishness, bearishness,

or general contrarian tendency against “hype” stocks (Hales, Kuang, and Venkataraman, 2011).7

Just as with sophistication, an audience’s dispositional optimism can be used to tailor news. Any

outlet can selecting EAs and frame its coverage to reflect a relatively more pessimistic or optimistic

outlook. Whether such tailoring actually occurs is the question of interest and depends on the

news market dynamics.

2.2 News market dynamics

Given differences in prior beliefs, the question arises whether the news market exploits those dif-

ferences. A number of media studies assume that news outlets have an incentive to do so because

readers prefer coverage consistent with their beliefs (George and Waldfogel, 2003; Hamilton, 2004;

6Analyzing a related question, Drake et al. (2017) examine whether the amount of online coverage from moreor less professional reporters is associated with more or less efficient pricing. The idea being that less sophisticatedcoverage from non-professional sources such as personal blogs and forums correlates with noise trading.

7Recent household finance data provide some evidence: for example, Angelini and Cavapozzi (2017) find evidenceconsistent with the magnitude of dispositional optimism influencing financial decisions. Using survey data, they findthat optimism is positively related to stock ownership as well as the share of gross financial wealth invested in theseassets, controlling for cognitive skills, personality traits, and risk aversion. Kuhnen and Miu (2017) use experimentaland household survey data and find that individuals with lower socioeconomic status form more pessimistic beliefswhen learning about the distribution of stock returns and are less likely to invest in stocks when these investmentsare likely to have good outcomes.

10

Page 12: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Mullainathan and Shleifer, 2005; Gentzkow and Shapiro, 2008, 2010; Seamans and Zhu, 2017).

Mullainathan and Shleifer (2005) and Gentzkow and Shapiro (2006) provide two distinct theories

explaining such affirmative preferences. Mullainathan and Shleifer (2005) model the market for

(political) news as a form of product placement. They assume that readers incur “psychological

costs” if they receive news contrary to the own beliefs. Although not stated explicitly by Mul-

lainathan and Shleifer (2005), such psychological costs can arise because of: (1) confirmation bias

(Nickerson, 1998; Pouget, Sauvagnat, and Villeneuve, 2017), that is, preferring news confirming

one’s own beliefs; (2) investors’ overconfidence (Daniel and Hirshleifer, 2015),8 leading them to

overweight their own beliefs; and (3) strategic ignorance, that is, avoiding information sources

that contradict one’s beliefs for fear of demotivating oneself (Benabou and Tirole, 2016; Golman,

Hagmann, and Loewenstein, 2017).9 Because of readers’ preferences for confirmatory news, out-

lets try to differentiate themselves by slanting (framing and selectively reporting) news. In this

scenario, the magnitude of slant is always an increasing function of competition among outlets.

In contrast, Gentzkow and Shapiro (2006) model the market for news as a strategic game

between outlets for revenues from readers who are Bayesian learners. In this game, readers do

not have an inbuilt preference for confirmatory news. Instead, slant emerges because readers,

being unsure about the quality of an outlet, tend to rationally view news disagreeing with their

strong prior beliefs as a signal of lower journalistic quality. Being perceived as being of lower quality

damages the reputations of news outlets, creating incentives for them to slant news content toward

the beliefs of their readers. At the same time, competition among outlets increases the chance of

readers being exposed to the “true” news and being caught slanting news hurts outlet reputation.

Thus, if news are easy to interpret, competition reduces the incentives to slant.

What is common to both theories is that readers will be inclined towards news in line with their

prior beliefs about a stock’s value. A key difference between them is the nature of news and the

resulting effect of news market competition: Is the information unambiguous or open to interpre-

8Overconfidence is often attributed as one cause of the active investment puzzle among individual investors(Odean, 1999). Overconfidence has even been found to affect the decisions of professionals such as CFOs and CEOs(Ben-David, Graham, and Harvey, 2013; Malmendier and Tate, 2015).

9For example, Oster, Shoulson, and Dorsey (2013) and Ganguly and Tasoff (2016) have recently found that,irrationally, many at-risk patients do not want to participate in free, accurate, and anonymous tests for Huntington’sdisease or HIV, respectively.

11

Page 13: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

tation? While competition always increases outlet incentives to differentiate in Mullainathan and

Shleifer (2005), Gentzkow and Shapiro (2006) require information to have room for interpretation

for slant to be prevalent in a competitive news market. In this case, the “correct” interpretation

is often difficult to identify ex ante. The question is thus how much room for interpretation is in

financial news? As the Netflix example illustrates, it can be argued that even EAs often contain

information that one can reasonably disagree about. Consequently, a segmented financial news

market is a theoretical possibility.

2.3 The relation between news market dynamics and trading activity

A link between news market dynamics and prior beliefs also suggests a link between news market

dynamics and trading activity. A long literature explicitly links trading patterns to belief hetero-

geneity (e.g, Varian, 1989; Kim and Verrecchia, 1994; Kandel and Pearson, 1995; Hong and Stein,

2007; Banerjee and Kremer, 2010). The argument is that, absent liquidity motives, two investors

must disagree on the value of a stock for a trade to occur. The investor with the higher valuation is

willing to buy and the other is willing to sell. In this case, there is a: “... potential channel through

which media coverage can matter for the stock market: if one thinks of the arrival of public news as

creating the raw fodder for disagreement, then increases in the intensity of media coverage can act

as a direct stimulus to trading.” (Hong and Stein, 2007, p.119). Empirical evidence is consistent

with this claim (e.g., Gurun and Butler, 2012; Ben-Rephael et al., 2017).

In contrast to prior literature, I argue that news coverage not only influences disagreement,

but that (prior) disagrement influences news coverage itself. If outlets cater to a certain subgroup

of readers (e.g., unsophisticated but skeptical readers), the prior beliefs and thus news demand of

their audience will determine which EAs an outlet covers and how it covers them. What’s more,

such news segmentation can reinforce rather than reduce existing beliefs, as illustrated in Fig. 1

Panel B. This increases the potential for disagreements further as investors are less exposed to

opposing view points (Mullainathan and Shleifer, 2005; Gentzkow and Shapiro, 2006, 2010).

Larger differences in attending news audiences of EAs should thus be associated with higher

trading volume after the EA. This provides a way to investigate whether heterogeneity in outlet

12

Page 14: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

characteristics is consistent with differences in prior beliefs. A similar argument holds for return

volatility. However, the link between news dynamics and volatility is more subtle, since volume is

driven by differences in beliefs, whereas prices are driven by the average belief (or interpretation).

Hence, return volatility “depends on the variance of the average interpretation” (Banerjee and

Kremer, 2010, p. 1285). However, if differences in prior beliefs are large, an information event is

more likely to cause belief adjustments among investors that also change the average belief. Hence,

one would also expect greater volatility in the presence of large disagreements.

3 Data

4 Sample composition

The data used to investigate belief-driven segmentation comes from a large, newly created sample

of EA-related online news posts published between April 2, 2015, and November 30, 2016. I

use the Spinn3r Firehose API, which continuously indexes and crawls weblogs, mainstream news,

and social media sites.10 The service provides more than 10 TB of crawled content per month and

contains a comprehensive set of mainstream online news and Twitter, forum, and blog posts across

the entire spectrum of news topics. The advantage of Spinn3r stream data over other data sets is

that Spinn3r provides the raw content of a news post in html format (after removing boilerplate

text). This raw content is crucial for a number of analysis steps. Most importantly, it allows for

the extraction of the news outlets’ writing styles and application of the word lists of Loughran and

McDonald (2011).

I extract the following information: date of publication (date), source (src lnk), permalink

(prmlnk), html content (html), raw text of the content (cntnt), type of news post (src type, i.e.,

whether a blog, mainstream news, forum, or Twitter post), author name (authr), and title (title)

of the post, if provided by Spinn3r.11 I exclude forum and Twitter posts. I identify news posts

relating to EAs of Russell 3000 companies as well as market summary posts using a specially

trained classifier whose construction is detailed in Online Appendix B. Only news about firms that

10 See http://www.spinn3r.com/.11 Exhibit 5 in Online Appendix B lists the variables available for mainstream news items in the Spinn3r Firehose.

13

Page 15: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

have been members of the Russell 3000 Index based on the index list from June 2014 is considered.

I define news outlets based on the main domain of each post’s url. This means that, for domains

such as wallstreet.blogs.fortune.cnn.com, the url is shortened to cnn.com. In some cases, an outlet’s

main website has several aliases. For example, sleekmoney.com and financial-market-news.com are

aliases and redirect to marketbeat.com. For each url, I check for redirects and group the main

domain and all aliases under one domain name.

Table 1 provides a detailed description of the sample cleaning process. The full sample consists

of 461,216 announcement-related posts about 15,502 EAs from 2,128 different news outlets. I use

this sample to provide general descriptive evidence regarding the distribution and diffusion of firm

earnings-related information across the Internet. Based on the full sample, I define a subsample of

news posts only from news outlets with a clear finance/investment theme and about EAs for which

financial data are available from Compustat and similar sources. I label as investment-focused

outlets all news outlets whose Amazon Alexa web information service categories contain the words

investing, equities, markets, business, finance, investments, and economy. Those domains for which

the Alexa service does not provide categories were hand-checked and assigned the appropriate

category. Finally, I include all large-audience general-purpose news sites, such as cnn.com, since

they are a large source of investment-related information. This sample is used later to fit the

model and infer each sample website’s unknown audience characteristics. The advantage of using

the investment-outlets only sample is that slant should mainly be driven by differences in beliefs

rather than, for example, a bias towards local firms such as in local newspapers.

The necessary financial data for the subsample are drawn from the Compustat, Center for

Research in Security Prices, Institutional Brokers’ Estimate System, and Thompson Reuters Insti-

tutional (13F) Holdings databases. They provide the EA characteristics that are necessary for the

coverage model, Eq. (5). Table 2 presents the definitions and construction of all variables in the

tests. The main dependent variables are defined as follows: Ij,i is a binary indicator equal to one if

news outlet i writes at least one news post longer than 100 words about EA j and zero otherwise.12

yneg is the number of words with a negative connotation and ycplx is the number of financial words

12I chose 100 words as the minimum to be informative about a typical quarterly EA.

14

Page 16: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

with more than three syllables. The latter is used to proxy for the level of financial sophistication

that the news post targets (Loughran and McDonald, 2011). Both yneg and ycplx are constructed

using the financial word lists compiled by Loughran and McDonald (2011) and downloaded from

Bill McDonald’s word list page.13

4.1 Descriptive statistics

The full sample provides a unique opportunity to study how earnings information is represented

in the overall media landscape. Table 3 provides descriptive statistics for various news outlet

characteristics by outlet topic. The comparison between investment-focused and non–investment-

focused outlets highlights three stylized facts about variation in general EA coverage. First, most

EA-related news posts appear in non-finance/investment news outlets. These are largely regional

news sites or industry-specific business sites such as metal.com.14 Second, there is sizable variation

in outlet characteristics. UniqueVisitors is the average number of daily unique visitors (in millions)

to the outlet’s website during October 2016, as estimated by Amazon’s Alexa web information

service. The distribution of unique visitors is highly skewed, with a few outlets clearly dominating

in terms of audience reach. Perc.NegWords and Perc.ComplexWords are the proportions of words

in a news post from the word lists of Loughran and McDonald (2011) for negative and complex

words respectively, averaged over all news posts by an outlet. The average percentage of negative

and complex words is consistent with other financial texts. For example, the average management

discussion and analysis section of a 10-K filing contains 1.5% negative and 0.83% positive words

when measured using the financial word lists (Loughran and McDonald, 2011, Table 2). Variation

in Perc.NegWords and Perc.ComplexWords is broadly similar between outlets with and without

a finance/investing focus. This is consistent with fundamental information being a first-order

determinant of content. Articles/EA is the average number of articles an outlet writes about

an EA. The median of 253 for TotalWords (average number of words in EA-related news posts)

suggests that a significant number of outlets issue short summaries, similar to newswire alerts

– sometimes automatically generated. Third, given that the sample comprises seven earnings

13http://sraf.nd.edu/data/.14See Online Appendix C for the list of websites comprising the samples.

15

Page 17: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

seasons, the median number of covered EAs is low.l EAs/Outlet is the number of EAs covered by a

news outlet during the sample period, with a median of 14 EAs for non–investment-focused outlets

and 39.5 EAs for investment-focused outlets. Both groups of outlets have sizable variation around

those low numbers. This result could suggest that immediate coverage of an average EA does

not always have a high priority for all investment-focused outlets, even though EAs are generally

regarded as the most important recurring firm-specific information events.

5 Estimating unobserved audience characteristics

To assess the magnitude of segmentation in the sample, the unobservable characteristics of each

outlet’s target audience need to be inferred based on the sample’s news posts. If news outlets cater

to stable characteristics, then one can identify those from observed coverage and content decisions.

This section lays out the probabilistic model to do so.

I assume that news source i’s target audience can be described by two unobserved characteris-

tics: its dispositional optimism (such as a generally bullish or bearish attitude) (Puri and Robinson,

2007; Sharot et al., 2011) and its financial sophistication. Because of competitive forces, the op-

timal strategy for any news outlet i is to target an audience by positioning itself optimally with

respect to its competitors. In terms of the model, news outlet i optimally chooses two parame-

ters {θi,neg, θi,cplx} = θi that mark the optimism (operationalized as negativity) and sophistication

(complexity of writing) of the target audience. News outlet i’s choice of audience is inferable from

which EAs it decides to write about and the accompanying content. As Fig. 2 shows, EAs cluster

over time, with sometimes 300 or more EAs on the same day. The clustering makes it important for

outlets to choose which EAs to cover. The choice depends on the target audience θi and EA/firm

characteristics (denoted Xj) that make the announcement appealing to this particular audience. In

addition to choosing which EAs to cover, news outlets write the content to cater to their audience

(Hamilton, 2004; Gentzkow and Shapiro, 2010). Thus, three observable news attributes inform

about θi: (1) news outlet i’s decision about whether EA j is sufficiently interesting for its audience

to write about, given EA/firm characteristics (Ij,i|Xj); (2) the tone of a news post, measured as

the number of negative words (yi,j,neg); and (3) the amount of financial jargon used in a news post,

16

Page 18: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

measured as the number of complex financial terms (words with three or more syllables) (yi,j,cplx).

I use negative tone as a sentiment proxy. It is a more stable summary measure of sentiment of news

text than measures of net tone (e.g., Tetlock, 2007; Gurun and Butler, 2012).15 Combining the

information in all three observable attributes leads to an identification strategy for unobservable,

latent target audience characteristics.

This setup implies a large amount of parameters to be estimated (e.g., two latent parameters

for each outlet and two for each EA). There is a concern that estimates of variation are easily

confounded by noise or sparsely distributed data (e.g., Gelman, 2014; Gentzkow et al., 2016). To

illustrate the issue, I simulate a simple setting: one sample is drawn from a population of the form

yi,t = ai + ui,t, with ui,t ∼ N(0, 15) and ai ∼ N(2, σ = 3) – i.e., there is considerable noise in the

regression. The parameter of interest are the individual effects ai. Fig. 3 illustrates differences in

ai estimates by a simple fixed effects (dummy) estimator and a Bayesian estimator using a prior

for regularization. The fixed effects specification drastically overestimates the heterogeneity in ai

because it produces very noisy estimates for outlets i with a low number of observations in the

sample. The standard errors are large and correspondingly the estimates are often far away from

the true value.

To avoid this issue and simultaneously incorporate multiple sources of information, I use a

Bayesian approach. The advantages are: (1) the hierarchical model below imposes automatic

regularization on imprecisely estimated coefficients via the choice of priors and thus counters

noise-fitting. By imposing normally distributed priors with a zero mean θi ∼ N (0, σθ) for

{θi,neg, θi,cplx} = θi, the model assumes a baseline prior of no market segmentation and applies

a certain amount of regularization to the estimates. The amount of regularization is to some

extent adaptive as the prior for the individual effects is learned from the data because each obser-

vation informs about σθ (the amount of regularization further depends on the variance hyperprior

and the amount of available observations per news outlet). Only if the data contain enough infor-

mation to the contrary will the estimates of θi move away from zero. This conservative approach

is especially important, given the low number of EAs covered by some outlets, which could lead to

15The reason for this is that optimistic language words, such as extraordinary and great, are much more ambivalentin common usage.

17

Page 19: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

highly imprecise estimates and high variation otherwise. (2) Bayesian updating provides a natural

approach to simultaneously incorporate multiple sources of information about the latent audience

characteristics (the content choice and the coverage choices).16

Settling on a Bayesian approach, the goal is to estimate the posterior probabilities p (θi|β, y, Ij,i, Xj).

These posteriors can be interpreted as the model’s belief about what each outlet’s audience most

likely is, given the observable data. β is a vector of regression coefficients associated with EA

characteristics (Xj) that are further determinants of the decision to cover an EA (e.g., larger firms

are generally more likely to receive coverage). To derive the posterior, one needs to specify the

likelihood (data model) of the three observable news attributes, as well as the prior distributions

for the parameters θi and β. Since yi,j = (yi,j,neg, yi,j,cplx) is only observed if outlet i covers EA

j, data are restricted to observable news posts yi,j,obs for part of the likelihood. Thus, the poste-

rior for the observable news posts is equal to the integrated full posterior that marginalizes out

not-covered news posts:

p (θi, β|yi,j,obs, Ij,i, Xj) =

∫p (θi, β|yi,j, Ij,i, Xj) dyi,j,miss (1)

where p(θi, β|yi,jIj,i, Xj) is proportional to the prior and the complete-data likelihood:

p (θi, β|yi,j, Ij,i, Xj)︸ ︷︷ ︸posterior

∝ p (θi, β)︸ ︷︷ ︸prior

p (yi,j, Ij,i|θi, β,Xj)︸ ︷︷ ︸likelihood

(2)

Assuming that once a news outlet’s target audience characteristics θi, coverage choice weights β,

and EA characteristics Xj are known, the decision to write about an EA (Ij,i) is independent

of how the post is actually written (yi,j), one can simplify: p(Ij,i|yi,j, θi, β,Xj) = p(Ij,i|θi, β,Xj).

Then, the complete-data likelihood p(yi,j, Ij,i|θi, β,Xj) is a simple product:

p (yi,j, Ij,i|θi, β,Xj) = p (yi,j|θi, β,Xj) p (Ij,i|θi, β,Xj) (3)

16Furthermore, since all the parameters are jointly estimated, all inferences automatically incorporate the uncer-tainty in the other parameters. So, no adjustments such as adjustments of standard errors in 2SLS estimators isnecessary.

18

Page 20: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Substituting (3) into the integral (1) and separating covered (yi,j ∈ Nobs) and non-covered (yi,j ∈

Nmiss) news posts yields

p (θi, β|yi,j,obs, Ij,i, Xj)︸ ︷︷ ︸posterior

=

∫p (θi, β|yi,j, Ij,i, Xj) dyi,j,miss (4)

∝∫p (θi, β) p (yi,j|θi, β,Xj) p (Ij,i|θi, β,Xj) dyi,j,miss

∝ p (θi, β)∏

n∈Nobs

p (yn,i,j|θi, β,Xj)∏

n∈Nall

p (In,j,i|θi, β,Xj)

∗∫ ∏

n∈Nmiss

p (yn,i,j|θi, β,Xj) dyi,j,miss

∝ p (θi, β)︸ ︷︷ ︸prior

p (Ij,i|θi, β,Xj)︸ ︷︷ ︸coverage decision likelih.

p (yi,j,obs|θi, Xj)︸ ︷︷ ︸content likelih.

because∫ ∏

n∈Nmissp (yn,i,j|θi, β,Xj) dyi,j,miss integrates to one.

I model the coverage decision likelihood (p (Ij,i|θi, β,Xj)) based on previous research (e.g.,

Drake et al., 2014) by assuming Ij,t ∈ {0, 1} follows a Bernoulli distribution with probability

πi,j,cov. I then estimate that probability using a logit regression that includes the latent audience

characteristics and various covariates used in prior literature to proxy for the general information

environment of a firm and general determinants of the usefulness of earnings news:

Ij,i ∼ Bernoulli(πi,j,cov) (5)

logit−1 (πi,j,cov) = β0 + β1θi,neg + β2θi,cplx + β3θi,negθi,cplx

+ β4RetV olj + β5RetV oljθi,neg + β6RetV oljθi,cplx + β7RetV oljθi,negθi,cplx

+ β8logMVj + β9logMVjθi,neg + β10logMVjθi,cplx + β11logMVjθi,negθi,cplx

+ β12BtoMj + β13InstHoldj + β14PriRetj + β15ESurpj

+ β16PRNegTonej + β17NrAnalystj + β18NrEAsj

In asset markets, the purpose of acquiring information is to reduce the expected conditional vari-

ance of future asset payoffs (Grossman and Stiglitz, 1980; Veldkamp, 2006). Accordingly, two key

determinants for information demand are the firm’s past stock payoff volatility (measured using

19

Page 21: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

the prior six-month return volatility, RetV ol) and the stock’s weight in the market portfolio (de-

termined by its market value, logMV ). Both RetV ol and logMV are interacted with the news

outlet-specific parameters, since information about uncertain stocks and large companies will be

valued heterogeneously depending on the audience. For example, more pessimistic investors could

demand more news about uncertain firms. At the same time, more sophisticated investors are likely

to have an information processing advantage over their unsophisticated counterparts and demand

more information on smaller stocks, where the information environment is sparser. Additional co-

variates are included based on prior literature and theoretical considerations: the book-to-market

ratio (BtoM) captures expectations about future growth potential. The amount of institutional

holdings (InstHold) is included as a measure of information environment. PriRet denotes a firm’s

buy-and-hold stock return over the past six months. If a stock has performed especially well or

poorly recently, information about its future is likely to be in greater demand as well. Earnings

surprise (ESurp) and the tone in the press release (PRNegTone) are included to account for the

actual information in the earnings release (Tetlock et al. (2008) suggests that both are equally

strong signals of future firm performance). The number of analysts (NrAnalyst) is included as

a control for other information sources. Finally, the number of EAs on the same day (NrEAs)

is included as a control for the number of competing information events. The calculation of each

variable is described in Table 2.

Since the observed tone and complexity measures are word counts, the likelihood (p (yi,j,obs|θi, β,Xj))

is assumed to follow a binomial distribution, with πi,j,neg (πi,j,cplx) being the probability of any

given word n ∈ Ni,j,words being negative (complex). This parametrization then allows πi,j,neg to be

modeled using a logit regression:

yi,j,neg ∼ Binom(πi,j,neg, Ni,j,words) (6)

logit−1 (πi,j,neg) = µneg + aj,neg + θi,neg

θi,neg ∼ N (0, σθ,neg)

aj,neg ∼ N (0, σa,neg)

20

Page 22: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

where θi,neg is the unobserved audience negativity and is assumed to be drawn randomly from an

unknown normal distribution: θi,neg ∼ N (0, σθ,neg). θi,neg is contrasted with general EA negativity

(aj,neg ∼ N (0, σa,neg)), which capture tone that is the same for all news posts about EA j. I

include aj,neg because news sources that have a generally more pessimistic target audience likely

use more negative words, irrespective of the EA’s content. However, such news sources are also

more likely to write about negative events. To control for this phenomenon, announcement effects

are included.17

The model for the number of complex financial terms yi,j,cplx and the parametrization for πi,j,cplx

is analogous to yi,j,neg and πi,j,neg:

yi,j,cplx ∼ Binom(πi,j,cplx, Ni,j,words) (7)

logit−1 (πi,j,cplx) = µcplx + aj,cplx + θi,cplx

θi,cplx ∼ N (0, σθ,cplx)

aj,cplx ∼ N (0, σa,cplx)

For the last component of the posterior, the vector of prior probabilities (p (θi, β)) is added. To

achieve shrinkage, I use weakly informative normally distributed priors with a large variance

N(0, 10) for the coefficients β0, . . . , β18 and half-normal priors with a mean zero and standard

deviation 0.5 for the scale hyperparameters σθ,neg, σθ,cplx, σa,neg, and σa,cplx. As illustrated before,

inclusion of priors is the key advantage of this approach (for example as compared to modeling

θi,neg and aj,neg as fixed effects). The (informative) prior induces automatic shrinkage by assuming

that θi, aj are most likely zero and rarely greater than 1.18

The final estimates, the posteriors p (θi|β, y, Ij,i, Xj), are weighed averages of this prior and

the likelihood, with the weight given to the likelihood depending on how informative the data

17Another way to control for this phenomenon would have been to use the tone of announcement j’s corporatepress release as a benchmark. However, using announcement effects instead will capture not only the press releasetone but also other things, such as the general sentiment or prior expectations about the EA that would not beincluded in the press release.

18Remember that for a normal distribution approx. 95% of all observations lie in a two standard deviation bandaround the mean. As a frame of reference, Gelman, Jakulin, Pittau, Su et al. (2008, p. 4) remark: “For logisticregression, a change of 5 moves a probability from 0.01 to 0.5, or from 0.5 to 0.99”

21

Page 23: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

is. As a result, the model should be less likely to overfit any θi due to noise or small number

of articles for i and thus produces a more accurate or at least conservative estimate of audience

characteristics and their variation. Similarly, the posteriors of σθ,neg and σθ,cplx are straightforward,

regularized measures of how much variation in tone, writing, and coverage is outlet specific. E.g.,

if the estimate of σθ,neg is close to zero, there is little outlet-specific variability in the data. One can

think of such variance hyperparameters as crude measures of the effect’s relevance in explaining

variations in outcome (McElreath, 2016, p. 376) and I will use them as such in section 6.

6 Results

6.1 Estimates of unobserved target audiences

The model is fit using the sample of investment-focused news outlets (see Table 4 for descriptive

statistics). First, this sample is likely homogeneous in the sense that the readers are interested

in investment news. The full sample includes many industry-specific news outlets, where other

reasons for slant could play an additional role, such as a local advertisting bias (Gurun and Butler,

2012). Second, large-scale MCMC methods become computationally slow for high numbers of

parameters, especially in a hierarchical setting. Because of the large sample size and hierarchical

structure, I employ a variational inference estimation procedure (Kucukelbir, Ranganath, Gelman,

and Blei, 2015).19

The model fit is presented in Table 5. It suggests economically significant variation in the

estimates of θi. For example, the posterior mean of the standard deviation in outlet negativity

(σθ,neg) is 0.53, which is of similar magnitude as the posterior mean of the standard deviation of

the EA effects in the news tone equation (σa,neg, 0.40). The variation in outlet-specific complexity

far outweighs that of EA-specific complexity (posterior mean of 1.29 versus 0.16), though part of

the large difference is driven by two extreme websites (moderngraham.com and raygent.com). One

should note that θi is also determined by coverage, whereas aj is not. Nevertheless, a sizable amount

of variation in news coverage seems determined by outlet specifics rather than by EA fundamental

19The use of variational inference algorithms is most prominent in latent topic models in automated textualanalysis (Blei, 2014). I use the mean field algorithm as implemented in Stan (Carpenter, Gelman, Hoffman, Lee,Goodrich, Betancourt, Brubaker, Guo, Li, and Riddell, 2016).

22

Page 24: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

information. All variables were centered and standardized before the model was estimated. Thus,

each coefficient in the coverage decision likelihood can be interpreted as the change in log odds,

holding all other variables fixed at their average value (i.e., the average firm). Both main coefficients

on θi,neg and θi,cplx have a large magnitude and opposite signs. For the average EA and outlet, θi,neg

is negatively associated with coverage (-3.54) while θi,cplx exhibits a positive main effect (1.75). A

possible reason is that outlets with the least negative tone disseminate predominantly automated

EA news. Outlets with more financially sophisticated readers tend to cover more firms. As expected

from theory and prior literature (e.g., Veldkamp, 2006), return volatility (RetV ol, 0.08) and market

size (logMV , 0.47) are highly associated with coverage. Moreover, the magnitude of the relation

strongly depends on the target audience, since the interactions with the audience parameters are

also large and highly significant. More negative outlets cover more EAs from firms with less historic

stock volatility (RetV ol× θi,neg, -0.04), the association is even increasing, if the outlet also targets

more financially sophisticated readers (RetV ol × θi,cplx, -0.85 and RetV ol × θi,neg × θi,cplx, -0.65) .

Both more negative and more sophisticated outlets also have a higher likelihood of covering smaller

firms (RetV ol × θi,neg, -0.21, RetV ol × θi,neg, -2.76, and RetV ol × θi,neg × θi,cplx, -2.27). Judging

from the coefficient magnitudes and given that all the variables are standardized, the degree of

financial sophistication seems to have the most explanatory weight in all these interactions. In

examining the importance of other coverage determinants in the model, it is worth noting that

earnings surprise (ESurp, 0.04) and press release tone (PRNegTone, 0.03) do not seem to be

important predictors, given all the other parameters in the model.

Fig. 4 shows the resulting distribution of outlet audiences based on their estimated target au-

diences. The figure plots the exponentiated posterior means of each outlet’s θ parameters. Since

both are estimated in a logistic regression system, the exponentiated versions can be interpreted

as odds ratios. For example, the exp (θi,neg) value of moderngraham.com of close to 1.5 indi-

cates that the odds of any word in a news post from that outlet being negative increases by 50%

from the baseline odds. The baseline odds, based on the posterior mean of the intercepts, are

exp(−3.91) = 0.02 for a word being negative and exp(−0.85) = 0.42 for a word being a financial

and complex term. The distribution of outlets seems in accordance with prior anecdotal expecta-

23

Page 25: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

tions. For example, the contrarian professional blog zerohedge.com is significantly more negative

than wsj.com and cnbc.com is less negative than wsj.com. In terms of extreme sites, the site using

the most financial jargon by far is moderngraham.com, which, as the name suggests, is a site (and

service) dedicated to fundamental value investing. Its free content encompasses stock lists and

detailed company valuations. The site raygent.com is a board publishing short trading suggestions

on life sciences stocks. Among the most negative sites are tradersmagazine.com (an online maga-

zine aimed at professional traders) and shareholdersfoundation.com (a site focusing on class action

lawsuits). The site tickerreport.com is the least negative, issuing a large number of short auto-

mated financial alerts about stock events, with barely any tone. Notably, the top 5% of outlets by

audience size (bbc.com, cnn.com, nytimes.com, dailymail.co.uk, theguardian.com, cnet.com, huff-

ingtonpost.com, washingtonpost.com, buzzfeed.com, foxnews.com, forbes.com, aol.com, rt.com)

cluster in a comparably concentrated region of the figure.

The data in Fig. 4 and Table 5 is consistent with significant news market segmentation along

sophistication and optimism characteristics. Therefore, I use the posterior means of θneg and θcplx

in a next step to analyze the relation between disagreement among news outlets and trading volume

after EAs.

6.2 Relation between news coverage dispersion and trading activity

If the estimated variation in audiences reflects differences in beliefs, then greater heterogeneity of

audiences exposed to an EA should be associated with higher trading volume and possibly higher

return volatility.

To provide initial univariate evidence, Fig. 5 plots the average size-adjusted abnormal trading

volume around the sample EAs for two groups of EAs. As a measure of attending audience hetero-

geneity, I compute the interquartile range (DispCovNeg) of all θi,neg estimates and DispCovSoph

for all θi,cplx estimates of all outlets that cover the EA.20 In panel A, the low group encompasses

EAs where DispCovNeg is below the sample median and the high group encompasses EAs with

20I use the interquartile range rather than the standard deviation because it is more robust to differences inthe number of news outlets per EA. In addition, the EA-specific standard deviations of negativity and financialsophistication are more highly correlated than the interquartile range, which reduces the precision of the coefficients.The results are similar and slightly weaker when measuring dispersion by the standard deviation.

24

Page 26: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

DispCovNeg above the median. The grouping in panel B is equivalent for DispCovSoph. Ab-

normal turnover is the daily trading volume scaled by shares outstanding, minus the average daily

trading volume in the event period [-31, -1] (Chae, 2005). The abnormal volume is size adjusted by

regressing abnormal volume on size and keeping the residuals. Consistent with the results of Chae

(2005), volume spikes at the announcement and slowly returns to normal levels in the following

days. Both panels A and B show that this spike is significantly higher for EAs that are covered by

news outlets with differing audience characteristics. The difference is especially striking for large

differences in sophistication, where the difference between groups has a magnitude of about 70%

of normal volume (3.7 vs. 4.4 on day +1; the baseline is normal volume and 4.4 therefore means

4.4 × normal volume).

The univariate analysis in Fig. 5 does not take a key confounding determinant into account.

Information content, and thus demand by investors, affects both news outlet coverage and posterior

beliefs. To isolate the relation between news coverage and trading volume, I use a regression

setup and control for the information in the EA, the size of the overall audience, and the general

information demand of investors regarding that stock. Standard information demand models

(e.g., Veldkamp, 2006) suggest firm size (logMV ) and ex ante uncertainty (as measured by return

volatility, RetV ol) as key determinants of information demand. Accordingly, I include size, return

volatility, and firm fixed effects as controls. I include SumReach, which is the sum of the average

amount of daily visitors for all covering outlets, as a further control for attention. To further

control for EA information, I include the earnings surprise (ESurp) and tone (PRNegTone) of

the announcement.

Results are presented in Table 6. Panel A shows a positive and significant association between

the dispersion of news outlet target audience characteristics (bothDispCovNeg andDispCovSoph)

and abnormal trading volume. Abnormal trading volume is the average abnormal turnover in the

event period [0, +2]. Even after firm fixed effects and all covariates are included, the relations

remain sizable and significant. Considering that the whole sample period is 18 months, firm

fixed effects likely eliminate most of a firm’s current newsworthiness and information demand.

Depending on the specification, the coefficient of DispCovNeg ranges from 0.071 to 0.076 and

25

Page 27: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

the coefficient of DispCovSoph from 0.121 to 0.0108. In all cases, the percentile bootstrap 95%

confidence interval does not include zero.21 The magnitude is economically sizable. Increasing

DispCovNeg by one standard deviation is associated with an increase in abnormal volume of

around 7% of normal volume, controlling for firm fixed effects, the size of the attending audience,

return volatility, firm size, earnings surprise, and press release tone. A one standard deviation

increase in DispCovSoph corresponds to an increase in abnormal volume of about 10%. Panel B

presents a similar analysis, but with return volatility as the dependent variable. Return volatility

is measured as the scaled standard deviation of the daily log market-adjusted return in the event

period [-1, +5]. DispCovNeg exhibits a positive and stable association with return volatility

around EAs, even after controlling for the other covariates and including firm effects. In contrast,

DispCovSoph does not exhibit an apparent relation. As argued before, the link between return

volatility and news dynamics is not as straightforward as that with volume. Still, strong prior

disagreements among investors, as reflected by greater dispersion of the outlet-specific negativity

component DispCovNeg, are likely associated with more frequent/larger changes in the average

belief after an EA, as investors adjust their beliefs and the average market belief also likely shifts.

In summary, the economically significant association between coverage segmentation and trad-

ing activity documented in Fig. 5 and Table 6 is consistent with news outlets catering to readers

based on differences in beliefs. The economic magnitude suggests that news segmentation might

be an important factor in the belief formation process. What the preceding analysis cannot test is

the degree to which the observed association is causally driven by news coverage affecting beliefs

versus merely reflecting beliefs. A similar question, whether news affects political attitudes, has

been extensively researched in the economics and political science literature. For example, DellaV-

igna and Kaplan (2007) argue that the introduction of Fox News to cable TV caused a shift in

favor of the Republican vote between 1996 and 2000. It is possible that both mechanisms are at

work and disentangling them could be a fruitful area for future research.

21Bootstrapping the standard errors and confidence intervals is necessary to reflect the additional uncertainty inDispCovNeg and DispCovSoph, stemming from the fact that these are themselves estimated and based on theestimated posterior means.

26

Page 28: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

6.3 Variation in coverage and content characteristics among news outlets with com-

mon ownership

The preceding sections provide evidence consistent with substantial and predictable differences in

content and coverage of EA news. Furthermore, the association between coverage variation and

trading activity is consistent with the variation being belief-, and thus demand driven. The fol-

lowing sections examine the plausibility of alternative explanations First, in a competitive market,

supply should ultimately be driven by demand and the online news market, with its low barriers

to entry, is highly competitive. Still, the possibility of significant supply side effects, particularly

owner effects, exists. A famous example of owner-induced variation in coverage and content is the

English news gazette The Sun, whose owner, Rupert Murdoch, supposedly ordered a change in

political attitude twice during his ownership of the paper (Reeves, McKee, and Stuckler, 2016).

One way of exploring the issue, at least to some extent, is to introduce a separate owner effect

into the model and investigate whether news outlets with common ownership still demonstrate

considerable variation in tone and coverage characteristics. To separate owner-specific effects from

news outlet-specific coverage and content characteristics, I expand the model by introducing a

varying owner mean into the structure of the latent variables (θi):

θi(z),neg ∼ N(µowner(z),neg, σθ,neg

)(8)

µowner(z),neg ∼ N (0, σowner,neg)

with the analogous version for θi(z),cplx. These owner-specific tone (µowner(z),neg) and complexity

(µowner(z),cplx) averages are shared by each news outlet i(z) belonging to the same owner z.22 This

specification requires ownership data however. Attempts at identifying co-owned sites via identical

IP addresses leads to too many false positives. The Alexa web information service has data on

co-ownership for some sites. Thus, I use a subsample (including non–investment-focused outlets)

for which Alexa has co-ownership data and refit the model with owner effects. The results are

presented in Table 7. After including owner-specific means, σθ,neg, σθ,cplx represent variation among

22The main model in Eqs. (5) to (7) is essentially a special case with σowner,neg = 0 and σowner,cplx = 0.

27

Page 29: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

outlets owned by the same owner. Both are still sizable.

6.4 Association between outlet characteristics and journalistic quality

Besides supply side effects, larger values of news outlet-specific negativity or tone could reflect the

higher degree to which an outlet engages in journalistic content creation. To provide evidence on

such possible associations, Fig. 6 plots the distribution of both θneg and θcplx against an outlet’s

audience size and its average EA news post length. If larger websites have more resources to engage

in journalistic activities and negativity or sophistication capture journalistic activity, one would

expect a positive relation between the number of unique visitors and both audience parameters.

Panels A and B do not provide evidence of such an association. Panel B exhibits a slight nega-

tive correlation between news outlet-specific financial sophistication and audience size. Panels C

and D plot the relation with an outlet’s average EA news post length (measured in words). The

average post length could be another indicator of journalistic activity. Thus, if negativity cap-

tures journalistic activity, one would expect a positive relation with average post length. Panel C

presents a negative relation between negativity and post length, inconsistent with negativity cap-

turing journalistic activity. Panel D exhibits a positive relation between financial sophistication

and post length, albeit most of the slope of the regression line is driven by a few large observations,

consistent with the largest posts being more complex but also more neutral. Taken together it

seems unlikely that outlet sophistication and complexity are driven by journalistic quality.

7 Robustness tests

7.1 Posterior predictive checks

The previous results are conditional on the model and test setup accurately describing the relevant

parts of the media and capital markets. If, for instance, the model is significantly misspecified,

inferences will be misleading. The provide some confidence in the model, I employ posterior

predictive checks and two out-of-sample prediction tests to examine how well the model fits the

data.

28

Page 30: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Posterior predictive checks make use of the fact that Bayesian models are generative. One

can draw from the posterior distribution, which yields a complete set of model parameter esti-

mates that can be used to replicate the entire sample with simulated, predicted values (Y rep =

{yrepneg, yrepcplx, I

rep}). By repeatedly sampling from the posterior distribution, a distribution of pre-

dicted samples can be drawn that reflects the underlying uncertainty in the model parameters.

This approach allows for a comparison of replicated sample features with the actually observed

features. For example, for each replicated sample, I compute the percentage of EAs covered, which

yields a distribution of the sample test statistic whose variance reflects the uncertainty in the

model. This distribution can be compared to the actual percentage of covered EAs in the sample.

By selecting important sample features, one can assess how well the model fits the data for the

purposes at hand and spot hints for where the model might be misspecified.

Table 8 lists several such test statistics for the actual sample of EA coverage, as well as distri-

butions based on two models: the full model, used to compute the results in Table 5, and a base

model that is equivalent to the probabilistic model from Eqs. (5) to (7), except it excludes θi,neg,

θi,cplx, and all interactions with both. The posterior predictive checks indicate that the full model,

incorporating outlet-specific negativity and sophistication parameters, generally fits observed as-

pects of the sample such as the mean and variation in tone and complexity of coverage better

than the model excluding outlet-specific parameters. In terms of content characteristics, the full

model is generally closer to the observed features of the data. Not surprisingly, this effect is most

apparent in the standard deviation of mean tone and complexity across news outlets, since the full

model is specifically designed to model this variation. The model without outlet-specific negativity

and sophistication predicts almost double the percentage of covered EAs than is actually observed.

Both models drastically underestimate the number of EAs that are covered by no outlet at all.

In summary, the full model seems to approximate the main features of the data well, alleviating

concerns about misspecification.

29

Page 31: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

7.2 Out-of-sample predictive ability of latent audience characteristics

As a second test of the model’s fit and to address potential overfitting concerns, I employ two

out-of-sample prediction tests: first, using five-fold cross-validation and, second, using a sample

of market summary news. The predictive accuracy gained by incorporating the latent audience

characteristics is also a natural measure of the importance of those characteristics for the coverage

and content decisions of the news outlets.

To determine the additional predictive accuracy gained from incorporating audience charac-

teristics estimates, I compare the same two nested models (used to compute posterior predictive

checks) in their ability to predict coverage, tone, and language complexity of news posts that were

not used to fit the model. The full model is more complex and contains a substantial number of

interactions. Thus it is likely to fit the sample well; but it is also more likely to overfit the data. In

that case, the full model would produce worse out-of-sample predictions, because the coefficients

(fitted on noise) will produce bad predictions for newly introduced data points. I adopt a five-fold

cross-validation out-of-sample prediction approach: I randomly divide the sample into five parts;

fit each model on four parts; use the fitted model to predict tone, complexity, and coverage in the

omitted fifth part of the sample; and repeat the exercise for all five parts to obtain predictions for

the whole sample.

The resulting distribution of the predictions from both models as well as the actual values are

presented in Table 9 and Fig. 7. The comparison provides a general overview of the predictive

quality and areas where the models might be misspecified. Table 9 compares the descriptive

statistics of the prediction errors of both models. The full model clearly outperforms the model

without outlet-specific parameters in terms of accurately predicting coverage decisions and the

number of negative words. The model without outlet-specific parameters is more accurate in

predicting the average number of complex words correct, but in turn seems to also produce too

extreme forecasts in the tails of the distribution. Fig. 7plots these distributions. Of particular

note are Panel C and Panel D. Panel C shows the estimated distribution of predicted probabilities

for EA–outlet pairs that do not result in coverage. Panel D shows the analogous distribution for

EA–outlet pairs that result in coverage. A perfect model would have all its probability mass at zero

30

Page 32: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

for non-covered EAs (panel C) and all its probability mass at one for EAs that are actually covered

(panel D). The full model significantly improves on the predictive quality of the model without

target audience characteristics. As can be seen from panels C and D, using the full model and

designating EA–news outlet combinations with an estimated coverage probability of at least 0.2

as covered will correctly classify most cases. Thus, while the inclusion of news outlet dimensions

already improves the prediction of word counts, it strongly improves coverage prediction.23

7.3 Out-of-sample predictions using market activity summary posts

The second out-of-sample robustness test examines how well θi,neg and θi,cplx help predict the

content of a second category of news posts; one that has not been used to estimate the latent

audience characteristics. A subset of news outlets publishes regular summaries of market activities

(MA), usually once or twice per day. Since these are intended to provide a brief overview of

noteworthy events, the beliefs of the target audience should also influence what would count as a

noteworthy event. I identify the market activity summary posts for each news outlet that produces

them, assign each sentence in the post to a company, and run the following logit prediction model—

mirroring the decision choice model in Eq. (5)—using the posterior means from the main model

as predictions for θi,neg and θi,cplx, with the model again fitted using five-fold cross-validation:

IMA,j,i ∼ Bernoulli(πcov,MA) (9)

logit−1 (πcov,MA) = β0 + β1θi,neg + β2θi,cplx + β3θi,negθi,cplx

+ β4RetV olj + β5RetV oljθi,neg + β6RetV oljθi,cplx + β7RetV oljθi,negθi,cplx

+ β8logMVj + β9logMVjθi,neg + β10logMVjθi,cplx + β11logMVjθi,negθi,cplx

+ β12BtoMj + β13InstHoldj + β14PriRetj + β15ESurpj

+ β16PRNegTonej + β17NrAnalystj + β18NrEAsj

23In untabulated tests, I compared the predictions of the full model with the predictions of a fully interacted model,namely, one that interacts each covariate in the coverage likelihood with θi,neg and θi,cplx. The fully interacted modelhad significantly worse out-of-sample characteristics and also did worse in the comparison of posterior predictivechecks.

31

Page 33: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

The distribution of the predictions from both models compared to the actual values are presented

in Fig. 8, similar to panels C and D of Fig. 7. The left panel shows the estimated distribution

of predicted probabilities for EA–outlet pairs that do not result in coverage. The right panel

shows the analogous distribution for EA–outlet pairs that result in coverage. Again, the full

model significantly outperforms the model without target audience characteristics in correctly

predicting summary occurrences. The right panel shows that the full model’s distribution of

predicted probabilities has significantly greater mass toward higher probabilities.

7.4 Other measures of prior beliefs

I use the amount of negative words to measure negativity as a proxy for prior beliefs. This is

in accordance with a stream of other media and textual analysis studies (e.g., Tetlock, 2007;

Tetlock et al., 2008; Loughran and McDonald, 2011; Gurun and Butler, 2012) that use only the

number of negative words instead of a measure combining negative and positive word counts into

something akin to net tone. The reason for this is that negative words are more stable and do

nearly always convey negative sentiment, whereas positive words very often do not carry positive

sentiment. This is especially true for more news texts. For example, the rather negative site

zerohedge.com (whose motto is “On a long enough timeline the survival rate of everyone drops

to zero”) uses many positive words, such as exceptional, remarkable, and great, but rarely in a

positive context. Untabulated results also show that using common net tone measures such as

log ((1 +NrPosWords)/(1 +NrNegWords)) leads to a worse posterior predictive sample fit and

worse out-of-sample predictions. I therefore decided to follow the prior literature in finance on

media effects and use only negative words.

8 Conclusion

This study documents considerable variation across online news outlets in their coverage of EAs.

Not only does the choice of which EAs to cover vary significantly by outlet, but so does the

choice of content, including the amount of negativity and the amount of financial sophistication.

Furthermore, negativity and complexity dispersion of covering outlets exhibit economically sizable

32

Page 34: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

associations with trading volume and volatility around EAs, which is consistent with greater

disagreement about the interpretation of the information. These findings refute the notion that

financial news is a homogeneous product and that outlets uniformly aim at uncovering a ‘true’

story. Associations with other outlet characteristics, audience size, and average post length, do not

support alternative explanations, such as negativity capturing the degree of journalistic activity.

Similarly owner effects fail to full account for the variation in outlet characteristics.

The results provide important insights into the causes and consequences of differences in beliefs.

Financial news market segmentation matters because it reflects and possibly exacerbates hetero-

geneity in investors’ beliefs, with significant implications not only for information diffusion but also

for firms’ resulting strategies (Dyck, Volchkova, and Zingales, 2008; Baloria and Heese, 2018). The

degree of belief-driven segmentation is an important indicator of how heterogeneously investors

interpret financial news. More importantly, studying demand-driven news market segmentation

provides insights into what drives those differences in interpretation about value-relevant informa-

tion and allows important insights into how investors demand information. What the preceding

analysis cannot test is the degree to which the observed association is causally driven by news cov-

erage affecting beliefs versus merely reflecting beliefs. Inquiries along these lines seem promising

avenues for future research.

33

Page 35: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

References

Ahern, K. R. and D. Sosyura (2014): “Who writes the news? Corporate press releases duringmerger negotiations,” The Journal of Finance, 69, 241–291.

Angelini, V. and D. Cavapozzi (2017): “Dispositional optimism and stock investments,”Journal of Economic Psychology, 59, 113–128.

Atmaz, A. and S. Basak (2017): “Belief dispersion in the stock market,” SSRN Working Paper.

Baloria, V. P. and J. Heese (2018): “The Effects of Media Slant on Firm Behavior,” Journalof Financial Economics (forthcoming).

Banerjee, S. and I. Kremer (2010): “Disagreement and learning: Dynamic patterns of trade,”The Journal of Finance, 65, 1269–1302.

Barber, B. M. and T. Odean (2013): “The behavior of individual investors,” Handbook of theEconomics of Finance, 2, 1533–1570.

Barron, O. E., D. Byard, and Y. Yu (2017): “Earnings Announcement Disclosures andChanges in Analysts’ Information,” Contemporary Accounting Research, 34, 343–373.

Ben-David, I., J. R. Graham, and C. R. Harvey (2013): “Managerial miscalibration,” TheQuarterly Journal of Economics, 128, 1547–1584.

Ben-Rephael, A., Z. Da, and R. D. Israelsen (2017): “It depends on where you search:Institutional investor attention and underreaction to news,” Review of Financial Studies, (forth-coming).

Benabou, R. and J. Tirole (2016): “Mindful economics: The production, consumption, andvalue of beliefs,” The Journal of Economic Perspectives, 30, 141–164.

Blei, D. M. (2014): “Build, compute, critique, repeat: Data analysis with latent variable models,”Annual Review of Statistics and Its Application, 1, 203–232.

Brandenburger, A., E. Dekel, and J. Geanakoplos (1992): “Correlated equilibrium withgeneralized information structures,” Games and Economic Behavior, 4, 182–201.

Bushee, B. J., J. E. Core, W. Guay, and S. J. W. Hamm (2010): “The role of the businesspress as an information intermediary,” Journal of Accounting Research, 48, 1–19.

Busse, J. A. and T. C. Green (2002): “Market efficiency in real time,” Journal of FinancialEconomics, 65, 415–437.

Carpenter, B., A. Gelman, M. Hoffman, D. Lee, B. Goodrich, M. Betancourt,M. A. Brubaker, J. Guo, P. Li, and A. Riddell (2016): “Stan: A probabilistic program-ming language,” Journal of Statistical Software, 20, 1–37.

Carver, C. S. and M. F. Scheier (2014): “Dispositional optimism,” Trends in cognitivesciences, 18, 293–299.

34

Page 36: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Chae, J. (2005): “Trading volume, information asymmetry, and timing information,” The Journalof Finance, 60, 413–442.

Daniel, K. and D. Hirshleifer (2015): “Overconfident investors, predictable returns, andexcessive trading,” The Journal of Economic Perspectives, 29, 61–87.

DellaVigna, S. and E. Kaplan (2007): “The Fox News effect: Media bias and voting,” TheQuarterly Journal of Economics, 122, 1187–1234.

DellaVigna, S. and J. M. Pollet (2009): “Investor inattention and Friday earnings an-nouncements,” The Journal of Finance, 64, 709–749.

Diether, K. B., C. J. Malloy, and A. Scherbina (2002): “Differences of opinion and thecross section of stock returns,” The Journal of Finance, 57, 2113–2141.

Drake, M. S., N. M. Guest, and B. J. Twedt (2014): “The media and mispricing: Therole of the business press in the pricing of accounting information,” Accounting Review, 89,1673–1701.

Drake, M. S., J. R. Thornock, and B. J. Twedt (2017): “The internet as an informationintermediary,” Review of Accounting Studies, 22, 543–576.

Dyck, A., N. Volchkova, and L. Zingales (2008): “The corporate governance role of themedia: Evidence from Russia,” The Journal of Finance, 63, 1093–1135.

Engelberg, J. (2018): “Discussion of “earnings announcement promotions: A Yahoo Financefield experiment”,” Journal of Accounting and Economics.

Engelberg, J., C. Sasseville, and J. Williams (2012): “Market madness? The case of madmoney,” Management Science, 58, 351–364.

Engelberg, J. E. and C. A. Parsons (2011): “The causal impact of media in financialmarkets,” The Journal of Finance, 66, 67–97.

Ganguly, A. R. and J. Tasoff (2016): “Fantasy and dread: The demand for information andthe consumption utility of the future,” Management Science, forthcoming.

Gelman, A. (2014): “How Bayesian analysis cracked the red-state, blue-state problem,” Statis-tical science, 26–35.

Gelman, A., A. Jakulin, M. G. Pittau, Y.-S. Su, et al. (2008): “A weakly informa-tive default prior distribution for logistic and other regression models,” The Annals of AppliedStatistics, 2, 1360–1383.

Gentzkow, M., B. T. Kelly, and M. Taddy (2017): “Text as data,” National Bureau ofEconomic Research Working Paper.

Gentzkow, M. and J. M. Shapiro (2006): “Media bias and reputation,” Journal of PoliticalEconomy, 114.

——— (2008): “Competition and truth in the market for news,” The Journal of Economic Per-spectives, 22, 133–154.

35

Page 37: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

——— (2010): “What drives media slant? Evidence from US daily newspapers,” Econometrica,78, 35–71.

——— (2011): “Ideological segregation online and offline,” The Quarterly Journal of Economics,126, 1799–1839.

Gentzkow, M., J. M. Shapiro, and M. Taddy (2016): “Measuring polarization in high-dimensional data: Method and application to congressional speech,” Tech. rep., National Bureauof Economic Research.

George, L. and J. Waldfogel (2003): “Who affects whom in daily newspaper markets?”Journal of Political Economy, 111, 765–784.

Golman, R., D. Hagmann, and G. Loewenstein (2017): “Information avoidance,” Journalof Economic Literature, 55, 96–135.

Grossman, S. J. and J. E. Stiglitz (1980): “On the impossibility of informationally efficientmarkets,” The American Economic Review, 70, 393–408.

Gurun, U. G. and A. W. Butler (2012): “Don’t believe the hype: Local media slant, localadvertising, and firm value,” The Journal of Finance, 67, 561–598.

Hales, J. (2007): “Directional Preferences, Information Processing, and Investors’ Forecasts ofEarnings,” Journal of Accounting Research, 45, 607–628.

Hales, J., X. J. Kuang, and S. Venkataraman (2011): “Who believes the hype? Anexperimental examination of how language affects investor judgments,” Journal of AccountingResearch, 49, 223–255.

Hamilton, J. (2004): All the news that’s fit to sell: How the market transforms information intonews, Princeton University Press.

Han, B., L. Lu, and Y. Zhou (2017): “Two trees with heterogeneous beliefs: Spillover effectof disagreement,” SSRN Working Paper.

Hendershott, T., D. Livdan, and N. Schurhoff (2015): “Are institutions informed aboutnews?” Journal of Financial Economics, 117.

Hillert, A., H. Jacobs, and S. Muller (2014): “Media makes momentum,” The Review ofFinancial Studies, 27, 3467–3501.

Hong, H. and D. A. Sraer (2016): “Speculative betas,” The Journal of Finance, 71, 2095–2144.

Hong, H. and J. C. Stein (2007): “Disagreement and the stock market,” Journal of Economicperspectives, 21, 109–128.

Huang, S. and A. V. Thakor (2013): “Investor heterogeneity, investor-management disagree-ment and share repurchases,” The Review of Financial Studies, 26, 2453–2491.

Huberman, G. and T. Regev (2001): “Contagious speculation and a cure for cancer: Anonevent that made stock prices soar,” The Journal of Finance, 56, 387–396.

36

Page 38: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Kandel, E. and N. D. Pearson (1995): “Differential interpretation of public signals and tradein speculative markets,” Journal of Political Economy, 103, 831–872.

Kim, O. and R. E. Verrecchia (1994): “Market liquidity and volume around earnings an-nouncements,” Journal of accounting and economics, 17, 41–67.

Kucukelbir, A., R. Ranganath, A. Gelman, and D. Blei (2015): “Automatic VariationalInference in Stan,” in Advances in Neural Information Processing Systems 28, ed. by C. Cortes,N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Curran Associates, Inc., 568–576.

Kuhnen, C. M. and A. C. Miu (2017): “Socioeconomic status and learning from financialinformation,” Journal of Financial Economics, (forthcoming).

Lawrence, A., J. Ryans, E. Sun, and N. Laptev (2018): “Earnings announcement promo-tions: A Yahoo Finance field experiment,” Journal of Accounting and Economics.

Li, E. X., K. Ramesh, and M. Shen (2011): “The role of newswires in screening and dis-seminating value-relevant information in periodic SEC reports,” The Accounting Review, 86,669–701.

Li, F. (2010): “The information content of forward-looking statements in corporate filings–ANaıve Bayesian machine learning approach,” Journal of Accounting Research, 48, 1049–1102.

Loughran, T. and B. McDonald (2011): “When is a liability not a liability? Textual analysis,dictionaries, and 10–Ks,” The Journal of Finance, 66, 35–65.

Malmendier, U. and G. Tate (2015): “Behavioral CEOs: The role of managerial overconfi-dence,” The Journal of Economic Perspectives, 29, 37–60.

Manning, C. D., H. Schutze, et al. (1999): Foundations of statistical natural languageprocessing, vol. 999, MIT Press.

Martin, G. J. and A. Yurukoglu (2017): “Bias in cable news: Persuasion and polarization,”American Economic Review, 107, 2565–99.

McElreath, R. (2016): Statistical rethinking: A Bayesian course with examples in R and Stan,vol. 122, Boca Raton, Chapman and Hall/CRC.

Miller, E. M. (1977): “Risk, uncertainty, and divergence of opinion,” The Journal of finance,32, 1151–1168.

Miller, G. S. (2006): “The press as a watchdog for accounting fraud,” Journal of AccountingResearch, 44, 1001–1033.

Miller, G. S. and D. J. Skinner (2015): “The evolving disclosure landscape: How changesin technology, the media, and capital markets are affecting disclosure,” Journal of AccountingResearch, 53, 221–239.

Mullainathan, S. and A. Shleifer (2005): “The market for news,” American EconomicReview, 95, 1031–1053.

37

Page 39: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Nickerson, R. S. (1998): “Confirmation bias: A ubiquitous phenomenon in many guises,”Review of General Psychology, 2, 175.

Odean, T. (1999): “Do investors trade too much?” American Economic Review, 89, 1279–1298.

Oster, E., I. Shoulson, and E. Dorsey (2013): “Optimal expectations and limited medicaltesting: evidence from Huntington disease,” The American Economic Review, 103, 804–830.

Peress, J. (2014): “The media and the diffusion of information in financial markets: Evidencefrom newspaper strikes,” The Journal of Finance, 69, 2007–2043.

Pouget, S., J. Sauvagnat, and S. Villeneuve (2017): “A mind is a terrible thing to change:confirmatory bias in financial markets,” The Review of Financial Studies, 30, 2066–2109.

Puri, M. and D. T. Robinson (2007): “Optimism and economic choice,” Journal of FinancialEconomics, 86, 71–99.

Reeves, A., M. McKee, and D. Stuckler (2016): “’It’s The Sun Wot Won It’: Evidence ofmedia influence on political attitudes and voting from a UK quasi-natural experiment,” SocialScience Research, 56, 44–57.

Savage, Leonard, J. (1954): The foundations of statistics, John Wiley, New York.

Seamans, R. and F. Zhu (2017): “Repositioning and cost-cutting: The impact of competitionon platform strategies,” Strategy Science, 2, 83–99.

Sharot, T., C. W. Korn, and R. J. Dolan (2011): “How unrealistic optimism is maintainedin the face of reality,” Nature neuroscience, 14, 1475–1479.

Solomon, D. H. (2012): “Selective Publicity and Stock Prices,” The Journal of Finance, 67,599–638.

Tetlock, P. C. (2007): “Giving content to investor sentiment: The role of media in the stockmarket,” The Journal of Finance, 62, 1139–1168.

Tetlock, P. C., M. Saar-Tsechansky, and S. Mackassy (2008): “More than words:Quantifying language to measure firms’ fundamentals,” The Journal of Finance, 63, 1437–1467.

Thakor, A. V. (2015): “Strategic information disclosure when there is fundamental disagree-ment,” Journal of Financial Intermediation, 24, 131–153.

Twedt, B. (2015): “Spreading the word: Price discovery and newswire dissemination of man-agement earnings guidance,” The Accounting Review, 91, 317–346.

Varian, H. R. (1985): “Divergence of opinion in complete markets: A note,” The Journal ofFinance, 40, 309–317.

——— (1989): “Differences of opinion in financial markets,” in Financial Risk: Theory, Evidence,and Implications: Proceedings of the 11th Annual Economic Policy Conference of the FederalReserve Bank of St. Louis, ed. by C. Stone, Kluwer Academic Publishers Boston, Massachusetts,3–40.

38

Page 40: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Veldkamp, L. L. (2006): “Media frenzies in markets for financial information,” American Eco-nomic Review, 96, 577–601.

Witten, I. H., E. Frank, M. A. Hall, and C. J. Pal (2016): Data mining: Practicalmachine learning tools and techniques, Morgan Kaufmann.

Xiong, W. and H. Yan (2009): “Heterogeneous expectations and bond markets,” The Reviewof Financial Studies, 23, 1433–1466.

39

Page 41: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Tables and figures

Figure 1: Determinants of differences in beliefsGraphical depiction of rational Bayesian updating about the success probability ps after observing y = 6 positive out of m = 15clues. Prior beliefs are assumed to follow a beta distribution (p (ps) ∼ beta(α, β)), with the pessimistic prior being parameterizedas beta(2, 7) and the optimistic prior as beta(7, 2). The likelihood is assumed to follow a binomial distribution (y ∼ Bin (m, ps)).The posterior belief about ps is p (ps|y,m) ∼ beta (α+ y, β +m− y).

0

1

2

3

0.00 0.25 0.50 0.75 1.00

P(Investments will pay off)

Den

sity

prio

r be

liefs

0.00

0.05

0.10

0.15

0.20

0.00 0.25 0.50 0.75 1.00

P(Investments will pay off)

Like

lihoo

d to

see

6 p

ositi

ve c

lues

0

1

2

3

4

0.00 0.25 0.50 0.75 1.00

P(Investments will pay off)

Den

sity

pos

terio

r be

liefs

Panel A: No room for interpretation

0

1

2

3

0.00 0.25 0.50 0.75 1.00

P(Investments will pay off)

Den

sity

prio

r be

liefs

0.00

0.05

0.10

0.15

0.20

0.00 0.25 0.50 0.75 1.00

P(Investments will pay off)

Like

lihoo

d to

see

4(1

1) p

ositi

ve c

lues

0

1

2

3

4

0.00 0.25 0.50 0.75 1.00

P(Investments will pay off)

Den

sity

pos

terio

r be

liefs

Panel B: Room for interpretation

Person Bearish Bullish

40

Page 42: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Figure 2: Distribution of EAs over timeBar chart of the number of EAs per day of firms that were part of the Russell 3000 Index on July 2014, based on 18,917 EAsbetween April 2015 and November 2016.

0

50

100

150

200

250

300

2015

−04

−01

2015

−05

−01

2015

−06

−01

2015

−07

−01

2015

−08

−01

2015

−09

−01

2015

−10

−01

2015

−11

−01

2015

−12

−01

2016

−01

−01

2016

−02

−01

2016

−03

−01

2016

−04

−01

2016

−05

−01

2016

−06

−01

2016

−07

−01

2016

−08

−01

2016

−09

−01

2016

−10

−01

2016

−11

−01

2016

−12

−01

Sample period

Nr

of e

arni

ngs

anno

unce

men

ts p

er d

ay

41

Page 43: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Figure 3: Effect of regularization in panel structuresIllustration of the effect of regularization. Depicts a simulated sample from a population of the form yi,t = ai + ui,t, withui,t ∼ N(0, 15) and ai ∼ N(2, σ = 3). 30 individuals i are drawn; each individual has a random number of observations between1 and 50. The parameter of interest are the individual effects ai. The true ai are marked in red.

●●

●●

●●

●●

●●

●●

●●

−20

−10

0

10

20

0 10 20 30 40

Group size

a i v

s. tr

ue a

i

Fixed effects specification

●●● ● ●●

● ●●

●●●

●●●

● ●

●●●

−20

−10

0

10

20

0 10 20 30 40

Group size

a i v

s. tr

ue a

i

Bayesian prior regularization

42

Page 44: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Figure 4: Distribution of news outlets by target audience characteristicsA scatter plot of the posterior mean of θi,neg , the outlet-specific proportion of negativity across news posts, and θi,cplx, the outlet-specific complexity across news posts. Both θi,neg and θi,cplx were exponentiated. Since both measures are estimated in a logisticregression system, the exponentiated versions can be interpreted as odds ratios, such as the percentage change in odds of a wordbeing negative (complex) if the news posts is from outlet i. The baseline odds based on the posterior means of the intercepts fromTable 5 are exp(−3.91) = 0.02 for a word being negative and exp(−0.85) = 0.42 for a word being a financial and complex term.The news outlets highlighted as the sample’s top 5% by audience size are bbc.com, businessinsider.com, buzzfeed.com, cnn.com,dailymail.co.uk, forbes.com, foxnews.com, huffingtonpost.com, nytimes.com, theguardian.com, and washingtonpost.com.

cnbc.com

cnn.com

moderngraham.com

nytimes.com

raygent.com

shareholdersfoundation.com

tickerreport.com

tradersmagazine.com

wsj.com

zerohedge.com

0.5

1.0

1.5

2.0

2.5

0.5 1.0

Financial sophistication: exp(θcplx)

Neg

ativ

ity: e

xp(θ

neg)

Top 5% by UniqueVisitor ● No Yes

43

Page 45: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Figure 5: Volume around EAs by news outlet dispersionPlots of average size-adjusted abnormal trading volume around the sample EAs for two groups of EAs. For each announcement,the interquartile range of negativity (DispCovNeg) and financial sophistication parameters (DispCovSoph) of all outlets thatcover the EA is computed; then, the sample of EAs is split at the median into two groups: high/low negativity dispersion (panel A)and high/low complexity dispersion (panel B). Abnormal turnover is the daily trading volume scaled by shares outstanding minusthe average daily trading volume scaled by shares outstanding in the event period [-31, -1](Chae, 2005). The abnormal volumeis size adjusted by regressing on size and keeping the residuals.

3.9

4.3

1

2

3

4

−1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Business days relative to EA

Ave

rage

abn

orm

al d

aily

turn

over

Negativity dispersion High Low

A: EAs grouped by dispersion in outlet negativity

3.7

4.4

1

2

3

4

−1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Business days relative to EA

Ave

rage

abn

orm

al d

aily

turn

over

Sophistication dispersion High Low

B: EAs grouped by dispersion in outlet sophistication

44

Page 46: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Figure 6: Relation between outlet parameters and other outlet characteristicsScatter plots of the posterior mean of θi,neg , the outlet-specific proportion of negativity across news posts, and θi,cplx, theoutlet-specific complexity across news posts, contrasted with the logarithm of average daily unique visitors per site and averageEA news posts length per site. Both θi,neg and θi,cplx were exponentiated. The points to the far left in panels A and B reflectoutlets for which Amazon’s Alexa service did not have unique visitors. Unique visitors are measured as the average number ofestimated daily unique visitors (in millions) to the outlet’s website during the month of October 2016, as measured by Amazon’sAlexa web information service.

0.5

1.0

1.5

2.0

2.5

1 100 10000

Unique visitors (log scale)

Neg

ativ

ity: e

xp(θ

neg)

A: Negativity by audience size

0.5

1.0

1 100 10000

Unique visitors (log scale)

Fin

anci

al s

ophi

stic

atio

n: e

xp(θ

cplx)

B: Financial sophistication by audience size

0.5

1.0

1.5

2.0

2.5

250 500 750

Mean # total words

Neg

ativ

ity: e

xp(θ

neg)

C: Negativity by post length

0.5

1.0

250 500 750

Mean # total words

Fin

anci

al s

ophi

stic

atio

n: e

xp(θ

cplx)

D: Financial sophistication by post length

45

Page 47: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Figure 7: Distributions of out-of-sample predictions versus actual valuesHistograms comparing sample distributions with the distribution of the predicted values of the two models. The predictionsare based on five-fold cross-validation out-of-sample predictions. The first prediction model utilizes the full probabilistic model,including θi news outlet effects; the second equals the first, except that all θi terms as well as their interactions in the coverageequation are excluded.

0

25000

50000

75000

100000

−50 −25 0 25 50Prediction error for number negative words

# ne

ws

post

s

A: Number negative words

0

30000

60000

90000

−300 −200 −100 0 100 200 300Prediction error for number complex financial words

# ne

ws

post

s

B: Number complex financial words

EA not covered EA covered

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

0

10000

20000

30000

0

500000

1000000

1500000

2000000

Predicted prob. of outlet covering EA

# (E

A, O

utle

t) p

airs

C: Coverage Prediction

Model employed No outlet effects With outlet effects

46

Page 48: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Figure 8: Distributions of market news predictions versus actual valuesHistograms comparing the predicted probabilities of an EA j being covered by a market summary from news outlet i for the twomodels. The left panel shows the predicted probability distribution for announcements not actually covered. A perfect modelwould have all its mass at zero. The right panel shows the predicted probability distribution for announcements actually covered.Here, a perfect model would have all its mass at one.

EA not covered EA covered

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

0

10000

20000

30000

0

500000

1000000

1500000

2000000

Predicted prob. of outlet covering EA in a market summary

# (E

A, O

utle

t) p

airs

Model employed No outlet effects With outlet effects

47

Page 49: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Table 1: Sample composition

Sample composition step News posts EAs Outlets

Full sample (used for descriptive statistics):Posts identified as covering an EA 634,015 15,729 5,310Removing search engines, blog hubs, and similar 595,433 15,644 5,292If multiple posts with the same number of words on sameday for the same EA, keeping the earliest

522,026 15,644 5,292

Keeping posts with more than 100 words 477,471 15,518 4,622Keeping posts with fewer than 1500 words (mostly removingconference call transcripts, etc.)

476,956 15,518 4,607

Consolidating urls linking to the same website 466,557 15,518 4,341Removing all news outlets that covered fewer than 5 EAsduring the sample period

461,216 15,502 2,128

Sample with finance data and investing outlets only:Based on the full sample:Only EAs for which all covariates (Table 2) are available 403,011 12,883 2,127Only finance/investment-related news outlets 161,408 11,429 212a Appendix B details the sample construction procedures that lead to the initial 634,015 observations identified

by the classifier. The sample covers news posts published between April 2, 2015, and November 30, 2016.

48

Page 50: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Table 2: Variable definitions

Variable Description

Ij,i An indicator variable equal to 1 if news outlet i wrote at least one news post aboutearnings announcement j.

RetV ol The standard deviation of a firm’s monthly stock returns over the last 6 months endingbefore the month of earnings announcement j.

BtoM A firm’s book-to-market ratio based on Compustat’s total asset and closing price fieldstimes shares (atq, prccq) upon the most recent fiscal quarter-end.

logMV The natural logarithm of a firm’s market value, based on Compustat’s mkvaltq field uponthe most recent fiscal quarter-end.

InstHold The sum of all shares divided by the numbers of shares outstanding from the Thompson13F database. If the firm is not covered in the database, InstHold is set to zero.

PriRet The buy-and-hold return over the most recent 6-month period ending before the monththat includes earnings announcement j.

ESurp Trailing 12-month earnings before extraordinary items minus trailing 12-month earningsbefore extraordinary items from the previous year, divided by total assets at the beginningof the fiscal quarter 4 quarters ago (analyst consensus is not used because not all firmshave an analyst following).

PRNegTone The tone of the press release accompanying announcement j, measured as the sum ofthe number of words from the negative word list of Loughran and McDonald (2011) inthe release, divided by the total number of words.

NrAnalyst The natural logarithm of the number of analysts in the IBES details file who had anactive recommendation at most 90 days before the earnings announcement. If the firmdoes not have an analyst following according to IBES or has no active recommendation,NrAnalyst is set to zero.

NrEAs The natural logarithm of the number of Russell 3000 earnings announcements on thesame day.

yneg Sum of the number of words from the negative word list of Loughran and McDonald(2011) in a news post.

ycplx Sum of the number of words from the complex word list of Loughran and McDonald(2011) in a news post.

Nwords Total number of words in a news post.AbnV ol Abnormal trading volume is the average abnormal turnover in the event period [0, +2],

where abnormal turnover is the daily trading volume scaled by shares outstanding minusthe average daily trading volume scaled by shares outstanding in the event period [-31,-1].

EARetV ol Earnings announcement return volatility is the standard deviation of the daily log market-adjusted return in the event period [-1, +5].

DispCovNeg The interquartile range of negativity (θi,neg) of all outlets i that cover a given EA.DispCovSoph The interquartile range of financial sophistication (θi,cplx) of all outlets i that cover a

given EA.UniqueV isitors The average number of estimated daily unique visitors (in millions) to the outlet’s website

during October 2016, as measured by Amazon’s Alexa web information service. October2016 is used as the reference month because it is the third quarter earnings season, whichusually provides early guidance for the next year.

SumReach The sum of UniqueV isitors of all outlets that cover a given EA.

49

Page 51: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Table 3: News outlet characteristics

Variable N Mean StD P5% P25% P50% P75% P95%

Panel A: Full samplePerc.NegWords 2128 0.021 0.008 0.009 0.016 0.021 0.025 0.035Perc.ComplxWords 2128 0.217 0.030 0.177 0.197 0.215 0.232 0.274TotalWords 2128 271.050 99.256 154.962 208.255 252.796 306.662 446.683UniqueVisitors 1882 176.869 955.526 2.000 3.685 8.748 36.161 723.984EAs/Outlet 2128 89.551 564.551 5.000 8.000 15.000 37.000 235.300Articles/EA 2128 1.231 0.366 1.000 1.000 1.125 1.294 1.819

Panel B: Outlets with investment focusPerc.NegWords 214 0.022 0.008 0.010 0.017 0.021 0.024 0.036Perc.ComplxWords 214 0.225 0.031 0.184 0.208 0.220 0.239 0.276TotalWords 214 290.001 122.468 161.424 216.007 256.018 336.948 488.540UniqueVisitors 190 834.181 2152.802 2.000 3.436 12.871 444.910 4704.694EAs/Outlet 214 363.150 1261.287 5.000 15.000 39.500 145.500 1703.150Articles/EA 214 1.357 0.498 1.000 1.047 1.197 1.500 1.987

Panel C: Outlets without investment focusPerc.NegWords 1914 0.021 0.008 0.008 0.016 0.021 0.025 0.035Perc.ComplxWords 1914 0.216 0.030 0.176 0.197 0.214 0.232 0.273TotalWords 1914 268.932 96.121 154.782 207.641 252.342 305.062 435.320UniqueVisitors 1692 103.058 666.051 2.000 3.729 8.484 32.248 386.790EAs/Outlet 1914 58.960 409.794 5.000 8.000 14.000 33.000 182.350Articles/EA 1914 1.217 0.345 1.000 1.000 1.125 1.279 1.750

a The term EAs/Outlet is the number of earnings announcements covered by the news outlet during the sample period; Articles/EA isthe average number of articles an outlet write about an EA; and UniqueVisitors is the average number of estimated daily uniquevisitors (in millions) to the outlet’s website during October 2016, as measured by Amazon’s Alexa web information service. Thevariables Perc.NegWords, Perc.PosWords), and Pec.ComplexWords are the sums of words from the respective words lists of Loughranand McDonald (2011) for negative, positive, and complex words in financial texts, scaled by the number of words in a news posts andaveraged over all news posts of a news outlet. The variable TotalWords is the average number of words in the news posts of a givenoutlet. Note that two of the 214 news outlets with investment focus are not included in the final investment sample used for estimatingthe latent audience characteristics, because they cover only EAs with insufficient financial data for the estimation.

50

Page 52: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Table 4: Descriptive statistics of the investment-focused outlet sample

Variable N Mean StD P5% P25% P50% P75% P95%

Panel A: Earnings announcement characteristicsBtoM 13040 2.321 4.766 0.217 0.562 1.043 2.163 8.296logMV 13040 7.578 1.645 5.189 6.374 7.447 8.605 10.512ESurp 13040 -0.007 0.114 -0.146 -0.018 0.001 0.015 0.089NrAnalyst 13040 1.757 1.051 0.000 1.099 1.792 2.485 3.466PriRet 13040 1.019 0.285 0.609 0.876 1.015 1.148 1.418RetV ol 13040 0.090 0.062 0.031 0.051 0.074 0.109 0.199InstHold 13040 0.652 0.219 0.230 0.546 0.690 0.791 0.920NrEAs 13040 4.642 1.063 2.398 4.111 4.942 5.407 5.690PRNegTone 13040 0.016 0.010 0.003 0.009 0.014 0.021 0.034

Panel B: News post characteristicsYneg 161408 4.018 4.019 0.000 2.000 3.000 5.000 11.000Ycplx 161408 82.145 42.895 25.000 54.000 80.000 103.000 142.000Nwords 161408 357.859 157.131 126.000 242.000 355.000 457.000 590.000

a The variable BtoM is a firm’s book-to-market ratio, logMV is the logarithm of a firm’s market value, ESurp denotes theearnings surprise measured as the year-on-year change in trailing 12-month earnings, NrAnalyst is the logarithm of the numberof analysts covering the stock before the earnings announcement, PriRet denotes a firm’s buy-and-hold stock return during themost recent six months, RetV ol is the prior six-month return volatility, InstHold is the amount of institutional holdings,NrEAs is the logarithm of the number of earnings announcements on the same day, and PRNegTone is the tone in the pressrelease. The construction of all variables in Panels A and B are described in Table 2. The variable yneg is a news post’s numberof words with a negative connotation, ypos the number of positively connoted words, and ycplx is the number of financial wordswith more than three syllables. All three variables are constructed using the financial word lists compiled by Loughran andMcDonald (2011). The term Nwords is the number of words making up a news post.

51

Page 53: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Table 5: Posterior distribution summary of the main model

Parameter Mean StD P2.5% P97.5%

Main parametersµneg -3.91 0.00 -3.92 -3.91µcplx -0.85 0.00 -0.85 -0.84σa,neg 0.40 0.00 0.39 0.40σa,cplx 0.16 0.00 0.16 0.16σθ,cplx 1.29 0.02 1.26 1.33σθ,neg 0.53 0.00 0.53 0.53

Coverage regression coefficientsintercept -3.75 0.01 -3.76 -3.73θi,neg -3.54 0.01 -3.56 -3.51θi,cplx 1.75 0.03 1.69 1.81θi,neg × θi,cplx -4.23 0.05 -4.34 -4.14RetV ol 0.08 0.01 0.07 0.09RetV ol × θi,neg -0.04 0.01 -0.05 -0.03RetV ol × θi,cplx -0.85 0.02 -0.89 -0.81RetV ol × θi,neg × θi,cplx -0.65 0.03 -0.70 -0.60logMV 0.47 0.00 0.46 0.47logMV × θi,neg -0.21 0.00 -0.22 -0.20logMV × θi,cplx -2.76 0.03 -2.82 -2.71logMV × θi,neg × θi,cplx -2.27 0.02 -2.32 -2.23BtoM -0.07 0.01 -0.08 -0.06InstHold 0.03 0.00 0.03 0.04PriRet -0.11 0.01 -0.12 -0.10ESurp 0.04 0.01 0.02 0.05PRNegTone 0.03 0.01 0.02 0.05NrAnalyst 0.01 0.01 -0.01 0.02NrEAs -0.26 0.01 -0.27 -0.25

a To speed up computations (by reducing correlations between covariates), all variableswere demeaned and scaled by their standard deviation. The terms Mean, StD, P2.5%,and P97.5% are the mean, standard deviation, and 2.5 and the 97.5 percentiles of theposterior distributions based on a random sample of 1000 draws from the joint posterior.The variable RetV ol is the prior six-month return volatility, logMV is the logarithm of afirm’s market value, BtoM is a firm’s book-to-market ratio, InstHold is the amount ofinstitutional holdings, PriRet denotes a firm’s buy-and-hold stock return during themost recent six months, ESurp denotes the earnings surprise measured as theyear-on-year change in trailing 12-month earnings, PRNegTone is the tone in the pressrelease, NrAnalyst is the logarithm of the number of analysts covering the stock beforethe earnings announcement, and NrEAs is the logarithm of the number of earningsannouncements on the same day. Variable computations can be found in Table 2.

52

Page 54: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Table

6:

Regre

ssio

nof

tradin

gact

ivit

ym

easu

res

on

covera

ge

dis

pers

ion

Model

1M

odel

2M

odel

3

Var

iable

Coeffi

cien

t(S

E)

95%

CI

Coeffi

cien

t(S

E)

95%

CI

Coeffi

cien

t(S

E)

95%

CI

Pan

el

A:

Abnorm

al

trad

ing

volu

me

Dis

pC

ovN

eg0.

071

(0.0

20)

0.02

8:

0.10

60.

076

(0.0

19)

0.03

5:

0.11

20.

076

(0.0

19)

0.03

6:

0.11

0D

ispC

ovSop

h0.

121

(0.0

24)

0.07

3:

0.16

60.

106

(0.0

24)

0.05

6:

0.15

00.

108

(0.0

24)

0.05

8:

0.15

0Sum

Rea

ch0.

257

(0.0

42)

0.17

8:

0.34

70.

253

(0.0

42)

0.17

2:

0.33

60.

253

(0.0

42)

0.17

3:

0.33

6lo

gMV

0.03

7(0

.113

)-0

.165

:0.

284

0.04

5(0

.117

)-0

.158

:0.

303

Ret

Vol

-2.7

45(0

.577

)-3

.919

:-1

.690

-2.7

27(0

.584

)-3

.935

:-1

.643

ESurp

0.11

9(0

.173

)-0

.241

:0.

424

PR

Neg

Ton

e8.

108

(2.9

49)

2.88

3:

14.6

63

R2

0.43

470.

4378

0.43

83N

11,4

2611

,426

11,4

26F

irm

eff

ect

sY

esY

esY

es

Pan

el

B:

Retu

rnvola

tility

Dis

pC

ovN

eg0.

056

(0.0

10)

0.03

6:

0.07

40.

053

(0.0

09)

0.03

3:

0.07

00.

052

(0.0

09)

0.03

3:

0.07

0D

ispC

ovSop

h0.

007

(0.0

10)

-0.0

12:

0.02

60.

012

(0.0

10)

-0.0

07:

0.03

00.

012

(0.0

10)

-0.0

07:

0.03

1Sum

Rea

ch0.

128

(0.0

21)

0.09

0:

0.17

10.

128

(0.0

20)

0.09

0:

0.17

00.

128

(0.0

21)

0.09

0:

0.17

1lo

gMV

-0.2

75(0

.062

)-0

.396

:-0

.152

-0.2

71(0

.060

)-0

.379

:-0

.145

Ret

Vol

-0.5

80(0

.389

)-1

.394

:0.

157

-0.5

75(0

.380

)-1

.319

:0.

187

ESurp

-0.0

09(0

.152

)-0

.312

:0.

282

PR

Neg

Ton

e2.

473

(1.4

69)

-0.2

03:

5.51

4

R2

0.37

110.

3759

0.37

61N

11,4

2511

,425

11,4

25F

irm

eff

ect

sY

esY

esY

esa

Sta

nd

ard

erro

rsare

boots

trap

ped

an

dco

nfi

den

cein

terv

als

are

bia

sad

just

ed.

Th

evari

ab

leDispCovNeg

isth

ein

terq

uart

ile

ran

ge

ofθ i

,neg

am

on

gall

cover

ing

ou

tlet

s;DispCovNeg

isth

ein

terq

uart

ile

ran

ge

ofθ i

,neg

am

on

gall

cover

ing

ou

tlet

s;andSumReach

isth

esu

mofUniqueVisitors

of

all

ou

tlet

sth

at

cover

agiv

enE

A,

wit

hall

thre

em

easu

res

stan

dard

ized

for

ease

of

inte

rpre

tati

on

.T

he

vari

ab

leRetVol

isth

ep

rior

six-m

onth

retu

rnvola

tility

,logMV

isth

elo

gari

thm

of

afi

rm’s

mark

etvalu

e,ESurp

den

ote

sth

eea

rnin

gs

surp

rise

mea

sure

das

the

yea

r-on

-yea

rch

an

ge

intr

ailin

g12-m

onth

earn

ings,

an

dPRNegTone

isth

eto

ne

inth

ep

ress

rele

ase

.T

he

ab

norm

al

trad

ing

volu

me

isth

eaver

age

ab

norm

al

turn

over

inth

eev

ent

per

iod

[0,

+2].

Ab

norm

al

turn

over

isth

ed

aily

trad

ing

volu

me

scale

dby

share

sou

tsta

nd

ing

min

us

the

aver

age

daily

trad

ing

volu

me

scale

dby

share

sou

tsta

nd

ing

inth

eev

ent

per

iod

[-31,

-1].

Th

eE

Are

turn

vola

tility

(EARetVol)

isth

est

an

dard

dev

iati

on

of

the

daily

logari

thm

of

the

mark

et-a

dju

sted

retu

rnin

the

even

tp

erio

d[-

1,

+5].

Th

ete

rmEARetVol

issc

ale

dto

have

ast

an

dard

dev

iati

on

of

on

efo

rea

sier

inte

rpre

tati

on

of

the

coeffi

cien

ts.

Vari

ab

leco

mp

uta

tion

sca

nb

efo

un

din

Tab

le2.

53

Page 55: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Table 7: Posterior Distribution Summary of the Owner Effects Model

Parameter Mean StD P2.5% P97.5%

Main parametersµneg -4.00 0.00 -4.01 -3.99µcplx -0.46 0.00 -0.47 -0.46σa,neg 0.57 0.01 0.55 0.59σa,cplx 0.15 0.00 0.15 0.16σθ,cplx 0.84 0.01 0.82 0.85σθ,neg 0.67 0.02 0.64 0.70σowner,cplx 0.77 0.05 0.67 0.87σowner,neg 1.11 0.07 0.99 1.25

Coverage regression coefficientsintercept -5.62 0.02 -5.65 -5.57θi,neg 7.31 0.04 7.23 7.39θi,cplx 0.23 0.02 0.19 0.26θi,neg × θi,cplx -0.83 0.07 -0.97 -0.69RetV ol 0.54 0.01 0.52 0.56RetV ol × θi,neg 0.81 0.05 0.71 0.91RetV ol × θi,cplx 0.09 0.01 0.07 0.12RetV ol × θi,neg × θi,cplx 0.91 0.04 0.83 0.99logMV 1.46 0.00 1.45 1.47logMV × θi,neg -1.31 0.03 -1.38 -1.25logMV × θi,cplx -0.08 0.01 -0.10 -0.06logMV × θi,neg × θi,cplx -0.75 0.05 -0.86 -0.65BtoM -0.16 0.02 -0.19 -0.13InstHold -0.01 0.02 -0.03 0.02PriRet -0.24 0.01 -0.26 -0.22ESurp 0.06 0.01 0.04 0.09PRNegTone 0.04 0.01 0.02 0.05NrAnalyst 0.06 0.01 0.03 0.08NrEAs -0.16 0.01 -0.17 -0.15

a The results are based on a subset of the full sample with ownership data (28,853 posts, 2,507EAs, 425 outlets). To speed up computations, all variables were demeaned and scaled by theirstandard deviation. The terms Mean, StD, P2.5%, and P97.5% are the mean, standarddeviation and the 2.5 and 97.5 percentiles of the posterior distributions based on a randomsample of 1000 draws from the joint posterior. The variable RetV ol is the prior six-month returnvolatility, logMV is the log of a firm’s market value, BtoM is a firm’s book-to-market ratio,InstHold is the amount of institutional holdings, PriRet denotes a firm’s buy-and-hold stockreturn during the most recent six months, ESurp denotes the earnings surprise measured as theyear-on-year change in trailing 12-month earnings, PRNegTone is the tone in the press release,NrAnalyst is the logarithm of the number of analysts covering the stock before the earningsannouncement, and NrEAs is the logarithm of the number of earnings announcements on thesame day. Variable computations can be found in Table 2.

54

Page 56: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Table

8:

Post

eri

or

pre

dic

tive

dis

trib

uti

ons

of

test

stati

stic

s

T( yrep ba

se

) :M

od

elw

/oθ i

T( yrep o

utlets

) :M

od

elwθ i

Tes

tva

riab

leT

(y)

P2.

5%M

ean

P97

.5%

p-V

alP

2.5%

Mea

nP

97.5

%p-V

al

Covera

ge

chara

cte

rist

ics

%co

vere

d0.0

20.

050.

060.

070.

000.

040.

050.

060.

00N

r.n

ever

cove

red

1611.0

00.

000.

142.

001.

000.

000.

051.

001.

00

Conte

nt

chara

cte

rist

ics

Max

(Yneg

)78.0

044

.00

52.3

666

.00

1.00

58.0

066

.33

78.0

00.

98M

ax(Ycplx

)598.0

046

8.00

490.

9652

0.00

1.00

510.

0053

5.72

569.

021.

00M

in(Yneg

)0.0

00.

000.

000.

001.

000.

000.

000.

001.

00M

in(Ycplx

)0.0

05.

007.

399.

000.

004.

006.

628.

000.

00M

ean

(Yneg

)4.0

23.

883.

913.

941.

004.

114.

134.

150.

00M

ean

(Ycplx

)82.1

478

.81

79.1

979

.61

1.00

79.6

579

.75

79.8

61.

00S

tD(Yneg

)4.0

23.

223.

243.

271.

003.

833.

863.

881.

00S

tD(Ycplx

)42.8

937

.97

38.1

538

.35

1.00

40.7

740

.87

40.9

61.

00S

tD(µi(Yneg

))2.7

31.

691.

781.

861.

003.

433.

583.

740.

00S

tD(µi(Ycplx

))38.7

130

.54

30.8

331

.13

1.00

36.9

737

.29

37.6

41.

00a

Th

ista

ble

com

pare

sth

ete

stst

ati

stic

s(T

(y))

of

the

sam

ple

use

dto

fit

the

mod

els

wit

hth

ed

istr

ibu

tion

of

test

vari

ab

les

gen

erate

dfr

om

1000

rep

lica

ted

sam

ple

s,th

at

is,T( yre

pbase

) an

dT( yre

poutlets

) ,on

efo

rea

chd

raw

from

the

post

erio

rd

istr

ibu

tion

of

mod

el.

Th

ete

rm%

covered

isth

ep

erce

nta

ge

of

(EA

,n

ews

ou

tlet

)co

mb

inati

on

s,w

her

eth

en

ews

ou

tlet

cover

sth

eE

A;

Nr.

never

covered

isth

enu

mb

erof

EA

sn

ot

cover

edby

any

new

sou

tlet

;Max(Yneg)

an

dMax(Ycplx)

are

the

hig

hes

tvalu

esof

neg

ati

ve

word

san

dco

mp

lex

word

s,re

spec

tivel

y,in

the

sam

ple

;Min

,Mean

,an

dStD

are

the

min

imu

m,

aver

age,

an

dst

an

dard

dev

iati

on

,re

spec

tivel

y;StD

(µi(Y

neg))

an

dStD

(µi(Y

cplx))

den

ote

the

stan

dard

dev

iati

on

sacr

oss

ou

tlet

sof

an

ou

tlet

’saver

age

nu

mb

erof

neg

ati

ve

an

dco

mp

lex

word

s,re

spec

tivel

y;Mean

,P2.5%

,an

dP97.5%

are

the

mea

nan

d2.5

an

d97.5

per

centi

les

of

the

sam

ple

dd

istr

ibu

tion

s,re

spec

tivel

y;

an

dp-val

isth

ep

erce

nti

leof

the

sam

ple

dd

istr

ibu

tion

sth

at

corr

esp

on

ds

toth

eact

ual

test

stati

sticT

(y).

55

Page 57: Measuring Segmentation in the Financial News Market · Measuring Segmentation in the Financial News Market Harm H. Schutt Tilburg School of Economics and Management Tilburg University

Table 9: Comparisions of prediction errors by model

Model N Mean StD P5% P25% P50% P75% P95%

Panel A: Coverage decisionNo outlet effects 2764480 0.00 0.15 -0.06 -0.03 -0.02 -0.01 0.00With outlet effects 2764480 0.00 0.13 -0.06 -0.01 0.00 0.00 0.00

Panel B: Number of negative wordsNo outlet effects 161408 -0.76 15.45 -20.34 -7.71 -1.12 5.31 19.37With outlet effects 161408 -1.73 11.97 -19.47 -8.04 -1.80 4.44 16.18

Panel C: Number of complex financial wordsNo outlet effects 161408 -0.31 3.76 -5.47 -2.16 -0.66 1.11 6.05With outlet effects 161408 -0.03 2.97 -3.75 -1.48 -0.27 1.17 4.59

a This table shows descriptive statistics for the distribution of prediction errors (actual valueminus predicted value) for the news model without and the news model with audience outleteffects included. The actual value for Panel A: coverage decision is 1 if an outlet covers an EAand 0 otherwise. The predicted value is the posterior mean of the probability of covering thatEA. The other two Panels use to actual number of negative (complex financial words) minus theposter mean of the predicted number of negative (complex financial) words.

56