the effects of information-based trading on the daily
TRANSCRIPT
1
The Effects of Information-Based Trading on the Daily Returns and Risks of
Individual Stocks
Xiangkang Yin and Jing Zhao
La Trobe University
First Version: 27 March 2013
This Version: 2 April 2014
Corresponding author, Department of Finance, La Trobe Business School, La Trobe University, Bundoora, Victoria 3086, Australia. Tel: 61-3-9479 3120, Fax: 61-3-9479 1654, Email: [email protected]. The authors would like to thank Xiaozhou, Zhou, Rong Wang and participants of 2014 MFA Annual Conference, 26th Australasian Finance and Banking Conference, and seminars at La Trobe University, Audencia Nantes School of Management and Monash University for constructive comments.
2
The Effects of Information-Based Trading on Daily Returns and Risks of
Individual Stocks
ABSTRACT
This paper investigates the dynamic relation between information-based trading of a stock and its
daily return and risk. It develops a theoretical model to motivate the regression specifications for
empirical analysis. Based on two samples of stocks, we demonstrate that the expected trading
imbalances of a stock determine its daily return while the expected trades determine its volatility.
Trading imbalance arisen from private information plays a dominant role in determining return
but trading due to disputable public information is the dominant contributor to risk. Public-
information trading is closely associated with idiosyncratic risk rather than systematic risk.
JEL Classification: D82, G12, G14
Keywords: Information-based trading, Return volatility, Systematic risk, Idiosyncratic risk
3
The important role played by information in securities trading is well recognized. A common
theme of information-based trading is the adverse selection issue caused by private information
as analyzed by seminal works of Grossman (1976), Glosten and Milgram (1985), Kyle (1985).
Because uninformed market players are at the risk of trading with privately informed speculators,
they require an information risk premium to compensate for their potential loss to informed
traders. In this sense, information risk is likely to be a risk factor determining asset return
differentials in a cross-section. Easley, Kiefer, O’Hara, and Paperman (1996) develop the
concept of PIN (Probability of INformed trading) to measure trading motivated by privately
informed investors. Since then, PIN measure has been widely adopted in the literature and
Easley, Hvidkjaer and O’Hara (2002, 2010) show that PIN is priced and can explain the cross-
sectional difference of securities returns. 1 Duarte and Young (2009) introduce Symmetric
Order-flow Shocks (SOSs) to securities trading into the original PIN model and define PSOS
(Probability of SOS) to isolate the illiquidity component of PIN.2 It is found that PSOS measure
is also a risk factor explaining cross-sectional difference of stock returns.
This paper studies a related but different issue. It focuses on the effects of information-
based trading of an individual stock on its daily return and risk. Similar to Duarte and Young
(2009), we consider two types of information-based trading. The first type of trading is
originated from privately informed traders, who observe some private signals of the stock and
thus have superior information over other market players. They take this informational
advantage to buy or sell the stock to maximize their profits. The second type is caused by
1 Mohanram and Rajgopal (2009) replicate Easley, Hvidkjaer and O’Hara’s (2002) work and show that although PIN is priced for the sample period 1984-1988 it does not constitute a risk factor for the period of 1998-2002. 2 Symmetric Order-flow Shock (SOS) has two explanations in Duarte and Young (2009). One cause of an SOS is the occurrence of a public information event. Because traders have different opinions or interpretations about the public information, both buy and sell orders increase when the public information is released. The other cause of an SOS is that traders simply coordinate trading at particular times to reduce trading costs. This paper inclines to the first interpretation. We will call trading activities induced by disagreement on a piece of public information disputable-public-information-induced trading or simply SOS trading.
4
different opinions or beliefs of a public information event such as an earnings announcement or
disclosure of a new investment opportunity. Investors who are optimistic of the news buy the
asset while those who are pessimistic sell it. Our main findings can be summarized as the
following. First, the expected amounts of net buy orders (i.e., the expected trading imbalances
from various sources) determine daily stock return, while the expected amounts of trades
(including both buys and sells) determine return risk. Second, privately informed trading
dominates SOS trading in both marginal and total effects on stock return,3 while SOS trading
dominates privately informed trading in return risk. Third, SOS trading or trading induced by
disputable public information has a significant effect on total risk (return variance) of individual
stocks and idiosyncratic risk but its effect on systematic risk (market beta of stock return) is
mild.4 Fourth, the size of a firm matters in the sense that smaller firms provide stronger
supporting evidence to the first three findings.
The intuitions behind the findings, particularly the first finding, are straightforward. For
a trading day with good (bad) news of a stock, privately informed speculators will quietly buy
(sell) it, which pushes the stock price up (down) and leads the market to end up with a high daily
return. Such one-sized trading moves price in one direction and it does not necessarily cause
excess volatility of return. On the other hand, a piece of disputable information can trigger a
surge in both buy and sell order flows. Such symmetric order-flow shocks lead price and return
to fluctuate but they do not substantially move return in one direction. We formalize these ideas
in a theoretical model based on Brennan, Chordia, and Subrahmanyam and Tong (2012), which
3 The marginal effect is defined as the change of dependent variable caused by one unit change of explanatory variable. The total effect is measured by the average of the absolute values of the marginal effect times the daily explanatory variable over the estimation window divided by the average of the absolute values of dependent variable over the same time period. 4 Idiosyncratic risk is measured by the standard deviation of return residuals obtained through a market model of intraday return.
5
builds up a relation between order flow and price change. We enrich the model by assuming that
order-flows arrive at the market following independent Poisson processes (Easley, Hvidkjaer and
O’Hara (2002), and Roşu (2009)). Following the model’s predictions, our empirical analysis
concentrates on the effects of expected trading imbalances caused by privately informed traders
and SOS traders on return and the effects of expected trades from these traders on stock risk.
Inspired by the concepts of PIN and PSOS,5 we use expected numbers of buy and sell
orders from different types of traders to characterize information-based trading. We use two sets
of measures. The first set includes expected numbers of net buy orders from informed traders
and SOS traders and expected numbers of trades from these traders. The second set adopts
relative measures; that is, all these expected numbers are scaled by the expected number of total
orders submitted by all investors.6 The measures based on expected numbers perfectly match
variables used in the theoretical model and they in general lead to better performance of
regression models. But the scaled measures are unit-free, which facilitate comparison across
stocks. Standard PIN and PSOS are time invariant over the estimation window such as a couple
of months to a year. Since we intend to investigate stock return and risk at daily level, we
require dynamic or daily measures of information-based trading. To this end, we adopt the
Hidden Markov Model (HMM) approach developed by Yin and Zhao (2014), which can estimate
various expected numbers of buy orders and sell orders at daily level or even at a higher
frequency. Their simulation and empirical analyses demonstrate the HMM approach can
generate quite accurate estimates measuring information-based trading.
5 PIN is usually defined as the ratio of the expected number of buy and sell orders stemming from informed traders to the expected total trades, while PSOS is defined as the ratio of the expected number of trades from SOS traders to the expected total number of trades. Thus, they are our scaled measures of total orders from privately informed trading and SOS trading, respectively. 6 Our theoretical model and empirical analysis include a third type of trading, which is motivated by liquidity needs and is irrelevant to private or public information.
6
This paper is closely related to the growing literature on the PIN model and its variants as
we have mentioned. But instead of the effects of information on the cross-sectional differentials
of expected stock returns, we focus on how information drives the evolution of daily return and
risk of each individual stock. Our premise is that stock price is continuously adjusted to private
and/or disputable public information through trading dynamics. This evolution dictates the level
and variation of daily return of each stock. Informed trading may lift average return over a long
period up because of adverse selection, negative private information should drive return down in
the short-run. More importantly, it can be shown that it is disputable public information rather
than private information that dominates the effect on the total risk and idiosyncratic risk of stock
return. This is to a certain extent consistent with Duarte and Young’s (2009) argument that it is
systematic order-flow shock (SOS) rather than private information that is decisive to the
differential of expected returns across stocks.
This paper is also related to Chordia and Subrahmanyam (2004), who investigate the
relation between order imbalance and daily return of individual risky assets. Their findings are
consistent with ours in the effect of trading imbalance on daily return, although they are more
interested in revealing how market makers dynamically accommodate autocorrelated imbalances
emanating from large traders, in order to explore the positive relation between lagged imbalances
and return. Their key explanatory variable, order imbalance, is daily observation of the
difference between buy and sell orders, which differs from our measures in two ways. First, our
order imbalance is measured by the difference of expected buy and sell orders rather than
observed numbers. This treatment enables us to filter the noise in daily observations. Second,
we decompose order imbalance according three trading movies while Chordia and
Subrahmanyam (2004) do not consider such decomposition. By separating trading types, we are
7
able to specify which type of trading is the dominant contributor to the level of daily return.
Moreover, Chordia and Subrahmanyam (2004) do not consider the risk of stock return. Instead,
we address total risks, systematic risks and idiosyncratic risks of individual stocks.
Our finding of the strong link between dispersed beliefs on stock value and total risk is
supportive to Banerjee and Kremer’s (2010) the prediction that a jump in the difference of
opinions leads to an increase in return volatility. On the other hand, our finding of the
insignificant relation between SOS trading and systematic risk for most sample stocks is related
to Patton and Verardo (2012). They propose a model where investors can use public information
to extract information of the aggregate economy and find that daily realized market beta is higher
on earnings announcement days but it declines on post-announcement days to a level below its
non-event average. Thus, the relation between market beta and trading due to disputable public
news over a period could be quite ambiguous, which is consistent with our finding. Rees and
Thomas (2010) note that forecast dispersion proxies for idiosyncratic uncertainty about future
cash flows during earnings announcements, which is in line with our empirical result that
disagreement on public news is positively related to idiosyncratic risk.
The remainder of this paper is organized as follows. Section I develops a theoretical
model of the relation between return dynamics and trading activities. Section II introduces the
approach of estimating the daily measures of information-based trading. Data and samples are
described in Section III. Section IV examines the effects of different types of trades on daily
return and total risk, while Section V studies the dynamic relationship between information-
based trading and systematic or idiosyncratic risk. Further tests and robustness checking are
briefly reported in Section VI. The concluding remarks are provided in the last section.
8
I. A Theoretical Model and Its Specifications for Empirical Analysis
To motivate our empirical analysis, we develop a dynamic model associating asset return
and its volatility with information-based trading, based on the formulation of Brennan, Chordia,
Subrahmanyam, and Tong (2012), which originates from Glosten and Harris (1988). Let ,
denote the expected value of a risky asset, conditional on the public information available
immediately after the th transaction of day t. Similar to Brennan et al. (2012), we assume that
, evolves according to
, , , , , (1)
where , is the order size of the th transaction of day t with , 0 corresponding to a buyer-
initiated trade and , 0 a seller-initiated trade, and is the (inverse) market depth parameter.
Thus, , in (1) reflects the revision in expectations upon an order arrival. It captures the
adverse selection effect of a transaction on price because the transaction can potentially contain
private information unknown to the market. Term , is the unobservable innovation between
the 1 th and th transactions due to the arrival of public information, of which all market
players have an unanimous view. Let , denote the direction of the th transaction of day t, i.e.,
, 1 if , 0 and , 1 if , 0. Brennan et al. (2012) further consider a fixed
component of transaction costs to account for inventory holding costs and fixed costs.
Therefore, the transaction price of the th trade of day t, , , can be written as
, , , . (2)
Using (1) and (2), price change between two transactions is equal to
Δ , ≡ , , , , , , . (3)
If there are transactions on day t, the total return of that day can be represented in terms of
aggregate trade-by-trade price changes, i.e., ∑ Δ , . Our focus is how the expected
9
return and volatility are affected by information-based trading. For the simplicity of exposition,
we assume the order size of each transaction being constant and normalize it to one share so that
, , . Such simplified assumption is widely adopted in theoretical analysis (see, for
example, Glosten and Milgrom (1985)). It is also consistent with the concepts of PIN and PSOS
measures in the empirical analysis, where the number of transactions rather than trading volume
is considered. Based on trade directions of order flows, we can figure out the number of buyer-
initiated orders on day t, ∑ max , , 0 , and the number of seller-initiated orders,
∑ min , , 0 .
In the literature, it is widely assumed that the arrivals of buy orders and sell orders are
independent Poisson processes (see for example, Easley, Hvidkjaer and O’Hara (2002) and Roşu
(2009)). Thus, we assume and follow Poisson distributions with parameters ; and ; ,7
respectively. A key innovation of our model is that it allows the distributions of buy and sell
order flows to vary from day to day, as reflected by time-varying parameters ; and ; . For
the total number of trades on day t, , it is also Poisson distributed with time-varying
mean and variance ; ; . We then obtain the probability of an arriving order being buyer-
initiated Pr , 1 ;
; ; and the probability of it being seller-initiated Pr , 1
;
; ;. The first two moments of the distribution of the order flow become Ε ,
; ;
; ;
and Ε , 1 , so that Var , 1 ; ;
; ;. Applying these results, we obtain the
following proposition by routine computation.
7 The Poisson distribution is a one-parameter distribution with its mean (arrival rate) equal to variance. We will use order arrival rate and the expected number of orders interchangeably.
10
Proposition 1. If price response to order flows follows (3) and buy and sell orders arrive
following independent Poisson processes with parameters ; and ; respectively, the expected
total return and variance on trading day t are
Ε ; ; E . (4)
Var ; ; 2 1 ; ;
; ;Var , (5)
where ≡ ∑ , .
Proof: See Appendix A.
Since ; and ; are the arrival rates and positive, there is ; ;
; ;1. Therefore,
; ;
; ; is a higher-order small term and is negligible in comparison to other terms. On the
other hand, reflects the fixed costs, which is small and is not the focal point of this study. If
we model it zero, the second term on the right-hand side of (5) disappears. With these
consideration, we rewrite (5) as
Var ; ; Var Constant and higher-order terms. (6)
In (4)-(6), represents the effects of non-disputable public news on day t, which is not
associated with abnormal trading (see, for example, Llorente, Michaely, Saar and Wang (2002)).
An important insight shed by the model is that the expected daily return of a risky asset is related
to the expected trade imbalance, while the volatility of return is positively related to the expected
total trade if the higher order effects are ignored. Moreover, expected daily return has direction
in the sense that whether it is positive or negative depends on whether the expected amount of
net buy orders of the day is positive or negative. This distinguishes the model from asset pricing
11
model where the focus is on the relation of expected long-term return and risk factors but return
direction is not a concern.
To study information-based trading, we should separate different types of transactions.
We consider three trading motives as we mentioned in the Introduction. The first is the liquidity
needs when investors want to adjust their portfolios to hedge their risks or rebalance their
portfolios due to some exogenous shocks. The second type of trading activities is generated by
speculative investors who have private information on the fundamental value of the asset. The
third type of trades comes from symmetric order flow shocks, such as a disputable public news
event, of which the occurrence induces some investors to buy the asset but others to sell. In
Proposition 1, the numbers of buy orders and sell orders aggregate these three types of
orders. Let ; ( ; be the arrival rate of liquidity buys (sells) on trading day t, ; ( ; the
arrival rate of privately informed buys (sells), and ; ( ; the arrival rate of buys (sells) due to
symmetric order flow shocks. Then, the means of and , ; and ; , can be decomposed
into three components
; ; ; ; , ; ; ; ; . (7)
Since liquidity trading constitutes the base of each day’s trading activities, the associated
arrival rates ; and ; are strictly positive for all trading days. The remaining two components
in ; and ; can be zero if the trading day has no private information and/or disputable
publication information. Moreover, we require that ; ; 0 since a private signal is either
positive or negative, which induces informed traders either to buy or to sell. But for ; and
; , they are either both equal to zero if there is no disputable news event or both positive if
there is such an event. Barclay and Warner (1993), Hasbrouck (1995), and Chakravarty (2001)
document evidence of a disproportionately greater price impact that is attributable to informed
12
trading. Alexander and Peterson (2007) also note that trades resulting in greater proportional
price impacts are more likely to have been made by informed traders than noise or liquidity
traders. Therefore, we modify the impact of trading on price, , , by considering these three
different types of orders empirical analysis. Furthermore, can be treated as an independent
error term n empirical analysis. In light of Proposition 1, we estimate the following regression
relations for each individual stock:
; ; ; ; ; ; , (8)
; ; ; ; ; ; , (9)
where we use to denote realized volatility to proxy for the variance of return. We include a
lagged term in the regressions to capture the persistence of dependent variables.
The variables we used in measuring trading activities are expected numbers of buy and
sell orders on a trading day. For instance, ; ; and ; ; are expected order
imbalance and expected order originated from SOS traders. Such raw measures are not
convenient for cross-sectional comparison if some stocks involving heavy trading while others
are very light. To facilitate cross-sectional comparison, we scale these measures by the expected
number of total trades (or the arrival rate of all orders), ; ; . Such scaling actually leads
us to the PIN developed by Easley et al. (1996) and PSOS introduced by Duarte and Young
(2009):
; ;
; ;, ; ;
; ;. (10)
However, we would like to point out that the original PIN and PSOS measures are constant over
the estimation window. The PIN and PSOS used in this paper are estimated by the Hidden
Markov Model (HMM) approach (see the next section), which vary from day to day. We further
13
define the ratio of the expected number of net buys due to private information to the expected
total number of trades as PNbIN (probability of net buys due to private information) and the ratio
of the expected number of net buys due to SOSs to the expected total number of trades as
PNbSOS (probability of net buys due to symmetric order flow shocks):
; ;
; ;, ; ;
; ;. (11)
Then, we can estimate an alternative version of regression models (8)-(9):
, (12)
. (13)
In these regressions, measures of liquidity trading is not explicitly included for two reasons.
First, ; ;
; ;≡ 1 so that ; ;
; ; should not be included in (13). Second, the
expected number of liquidity buys is expected to be similar to that of liquidity sells on average,
; ;
; ; is almost zero and negligible.8 To make (12) in the same format as (13), the expected net
liquidity buy, ; ;
; ;, is also dropped in (12).
To preserve tractability, we analyze price change (total return) in the theoretical model,
which is standard practice in the microstructure literature on informed trading (see for example,
Hong and Stein (1999); Chordia and Subrahmanyam (2004)). However, in the empirical
analysis, in order to preserve comparability in the cross-section, we analyze simple daily returns
and report the associated results since the implications of the theoretical model hold to both price
changes and simple returns. Total return and log return, as alternative measures to simple return,
are considered in the robustness checks.
8 Some empirical studies actually presume ; ≡ ; , see for example Easley, Engle, O’Hara and Wu (2008).
14
II. Daily Measures of Information-Based Trading
In order to empirically test the effects of information-based trading on daily return and
risk of an individual asset, we need to estimate the daily order arrival rates according to their
trading motives to measure information-based trading. Although the estimation of the standard
PIN and PSOS measures of Easley, Hvidkjaer, and O’Hara (2002) and Duarte and Young (2009)
has been discussed thoroughly in the literature, these conventional measures are constant over
the whole estimation period, say a quarter, and it is difficult to use them to well capture short-
term variations in information-based trading. To overcome the difficulty, Yin and Zhao (2014)
develop a new Hidden Markov Model (HMM) approach, which can estimate daily measures with
satisfactory accuracy. This section briefly outlines the approach and the estimation process.
The core of this approach is a Hidden Markov Model (HMM), which links the observable
trading activities to the unobservable information state of the market of a risky asset. The
information state is used to describe whether private and/or public information events of the asset
occur or not, and if they occur, how intense they are.9 More specifically, it is characterized by
the expected numbers of buyer-initiated and seller-initiated orders arriving at the market. Each
trading day is associated with a distribution of information states and its evolution portraits the
trading process of the risky asset. Formally, the HMM consists of two parts: a two-dimensional
unobservable stochastic process of state ≡ ; , ; : 1, … , , satisfying the Markov
properties; a bivariate state-dependent trading process ≡ , : 1,⋯ , . In this
model, T is the time horizon being considered, indicates the hidden state at time t, and and
represent the observable time series of buyer- and seller- initiated trades, respectively. The
9 For instance, informed investors may receive an extremely good signal of a company or it is just slightly better than the expected. The private signal can be observed by either a very limited number of investors or a relatively large amount of investors. For public information event, divergence in opinions can be either profound or mild.
15
distributions of and depend only on the current state and not on previous states or trades,
i.e.,
Pr Pr | and Pr , Pr | ,
where ≡ , , … , and ≡ , , … , . Although the Markov property implies
that conditioning on the history of the process up to time t is equivalent to conditioning only on
the most recent value of , there exists a dependence structure in the evolution of hidden states.
The transition matrix of this 2-dimensional Markov chain can be written as
Γ
γ , ; , γ , ; ,γ , ; , γ , ; ,
⋯γ , ; , γ , ; ,γ , ; , γ , ; ,
⋮ ⋱ ⋮γ , ; , γ , ; ,γ , ; , γ , ; ,
⋯γ , ; , γ , ; ,γ , ; , γ , ; ,
,
where
γ , ; , ≡ Pr ; , ; ; , ; ,
is the probability that state is , at time 1 conditional on it being , at time , and m
and n are the ranges of the two components of hidden state. The unconditional probability of the
hidden state being in state , at time t, , ; ≡ Pr ; , ; , is a key variable of any
HMM. Denoting these probabilities by the row vector
≡ , ; , , ; , … , , ; , … , , ; , … , , ; , , ; ,
we can deduce the distribution of states at time 1 from its distribution at time t by
Γ. Moreover, the distribution of future information states, over a forecast horizon of h days,
can be calculated by Γ .
16
Consistent with prior literature and the assumption in Section I, buy and sell order flows
arrive at the market according to a bivariate independent Poisson process for state , .10 Thus,
given state being , the probability of observing buy orders and sell orders at time t,
Pr | ; , ; , is , where
; ;
! and ; ;
!.
In the above expressions, λ ; and λ ; are the arrival rates of buys and sells, respectively, when
buy state is i and sell state is j. The marginal distribution of observing , at time t can
be calculated by
Pr Pr | ; , ; Pr ; , ; ,
where -diagonal matrix is defined by
≡0
⋱0
and ≡1⋮1
.
The HMM model also yields a probability distribution of states for each day, conditional on the
history of observed trades:
Pr ; , ; , for 1,2, … , . (14)
The parameters of the model include the initial distribution of states , transition matrix
Γ and order arrival rates λ ; and λ ; ( 1, 2, … , , 1, 2, … , . They can be estimated by
maximizing the following likelihood function as shown by Yin and Zhao (2014):11
Θ Γ Γ ⋯Γ .
10Although buys and sells are independent in a specific state, the observed daily numbers of buys and sells are contemporaneously and serially correlated because of correlation between states. 11 The details of parameter estimation of the HMM based on Expectation and Maximization Algorithm (see Baum, Petrie, Soules, and Weiss (1970)) are given in Appendix B.1.
17
The numbers of buy and sell states, m and n, are determined in model selection according to
information criterion, such as Akaike information criterion (AIC) or Bayesian information
criterion (BIC).
After obtaining λ ; , λ ; and Pr ; , ; in the process of
estimating the HMM, Yin and Zhao (2014) further develop a two-step approach to decompose
the order arrival rates, λ ; and λ ; into three components
; ; ; ; , ; ; ; ; . (15)
The first step applies k-means clustering together with the jump method (see Sugar and James
(2003)) to all observed trade imbalances | | 1, 2, … in order to identify the arrival
rates of trades due to private information for each hidden state. They argue that the states
belonging to the cluster with the smallest mean of trade imbalances do not contain trades with
private information. The rest states contain private information revealed by their substantial
expected trade imbalances. After partitioning the states by this way, the estimates of arrival rates
of trades due to private information, ; and ; , can be easily obtained as specified in
Appendix B.2. The second step conducts a 2-means clustering analysis on the observations of
balanced trades 1, 2, … to separate states with disputable public information
from states without disputable public information. The estimation of arrival rates of trades due
to disputable public information, ; and ; , is detailed in Appendix B.2. Therefore, we can
obtain the estimates of the arrival rates of different types of trades at trading day t in the
framework of the HMM approach,
18
; ;,
Pr ; , ; , ; ;,
Pr ; , ; ,
; ;,
Pr ; , ; , ; ;,
Pr ; , ; ,
; ;,
Pr ; , ; , ; ;,
Pr ; , ; ,
(16)
where the conditional probability of the hidden state Pr ; , ; is available after
the estimation of the HMM as detailed in Appendix B.1.
III. Data and Sample Description
We use two samples of stocks for our empirical analysis. The first dataset is a sample of
120 stocks that were traded on the New York Stock Exchange (NYSE) in 2010 and 2011. It
consists of 40 stocks randomly selected from S&P 500 Index, S&P MidCap 400 Index, and S&P
SmallCap 600 Index, respectively. The ticker symbols of these sample stocks are detailed in
Panel A of Table I. This dataset has been used by Yin and Zhao (2014), which demonstrates that
the HMM approach can effectively measures information-based trading for all the sample stocks
and performs better than prevailing approaches. In particular, both positive contemporaneous
correlation between buys and sells and serial dependence of order flows are captured with high
accuracy. Because the sample firms are selected from three indexes, they are representatives for
a variety of industries and market capitalizations. This sample includes only NYSE stocks to
avoid possible variation caused by differences in trading protocols.
INSERT TABLE I HERE
The second sample consists of all constituent stocks of S&P 500 Index in 2010 and 2011.
We exclude the stocks added or removed from the index over the sample period and the final
sample contains 451 stocks. S&P500 stocks are arguably the most actively traded stocks,
capturing 75% coverage of U.S. equities in terms of market capitalization. This sample presents
19
a more comprehensive picture of the market, particularly for large stocks. There are 40 stocks
appearing in both two samples and serving as the bridge between the two samples.
Transaction data of all sample stocks are taken from the Thomas Reuters Tick History
(TRTH) transaction database over a two-year period from January 1, 2010 through December 31,
2011. For each sample stock, transactions and quotes that occur before and at the open are
excluded, as well as those at and after the close. Quotes with zero bid or ask prices, quotes for
which the bid-ask spread is greater than 50% of the price, and transactions with zero prices are
also excluded to eliminate possible data errors. Data of November 26, 2010 and November 25,
2011 are removed due to an early “day after thanksgiving” closing. The Lee-Ready (1991)
algorithm is applied to the TRTH transaction data to determine the daily numbers of buys and
sells.
For each stock, we measure firm size by the average daily market capitalization over the
sample period. Panel B of Table I summaries the statistics of characteristics for the first sample
of 120 stocks and its three size-based groups. It indicates a positive relation between firm size
and the average daily total number of trades, and the average daily absolute trade imbalance.
Consistent with prior studies, the subsample of S&P SmallCap 600 constituents has the largest
sample mean of average percentage effective bid-ask spread and the smallest sample mean of
average daily turnover (measured by the ratio of the number of shares traded to the number of
shares outstanding). The summary statistics of the characteristics of the second sample of 451
stocks constituting S&P 500 index are provided in Panel C of Table I.
For each sample stock i, we apply the HMM approach to estimate the arrival rates of
different types of trades on each trading day as specified by (16) in the previous section. We
20
also estimate the scaled daily measures of information-based trading, i.e., , , , ,
, , , , using (10) and (11).
We consider two types of return series here, i.e., open-to-close daily returns and close-to-
close daily returns. The close-to-close daily returns are obtained from the Center for Research in
Security Prices (CRSP), while the open-to-close daily returns are calculated based on the open
and closing prices taken from the Thomas Reuters Tick History (TRTH) transaction database.
Total risk of returns is defined as return variance. As a common indicator of return variance,
realized variance provides a relatively accurate measure (Andersen and Bollerslev (1998)) and
reflects time variations in total risk. For a particular trading day, the realized variance is
calculated as the sum of squared intraday returns. As the sampling frequency of intraday returns
approaches infinity, realized variance is free from measurement errors (Andersen, Bollerslev,
Diebold, and Labys (2001)). We choose sampling frequencies of 15-minute and 10-minute for
intraday returns to balance the desire for reduced measurement error with the need to avoid the
microstructure biases that arise at the highest frequencies.
IV. The Effects of Information-based Trading on Return and Total Risk
In this section, we empirically examine the effects of information-based trading on stock
returns and their total risks.
A. Descriptive Statistics and Nonparametric Tests
To preserve comparability in the cross-section, Table II presents only descriptive
statistics of the scaled daily measures of information-based trading, i.e., PIN, PSOS, PNbIN, and
PNbSOS because their counterparts of non-scaled measures are hardly comparable across
21
sections. , , , and are the averages of their underlying daily
measures over the sample period. As Panel A shows, the cross-sectional sample means of
and are 0.136 and 0.358 respectively for the first sample of 120 stocks, reflecting the
existence of substantial information-based trading in the market. On the other hand, the cross-
sectional means of and are close to zero. While each , is likely to
be vary small and close to zero leading a close-to-zero mean, small indicates that the
privately informed buys and sells offset each other over a long period when a sell is treated as a
negative buy.
INSERT TABLE II HERE
In order to examine the daily return’s relation with information-based trading, we
separately calculate the averages of scaled daily measures for days with a positive return and for
days with a negative return.12 The results of their cross-sectional sample means are given in
Panel A of Table II as well. Consistent with the theoretical prediction from our theoretical
model, the cross-sectional mean of on trading days with positive return is 0.041, while
that on trading days with negative return is -0.051. It demonstrates that positive (negative) daily
returns are associated with net privately informed buys (sells). In contrast, the difference
between ’s on positive and negative return days is much smaller and equal to 0.003
( 0.002 0.001 . In order to further explore the contemporaneous link of return and
information-based trading for sample stocks, we test the equality of the average , ( ,
, or , ) on days with a positive return and that on days with a negative return.
Using the 5% significance level,13 we find only 13.33% of stocks whose PINs are significantly
12 We use close-to-close daily return here. In regression analysis in the next subsection, both close-to-close and open-to-close returns are used. 13 Throughout of the paper, the statistical significance level is at the 5% level if it is not specified.
22
different on positive and negative return days and the corresponding future for PSOS is 17.75%.
These figures indicate that PIN and PSOS are not good proxies for the determinants driving daily
return. Although measure may be priced in long-term asset pricing tests as a risk factor, it
does not distinguish privately informed buys with privately informed sells and has an ambiguous
effect on contemporaneous daily returns. In particular, the cross-sectional mean of is 0.136
on days with a positive return, which is very close to that on days with a negative return (0.135).
Similarly, the cross-sectional mean of on days with a positive return is close to that on
days with a negative return (0.350 vs. 0.365). However, there are 94.17% and 49.17% of stocks,
respectively, whose and PNbSOS on days with a positive return are significantly
different from those on days with a negative return. It implies that positive daily returns are
significantly driven by contemporaneous net privately informed buys but to a less extent driven
by net buys due to public information. This result is consistent with Alexander and Peterson
(2007), who argue that trades resulting in greater proportional price impacts are more likely, on
average, to have been made by informed traders than noise or liquidity traders.
For total risk, we sort trading days into quintiles by its realized variance for each
sample stock.14 The cross-sectional mean of is 0.446 for the trading days within the
largest quintile ( , while that is 0.273 for the trading days within the smallest quintile
( ). The difference-in-means test shows that on trading days within is significantly
different from that within for almost all sample stocks (i.e., 96.67% of 120 stocks), but the
corresponding figure for drops to 66.67%. For the other two measures, i.e., and
, the cross-sectional means for the trading days within are close to their counterparts
within . It implies that excess total risk of returns is mainly determined by the number of
14 Realized variance is calculated based on a time interval of 10 minutes. Both 10- and 15minute frequencies are used in the regression analysis in the next subsection.
23
orders rather than order imbalance and it is more profoundly related to disputable public
information than private information.
Panel B of Table II presents descriptive statistics of the four measures for the three size-
based subsamples of the 120 stocks, and the second sample of 451 stocks constituting S&P500
Index, respectively. The cross-sectional means of both and obtained by averaging
over all trading days decrease with firm size, which implies that information-based trading is
more prevalent in the market of small firms than that of large firms. The results of Panel A hold
for all three size-based subsamples and the second sample. In particular, for almost all sample
stocks, on days with a positive return is significantly different from that on days with a
negative return, while on days within the largest quintile is significantly different
from that on days within the smallest quintile. Monotonicity in firm size can also been seen
for difference-in-means tests of when days are sorted by daily return and tests of
and when days are sorted by . We also note that although the descriptive statistics
and nonparametric test results of the subsample of large size group are similar to those of
S&P500 sample, difference between the two still exist. This difference may reflect the fact the
S&P500 includes both NYSE and NASDAK stocks while the subsample of large size group
concentrates on 40 NYSE stocks included in S&P500 index.
In order to further investigate return and total risk in a simple nonparametric way, we sort
the trading days of each sample stock in another way, i.e., sorting days into quintiles according to
the value of one of daily measures of information based trading. We then take the averages of
daily return and realized variance of each quintile over time and across section. The results are
documented in Table III. Panel A shows that the average daily return is higher for the trading
days with a larger measure of , , consistent with the theoretical prediction of our model.
24
Such phenomenon is strongest for small, less frequently traded stocks, where the average daily
return is 1.1% for days within the smallest , quintile and 1.2% for trading days within
the largest quintile. Meanwhile, the relationship of , , , or , with the
contemporaneous daily returns are ambiguous. Although , distinguishes buys with
sells, SOS traders do not possess private information about the value of the stock so that the
profits of their trading activities cannot be assured in general. Panel B shows that the average
daily realized variance is higher for trading days with a larger measure of , , which is again
consistent with our theoretical prediction. The other three scaled measures of information-based
trading, i.e., , , , and , , do not have clear and consistent effects on the
contemporaneous total risk of returns.
INSERT TABLE III HERE
In summary, both Table II and Table III show that net information-based buys are
associated with positive contemporaneous daily returns and such effect is largely driven by
privately informed trading rather than SOS trading. The conventional measures of PIN and
PSOS do not distinguish buys from sells and thus cannot effectively reveal the short-term
association of returns with information-based trading. Excess total risk of return can be induced
by both buys and sells due to information arrivals, where the effect of trading due to disputable
public information is much stronger than that of privately informed trading. It implies the
differential effects of two-sided shocks and one-sided shocks on total risk.
B. The Effects of Information-Based trading on Daily Return
To quantify the relation between information-based trading and daily return, we examine
regression model (8), where is measured by either open-to-close or close-to-close daily return
25
on day t. The averages of regression coefficients of (8) and autocorrelation-corrected t-statistics
are reported in Panel A of Table IV. To see the direction of the effect of each explanatory
variable, we count and report the percentage of sample stocks with regression coefficient being
significantly positive or negative. While the coefficient of each explanatory variable proxies its
marginal effect, we are also interested in the total effect of the variable. In order to examine the
normalized total effect of each regressor in individual regressions, we consider a measure of
effect size, which is the ratio of average of the absolute total effects to the average of absolute
values of dependent variable. For instance, the effect size of ; ; on is calculated by
∑ ; ; ∑ | |⁄ . Panel A presents the results of the first sample of the 120
stocks, its three size-based subsamples, and the second sample of the 451 stocks constituting
S&P500 Index, respectively, for both open-to-close and close-to-close daily returns. Because the
results are robust to the choice of daily return, we take open-to-close for example to discuss.
First of all, the marginal impacts of different trades are quite different. In particular, the
regression coefficient of the expected number of net buys due to private information, ; ; ,
is positive and significant for the majority of sample stocks. In contrast, the coefficient of the
expected number of net buys due to disputable public information, ; ; , or liquidity needs,
; ; , is mostly positive but they are significant for no more than 32.50% of sample stocks.
In terms of magnitude, the marginal effect of informed trading is also the largest among the three
types of trades, as implied by its largest average regression coefficient of ; ; . The
explanatory power of SOS trading or liquidity trading to daily return is relatively low. Previous
literature (see for example Chakravarty (2001); Alexander and Peterson (2007)) note that trades
resulting in greater proportional price impacts are more likely, on average, to have been made by
26
informed traders than noise or liquidity traders.15 Our findings are consistent with this claim. It
in turn demonstrates the validity of the two-step approach in identifying the order arrivals of
different types of trades detailed in Appendix B. The difference of total effects between the
three types of trading is more impressive because of their sizes of expected net buys. As we can
see that the total effect of ; ; ranges from 26.6% to 39.3%. On the other hand, the total
effect of ; ; ranges from 9.96% to 13.3% and that of liquidity trading ranges from 7.6%
to 12.9%.
INSERT TABLE IV HERE
The information-based trading could affect the contemporaneous returns of large and
frequently traded stocks differently from small and infrequently traded stocks. We compare the
results of the three size-based subsamples of the 120 stocks reported in Panel A to explore such
possibility. The size-stratified results demonstrate that both marginal effect and total effect of
privately informed trading on daily return decreases with firm size. This implies that the price
impact of private information depends the size of stock market capitalization. All other things
being equal, private informing moves price less effectively for a stock with larger market
capitalization. The intuition behind this is straightforward. Large stocks usually trade more
frequently and the private information is more easily to be hidden by the high transaction traffic.
Thus, it is less possible for informed trades to be followed by other investors in the market.
Moreover, the sheer size of market capitalization of large stocks means that the resources owned
by informed speculators are relatively small so that their role played in these markets is relatively
small. Different from privately informed trading, neither marginal effects nor total effects of
15 Some papers in the literature (see for example Barclay and Warner (1993); Hasbrouck (1995); Chakravarty (2001); and Alexander and Peterson (2007)) document the presence of stealth trading by institutional investors and find that medium-sized trades, more likely to be attributable to informed traders, tend to have a disproportionately greater aggregate price impact.
27
both liquidity trading and public-information induced trading appear to be monotonic in firm size.
Our model seem to perform better for small firms as the average R2 of regressions decreases with
firm size from over 10% to less than 6%.
The coefficient of lagged return is negative on average and the stocks with a significantly
negative coefficient are much more that with a significantly positive coefficient. It implies that
stock returns are more like to reverse themselves rather than continue their trends. The average
effect size of lagged return is smaller than that of ; ; , ; ; or ; ; for both
samples and the three subsamples, which demonstrates the dominant role played by trading
activities in explaining return dynamics.
The results of testing regression (12) are documented in Panel B of Table IV in the same
format as Panel A. For more than 90% of the sample stocks, daily return is positively and
significantly associated with the probability of net buys due to private information .
There is no sample stock with regression coefficient of or being significantly
negative for either open-to-close or close-to-close return. The average regression coefficient of
always exceeds that of by a substantial margin, implying larger marginal
effect of private information than disputable public information on daily return. Regarding total
effect, the average effect size of is larger than 28.3% for all samples and subsamples
considered, while that of is less than 8.31%. Although the scaled measures of
information-based trading are adopted in (12) to explain daily return instead of their non-scaled
counterparts, the results of Panel A discussed above qualitatively hold here. However, for seven
out of 10 cases the average adjusted R2 in Panel B is slightly smaller than that in Panel A,
indicating regression specification (8) marginally better conforms to the prediction of theoretical
model. But when scaled measures of information-based trading are adopted as regressors in (12),
28
we can do the cross-sectional comparison between the marginal effects of the regressors by
performing a simple different-in-means test of equality of and for each sample and
subsample. As reported in the last row of Panel B, all the hypothesis tests yield a p-value less
than 0.05. It shows that the marginal effect of on daily return is significantly larger in
comparison to at the 5% level.
C. The Effects of Information-Based trading on Total Risk of Return
For total risk, we regress (9) using realized variance of intraday returns sampled at the
15- or 10-minute frequency. The results are presented in Panel A of Table V in the same fashion
as Panel A of Table IV. Let us first look at the first sample of 120 stocks and the case where
realized return variance is generated by 15-minute frequency. The regression coefficient of
; ; is positive and significant for almost all the stocks (95%), showing the substantial
effect of SOS trading on the total risk of return. Meanwhile, the regression coefficients of
; ; and ; ; are significant and positive for only 16.67% and 8.33% of the sample
stocks, respectively. In terms magnitude, the average regression coefficient associated of
; ; is also the largest, while the average coefficient of ; ; is marginally larger than
that of ; ; . These results are consistent with the prior literature that information arrival
may induce excess volatility. Our results, however, further show that the marginal effects of the
two types of information-based trading are different. The strong relation between ; ; and
reflects the substantial effect of belief divergence of public news on the volatility of stock
price. When a public information event occurs, say an announcement of profitability outlook of
a firm, investors may disagree about the implication of the event. Those with a positive view
actively buy the stock and push its price up, while those with a negative view (the announcement
29
may be not as good as expected) actively sell it and push its price down, which makes stock
return volatile in a dynamic trading process. This evidence that trading motivated by
heterogeneous beliefs amplifies total risk of return is consistent with the findings in the literature.
For instance, Shalen (1993) shows that belief dispersion gives a measure of excess price
variability and are related to price volatility. On the other hand, for the majority of sample
stocks, the marginal effect of the expected number of trades due to private information is
insignificant. It reflects the one-sided impact of private information on stock price. If informed
traders obtain a signal of stock price, it is either high or low and induces them to either buy or
sell. Thus, stock price moves in one direction without substantial fluctuations. Regression
coefficient of the lagged realized variance is also significantly positive for the majority of sample
stocks. It demonstrates that the persistence of the realized variance cannot be fully explained by
information-based trading.
INSERT TABLE V HERE
Panel A also reports the average effect sizes measuring the total effect of each
explanatory variable. The total effect of the expected number of SOS trades ; ; on return
volatility is dominant and larger than that informed trading ; ; or liquidity trading
; ; . About 49.7% of volatility can be explained by ; ; , 29.9% by ; ; , and
9.12% by ; ; . Liquidity trading as two-sided trading contributes more significantly to
realized variance than one-sided trading of privately informed. In addition, liquidity trading
exists in the market for all trading days while private information occurs less frequently. Admati
and Pfleiderer (1988) note that informed traders attempt to disguise their trades by placing them
during times of abnormally heavy trading, which helps to explain the relatively small total effect
30
of private-information trading. Overall, the regression in (9) exhibits pronounced explanatory
power, with an average adjusted R2 of 37.09% for the first sample of 120 stocks.
In order to examine the possible variation in the relation between information-based
trading and total risk, we also report the results of the three size-based subsamples in Panel A.
The earlier results obtained from the whole sample hold for all three subsamples. However, we
notice the increase of average effect size of ; ; as firm becomes larger and the overall
explanatory power of regression (9), measured by the average adjusted R2, also increases with
firm size. We further extend our analysis to the second sample of 451 stocks constituting
S&P500 Index, of which the results are similar to those of the large size group of the first sample.
To ensure that our results are not sensitive the particular choice of return frequency in
estimating realized volatility, we also measure by using the 10-minute intraday returns and
report the results of regression (9) in Panel A of Table V as well. The findings about the
association between information-based trading and total risk of return under 15-minute
frequency remain unchanged. However, the average adjusted R2 across sample stocks are further
increased compared to those results based on the 15-minute intraday returns. The building block
of our theoretical model is (3) that specifies a trade-by-trade price impact. A higher frequency
measurement for return variance is closer to the theoretical setting and it can better capture the
risk driven by intraday trading activities. Thus, regression model (9) has more explanatory
power.
In the regression specification, we exclude the higher order term of ; ;
; ;. When there
is no private information on trading day t, ; ;
; ; is expected to be negligibly small and thus
the omission of its higher order term almost yields no errors. If there are substantial trade
31
imbalances due to private information, the higher order term contributes negatively to the total
risk of return while the coefficient of ; ; is always positive as shown in (5). Since private
information is either positive or negative so that ; ; | ; ; | | ; ; |, the
regression coefficient of ; ; may become insignificant or even significantly negative
when the higher order term of ; ;
; ; is omitted in the regression specification. Panel A of
Table V shows that there exist a small number of sample stocks with regression coefficient of
; ; being negative and significant, which further validates our theoretical prediction in
Proposition 1.
In addition to expected numbers of trades generated by different trading motives, we
further test the theoretical prediction using the relative measures of expected trades, PIN and
PSOS, by running regression (13) for each sample stock. We document the results in Panel B of
Table V with measured based on 15-minute and 10-minute intraday returns, respectively.
As expected, is a dominant contributor to the return variance for almost all stocks in the
sample and its marginal and total effects are more profound than those of for all the
samples and subsamples considered. It demonstrates that more dispersed beliefs are likely to be
associated with higher return risk, while the impact of private information trading tends to less
influential. Compared with Panel A, the average adjusted R2 is reduced a noticeable margin
when these scaled measures of information-based trading are adopted. In the last row of Panel B,
we perform a difference-in-means test for the equality of the marginal effects of and ,
i.e., in (13), for each sample and subsample. All of them yield a p-value less than 0.01.
It leads us to conclude that the effect of SOS trading on total risk is significantly larger than
private-information trading.
32
V. The Effects of Information-based Trading on Systematic Risk and Idiosyncratic Risk
The variance of daily return Var in (10) represents total risk, including both
systematic risk and non-systematic (idiosyncratic) risk. Financial analysis is keen to
composition of total risk because the former is supposed to be priced while the latter is not. In
this section, we further identify the effects of information-based trading on systematic risk and
idiosyncratic risk.
A. The Estimation of Daily Beta and Idiosyncratic Risk
An important concept for evaluating an asset's exposure to systematic risk is market beta
of the asset. We consider an intraday market model to obtain estimates of the beta and
idiosyncratic risk for individual stocks on each trading day. This enables us to analyze the time-
varying association of information-based trading with beta and idiosyncratic risk. In particular,
for each sample stock, we consider the following intraday market model on trading day t
, , , (17)
where , denotes the j-th intraday return of the stock on trading day t sampled at a certain
frequency, , the j-th intraday market return, the daily beta measuring the systematic risk of
the asset, and , the j-th realized return residual. The corresponding daily idiosyncratic risk
is defined as the standard deviation of realized residuals , . Our first sample consists of
120 stocks randomly selected from three size-based indexes and we use the exchange-traded
fund (ETF) tracking the total market return as Todorov and Bollerslev (2010). More specifically,
Vanguard Total Stock Market ETF with ticker symbol VTI is used, and its intraday prices are
available in Thomas Reuters Tick History database. Our second sample is the constituents of
S&P 500 index, therefore we adopt SPDR S&P 500 ETF with ticker symbol SPY to measure the
33
corresponding market return. Both two ETFs are actively traded in the market and arbitrage
opportunities ensure that their prices do not deviate considerably from the values of their
underlying indexes. Similar to the analysis of total risk, we consider intraday stock and market
returns sampled at 15-minute and 10-minute frequency.
B. The Effects of Expected Amounts of Various Buy and Sell Orders on Market Beta and
Idiosyncratic Risk
To explore the effects of information-based trading, we run the following two regressions
for each sample stock respectively,
; ; ; ; ; ; , (18)
; ; ; ; ; ; . (19)
Table VI displays the averages of regression coefficients and autocorrelation-corrected t-
statistics of individual stock regressions, and the percentage of the sample stocks with regression
coefficient being significantly positive or negative for each explanatory variable. Panel A adopts
intraday returns sampled at the 15-minute frequency to estimate the daily beta and idiosyncratic
risk. There are a few of stylized facts. First, consistent with the results of total risk, the marginal
and total effects of SOS trading on both systematic risk and idiosyncratic risk are much larger
than the other two types of trading. The coefficient of ; ; is not only larger on average
but also has much more chance to be significantly positive and is associated with much larger
average effect size. Take S&P500 sample for example. The average effect size of ; ; is
5.18 times of that of ; ; and 6.57 times of that of ; ; . Second, the marginal effect
of ; ; on systematic and idiosyncratic risks decreases in firm size although the coefficient
of beta regression is not significant on average for medium and large subsamples. Third, all
34
three types of trades have much greater marginal and total effects on idiosyncratic risk than on
systematic risk for all samples and subsamples. The t-statistics of these three regressors are
small and insignificant for most sample and subsamples in beta regressions but they are
significant on average in the regressions of idiosyncratic risk. Fourth, corresponding to the
previous point, the average adjusted R2 of the idiosyncratic risk regressions, which is no smaller
than 40%, is also larger than its counterpart of beta regressions, which is no larger than 5%.
Fifth, the average adjusted R2 of the idiosyncratic risk regressions increases in firm size that
means our model have more explanatory power for idiosyncratic risk of large stocks.
INSERT TABLE VI HERE
Overall, our results demonstrate that information-based trading is more closely associated
with idiosyncratic risk rather than systematic risk. When there are more trading order flows
induced by investors’ dispersed beliefs on public information, the stock’s idiosyncratic risk
increases significantly. Traders willingly bear idiosyncratic risk when they perceive the asset is
mispriced. Therefore, dispersed beliefs lead to excessive idiosyncratic risk. For the link of
disputable public information and beta, our finding is related to prior empirical studies, which
document significant evidence of variation in beta typically associated with stock fundamentals.
Patton and Verardo (2012) propose a model where investors can extract public information of the
aggregate economy and find that daily realized beta increases on earnings announcement days
but declines on post-announcement days. Noting that earning announcement can trigger more
SOS trading activities because the announcement may be interpreted differently by investors (see
Yin and Zhao (2014)), this is likely to explain the link between beta and ; ; , averaged
over a two year period. This is more relevant to small firms though
35
In Panel B of Table VI, we use 10-minute intraday returns instead to estimate market beta
and idiosyncratic risk through (17). The results shown by Panel A are qualitatively unchanged,
although the overall explanatory power of (18) or (19) increases when the sampling frequency of
intraday return becomes higher. Once again, the better performance of the higher-frequency
analysis is consistent with the setting of our theoretical model, which is based on the trade-by-
trade impact on price.
C. The Effects of Daily PIN and PSOS on Market Beta and Idiosyncratic Risk
Although PIN or PSOS are considered as proxies for risk factors in the literature, there
are no studies directly examining their relationships with systematic risk or idiosyncratic risk.
To explore the relations, we run the following two regression models for each sample stock
respectively,
(20)
(21)
Panels C and D of Table VI report the results of individual regressions when 15-minute or 10-
minute intraday returns are used in estimating the intraday market model (17) respectively.
Regarding systematic risk, the overall explanatory power is low with an average adjusted R2
being less than 7%, and the measures of information-based trading may contribute positively or
negatively to daily beta. With regards to idiosyncratic risk, the average adjusted R2 is more than
37% for all samples and subsamples considered. For all sample stocks, is positively and
significantly associated with . It implies that trading due to disputable public information
generates substantial idiosyncratic risk. Meanwhile, for no more than 60% of the sample stocks,
is positively and significantly associated with . Both the marginal and total effects of
36
on systematic risk or non-systematic risk are much larger than those of . This is
consistent with the dominating role played by ; ; as we have seen previously in Panels A
and B. Moreover, contributes more significantly to non-systematic risk than to systematic
risk, evidenced by the larger percentage of sample stocks with regression coefficient of
being significant in explaining than that in explaining . It is consistent with the theoretical
prediction of Hughes, Liu and Liu (2007) that private signals at the firm-level are generally
understood to be far more informative of idiosyncratic shocks than systematic factors. Our study
empirically examines the dynamic relationship between systematic or non-systematic risk and
information risk and obtains qualitatively similar results. It implies that information risk may be
subsumed by existing risk factors, especially idiosyncratic risk.
In the last row of Panel C and Panel D, we perform a difference-in-means test of the
equality of coefficients of and for each sample and subsample considered. For the
null hypothesis of in (20) that is related to the marginal effects on daily systematic risk,
the p-value of the hypothesis test is all larger than 0.1, suggesting that the equality of the
marginal effects of and on daily beta cannot be rejected at the 10% level. As shown
by the individual stock regressions in Panels C and D, the marginal effects of both and
on daily beta is relatively weak and close to zero. Thus, they are not significantly
different. For the null hypothesis of in (21), the p-value of the hypothesis test is all
smaller than 0.001, confirming significantly different effects of and on daily
idiosyncratic risk.
37
VI. Further Tests and Robustness Checking
To examine the validity of theoretical model’s prediction and robustness of our empirical
findings, we have performed various further tests. We briefly report the major tests we have
conducted and their conclusions in this section.
A. The Significance of the Effects of Expected Amounts of Trades on Daily Return
Although whether or PSOS is priced as a risk factor in the long-term cross-sectional
analysis is heavily debated (see, for example, Duarte and Young (2009); Easley, Hvidkjaer and
O’Hara (2010)), there is no study examining how daily measures of PIN and PSOS explain the
contemporaneous return. According to Proposition 1, stock returns are determined by expected
numbers of net buy orders (order imbalances) by different trading motives while return volatility
is determined by expected trades from different investors. These predictions imply that the
expected numbers of trades generated by private information signals and disputable public
information should have weak and negligible impacts on daily return. To examine this
implication, we use ; ; , ; ; and ; ; respectively to replace ; ; ,
; ; and ; ; in (8) and the scaled measures and to replace
and in (12) and run the following regressions for each sample stock:
; ; ; ; ; ;
.
It is found that the average adjusted R2 reduces to less than 1% for all the samples and
subsamples considered and coefficients of and in all individual regressions are
insignificant. It shows the importance of distinguishing information-based buys from
information-based sells in analyzing their associations with contemporaneous returns. It should
38
be and not and that determine daily return. These findings
together with the results reported in Table IV not only demonstrate the validity of theoretical
predictions but also the effectiveness of dynamic measures of information-based trading derived
from the HMM approach.
B. Measures of Stock Return
In the empirical analysis, we adopt the rate of simple return while the theoretical model is
developed based on total return for tractability. To ensure our findings are not sensitive the
particular choice of return measure, we duplicate the analysis of Table IV for the first sample of
120 stock using both total returns and log returns as alternative measures to simple returns. The
results of Table IV qualitatively hold.
C. Higher Order Terms
The theoretical model specifies the effects of higher order terms of information-based
trading on the variance of stock return. The regression results reported thus far are of linear
regressions, ignoring the second-order terms of information-based trading. We have also run
regressions of total risk including these second-order terms. It is found that that modified models
perform better in the sense that the coefficients of linear regressors are more significant with a
correct sign and/or more stocks turn out to have significant coefficients of linear regressors with
a correct sign. The average adjusted R2 is also improved marginally.
39
VII. Concluding Remarks
This paper intends to gain a better understanding of the relation between trading activities
and the returns and risks of individual assets. We develop a theoretical model, which shed
insights of distinct effects of trading activates on stock return and risk in a dynamic fashion. The
expected net buys with different trading motives are expected to be positively related to daily
return, while the corresponding expected numbers of buy and sell orders are related to return risk.
Motivated by the theoretical predictions of the model, we run daily time-series regressions for
individual stocks. Using a sample of 120 NYSE stocks and another sample of 451 stocks
constituting S&P500 Index, it is shown that expected net buys due to private information
contributes significantly to daily returns with marginal and total effects larger than that due to
disputable public information. On the other hand, the expected order number due to disputable
public information plays a dominant role in determining return risk.
We further examine the effects of daily information-based trading on systematic or
idiosyncratic risk. We find a strong relation to idiosyncratic risk and overall explanatory power
of regression models of idiosyncratic risk is substantial. However, the association of
information-based trading with systematic risk is quite weak. For all sample stocks, the effects
of SOS trading on both types of risks dominate that of private information trading. This suggests
that on the one hand there is some risk factor related to information-based trading which is not
captured by market beta but may be subsumed by idiosyncratic risk. On the other hand, it is
SOS trading rather than private information trading determines this risk factor. These claims are
in line with Duarte and Young’s (2009) argument that PSOS is priced as a risk factor but not PIN.
However, we leave investigating the reasons for the insignificance of the effects on systematic
risk for our future research.
40
Appendix A: Proof of Proposition 1
Recalling Ε ,; ;
; ;, Var , 1 ; ;
; ;, and Ε Var ;
; , we have
E , E E , Ε Ε , Ε ; ;
; ;
; ; .
Var , Ε Var , Var E ,
Ε Var , Var E ,
Ε Var , Var E , Ε ; ; .
Therefore, after writing ∑ , as we have
E Ε , , ,
; ; E
Var Var , , ,
Var , Var , , Var
2Cov , , , ,
; ; Var , Var 2 Var ,
; ; 2 1 ; ;
; ;Var .
41
Appendix B: Estimation of the HMM and ; , ; , ; and ;
B.1. Estimation of the HMM by the Expectation and Maximization Algorithm
We apply the Baum-Welch algorithm (see Baum, Petrie, Soules, and Weiss (1970)) in the
estimation of the HMM. In particular, we regard the hidden states as missing data while the
Complete-Data Log-Likelihood (CDLL) is the log-likelihood of the parameter set based on
observed time series of buy and sell order flows and the unobservable time series of states, i.e.,
log Pr , | , where is a time series realization of state variable Ht
with t ranging from 1 to T. For forward probabilities η whose 1 -th element is
defined as η , ; Pr , ; , ; , we have η and η η Γ
for 2, 3, … , . To apply the Expectation and Maximization (EM) algorithm, we also define
ζ Γ ζ as the vector of backward probabilities for 1, 2, … , 1 with ζ ′,
where the 1 -th element of is
ζ , ; Pr , , … , | ; , ; .
Further, let , ; and , ; , ; be zero-one variables that
, ; 1ifandonlyif ; , ; ,
, ; , ; 1ifandonlyif ; , ; , ; , ; .
With this notation, the CDLL of the HMM is given by
log Pr ,
, ; log , ; , ; , ; log , ; , , ; log ,
,,
.
We use the EM algorithm to estimate the HMM as follows:
E Step: Compute the conditional expectations of the missing data, given the observations
and the current estimate of . Specifically, conditional expectations of , ; and
42
, ; , ; are estimated by:
, ; Pr ; , ; ,η , ; ζ . ;
|,
, ; , ; Pr ; , ; , ; , ; | ,η , ; , ; , , ζ , ;
|.
M Step: Maximize the CDLL, where the missing data are replaced by their conditional
expectations, to determine the estimate of . Thus, replace all , ; and , ; , ; in CDLL
by their conditional means , ; and , ; , ; , and maximize it with respect to , Γ, and
λ ; and λ ; . The solution to the maximization problem consists of
, ; , ; , , ; ,∑ , ; , ;
∑ ∑ ∑ , ; , ;,
,
∑ ∑ , ;∑ ∑ , ;
and ,∑ ∑ , ;∑ ∑ , ;
.
The above E and M steps are repeated many times until some convergence criterion has been
satisfied, for instance the improvement in the CDLL is less than 10-6. This EM algorithm
provides us with three sets of parameter estimates: , Γ, and λ ; and λ ; . Once and Γ are
estimated, we have Γ .
Applying Bayes’ rule, the posterior distribution of states in (14) can be calculated by
Pr ; , ;, ; , ; η , ; ζ , ;
| , ; .
B.2. Estimation of ; , ; , ; and ;
Based on clustering analysis, Yin and Zhao (2014) introduce a two-step approach to
decompose the state-dependent order arrival rates λ ; and λ ; into the three components in
Equation (15).
43
Step One: Partitioning hidden states depending on whether it contains private
information or not.
K-means clustering is performed on observed trading imbalances over the whole
estimation window, i.e., | | for 1, 2, … , , and then determine the number of clusters
by using the jump method of Sugar and James (2003). If there is only one cluster,16 we infer that
observed trading imbalances are similar and there is no significant evidence for the existence of
private information during the period. Therefore, we have ; ; 0 for all hidden states.
The rationale behind such a claim is the common “wisdom” that trading due to liquidity needs or
disputable public information is symmetric and only generates small trade imbalance, while
privately informed trading is often associated with substantial trade imbalance. If there is
privately informed trading, the daily trade imbalances cannot be consistent over time.
If the clustering analysis indicates that there are multiple clusters with different centers of
trade imbalances, we treat λ ; λ ; as an out-of-sample observation and assign it to the
cluster whose center is the closest to it. If λ ; λ ; belongs to the cluster with the smallest
center, state , is identified as the one involving no privately informed trading because the
small trade imbalances can only be caused by liquidity trading and/or SOS trading. We have
; ; 0 and use to denote the set consisting of states in this cluster. If λ ; λ ;
belongs to one of the other clusters, state , contains private information, and we have
; λ ; λ ; λ ; # λ ; # and ; 0 if λ ; λ ; ,
; 0 and ; λ ; λ ; λ ; # λ ; # if λ ; λ ; ,
16 In our simulation and sample data analysis, the case of one cluster has never occurred.
44
where #, # is a matching state of state , , which is a state in with balanced trade the
closest to the balanced trade of state , .17 The matching state is used to proxy the small trade
imbalance caused by liquidity needs or disputable public information in state , .
Step Two: Classifying hidden states into two sets depending on whether it contains
disputable public information or not.
Clustering analysis together with the jump method is now applied to the observed
balanced trades over the whole estimation window, i.e., | | for 1, 2, … , . If
there is only one cluster,18 it implies that investors have very similar interpretations for public
information in the market and all balanced orders are generated by liquidity traders, because
disagreement on a public information disclosure should induce a considerable increase in both
buy and sell orders. Therefore, we have ; ; 0 for all , .
If more than one cluster is detected, we use 2-means clustering on observed balanced
trades to form two clusters. For state , , we treat the expected number of balanced trades
λ ; λ ; λ ; λ ; as an out-of-sample observation, and assign it to one of the two clusters
with the closer center to it. We use to denote the set consisting of the states in the cluster with
a smaller center and these states are considered not associated with disputable public information.
The rest states constitute set , which involve trading due to disputable public information
because their balanced trades are much larger than those in set . If state , belongs to set ,
we have its arrival rates of public information-driven buys and sells being zero therefore
17 Mathematically, #, # ∗, ∗ ∈ λ ; λ ; λ ; λ ; λ ; ∗ λ ; ∗ .
18 This can be the case when we use the EHO or EEOW model to generate simulation data although it is hard to
imagine there are no diverse interpretations of all public information disclosed over a considerably long period. In
the application of real data, the trading of all sample stocks implies more than one cluster.
45
; ; 0. If state , belongs to set , we can determine its expected buys and sells
triggered by public information as follows
; λ ; ; max#, # ∈ ∩
λ ; #
; λ ; ; max#, # ∈ ∩
λ ; # .
where ; and ; are obtained in the first step. The last terms in the above equations proxy ; ,
and ; in (1), respectively. Set includes both liquidity trading and privately informed trading
while set includes both liquidity trading and public-information trading. Their intersection, i.e.,
set ∩ , includes states that involve only liquidity trading. We use the largest arrival rates of
buy and sell orders in ∩ to subtract liquidity order arrival rates from the aggregate buy and
sell order arrival rates to ensure that the arrival rates of buy and sell orders driven by disputable
public information are not exaggerated.
46
REFERENCES Admati, Anat R., and Paul Pfleiderer, 1988. A theory of intraday patterns: volume and price
variability, Review of Financial Studies 1, 3–40. Alexander, Gordon, and Mark A. Peterson, 2007. An analysis of trade-size clustering and its
relation to stealth trading, Journal of Financial Economics 84, 435–471. Andersen, Torben G.; and Bollerslev Tim, 1998. Answering the sceptics: yes standard volatility
models do provide accurate forecasts, International Economic Review 39, 885–905. Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Paul Labys, 2001. The
distribution of realized exchange rate volatility, Journal of the American Statistical Association 96, 42–55.
Banerjee, Snehal, and Ilan Kremer, 2010. Disagreement and learning: dynamic patterns of trade, Journal of Finance 65, 1269–1302.
Barclay, Michael J., and Jerold B. Wamer, 1993. Stealth trading and volatility: Which trades move prices? Journal of Financial Economics 34, 281–306.
Baum, Leonard E., Ted Petrie, George Soules, and Norman Weiss, 1970. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains, Annals of Mathematical Statistics 41, 164–171.
Brennan, Michael J., Tarun, Chordia, Avanidhar Subrahmanyam, and Qing Tong, 2012. Sell-order liquidity and the cross-section of expected stock returns, Journal of Financial Economics 105, 523–541.
Chakravarty, Sugato, 2001. Stealth trading: Which trader’s trades move prices? Journal of Financial Economics 61, 289–307.
Chordia, Tarun, and Avanidhar Subrahmanyam, 2004. Order imbalance and individual stock return: Theory and evidence, Journal of Financial Economics 72, 485–518.
Duarte, Jefferson, and Lance Young, 2009, Why is PIN priced? Journal of Financial Economics 91, 119–138.
Easley, D., Engle, R., O’Hara, M., Wu, L., 2008. Time-varying arrival rates of informed and uninformed trades. Journal of Financial Econometrics 6, 171–207.
Easley, David, Soeren Hvidkjaer, and Maureen O’Hara, 2002, Is information risk a determinant of asset returns? Journal of Finance 57, 2185–2221.
Easley, David, Soeren Hvidkjaer, and Maureen O’Hara, 2010. Factoring information into returns, Journal of Financial and Quantitative Analysis 45, 293–309.
Glosten, Lawrence R., and Lawrence E. Harris, 1988. Estimating the components of the bid-ask spread, Journal of Financial Economics 21, 123–142
Glosten, Lawrence R., and Paul R. Milgrom, 1985. Bid, ask and transaction prices in a specialist market with heterogeneously informed traders, Journal of Financial Economics 14, 71–100.
Grossman, Sanford, 1976. On the efficiency of competitive stock markets where traders have diverse information, Journal of Finance 31, 573–85.
Hasbrouck, Joel, 1995. One security, many markets: determining the contributions to price discovery, Journal of Finance 50, 1175–1199.
Hughes, John S., Jing Liu, and Jun Liu, 2007. Information asymmetry, diversification, and cost of capital, The Accounting Review 82, 3, 705–729.
Kyle, Albert S., 1985. Continuous auctions and insider trading, Econometrica 53, 1315–1326. Lee, Charles M.C., Belinda Mucklow, and Mark J. Ready, 1993. Spreads, depths, and the impact
of earnings information: An intraday analysis, Review of Financial Studies 6, 345–374.
47
Llorente, Guillermo, Roni Michaely, Gideon Saar, and Jiang Wang, 2002. Dynamic volume-return relation of individual stocks, Review of Financial Studies 15, 1005–1047.
Mohanram, Partha, and Shiva Rajgopal, 2009. Is PIN priced risk? Journal of Accounting and Economics 47, 226–243.
Newey, Whitney K., and Kenneth D. West, 1994. Automatic lag selection in covariance matrix estimation, Review of Economic Studies 61, 631–653.
Patton, Andrew J., and Michela Verardo, 2012. Does beta move with News? Firm-specific information flows and learning about profitability, Review of Financial Studies 25, 2789–2839.
Mohanram, Partha, and Shiva Rajgopal, 2009. Is PIN priced risk? Journal of Accounting and Economics. 47, 226–243.
Rees, Lynn, and Wayne Thomas, 2010. The stock price effects of changes in dispersion of investor beliefs during earnings announcements, Review of Accounting Studies 15, 1–31.
Roşu, Ioanid, 2009, A dynamic model of the limit order book, Review of Financial Studies, 22, 4601–4641.
Shalen, Catherine T., 1993. Volume, volatility and the dispersion of beliefs, Review of Financial Studies 6, 405–434.
Sugar, Catherine A., and Gareth M. James, 2003. Finding the number of clusters in a dataset, Journal of the American Statistical Association 98, 750–763.
Todorov, Viktor, and Tim Bollerslev, 2010. Jumps and betas: A new framework for disentangling and estimating systematic risks, Journal of Econometrics 157, 220–35.
Yin, Xiangkang and Jing Zhao, 2014, A hidden Markov model approach to Information-based trading: Theory and applications, http://ssrn.com/abstract=2412321.
Zucchini, Walter, and Iain L. MacDonald, 2009. Hidden Markov models for time series: An introduction using R (Chapman & Hall, CRC Press).
48
Table I Ticker symbols of the 120 sample stocks and summary statistics of the sample characteristics
The first sample includes 120 stocks that traded on the NYSE in 2010 and 2011. They are randomly selected from S&P 500 Index, S&P MidCap 400 Index, and S&P SmallCap 600 Index with 40 stocks from each index. Panel A lists the ticker symbols of the sample stocks. Panel B presents descriptive statistics of stock characteristics for the entire sample of 120 stocks and three size-based groups. For each stock, AvgCap is the average daily market capitalization over the sample period, AvgTurn the average daily turnover (the number of shares traded divided by the number of shares outstanding), AvgNT the average daily total number of trades, AvgAIMB the average daily absolute trade imbalance measured by the difference between numbers of buyer- and seller-initiated orders, and AvgEfspd the average daily percentage effective bid-ask spread. The second sample consists of stocks constituting S&P 500 Index in 2010 and 2011, of which the descriptive statistics of characteristics are reported in Panel C.
PanelA:Tickersymbolsofthe120samplestocks
S&P500constituents S&PMidCap400constituents S&PSmallCap600constituentsADM CSC JBL RAI AAP HNI CVD SM ABM FOR CCC ORBAVP DE LUK ROK ADS IEX DCI SON AHS FUL CRY POLBAX ECL MWV SVU AJG JLL DLX SXT AIT HHS CUZ PPSBA EL NEM SWK AYI KEX DRC TKR ALE IVC DAR RTBIG EQT NKE TSO BCO LNT ESI TRN AXE KRG DIN SCLC ETR NOC UTX BGC MAN FLO URS AZZ KWR DRH SNXCBS FHN NOV VFC BKH MLM FRT VCI BDC LDL DW SUPCEG HAR NU WEC CBT NYT GES WRB BGS LTC EIG UBACF HOT PBI WMT CRL PNM GGG WSO BMI MED EXP UNFCFN IVZ PCG XEL CSL ROL GHL XEC CBR ONB FIX ZEP
PanelB:Characteristicsofthe120samplestocksanditsthreesize‐basedsubsamples
AvgCapinmillion$
AvgTurnin% AvgNT AvgAIMB AvgEfspd
in%Entiresample Mean 10582.85 0.312 2008.61 169.29 0.041Median 2558.76 0.284 1185.43 117.16 0.031Std.Dev. 34317.1 0.147 2590.98 271.24 0.033Minimum 148.43 0.086 101.85 0.0029 0.008Maximum 295774.9 0.958 19689.07 2393.64 0.191Smallsizegroup:S&PSmallCap600constituents Mean 793.60 0.268 548.65 67.79 0.073Median 713.54 0.245 468.52 63.83 0.061Std.Dev. 422.18 0.122 349.93 36.67 0.039Mediumsizegroup:S&PMidCap400constituents Mean 2811.65 0.333 1189.93 115.63 0.030Median 2603.26 0.339 1101.29 107.20 0.031Std.Dev. 1242.59 0.139 487.30 38.62 0.009Largesizegroup:S&P500constituents Mean 28143.3 0.335 4287.22 324.46 0.019Median 10119.11 0.294 3025.99 223.44 0.016Std.Dev. 55813.6 0.17 3453.29 428.29 0.013
PanelC:Characteristicsofthe451samplestocksconstitutingS&P500
AvgCapinmillion$
AvgTurnin% AvgNT AvgAIMB AvgEfspd
in%Mean 24042 0.347 5048 324.16 0.020Median 11174.82 0.31 3690.07 264.69 0.016Std.Dev. 40391.58 0.17 4436.59 236.63 0.027
49
Table II Descriptive statistics and nonparametric tests of measures of information-based trading
This table presents descriptive statistics and nonparametric tests of the four daily scaled measures of information-based trading, i.e., PIN, PSOS, PNbIN, and PNbSOS. Descriptive statistics focus on the cross-sectional distributions of average daily measures over selected days, where time-series average is denoted by a bar over a variable. The nonparametric tests examine whether these daily measures on certain days statistically differ from those on other days. The percentage of stocks with a p-value less than 5% in a difference-in-means test is reported. For each sample stock, trading days are sorted into two groups by the sign of its daily return or into quintiles by its realized variance . Panel A is of the first sample of 120 stocks while Panel B is of the three size-based subsamples of the 120 stocks and the second sample of 451 stocks constituting S&P500 Index.
PanelA:Theentiresampleof120stocks Mean 25% Median 75% Mean 25% Median 75%
Alltradingdays 0.135 0.100 0.131 0.155 0.358 0.300 0.337 0.391Dayswith 0 0.136 0.100 0.129 0.153 0.350 0.292 0.321 0.390Dayswith 0 0.135 0.096 0.131 0.157 0.366 0.307 0.347 0.401Dayswith ∈ 0.115 0.081 0.112 0.144 0.446 0.394 0.427 0.489Dayswith ∈ 0.151 0.113 0.143 0.172 0.273 0.199 0.258 0.321 Mean 25% Median 75% Mean 25% Median 75%
Alltradingdays ‐0.004 ‐0.012 0.001 0.009 0.001 ‐0.012 0.001 0.019Dayswith 0 0.041 0.022 0.040 0.061 0.002 ‐0.011 0.002 0.021Dayswith 0 ‐0.051 ‐0.069 ‐0.044 ‐0.022 ‐0.001 ‐0.013 0.000 0.018Dayswith ∈ ‐0.009 ‐0.022 ‐0.007 0.004 0.001 ‐0.011 0.002 0.017Dayswith ∈ 0.006 ‐0.013 0.011 0.032 0.001 ‐0.013 0.001 0.022Hypothesistest , , , ,
PercentageofstockswithH0thatthemeasuresondayswith0and 0areequalrejected 13.33% 17.50% 94.17% 49.17%
PercentageofstockswithH0thatthemeasuresondayswithinthesmallestandlargestquintilesof areequal rejected. 66.67% 96.67% 9.17% 18.33%
50
PanelB:Thethreesize‐basedsubsamplesofthe120samplestocksandthesampleofS&P500constituents Cross‐sectionalmeanof Cross‐sectionalmeanof Smallsize Mediumsize Largesize S& 500 Smallsize Medium
size Largesize S& 500
Alltradingdays 0.170 0.131 0.105 0.092 0.384 0.355 0.335 0.334Dayswith 0 0.168 0.131 0.109 0.094 0.380 0.346 0.324 0.324Dayswith 0 0.173 0.131 0.101 0.091 0.389 0.365 0.345 0.345Dayswith ∈ 0.149 0.114 0.082 0.075 0.457 0.441 0.439 0.444Dayswith ∈ 0.187 0.145 0.122 0.107 0.313 0.277 0.228 0.220 Cross‐sectionalmeanof Cross‐sectionalmeanof Smallsize Mediumsize Largesize S& 500 Smallsize Medium
size Largesize S& 500
Alltradingdays ‐0.009 0.000 ‐0.004 ‐0.003 0.003 0.000 0.002 0.000Dayswith 0 0.057 0.045 0.020 0.024 0.004 0.001 0.002 0.001Dayswith 0 ‐0.077 ‐0.048 ‐0.029 ‐0.030 0.003 ‐0.001 ‐0.001 0.000Dayswith ∈ ‐0.009 ‐0.010 ‐0.010 ‐0.011 0.003 0.000 0.001 0.000Dayswith ∈ ‐0.001 0.012 0.008 0.008 0.003 ‐0.001 0.001 0.000 , , Smallsize Mediumsize Largesize S& 500 Smallsize Medium
size Largesize S& 500
PercentageofstockswithH0thatthemeasuresondayswith 0 and 0 areequalrejected 15.00% 2.50% 22.50% 16.63% 7.50% 25.00% 20.00% 20.40%
PercentageofstockswithH0thatthemeasuresondayswithinthesmallestandlargestquintilesof areequalrejected. 57.50% 60.00% 82.50% 83.15% 95.00% 95.00% 100% 99.78% , , Smallsize Mediumsize Largesize S& 500 Smallsize Medium
size Largesize S& 500
PercentageofstockswithH0thatthemeasuresondayswith 0 and 0 areequalrejected 100% 95.00% 87.50% 93.13% 57.50% 40.00% 50.00% 53.88%
PercentageofstockswithH0thatthemeasuresondayswithinthesmallestandlargestquintilesof areequalrejected. 10.00% 5.00% 12.50% 19.51% 15.00% 17.50% 22.50% 12.86%
51
Table III Daily return and total risk conditional on measures of information-based trading
For each sample stock, we sort its trading days by one of its daily measures of information based trading, i.e., , ,
, , , , or , , into quintiles and then calculate the average daily return and realized variance for each quintile. Panel A presents the cross-sectional sample mean of average daily returns and Panel B presents results of realized variance. The difference between the quintiles is also reported.
PanelA:Dailyreturn
Quintile Comparison Smallest
1 2 3 4Largest
5 "5 1" "5 3" "3 1"
Sortedby , Entire120stocks 0.001 0.001 0.000 0.001 0.001 ‐0.001 0.001 ‐0.001Smallsizegroup 0.001 0.001 0.000 0.001 0.000 ‐0.001 ‐0.001 0.000Mediumsizegroup 0.001 0.001 0.000 0.000 0.001 0.000 0.001 ‐0.001Largesizegroup 0.003 0.000 0.000 0.000 0.002 ‐0.001 0.002 ‐0.003S&P500constituents 0.001 0.000 0.000 0.000 0.001 0.000 0.001 ‐0.001Sortedby , Entire120stocks 0.001 0.001 0.001 0.001 ‐0.001 ‐0.003 ‐0.002 0.000Smallsizegroup 0.001 0.001 0.001 0.001 ‐0.001 ‐0.002 ‐0.002 0.000Mediumsizegroup 0.002 0.001 0.001 0.001 ‐0.002 ‐0.004 ‐0.003 ‐0.001Largesizegroup 0.002 0.001 0.001 0.001 ‐0.001 ‐0.003 ‐0.002 ‐0.001S&P500constituents 0.002 0.002 0.001 0.000 ‐0.002 ‐0.004 ‐0.003 ‐0.001Sortedby , Entire120stocks ‐0.008 ‐0.002 0.001 0.004 0.009 0.017 0.008 0.009Smallsizegroup ‐0.011 ‐0.004 0.001 0.005 0.012 0.023 0.011 0.012Mediumsizegroup ‐0.007 ‐0.002 0.001 0.003 0.008 0.016 0.007 0.008Largesizegroup ‐0.006 ‐0.001 0.001 0.003 0.007 0.013 0.006 0.007S&P500constituents ‐0.006 ‐0.002 0.001 0.003 0.007 0.013 0.006 0.006Sortedby , Entire120stocks 0.000 0.000 0.001 0.001 0.002 0.002 0.001 0.001Smallsizegroup 0.000 0.001 0.000 0.000 0.002 0.002 0.002 0.000Mediumsizegroup 0.000 0.000 0.000 0.001 0.002 0.002 0.003 0.000Largesizegroup 0.001 0.000 0.002 0.000 0.003 0.002 0.001 0.001S&P500constituents 0.001 0.000 ‐0.001 0.001 0.001 0.000 0.003 ‐0.002
52
PanelB:Dailyrealizedvariance
Quintile Comparison Smallest
1 2 3 4 Largest5 "5 1" "5 3" "3 1"
Sortedby , Entire120stocks 0.0003 0.0004 0.0004 0.0004 0.0003 0.0000 ‐0.0001 0.0001Smallsizegroup 0.0004 0.0005 0.0005 0.0005 0.0005 0.0000 ‐0.0001 0.0001Mediumsizegroup 0.0003 0.0003 0.0004 0.0003 0.0003 0.0000 ‐0.0001 0.0001Largesizegroup 0.0003 0.0003 0.0003 0.0003 0.0002 0.0000 ‐0.0001 0.0000S&P500constituents 0.0002 0.0003 0.0003 0.0002 0.0002 0.0000 ‐0.0001 0.0001Sortedby , Entire120stocks 0.0002 0.0003 0.0003 0.0004 0.0007 0.0005 0.0004 0.0001Smallsizegroup 0.0003 0.0004 0.0004 0.0005 0.0009 0.0006 0.0004 0.0002Mediumsizegroup 0.0002 0.0002 0.0003 0.0003 0.0006 0.0004 0.0003 0.0001Largesizegroup 0.0001 0.0002 0.0002 0.0003 0.0005 0.0004 0.0003 0.0001S&P500constituents 0.0001 0.0002 0.0002 0.0003 0.0005 0.0004 0.0003 0.0001Sortedby , Entire120stocks 0.0003 0.0004 0.0003 0.0004 0.0003 0.0000 0.0000 0.0000Smallsizegroup 0.0005 0.0005 0.0004 0.0005 0.0005 0.0000 0.0001 0.0000Mediumsizegroup 0.0003 0.0004 0.0003 0.0003 0.0003 0.0000 0.0000 0.0000Largesizegroup 0.0002 0.0003 0.0002 0.0003 0.0002 0.0000 0.0000 0.0000S&P500constituents 0.0003 0.0003 0.0002 0.0003 0.0002 0.0000 0.0000 0.0000Sortedby , Entire120stocks 0.0003 0.0003 0.0005 0.0004 0.0003 0.0001 ‐0.0002 0.0002Smallsizegroup 0.0004 0.0004 0.0007 0.0005 0.0004 0.0000 ‐0.0003 0.0004Mediumsizegroup 0.0003 0.0003 0.0004 0.0003 0.0003 0.0001 ‐0.0001 0.0002Largesizegroup 0.0002 0.0003 0.0004 0.0003 0.0003 0.0001 ‐0.0001 0.0002S&P500constituents 0.0002 0.0003 0.0004 0.0003 0.0003 0.0001 ‐0.0001 0.0002
53
Table IV The effects of information-based trading on daily return
This table reports the results of following regression models for individual stocks: ; ; ; ; ; ; where denotes the daily return on trading day t. In the first regression, ; ; , ; ; and ; ; are the expected numbers of net buys due to private information, disputable public information, and liquidity needs, respectively. Scaled measures and in the second regression are the probabilities of net buys due to private information and disputable public information, respectively. The table displays average regression coefficients, average t-statistics (in parentheses) corrected for autocorrelation, the percentage of sample stocks with the regression coefficient being significantly positive or negative at the 5% level, average and adjusted of the individual regressions across the first sample of 120 stocks, its three size-based groups, and the second sample of 451 stocks constituting S&P500 Index. Effect size measures the normalized total effect of an explanatory variable. For instance, the effect size of ; ; is calculated by ∑ ; ; ∑ | | .⁄ Average effect sizes across the sample stocks are reported in brackets. The p-values of testing that the mean of coefficients of is equal to that of are reported in Panel B.
PanelA:Regressionsofdailyreturnonarrivalratesofnetbuys ofdifferenttypes
Open‐to‐closedailyreturn Close‐to‐closedailyreturn Entire
120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500Entire120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500
Explanatoryvariables Averagecoefficient ‐0.051 ‐0.083 ‐0.027 ‐0.043 ‐0.039 ‐0.062 ‐0.091 ‐0.041 ‐0.053 ‐0.060
Averaget‐statistics ‐0.992 ‐1.656 ‐0.546 ‐0.775 ‐0.721 ‐1.100 ‐1.670 ‐0.741 ‐0.889 ‐1.069 Coefficient 0significant 0.83% 0.00% 2.50% 0.00% 1.33% 1.67% 0.00% 5.00% 0.00% 0.22% Coefficient 0significant 25.83% 45.00% 15.00% 17.50% 14.19% 23.33% 45.00% 20.00% 5.00% 18.63% Averageeffectsize 6.57% 8.85% 5.08% 5.39% 5.60% 7.30% 9.39% 6.49% 6.03% 6.58%
; ; Averagecoefficient ‐1.00e‐5 ‐2.05e‐5 ‐2.91e‐5 1.96e‐5 1.80e‐5 ‐1.44e‐6 7.27e‐5 ‐4.43e‐5 3.27e‐5 1.11e‐5 Averaget‐statistics 1.245 1.602 1.062 1.069 0.895 1.175 1.545 0.850 1.130 0.928 Coefficient 0significant 26.67% 32.50% 22.50% 25.00% 19.29% 27.50% 32.50% 20.00% 30.00% 21.29% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.89% 0.83% 0.00% 0.00% 2.50% 2.22% Averageeffectsize 10.0% 12.9% 7.60% 9.47% 7.63% 10.1% 12.3% 7.42% 10.6% 8.28%
; ; Averagecoefficient 5.13e‐5 1.01e‐4 3.98e‐5 3.33e‐5 2.20e‐5 5.64e‐5 1.08e‐4 9.38e‐5 2.70e‐5 1.55e‐5 Averaget‐statistics 4.452 5.029 4.487 3.842 3.946 4.182 4.894 3.893 3.759 4.018 Coefficient 0significant 86.67% 90.00% 85.00% 85.00% 84.48% 80.83% 87.50% 80.00% 75.00% 82.26% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% Averageeffectsize 33.6% 39.3% 35.3% 26.2% 27.4% 32.8% 38.2% 32.8% 27.3% 29.5%
54
; ; Averagecoefficient 4.30e‐5 3.41e‐5 3.68e‐5 2.80e‐5 2.01e‐5 4.35e‐5 3.40e‐5 6.87e‐5 ‐2.33e‐6 1.21e‐5 Averaget‐statistics 1.147 1.168 1.189 1.085 0.808 1.077 1.059 1.198 0.975 0.896 Coefficient 0significant 23.33% 22.50% 22.50% 25.00% 17.07% 22.50% 20.00% 22.50% 25.00% 19.73% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 1.55% 0.00% 0.00% 0.00% 0.00% 1.11% Averageeffectsize 11.5% 13.2% 9.96% 11.3% 10.2% 11.8% 12.5% 10.4% 13.3% 11.3%AverageadjustedR2 8.18% 10.52% 8.88% 5.14% 5.51% 7.87% 10.28% 7.68% 5.66% 6.31%AverageR2 8.91% 11.23% 9.61% 5.90% 6.27% 8.61% 10.99% 8.42% 6.41% 7.06%
PanelB:RegressionsofdailyreturnondailymeasuresofPINandPSOS
Open‐to‐closedailyreturn Close‐to‐closedailyreturn Entire
120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500 Entire120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500
Explanatoryvariables Averagecoefficient ‐0.049 ‐0.082 ‐0.024 ‐0.040 ‐0.037 ‐0.060 ‐0.091 ‐0.040 ‐0.050 ‐0.057
Averaget‐statistics ‐0.942 ‐1.641 ‐0.453 ‐0.73 ‐0.664 ‐1.059 ‐1.628 ‐0.685 ‐0.863 ‐0.995 Coefficient 0significant 0.83% 0.00% 2.50% 0.00% 1.55% 1.67% 0.00% 5.00% 0.00% 0.22% Coefficient 0significant 23.33% 45.00% 10.00% 15.00% 13.30% 20.00% 42.50% 12.50% 5.00% 14.41% Averageeffectsize 6.41% 8.72% 4.78% 5.73% 5.52% 7.09% 9.32% 6.18% 5.78% 6.34%
Averagecoefficient 0.044 0.050 0.041 0.042 0.043 0.053 0.057 0.047 0.055 0.058 Averaget‐statistics 5.835 0.050 5.88 4.710 4.788 5.605 6.697 5.385 4.734 5.081 Coefficient 0significant 95.83% 100% 95.00% 92.50% 93.13% 95.00% 100% 95.00% 90.00% 93.35% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% Averageeffectsize 37.4% 45.8% 37.9% 28.3% 28.7% 37.0% 44.8% 36.6% 29.6% 31.1%
Averagecoefficient 0.033 0.032 0.014 0.051 0.039 0.033 0.042 0.025 0.033 0.052 Averaget‐statistics 0.793 0.628 0.89 0.862 0.584 0.774 0.568 0.948 0.808 0.653 Coefficient 0significant 12.50% 12.50% 15.00% 10.00% 11.53% 12.50% 7.50% 17.50% 12.50% 14.63% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.89% 0.00% 0.00% 0.00% 0.00% 1.11% Averageeffectsize 6.54% 6.23% 6.45% 6.93% 6.24% 7.18% 6.22% 7.01% 8.31% 6.57%AverageadjustedR2 8.06% 10.90% 8.47% 4.81% 4.98% 7.80% 10.79% 7.57% 5.05% 5.64%AverageR2 8.61% 11.43% 9.02% 5.38% 5.55% 8.35% 11.32% 8.13% 5.62% 6.20%Hypothesistest:Themeansofregressioncoefficientsof and areequal. p‐value 1.05e‐7 3.12e‐2 1.5e‐4 0.002 0.011 2.18e‐4 0.011 0.019 0.037 1.12e‐6
55
Table V The effects of information-based trading on total risk of return
This table reports the results of following regression models for individual stocks: ; ; ; ; ; ; where denotes the realized variance of intraday returns sampled at the 10-minute or 15-minute interval. In the first regression, ; ; , ; ; , and
; ; denote the expected numbers of trades due to liquidity needs, private information and disputable public information, respectively, on day t. Scaled measures and in the second regression are the probabilities of privately informed trading and SOS trading. The table displays average regression coefficients, average t-statistics (in parentheses) corrected for autocorrelation, the percentage of sample stocks with the regression coefficient being significantly positive or negative at the 5% level, average and adjusted of the individual regressions across the first sample of 120 stocks, its three size-based groups, and the second sample of 451 stocks constituting S&P500 Index. Effect size measures the normalized total effect of an explanatory variable. For instance, the effect size of ; ; is calculated by ∑ ; ; ∑ .⁄ Average effect sizes across the sample stocks are reported in brackets. The p-values of testing that the mean of coefficients of is equal to that of are reported in Panel B.
PanelA:Regressionsofdailyrealizedvarianceonarrivalratesoftradesofdifferenttypes
15‐minute 10‐minute Entire
120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500 Entire120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500
Explanatoryvariables Averagecoefficient 0.290 0.326 0.332 0.211 0.194 0.303 0.345 0.327 0.237 0.224
Averaget‐statistics 3.894 4.611 4.169 2.902 2.596 4.248 4.687 4.641 3.415 3.072 Coefficient 0significant 74.17% 85.00% 82.50% 55.00% 51.22% 74.17% 85.00% 75.00% 62.50% 61.20% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.22% 0.00% 0.00% 0.00% 0.00% 0.67% Averageeffectsize 29.1% 32.7% 33.2% 21.4% 20.7% 30.6% 34.4% 32.9% 24.5% 23.4%
; ; Averagecoefficient 2.79e‐7 8.22e‐7 8.79e‐9 5.92e‐9 7.51e‐9 2.92e‐7 8.27e‐7 4.38e‐8 6.31e‐9 1.19e‐8 Averaget‐statistics 0.675 1.085 0.476 0.465 0.677 0.932 1.288 0.625 0.882 1.112 Coefficient 0significant 16.67% 25.00% 12.50% 12.50% 19.07% 28.33% 37.50% 22.50% 25.00% 27.05% Coefficient 0significant 0.83% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% Averageeffectsize 29.9% 36.2% 26.8% 26.7% 37.9% 30.3% 36.9% 24.5% 29.5% 37.4%
; ; Averagecoefficient 2.98e‐7 7.63e‐7 9.09e‐8 4.03e‐8 2.98e‐8 2.58e‐7 6.24e‐7 1.15 e‐7 3.50e‐8 2.90e‐8 Averaget‐statistics 0.571 0.864 0.384 0.466 0.293 0.611 0.787 0.511 0.535 0.299 Coefficient 0significant 8.33% 12.50% 7.50% 5.00% 7.10% 6.67% 7.50% 5.00% 7.50% 6.21% Coefficient 0significant 0.83% 0.00% 0.00% 2.50% 3.55% 0.00% 0.00% 0.00% 0.00% 3.33% Averageeffectsize 9.12% 11.6% 6.49% 9.26% 9.36% 8.00% 9.31% 6.77% 7.93% 8.47%
56
; ; Averagecoefficient 5.44e‐7 1.16e‐7 3.40e‐7 1.32e‐7 1.21e‐7 5.64e‐7 1.23e‐6 3.17e‐7 1.40e‐7 1.18e‐7 Averaget‐statistics 3.740 3.449 3.934 3.837 3.975 4.071 3.746 4.090 4.379 4.533 Coefficient 0significant 95.00% 90.00% 95.00% 100% 97.56% 94.17% 92.50% 92.50% 97.50% 96.90% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% Averageeffectsize 49.7% 43.8% 46.3% 59.1% 61.1% 47.4% 42.3% 44.9% 54.9% 55.8%AverageadjustedR2 37.09% 32.91% 39.10% 39.25% 40.42% 40.82% 37.76% 41.11% 43.59% 45.03%AverageR2 37.59% 33.44% 39.59% 39.73% 40.90% 41.29% 38.25% 41.58% 44.04% 45.47%
PanelB:RegressionsofdailyrealizedvarianceondailymeasuresofPINandPSOS
15‐minute 10‐minute Entire
120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500 Entire120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500
ExplanatoryvariablesAveragecoefficient 0.342 0.361 0.374 0.289 0.278 0.354 0.381 0.140 0.312 0.302
Averaget‐statistics 4.663 5.080 4.867 4.044 4.048 5.057 5.538 0.550 4.508 4.199 Coefficient 0significant 89.17% 95.00% 90.00% 82.50% 78.49% 85.83% 87.50% 87.50% 82.50% 76.05% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.22% Averageeffectsize 34.1% 36.9% 37.3% 28.9% 27.9% 35.4% 38.1% 36.9% 31.2% 30.3%
Averagecoefficient 2.48e‐4 4.39e‐4 1.50e‐4 1.55e‐4 1.60e‐4 2.40e‐4 3.89e‐4 1.66e‐4 1.65e‐4 1.64e‐4 Averaget‐statistics 1.377 1.832 1.145 1.152 0.950 1.442 1.841 1.289 1.197 0.965 Coefficient 0significant 32.50% 45.00% 25.00% 27.50% 22.17% 30.00% 40.00% 30.00% 20.00% 19.96% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.89% 0.00% 0.00% 0.00% 0.00% 1.55% Averageeffectsize 9.84% 13.9% 8.27% 7.34% 7.62% 9.14% 12.1% 8.20% 7.05% 7.05%
Averagecoefficient 6.45e‐4 8.07e‐4 5.46e‐4 5.82e‐4 6.24e‐4 6.46e‐4 8.12e‐4 5.46e‐4 5.80e‐4 6.22e‐4 Averaget‐statistics 4.346 4.291 4.467 4.280 4.570 4.624 4.461 4.613 4.799 5.126 Coefficient 0significant 100% 100% 100% 100% 99.33% 99.17% 97.50% 100% 100% 98.45% Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% Averageeffectsize 53.7% 49.7% 50.2% 61.3% 62.6% 51.5% 47.7% 48.9% 57.8% 59.1%AverageadjustedR2 30.43% 26.32% 32.76% 31.06% 31.63% 33.86% 31.84% 34.17% 35.56% 36.84%AverageR2 30.85% 37.50% 33.16% 31.47% 32.04% 34.25% 32.25% 34.56% 35.95% 37.22%Hypothesistest:Themeansofregressioncoefficientsof and areequal. p‐value 2.95e‐6 0.004 0.002 5.64e‐3 2.20e‐9 2.11e‐6 0.001 0.003 6.26e‐2 8.02e‐10
57
Table VI The effects of information-based trading on systematic risk and idiosyncratic risk
This table reports the results of following regression models for individual stocks: ; ; ; ; ; ; ′ ′ ′
; ;′
; ;′
; ;′
′ ′ ′ ′ ′ where denotes a stock’s market beta on day t estimated from an intraday market model and is the daily idiosyncratic risk. In the first two regressions,
; ; , ; ; , and ; ; denote the expected numbers of trades due to liquidity needs, private information and disputable public information, respectively. Scaled measures and in the last two regressions are the probabilities of privately informed trading and SOS trading. The table displays average regression coefficients, average t-statistics (in parentheses) corrected for autocorrelation, the percentage of sample stocks with the regression coefficient being significantly positive or negative at the 5% level, average and adjusted of the individual regressions across the first sample of 120 stocks, its three size-based groups, and the second sample of 451 stocks constituting S&P500 Index. Effect size measures the normalized total effect of an explanatory variable. For instance, the effect size of ; ; is calculated by ∑ ; ; ∑ | | .⁄ Average effect size across the sample stocks are reported in brackets. The p-values of testing that the mean of coefficients of is equal to that of are reported in Panels C and D.
PanelA:Regressionsofdailybetaandidiosyncraticrisksampledata15‐minutefrequency
Dailybeta Dailyidiosyncraticrisk Entire
120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500Entire120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500
ExplanatoryvariablesLaggedterm Averagecoefficient 0.121 0.095 0.126 0.143 0.126 0.373 0.402 0.408 0.309 0.287
Averaget‐statistics 2.290 1.825 2.321 2.724 2.370 6.264 7.037 6.872 4.883 4.528Coefficient 0significant 55.83% 45.00% 60.00% 62.50% 60.09% 95.83% 97.50% 100% 90.00% 84.92%Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 12.3% 9.81% 12.7% 14.5% 12.7% 37.3% 40.2% 40.7% 30.9% 28.7%
; ; Averagecoefficient 1.70e‐4 3.47e‐7 ‐1.28e‐4 1.57e‐5 ‐4.41e‐5 1.06e‐5 2.48e‐5 4.78e‐6 2.06e‐6 1.83e‐6Averaget‐statistics 0.456 0.002 ‐0.550 0.605 ‐0.458 2.776 2.703 2.282 3.342 3.478Coefficient 0significant 5.00% 12.50% 0.00% 2.50% 2.22% 70.83% 70.00% 57.50% 85.00% 83.59%Coefficient 0significant 6.67% 7.50% 7.50% 5.00% 8.65% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 5.72% 5.36% 3.20% 3.46% 2.83% 24.9% 25.3% 21.1% 28.4% 30.3%
; ; Averagecoefficient ‐4.62e‐5 4.95e‐4 ‐8.19e‐7 ‐1.15e‐5 1.01e‐5 3.85e‐6 9.04e‐6 1.67e‐6 8.43e‐7 4.91 e‐7Averaget‐statistics ‐0.251 0.901 ‐0.139 ‐0.204 0.494 0.701 0.958 0.435 0.711 0.418Coefficient 0significant 5.00% 27.50% 7.50% 2.50% 15.52% 10.83% 12.50% 7.50% 12.50% 9.53%Coefficient 0significant 7.50% 2.50% 10.00% 7.50% 4.21% 0.00% 0.00% 0.00% 0.00% 2.88%Averageeffectsize 3.76% 9.03% 4.67% 2.73% 3.59% 2.92% 3.38% 2.33% 3.04% 2.62%
58
; ; Averagecoefficient 9.60e‐4 2.46e‐3 3.34e‐4 8.66e‐5 6.58e‐5 1.07e‐5 1.59e‐5 5.86e‐6 2.47e‐6 2.18e‐6Averaget‐statistics 1.158 2.043 0.882 0.549 0.857 5.116 4.627 5.147 5.572 5.926Coefficient 0significant 25.83% 40.00% 22.50% 15.00% 19.51% 97.50% 95.00% 97.50% 100% 99.33%Coefficient 0significant 0.83% 2.50% 0.00% 0.00% 0.22% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 28.5% 49.1% 20.3% 16.2% 18.6% 18.5% 16.2% 17.9% 21.5% 22.2%
AverageadjustedR2 3.81% 4.76% 2.91% 3.75% 3.13% 45.97% 41.32% 46.17% 50.41% 50.66%AverageR2 4.58% 5.53% 3.69% 4.52% 3.91% 46.40% 41.79% 46.60% 50.81% 51.05%
PanelB:Regressionsofdailybetaandidiosyncraticrisksampledata10‐minutefrequency
Dailybeta Dailyidiosyncraticrisk Entire
120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500Entire120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500
ExplanatoryvariablesLaggedterm Averagecoefficient 0.136 0.102 0.140 0.167 0.151 0.392 0.423 0.419 0.334 0.315
Averaget‐statistics 2.603 2.162 2.606 3.040 2.819 6.683 7.581 7.206 5.263 5.081Coefficient 0significant 65.83% 57.50% 62.50% 77.50% 71.18% 94.17% 97.50% 95.00% 90.00% 88.03%Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 13.9% 10.8% 14.1% 16.8% 15.2% 39.2% 42.3% 41.9% 33.4% 31.5%
; ; Averagecoefficient 1.03e‐4 4.13e‐4 ‐3.97e‐5 ‐6.32e‐5 ‐3.40e‐5 1.10e‐5 2.55e‐5 5.33e‐6 2.11e‐6 1.87e‐6Averaget‐statistics ‐0.097 0.453 ‐0.379 ‐0.365 ‐0.542 3.031 2.837 2.542 3.714 3.785Coefficient 0significant 4.17% 10.00% 2.50% 0.00% 1.77% 72.50% 67.50% 65.00% 85.00% 85.37%Coefficient 0significant 4.17% 0.00% 7.50% 5.00% 11.97% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 3.81% 4.96% 3.31% 3.15% 2.69% 25.0% 24.6% 22.3% 28.0% 30.1%
; ; Averagecoefficient 1.66e‐4 4.87e‐4 1.82e‐5 ‐6.35e‐6 8.42e‐6 3.35e‐6 7.33e‐6 2.01e‐6 7.17e‐7 4.77e‐7Averaget‐statistics 0.382 1.329 0.013 ‐0.196 0.421 0.710 0.891 0.584 0.656 0.400Coefficient 0significant 20.00% 47.50% 7.50% 5.00% 16.63% 10.00% 7.50% 15.00% 7.50% 7.98%Coefficient 0significant 9.17% 7.50% 12.50% 7.50% 6.21% 0.00% 0.00% 0.00% 0.00% 3.77%Averageeffectsize 6.11% 11.3% 3.92% 3.08% 3.18% 2.60% 2.89% 2.35% 2.57% 2.39%
; ; Averagecoefficient 9.51e‐4 2.04e‐3 6.91e‐4 1.23e‐4 6.57e‐5 1.17e‐5 1.65e‐5 5.56e‐6 2.47e‐6 2.12e‐6Averaget‐statistics 1.870 2.753 1.929 0.928 0.947 5.508 4.900 5.509 6.114 6.478Coefficient 0significant 41.67% 57.50% 47.50% 20.00% 23.50% 97.50% 95.00% 97.50% 100% 99.33%Coefficient 0significant 0.83% 0.00% 0.00% 2.50% 1.33% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 38.2% 57.2% 36.7% 20.7% 18.5% 17.7% 15.7% 17.0% 20.3% 20.8%
AverageadjustedR2 5.71% 7.16% 4.93% 5.04% 4.37% 49.16% 44.95% 48.57% 53.97% 54.03%AverageR2 6.46% 7.90% 5.69% 5.80% 5.14% 49.57% 45.39% 48.98% 54.34% 54.40%
59
PanelC:Regressionsofdailybetaandidiosyncraticrisksampledata15‐minutefrequency
Dailybeta Dailyidiosyncraticrisk Entire
120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500Entire120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500
ExplanatoryvariablesLaggedterm Averagecoefficient 0.124 0.098 0.129 0.143 0.128 0.409 0.428 0.436 0.365 0.344
Averaget‐statistics 2.331 1.888 2.367 2.738 2.393 7.098 7.604 7.374 6.317 5.997Coefficient 0significant 58.33% 47.50% 62.50% 65.00% 60.53% 98.33% 97.50% 100% 97.50% 95.12%Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 12.5% 10.2% 12.9% 14.46 12.8% 40.9% 42.7% 43.5% 36.4% 34.3%Averagecoefficient 0.015 0.201 ‐0.132 ‐0.025 ‐0.124 0.005 0.007 0.003 0.005 0.004
Averaget‐statistics 0.049 0.601 ‐0.399 ‐0.055 ‐0.386 1.619 2.126 1.276 1.456 1.150 Coefficient 0significant 5.00% 12.50% 0.00% 2.50% 1.11% 40.00% 50.00% 32.50% 37.50% 25.72% Coefficient 0significant 5.83% 2.50% 10.00% 5.00% 6.21% 0.00% 0.00% 0.00% 0.00% 0.22% Averageeffectsize 3.85% 5.73% 3.11% 2.72% 2.99% 4.13% 5.77% 3.55% 3.08% 2.95%
Averagecoefficient 0.192 0.430 0.016 0.131 0.098 0.012 0.013 0.011 0.013 0.014Averaget‐statistics 1.282 2.468 0.245 1.133 0.892 6.271 6.034 6.108 6.671 6.997Coefficient 0significant 34.17% 65.00% 7.50% 30.00% 25.50% 100% 100% 100% 100% 100%Coefficient 0significant 4.17% 0.00% 10.00% 2.50% 3.33% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 9.24% 17.3% 5.04% 5.36% 5.22% 23.9% 22.1% 22.6% 27.1% 28.0%
AverageadjustedR2 3.38% 3.92% 2.43% 3.77% 2.87% 41.71% 37.66% 42.27% 45.20% 45.29%AverageR2 3.96% 4.50% 3.02% 4.35% 3.45% 42.06% 38.04% 42.61% 45.53% 45.62%Hypothesistest:Themeansofregressioncoefficientsof and areequal.
p‐value 0.965 0.292 0.110 0.828 0.528 1.17e‐12 1.85e‐6 1.19e‐4 8.13e‐5 2.22e‐11
60
PanelD:Regressionsofdailybetaandidiosyncraticrisksampledata10‐minutefrequency
Dailybeta Dailyidiosyncraticrisk Entire
120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500Entire120stocks
Smallsizegroup
Mediumsizegroup
Largesizegroup
S&P500
ExplanatoryvariablesLaggedterm Averagecoefficient 0.140 0.107 0.145 0.168 0.154 0.428 0.448 0.447 0.387 0.366
Averaget‐statistics 2.649 2.211 2.699 3.036 2.837 7.563 8.394 7.818 6.478 6.280Coefficient 0significant 66.67% 57.50% 67.50% 75.00% 72.51% 99.17% 100% 100% 97.50% 94.90%Coefficient 0significant 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 14.2% 11.2% 14.6% 16.9% 15.4% 42.7% 44.8% 44.6% 38.7% 36.5%Averagecoefficient 0.11 0.333 0.007 0.010 ‐0.130 0.005 0.006 0.004 0.005 0.004
Averaget‐statistics 0.352 1.138 0.018 ‐0.101 ‐0.457 1.653 2.103 1.440 1.415 1.147 Coefficient 0significant 10.83% 17.50% 12.50% 2.50% 1.55% 38.33% 55.00% 35.00% 25.00% 25.28% Coefficient 0significant 5.83% 0.00% 12.50% 5.00% 10.64% 0.00% 0.00% 0.00% 0.00% 0.22% Averageeffectsize 4.85% 8.17% 3.92% 2.47% 2.74% 3.87% 5.26% 3.52% 2.83% 2.74%
Averagecoefficient 0.26 0.540 0.134 0.106 0.083 0.012 0.013 0.011 0.013 0.014Averaget‐statistics 1.961 3.665 1.267 0.952 0.859 6.455 6.143 6.187 7.033 7.479Coefficient 0significant 42.50% 72.50% 35.00% 20.00% 25.06% 100% 100% 100% 100% 100%Coefficient 0significant 0.83% 0.00% 0.00% 2.50% 5.76% 0.00% 0.00% 0.00% 0.00% 0.00%Averageeffectsize 11.7% 23.4% 7.23% 4.50% 4.92% 23.0% 21.5% 21.7% 25.8% 26.8%
AverageadjustedR2 5.10% 6.28% 4.14% 4.89% 4.04% 44.82% 41.20% 44.39% 48.87% 49.23%AverageR2 5.67% 6.85% 4.71% 5.46% 4.61% 45.15% 41.55% 44.72% 49.18% 49.54%Hypothesistest:Themeansofregressioncoefficientsof and areequal.
p‐value 0.996 0.695 0.450 0.881 0.132 9.87e‐13 7.47e‐7 2.56e‐4 3.03e‐4 2.22e‐9