on more robust estimation of skewness and...
TRANSCRIPT
1
On More Robust Estimation of Skewness and Kurtosis: Simulation and
Application to the S&P500 Index
Tae-Hwan Kim
School of Economics, University of Nottingham University Park, Nottingham NG7 2RD, UK
(Phone) 44-115-951-5466 (Fax) 44-115-951-4159 [email protected]
Halbert White
Department of Economics, University of California, San Diego 9500 Gilman Drive, La Jolla, CA 92093-0508 (Phone) 858-534-3502 (Fax) 858-534-7040
September, 2003
Abstract: For both the academic and the financial communities it is a familiar stylized fact that stock
market returns have negative skewness and excess kurtosis. This stylized fact has been supported by a vast
collection of empirical studies. Given that the conventional measures of skewness and kurtosis are computed
as an average and that averages are not robust, we ask, “How useful are the measures of skewness and
kurtosis used in previous empirical studies?” To answer this question we provide a survey of robust
measures of skewness and kurtosis from the statistics literature and carry out extensive Monte Carlo
simulations that compare the conventional measures with the robust measures of our survey. An application
of the robust measures to daily S&P500 index data indicates that the stylized facts might have been accepted
too readily. We suggest that looking beyond the standard skewness and kurtosis measures can provide
deeper insight into market returns behaviour.
Keywords : Skewness, Kurtosis, Quantiles, Robustness.
2
1. Introduction
It has long been recognized that the behavior of stock market returns does not agree with the
frequently assumed normal distribution. This disagreement is often highlighted by showing the
large departures of the skewness and kurtosis of returns from normal distribution counterparts. For
both the academic and the financial communities it has become a firm and indisputable stylized
fact that stock market returns have negative skewness and excess kurtosis. This stylized fact has
been supported by a huge collection of empirical studies. Some recent papers on this issue include
Bates (1996), Jorion (1988), Hwang and Satchell (1999), and Harvey and Siddique (1999, 2000).
The role of higher moments has become increasingly important in the literature mainly because
the traditional measure of risk, variance (or standard deviation), has failed to capture fully the "true
risk" of the distribution of stock market returns. For example, if investors prefer right-skewed
portfolios, then more reward should be given to investors willing to invest in left-skewed portfolios
even though both portfolios have the same standard deviation. This suggests that the "true risk"
may be a multi-dimensional concept and that other measures of distributional shape such as higher
moments can be useful in obtaining a better description of multi-dimensional risk. In this context,
Harvey and Siddique (2000) proposed an asset pricing model that incorporates skewness and
Hwang and Satchell (1999) developed a CAPM for emerging markets taking into account
skewness and kurtosis.
Given this emerging interest in skewness and kurtosis in financial markets, one should ask the
following question: how useful are the measures of skewness and kurtosis used in previous
empirical studies? Practically all of the previous work concerning skewness and kurtosis in
financial markets has used the conventional measures of skewness and kurtosis (that is, the
standardized third and fourth sample moments or some variants of these). It is well known that the
sample mean (also its regression version, the least squares estimator) is very sensitive to outliers.
Since the conventional measures of skewness and kurtosis are essentially based on sample
averages, they are also sensitive to outliers. Moreover, the impact of outliers is greatly amplified in
the conventional measures of skewness and kurtosis due to the fact that they are raised to the third
and fourth powers.
In the statistics literature, a great deal of effort has been taken to overcome the non-robustness
of the conventional measures of location and dispersion (i.e. mean and variance), and some
attention has been paid to the non-robustness of the conventional measures of skewness and
3
kurtosis. In the finance literature, there has been some concern with non-robustness of conventional
measures of location and dispersion, but almost no attention to the non-robustness of conventional
measures of skewness and kurtosis. In this paper, we consider certain robust alternative measures
of skewness and kurtosis based on quantiles that have been previously developed in the statistics
literature, and we conduct extensive Monte Carlo simulations to evaluate and compare the
conventional measures of skewness and kurtosis and their robust counterparts. Our simulation
results demonstrate that the conventional measures are extremely sensitive to single outliers or
small groups of outliers, comparable to those observed in U.S. stock returns. An application of
robust measures to the daily S&P500 index indicates that the familiar stylized facts (negative
skewness and excess kurtosis in financial markets) may have been too readily accepted.
2. Review of Robust Measures of Skewness and Kurtosis
We consider a process Ntty ,...,2,1}{ = and assume that the ty ’s are independent and identically
distributed with a cumulative distribution function F . The conventional coefficients of skewness
and kurtosis for ty are given by:
,3
1
−=
σµty
ESK 34
1 −
−=
σµty
EKR ,
where )( tyE=µ and ,)( 22 µσ −= tyE and expectation E is taken with respect to F . Given the
data Ntty ,...,2,1}{ = , 1SK and 1KR are usually estimated by the sample averages
,ˆ
ˆ
1
31
1 ∑=
−∧
−=
N
t
tyTSK
σµ
∑=
−∧
−
−=
N
t
tyTKR
1
41
1 3ˆ
ˆσ
µ,
where ,ˆ1
1∑=
−=N
ttyTµ .)ˆ(ˆ
1
212 ∑=
− −=N
ttyT µσ
Due to the third and fourth power terms in 1SK and 1KR , the values of these measures can be
arbitrarily large, especially when there are one or more large outliers in the data. For this reason, it
can sometimes be difficult to give a sensible interpretation to large values of these measures simply
because we do not know whether the true values are indeed large or there exist some outliers. One
seemingly simple solution is to eliminate the outliers from the data. Two problems arise in this
approach. One is that the decision to eliminate outliers is taken usually after visually inspecting the
4
data; this can invalidate subsequent statistical inference. The other is that deciding which
observations are outliers can be somewhat arbitrary.
Hence, eliminating outliers manually is not as simple as it may appear, and it is desirable to
have non-subjective robust measures of skewness and kurtosis that are not too sensitive to outliers.
We turn now to a description of a number of more robust measures of skewness and kurtosis that
have been proposed in the statistics literature. It is interesting to note that among the robust
measures we discuss below, only one requires the second moment and all other measures do not
require any moments to exist.
2.1 Robust Measures of Skewness
Robust measures of location and dispersion are well known in the literature. For example, the
median can be used for location and the interquartile range for dispersion. Both the median and the
interquartile range are based on quantiles. Following this tradition, Bowley (1920) proposed a
coefficient of skewness based on quantiles:
,2
13
2132 QQ
QQQSK
−−+
=
where iQ is the thi quartile of ty : that is, ),25.0(11
−= FQ )5.0(12
−= FQ and ).75.0(13
−= FQ It
is easily seen that for any symmetric distribution, the Bowley coefficient of skewness is zero. The
denominator 13 QQ − re-scales the coefficient so that the maximum value for 2SK is 1,
representing extreme right skewness and the minimum value for 2SK is –1, representing extreme
left skewness.
The Bowley coefficient of skewness has been generalized by Hinkley (1975):
)()1(2)()1(
)(11
211
3 αααα
α−−
−−
−−−+−
=FF
QFFSK ,
for any α between 0 and 0.5. Note that the Bowley coefficient of skewness is a special case of
Hinkley's coefficient when α = 0.25. This measure, however, depends on the value of α and it is
not clear what value should be used for α . One way of removing this dependence is to integrate
out ,α as done in Groeneveld and Meeden (1984):
5
{ }{ }
.||
)()1(
2)()1(
2
2
5.0
0
11
5.0
0 211
3
QyEQ
dFF
dQFFSK
t −−
=
−−
−+−=
∫∫
−−
−−
µ
ααα
ααα
This measure is also zero for any symmetric distributions and is bounded by –1 and 1.
Noting that the denominator || 2QyE t − is a kind of dispersion measure, we observe that the
Pearson coefficient of skewness [Kendall and Stuart (1977)] is obtained by replacing the
denominator with the standard deviation as follows:
.24 σ
µ QSK
−=
Groeneveld and Meeden (1984) have put forward the following four properties that any
reasonable coefficient of skewness )( tyγ should satisfy: (i) for any ),0( ∞∈a and ),,( ∞−∞∈b
)( tyγ = )( bay t +γ ; (ii) if ty is symmetrically distributed, then )( tyγ = 0; (iii) )( tyγ− = )( ty−γ ;
(iv) if F and G are cumulative distribution functions of ty and tx , and GF c< , then )( tyγ ≤
)( txγ where c< is a skewness-ordering among distributions (see Zwet (1964) for the definition).
Groeneveld and Meeden proved that 1SK , 2SK and 3SK satisfy (i)-(iv), but 4SK satisfies (i)-(iii)
only.
2.2 Robust Measures of Kurtosis
Moors (1988) showed that the conventional measure of kurtosis ( 1KR ) can be interpreted as a
measure of the dispersion of a distribution around the two values σµ ± . Hence, 1KR can be large
when probability mass is concentrated either near the mean µ or in the tails of the distributions.
Based on this interpretation, Moors (1988) proposed a robust alternative to 1KR :
,)()(
26
1357
EEEEEE
−−+−
where iE is the thi octile: that is, )8/(1 iFEi−= for .7...,,2,1=i Moors
justified this estimator on the ground that the two terms, )( 57 EE − and )( 13 EE − , are large (small)
if relatively little (much) probability mass is concentrated in the neighbourhood of 6E and 2E ,
corresponding to large (small) dispersion around σµ ± . The denominator is a scaling factor
6
ensuring that the statistic is invariant under linear transformation. As we do for ,1KR we center the
Moors coefficient of kurtosis at the value for the standard normal distribution. It is easy to
calculate that ,15.171 −=−= EE ,68.062 −=−= EE 32.053 −=−= EE and 04 =E for )1,0(N
and therefore the Moors coefficient of kurtosis is 1.23. Hence, the centered coefficient is given by:
.23.1)()(
26
13572 −
−−+−
=EE
EEEEKR
While investigating how to test light-tailed distributions against heavy-tailed distributions, Hogg
(1972, 1974) found that the following measure of kurtosis performs better than the traditional
measure 1KR in detecting heavy-tailed distributions: ββ
αα
LULU
−−
where )( αα LU is the average of
the upper (lower) α quantiles defined as: ,)(1 1
1
1∫ −
−=αα α
dyyFU ,)(1
0
1∫ −=α
α αdyyFL for
).1,0(∈α According to Hogg’s simulation experiments, α = 0.05 and β = 0.5 gave the most
satisfactory results. Here, we adopt these values for α and β . For ),1,0(N we have
06.205.005.0 =−= LU and ,80.05.05.0 =−= LU implying that the Hogg coefficient with α = 0.05
and β = 0.5 is 2.59. Hence, the centered Hogg coefficient is given by:
.59.25.05.0
05.005.03 −
−−
=LULU
KR
Another interesting measure based on quantiles has been used in Crow and Siddiqui (1967),
which is given by )()1()()1(
11
11
ββαα
−−
−−
−−+−
FFFF
for ).1,0(, ∈βα Their choices for α and β are 0.025 and
0.25 respectively. For these values, we obtain 96.1)025.0()975.0( 11 =−= −− FF and
68.0)25.0()75.0( 11 −=−= −− FF for N(0,1) and the coefficient is 2.91. Hence, the centered
coefficient is:
.91.2)25.0()75.0(
)025.0()975.0(11
11
4 −−+
=−−
−−
FFFF
KR
7
3. Monte Carlo Simulations
In this section we conduct Monte Carlo simulations designed to investigate how robust the
alternative measures of skewness and kurtosis are in finite samples. The simulations were carried
out on a 700MHz PC using MATLAB. The random number generator used in the simulations is
that from the MATLAB Statistics Toolbox.
We choose three symmetric distributions and one non-symmetric distribution for our simulation
study. Our symmetric distributions are the standard normal distribution N(0,1), and the Student
−t distributions with 10 and 5 degrees of freedom [T-10, T-5]. These represent moderate, heavy
and very heavy tailed distributions. For the non-symmetric distribution we use the log-normal
distribution with µ = 1, σ = 0.4 (shifted by 2)4.0(5.01+− e so that the mean is zero) denoted Log-
N(1,0.4). For a range of values for N (50, 250, 500, 1000, 2500 and 5000), we generate Ntty ,...,2,1}{ =
using the four distributions and calculate the various measures of skewness and kurtosis discussed
in the previous section. The number of replications for each experiment is 1000. The true values of
the skewness and kurtosis measures for the various distributions are provided in Table 1. If our
statistics are consistent, then their Monte Carlo distributions should collapse around these values as
∞→N .
The simulation results are reported in Figures 1-8. Each figure is divided into two sections: A
and B. Figure A shows smoothed histograms of estimated coefficients of skewness or kurtosis for
the 5 different sample sizes and 4 data generating distributions. We use a kernel density method to
smooth the histograms. In order to provide additional information, Figure B displays box-plots for
the corresponding smoothed histograms. As usual, each box represents the lower quartile, median,
and upper quartile values. The whiskers are lines extending from each end of the box and their
length is chosen to be the same as the length of the corresponding box (i.e. the inter-quartile range).
The number at the end of each whisker is the number of observations beyond the end of the
whiskers.
As expected, the performance of 1SK (Figure 1) deteriorates as the distribution moves from
N(0,1) to T-10 and T-5. The sampling distributions tend to have large dispersion and the number of
observations outside the whiskers is noticeably increasing. For the lognormal case, 1SK is not a
good measure when N is small, which is indicated by the fact that the center of the box-plot is
quite different from the limiting value (1.32), although it does move towards the limiting value as
8
∞→N . These problems, however, are not present in the other robust measures, 2SK (Figure 2),
3SK (Figure 3), 4SK (Figure 4). That is, the sampling distributions are fairly stable and similar
across various distributions and also, for the lognormal case, the center of the box-plot stays near
the limiting value even for N = 50.
The performance of 1KR (Figure 5) is even worse than 1SK as the distribution moves from
N(0,1) to T-10 and T-5. For the T-5 case, the center of the box-plot is still far away from the true
value 6 even for N = 5000. The small sample bias of 1KR for Log-N(1,0.4) is also evident in the
box-plot. Other robust estimates ( 2KR and 4KR ) do not exhibit these problems at all. The
sampling distribution of 3KR indicates that there is a small finite sample bias when the number of
observations is less than 1000.
Next we add a single outlier to the same set of generated random numbers Ntty ,...,2,1}{ = and
calculate the same set of measures in order to see the impact of the single outlier on the various
measures. The outlier is constructed to occur at time )1,0(∈τ . In order to inject some realism into
the simulation study, we use the daily S&P500 index to calculate the size and location of the
outlier. The sample period is January 1, 1982 through June 29, 2001 with 5085. The largest outlier
(-20.41%) is caused by the 1987 stock market crash. The timing of the crash in this sample period
is τ = 0.3 (the location of the outlier in the sample divided by the total number of observations).
We calculate the 25th percentile of the sampling distribution of the S&P index. It turns out that the
25th percentile is -0.42. The size of the outlier relative to the 25th percentile is then given by
62.4842.0/41.20)25.0(/ 1][ =−−== −Fym Tτ . Therefore, we first generate random numbers
Ntty ,...,2,1}{ = and calculate the 25th percentile ).25.0(1−F The outlier is then )25.0(1−mF and we
replace the observation ]3.0[ Ty with the outlier.
The results are reported in Figures 9-16. The impact of one outlier on 1SK is clearly visible in
Figure 9. Note that the center of the sampling distributions of 1SK is moving toward to zero (the
true value for all symmetric distributions) once N is greater than 500, but even for N = 5000, the
center is far from zero. No whiskers from the box-plots contain the true value 0 in their upper tails.
The maximum impact occurs approximately when N = 500. In that case the center of the box-
plots is around between -13 and -10. In contrast, the outlier has no impact on 2SK at all. This is
expected since 2SK is based on quantiles whose values are not changed by a single observation.
9
On the other hand, we see the moving-box type of convergence for the other robust skewness
estimators ( 3SK and 4SK ). This is because these measures involve µ and σ , and the outlier must
have some impact on the sample mean and sample standard deviation. Even for N = 5000 the
medians of the sampling distributions of 3SK and 4SK for the symmetric generating distributions
are slightly less than zero. If one does not want to have any impact of a single outlier when
measuring skewness, then 2SK is preferable. However, if one wishes to take into account the
outlier in the calculation, but does not want to have as severe a distortion as appears in 1SK , then
3SK or 4SK might be a better measure.
The impact of the single outlier on 1KR is truly spectacular (Figure 13). The effect is
maximized around at N = 1000 in which the center of box-plots are between 150 and 300 for
various distributions when the true value should be between 0 and 6. Once N becomes larger than
1000, 1KR converges towards its true value, but even with N = 5000 the medians are still between
80 and 200. The results indicate that it may not be possible to attach any meaningful interpretation
to a large value of 1KR even when there are sufficiently many observations. The Moors coefficient
2KR , on the other hand, is not influenced by the outlier at all (Figure 14). For the same reason as
for 2SK , this is because 2KR is based on octiles, which are not affected by a single observation.
The centered Hogg coefficient 3KR is mildly influenced by the outlier, especially when the number
of observations is very small, but the influence disappears quickly as N increases (Figure 15). The
source of the influence is the terms 05.0L and 5.0L . As explained before, αL is the average of the
lower α percentile tails. Hence, when N is small (e.g. 50), the outlier becomes one of the
percentiles and enters the calculation of αL . On the other hand, the impact on 4KR is present only
when N = 50 (Figure 16). This is because the term )025.0(1−F in the formula is equal to the
outlier when the number of observations is very small. The same comment can be applied to the
choice of kurtosis measures: 2KR completely ignores a single outlier while 3KR and 4KR reflects
to some degree the presence of an outlier without much distortion.
If an outlier occurs only once and its size does not depend on the sample size (N ), then its
impact on any statistic will eventually disappear as the sample size goes to infinity. This must be
true for even 1SK or 1KR , despite their being severely degraded in finite samples. In reality,
however, we tend to find that large outliers recur through time: for example, the 1987 stock market
10
crash and the 1998 Asian crisis. One way of modelling this phenomenon is to use a mixture
distribution. Suppose that a process }{ ty is generated from ),( 11 σµD with probability p and
from ),( 122 γσσµ =D with probability p−1 . If the probability p is very close to one and 2σ is
fairly large compared to 1σ , then the process has recurring outliers with probability p−1 through
time. In our simulations we determine the values of p and γ , again using the daily S&P index.
We treat a change larger than 7% per day as an outlier. Over the sample period, there are six
observations whose absolute values are greater than 7%. Our estimate for p is then given by
9988.05085/5079ˆ ==p and p̂1− = 0.0012. The sample standard devia tion of the sample
without the six outliers is 0.94, and the sample standard deviation of the six outliers is 9.99. Hence,
the ratio is 63.1094.0/99.9 = . The sample mean of the six outliers is -7.322. Taking into account
these estimates, we choose the following set of parameter values for our simulations: ,01 =µ
11 =σ , ,72 −=µ 10=γ and .9988.0=p Hence, the random numbers are generated by
)1,0(9988.0 D + )10,7(0012.0 −D using the same four distributions for D . We present the true
values of the skewness and kurtosis measures for these mixture distributions in Table 2. We
provide a short discussion of how to obtain these values in the Appendix.
The simulation results are displayed in Figures 17-24. The behaviour of 1SK (Figure 17) is
quite different from that of 1SK with a single outlier in that the dispersion of sampling distributions
increases as the sample size becomes larger. This may seem surprising because we usually expect
the dispersion of a sampling distribution to shrink as N increases. In the single outlier case we
guaranteed the occurrence of the outlier in each sample regardless of the value of N . In the
mixture distribut ion case, on the other hand, when N is small, the sample may not have any
outliers due to the high value of .9988.0=p This may explain why the dispersion of the sampling
distribution of 1SK is small when N is small, and becomes larger as N increases. In most
distributions, the center of box-plots converges to the true value rather slowly: with N = 5000 it is
still far from the true value; -2.27 for N(0,1), -2.00 for T-10, -1.71 for T-5 and 0.52 for Log-
N(1,0.4). The other robust skewness measures ( ,2SK ,3SK 4SK ) are in general not influenced by
this type of outlier: the dispersion is quite small even for small N and there is no finite sample
bias. As for 1SK , the sampling distributions of 1KR (Figure 21) tend to have larger dispersion as
N approaches 5000. Two other measures ( ,2KR 4KR ) are robust to these recurring outliers and
11
their medians converge to the true limiting values reasonably quickly (Figures 22 and 24). The
Hogg coefficient 3KR (Figure 23) has a finite sample bias for small N , but this disappears once
the number of observations becomes larger than 500.
4. Application: Skewness and Kurtosis of the S&P500 Index
In this section we apply conventional and robust measures of skewness and kurtosis to the same
S&P500 index data described in the previous section. The sample period of January 1, 1982
through Jun 29, 2001 yields 5085 observations, and the unit is percent return.
First we compute the conventional measures of skewness and kurtosis using all observations.
For this sample period, we have 1SK = -2.39 and 1KR = 53.62, which is consistent with the
previous findings in the literature; i.e. negative skewness and excess kurtosis. Next we use robust
measures to estimate skewness and kurtosis. The results are displayed in the first column of Table
3. The outcomes are interesting in that all but one robust estimate support the opposite
characterization of skewness and kurtosis. All the robust skewness measures are pretty close to
zero and hence indicate that there is little skewness in the distribution of the S&P500 index. Given
that all robust kurtosis measure iKR ( i = 2,3,4) are centered by the values for N(0,1), positive
values ( 2KR = 0.28, 3KR = 0.77, 4KR = 1.16) indicate that there exists excess kurtosis, but this is
considerably more mild than usually thought.
Next we re-compute all the statistics after removing the single observation corresponding the
1987 stock market crash. The results are reported in the second column of Table 3. The values for
1SK and 1KR are dramatically reduced to 1SK = -0.26 and 1KR = 6.80, but the other robust
measures hardly change. This clearly shows that the single observation must be very influential in
the calculation of 1SK and 1KR , which is consistent with what we find in our simulations. On that
basis, we may argue that it is difficult to attach a meaningful interpretation to the value of 1KR
(53.62) calculated using all observations. Finally, as we have done in the simulations, we remove
the six observations whose absolute values are larger than 7%. The results are in the third column
of Table 3. Not only the conventional measures become substantially smaller ( 1SK = -0.04 and
1KR = 3.43) as in the case where the single crash observation is removed, but also they indicate
that there exist no negative skewness and the magnitude of kurtosis is not as large as previously
12
believed. The implication of all other robust measures are qualitatively the same; that is, no
negative skewness and quite mild kurtosis.
5. Conclusion
The use of robust measures of skewness and kurtosis reveals interesting evidence, which is in sharp
contrast to what heretofore has been firmly regarded as true in the finance literature. Rather than
arguing that we have obtained definite evidence to refute the long-believed stylized facts about
skewness and kurtosis, we hope that the current paper will serve as a starting point for further
constructive research on these important issues. We do propose however, that the standard
measures of skewness and kurtosis be viewed with skepticism, that the robust measures described
here be routinely computed, and, finally, that it may be more productive to think of the S&P500
index returns studied here as being better described as a mixture containing a predominant
component that is nearly symmetric with mild kurtosis and a relatively rare component that
generates highly extreme behaviour. Viewing financial markets in this way further suggests that
useful extensions of asset pricing models now embodying only skewness and kurtosis may be
obtained by accommodating mixtures similar to those discussed here.
13
Appendix
Everitt and Hand (1981) provided the first five central moments of a two-component univariate
normal mixture. Extending their results, we consider the following mixture distribution.
21 )1( gppgf −+=
where ig has mean iµ and variance ,2iσ and is assumed to have up to 4th moment. Let
∫= dxxxf )(µ and ∫ −= dxxfxv rr )()( µ . Then it can be proved that
21 )1( µµµ pp −+=
))(1()( 22
22
21
212 δσδσ +−++= ppv
)3)(1()3( 32
222
322
31
211
3113 δσδσδσδσ ++−+++= SpSpv
)64)3)((1()64)3(( 42
22
22
3222
422
41
21
21
3111
4113 δσδσδσδσδσδσ ++++−+++++= SKpSKpv
where µµδ −= ii , and iS and iK are the skewness and kurtosis of ig respectively. Moreover, let
1X and 2X be the random variables governed by 1g and 2g . In our simulation set-up, we have
222 µγ −= XX . Then it is easily seen that both skewness and kurtosis are invariant under this
linear transformation. Hence, we have 21 SS = and 21 KK = , and the values of 1S and 1K are in
Table 1. All central moments up to 4v can be now calculated in order to obtain the skewness and
kurtosis of the corresponding mixture distribution.
14
References
Bates, D.S. (1996), "Jumps and Stochastic Volatility: Exchange Rate Processes Implicit in
Deutsche Mark Options," Review of Financial Studies, 9, 69-107.
Bowley, A.L. (1920), Elements of Statistics, New York: Charles Scribner's Sons.
Crow, E.L. and Siddiqui, M.M. (1967), "Robust Estimation of Location,'' Journal of the American
Statistical Association, 62, 353-389.
Groeneveld, R.A. and Meeden, G. (1984), "Measuring Skewness and Kurtosis,'' The Statistician,
33, 391-399.
Everitt, B.S. and Hand, D.J. (1981), Finite Mixture Distributions, London, New York: Chapman
and Hall.
Harvey, C.R. and Siddique, A. (1999), "Autoregressive Conditional Skewness," Journal of
Financial and Quantitative Analysis, 34, 465-487.
Harvey, C.R. and Siddique, A. (2000), "Conditional Skewness in Asset Pricing Tests," Journal of
Finance, LV, 1263-1295.
Hinkley, D.V. (1975), "On Power Transformations to Symmetry," Biometrika, 62, 101-111.
Hogg, R.V. (1972), "More Light on the Kurtosis and Related Statistics," Journal of the American
Statistical Association, 67, 422-424.
Hogg, R.V. (1974), "Adaptive Robust Procedures: A Partial Review and Some Suggestions for
Future Applications and Theory," Journal of The American Statistical Association, 69, 909-
923.
Hwang, S. and Satchell, S.E. (1999), "Modelling Emerging Market Risk Premia Using Higher
Moments," International Journal of Finance and Economics, 4, 271-296.
Jorion, P. (1988), "On Jump Processes in the Foreign Exchange and Stock Markets," Review of
Financial Studies, 1, 27-445.
Kendall, M.G. and Stuart, A. (1977), The Advanced Theory of Statistics, Vol. 1, London: Charles
Griffin and Co.
Moors, J. J. A. (1988), "A Quantile Alternative for Kurtosis," The Statistician, 37, 25-32.
Zwet, W.R. van (1964), "Convex Transformations of Random Variables," Math. Centrum,
Amsterdam.
15
Table 1. Values of skewness and kurtosis for various distributions
(no outlier and single outlier cases)
N(0,1) T-10 T-5 Log-N(1,0.4)
1SK 0 0 0 1.32
2SK 0 0 0 0.13
3SK 0 0 0 0.25
4SK 0 0 0 0.18
1KR 3 1 6 3.26
2KR 1.23 0.04 0.10 0.04
3KR 2.59 0.20 0.46 0.19
4KR 2.91 0.28 0.63 0.27
Table 2. Values of skewness and kurtosis for various distributions
(mixture distribution case)
N(0,1) T-10 T-5 Log-N(1,0.4)
1SK -2.27 -2.00 -1.71 0.52
2SK 0 0 0 0.14
3SK -0.01 -0.01 -0.01 0.24
4SK -0.01 -0.01 -0.01 0.17
1KR 52.76 57.33 101.47 49.18
2KR 0.01 0.04 0.10 0.40
3KR 0.09 0.29 0.53 0.27
4KR 0.01 0.28 0.66 0.27
16
Table 3. Skewness and kurtosis of the S&P500 Index
Using all observations Without the 1987
crash observation
Without the 6
observations ( ≥ 7%)
1SK -2.39 -0.26 -0.04
2SK 0.08 0.08 0.08
3SK 0.04 0.04 0.05
4SK 0.02 0.03 0.03
1KR 53.62 6.80 3.44
2KR 0.28 0.28 0.28
3KR 0.77 0.73 0.67
4KR 1.16 1.16 1.13
1
Figure 1A. Sampling distributions of 1SK (smoothed histograms): no outlier case
Figure 1B. Sampling distributions of 1SK (Box-plots): no outlier case
-2 0 20
5
10N(0,1)
N=5
0
-2 0 20
5
10T-10
-2 0 20
5
10T-5
-2 0 20
5
10Log-N(1,0.4)
-2 0 20
5
10
N=250
-2 0 20
5
10
-2 0 20
5
10
-2 0 20
5
10
-2 0 20
5
10
N=500
-2 0 20
5
10
-2 0 20
5
10
-2 0 20
5
10
-2 0 20
5
10
N=1
000
-2 0 20
5
10
-2 0 20
5
10
-2 0 20
5
10
-2 0 20
5
10
N=2500
-2 0 20
5
10
-2 0 20
5
10
-2 0 20
5
10
-2 0 20
5
10
N=5
000
-2 0 20
5
10
-2 0 20
5
10
-2 0 20
5
10
-2 -1 0 1 2 3
50
250
500
1000
2500
5000
N(0,1)
29 27
19 23
14 16
17 19
20 23
20 25
-2 -1 0 1 2 3
50
250
500
1000
2500
5000
T-10
29 38
52 39
19 32
31 36
35 21
26 21
-2 -1 0 1 2 3
50
250
500
1000
2500
5000
T-5
52 43
47 46
52 63
52 62
42 64
43 51
-2 -1 0 1 2 3
50
250
500
1000
2500
5000
Log-N(1,0.4)
3 62
3 61
11 72
7 46
8 51
7 47
2
Figure 2A. Sampling distributions of 2SK (Smoothed histograms): no outlier case
Figure 2B. Sampling distributions of 2SK (Box-plots): no outlier case
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
N(0,1)
23 16
19 25
23 21
19 23
26 22
21 21
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-10
11 11
16 9
19 26
27 22
26 26
29 26
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-5
15 28
29 28
19 22
20 16
23 24
32 32
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
Log-N(1,0.4)
19 19
20 13
14 29
19 20
31 32
16 21
-0.5 0 0.50
10
20N(0,1)
N = 50
-0.5 0 0.50
10
20T-10
-0.5 0 0.50
10
20T-5
-0.5 0 0.50
10
20Log-N(1,0.4)
-0.5 0 0.50
10
20
N = 250
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N = 500
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N = 100
0
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N = 250
0
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N = 500
0
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
3
Figure 3A. Sampling distributions of 3SK (Smoothed histograms): no outlier case
Figure 3B. Sampling distributions of 3SK (Box-plots): no outlier case
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
N(0,1)
15 18
24 21
18 32
17 27
29 25
11 14
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-10
16 17
18 20
9 23
25 35
16 22
9 16
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-5
29 16
20 18
23 23
22 21
28 38
22 20
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
Log-N(1,0.4)
25 8
27 22
26 22
23 19
16 25
22 30
-0.5 0 0.50
10
20
30N(0,1)
N = 50
-0.5 0 0.50
10
20
30T-10
-0.5 0 0.50
10
20
30T-5
-0.5 0 0.50
10
20
30Log-N(1,0.4)
-0.5 0 0.50
10
20
30
N = 250
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
N = 500
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
N = 100
0
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
N = 250
0
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
N = 500
0
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
-0.5 0 0.50
10
20
30
4
Figure 4A. Sampling distributions of 4SK (Smoothed histograms): no outlier case
Figure 4B. Sampling distributions of 4SK (Box-plots): no outlier case
-0.5 0 0.50
20
N(0,1)
N = 50
-0.5 0 0.50
20
T-10
-0.5 0 0.50
20
T-5
-0.5 0 0.50
20
Log-N(1,0.4)
-0.5 0 0.50
20
N = 250
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.50
20
N = 500
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.50
20
N = 100
0
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.50
20
N = 250
0
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.50
20
N = 500
0
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.50
20
-0.5 0 0.5
50
250
500
1000
2500
5000
N(0,1)
16 19
23 21
19 34
18 27
29 26
11 13
-0.5 0 0.5
50
250
500
1000
2500
5000
T-10
17 20
16 20
9 22
26 35
16 22
9 17
-0.5 0 0.5
50
250
500
1000
2500
5000
T-5
23 16
20 17
22 20
22 20
28 37
22 17
-0.5 0 0.5
50
250
500
1000
2500
5000
Log-N(1,0.4)
27 8
28 17
23 20
19 18
16 29
21 30
5
Figure 5A. Sampling distributions of 1KR (Smoothed histograms): no outlier case
Figure 5B. Sampling distributions of 1KR (Box-plots): no outlier case
-2 0 2 4 6 80
5
N(0,1)
N=5
0
-2 0 2 4 6 80
5
T-10
-2 0 2 4 6 80
5
T-5
-2 0 2 4 6 80
5
Log-N(1,0.4)
-2 0 2 4 6 80
5
N=2
50
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
N=5
00
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
N=1
000
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
N=2
500
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
N=5
000
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 80
5
-2 0 2 4 6 8
50
250
500
1000
2500
5000
N(0,1)
0 78
5 54
3 33
10 29
13 30
17 23
-2 0 2 4 6 8
50
250
500
1000
2500
5000
T-10
0 73
1 93
0 78
2 80
4 76
2 57
-2 0 2 4 6 8
50
250
500
1000
2500
5000
T-5
0 101
0 92
0 124
0 122
0 93
0 93
-2 0 2 4 6 8
50
250
500
1000
2500
5000
Log-N(1,0.4)
0 104
0 90
0 100
0 74
0 82
1 79
6
Figure 6A. Sampling distributions of 2KR (Smoothed histograms): no outlier case
Figure 6B. Sampling distributions of 2KR (Box-plots): no outlier case
-1 0 10
5
10
15N(0,1)
N=50
-1 0 10
5
10
15T-10
-1 0 10
5
10
15T-5
-1 0 10
5
10
15Log-N(1,0.4)
-1 0 10
5
10
15
N=2
50
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 0 10
5
10
15
N=500
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 0 10
5
10
15
N=1000
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 0 10
5
10
15
N=2500
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 0 10
5
10
15
N=5000
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 0 10
5
10
15
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
N(0,1)
4 41
10 29
13 28
10 29
22 31
32 31
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-10
8 42
13 27
10 26
14 26
22 34
14 28
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-5
0 39
5 30
8 19
14 26
24 26
20 22
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
Log-N(1,0.4)
8 47
10 30
15 30
18 35
14 24
13 25
7
Figure 7A. Sampling distributions of 3KR (Smoothed histograms): no outlier case
Figure 7B. Sampling distributions of 3KR (Box-plots): no outlier case
-1 0 10
10
N(0,1)
N=50
-1 0 10
10
T-10
-1 0 10
10
T-5
-1 0 10
10
Log-N(1,0.4)
-1 0 10
10
N=2
50
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 0 10
10
N=500
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 0 10
10
N=1
000
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 0 10
10
N=2
500
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 0 10
10
N=5
000
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
N(0,1)
15 36
8 25
15 15
21 24
28 19
19 19
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-10
6 36
10 27
14 27
16 16
27 24
25 22
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-5
4 40
18 37
16 41
19 34
17 28
13 25
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
Log-N(1,0.4)
6 39
8 31
19 24
22 16
22 28
28 17
8
Figure 8A. Sampling distributions of 4KR (Smoothed histograms): no outlier case
Figure 8B. Sampling distributions of 4KR (Box-plots): no outlier case
-1 0 1 2
50
250
500
1000
2500
5000
N(0,1)
4 43
14 41
16 28
15 23
21 28
16 11
-1 0 1 2
50
250
500
1000
2500
5000
T-10
2 48
15 42
7 26
12 16
21 24
21 16
-1 0 1 2
50
250
500
1000
2500
5000
T-5
1 65
8 32
13 34
16 25
9 23
18 30
-1 0 1 2
50
250
500
1000
2500
5000
Log-N(1,0.4)
0 40
13 56
8 31
18 21
22 18
14 15
-1 0 1 20
5
N(0,1)
N=5
0
-1 0 1 20
5
T-10
-1 0 1 20
5
T-5
-1 0 1 20
5
Log-N(1,0.4)
-1 0 1 20
5
N=2
50
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
N=5
00
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
N=1
000
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
N=2
500
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
N=5
000
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
1
Figure 9A. Sampling distributions of 1SK (smoothed histograms): single outlier case
Figure 9B. Sampling distributions of 1SK (Box-plots): single outlier case
-15 -10 -50
1
2
N(0,1)
N=5
0
-15 -10 -50
1
2
T-10
-15 -10 -50
1
2
T-5
-15 -10 -50
1
2
Log-N(1,0.4)
-15 -10 -50
1
2
N=2
50
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -50
1
2
N=5
00
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -50
1
2
N=1
000
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -50
1
2
N=2
500
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -50
1
2
N=5
000
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -50
1
2
-15 -10 -5
50
250
500
1000
2500
5000
N(0,1)
0 103
3 40
22 42
15 19
15 25
20 16
-15 -10 -5
50
250
500
1000
2500
5000
T-10
0 83
17 52
10 32
15 25
28 26
19 14
-15 -10 -5
50
250
500
1000
2500
5000
T-5
0 102
5 34
13 35
13 20
19 36
32 44
-15 -10 -5
50
250
500
1000
2500
5000
Log-N(1,0.4)
0 87
7 40
7 26
12 24
23 34
29 34
2
Figure 10A. Sampling distributions of 2SK (Smoothed histograms): single outlier case
Figure 10B. Sampling distributions of 2SK (Box-plots): single outlier case
-0.5 0 0.50
10
20
N(0,1)
N=5
0
-0.5 0 0.50
10
20
T-10
-0.5 0 0.50
10
20
T-5
-0.5 0 0.50
10
20
Log-N(1,0.4)
-0.5 0 0.50
10
20
N=2
50
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=5
00
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=1
000
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=2
500
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=5
000
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.4 -0.2 0 0.2 0.4 0.6
50
250
500
1000
2500
5000
N(0,1)
21 14
18 22
24 19
21 27
26 22
22 21
-0.4 -0.2 0 0.2 0.4 0.6
50
250
500
1000
2500
5000
T-10
14 21
18 11
20 21
23 21
25 24
32 29
-0.4 -0.2 0 0.2 0.4 0.6
50
250
500
1000
2500
5000
T-5
24 37
25 21
19 21
22 18
26 28
33 33
-0.4 -0.2 0 0.2 0.4 0.6
50
250
500
1000
2500
5000
Log-N(1,0.4)
19 12
20 12
15 29
21 19
30 30
16 20
3
Figure 11A. Sampling distributions of 3SK (Smoothed histograms): single outlier case
Figure 11B. Sampling distributions of 3SK (Box-plots): single outlier case
-1 -0.5 0 0.50
10
20
30N(0,1)
N = 50
-1 -0.5 0 0.50
10
20
30T-10
-1 -0.5 0 0.50
10
20
30T-5
-1 -0.5 0 0.50
10
20
30Log-N(1,0.4)
-1 -0.5 0 0.50
10
20
30
N = 250
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
N = 50
0
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
N = 100
0
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
N = 250
0
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
N = 500
0
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.50
10
20
30
-1 -0.5 0 0.5
50
250
500
1000
2500
5000
N(0,1)
16 44
25 28
14 31
18 25
29 27
11 10
-1 -0.5 0 0.5
50
250
500
1000
2500
5000
T-10
10 29
23 25
11 21
23 27
18 21
10 14
-1 -0.5 0 0.5
50
250
500
1000
2500
5000
T-5
17 39
16 24
20 30
18 20
28 38
22 19
-1 -0.5 0 0.5
50
250
500
1000
2500
5000
Log-N(1,0.4)
10 29
17 19
21 19
26 17
17 26
22 30
4
Figure 12A. Sampling distributions of 4SK (Smoothed histograms): single outlier case
Figure 12B. Sampling distributions of 4SK (Box-plots): single outlier case
-0.5 0 0.50
20
40
N(0,1)
N = 50
-0.5 0 0.50
20
40
T-10
-0.5 0 0.50
20
40
T-5
-0.5 0 0.50
20
40
Log-N(1,0.4)
-0.5 0 0.50
20
40
N = 25
0
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
N = 50
0
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
N = 10
00
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
N = 25
00
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
N = 50
00
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.50
20
40
-0.5 0 0.5
50
250
500
1000
2500
5000
N(0,1)
26 43
29 32
21 34
19 26
29 27
11 10
-0.5 0 0.5
50
250
500
1000
2500
5000
T-10
19 36
19 27
11 21
23 28
17 21
9 15
-0.5 0 0.5
50
250
500
1000
2500
5000
T-5
18 40
20 27
22 31
18 20
29 38
22 18
-0.5 0 0.5
50
250
500
1000
2500
5000
Log-N(1,0.4)
12 39
13 31
19 17
18 16
18 30
21 30
5
Figure 13A. Sampling distributions of 1KR (Smoothed histograms): single outlier case
Figure 13B. Sampling distributions of 1KR (Box-plots): single outlier case
0 2000
0.1
0.2
N(0,1)
N=50
0 2000
0.1
0.2
T-10
0 2000
0.1
0.2
T-5
0 2000
0.1
0.2
Log-N(1,0.4)
0 2000
0.1
0.2
N=2
50
0 2000
0.1
0.2
0 2000
0.1
0.2
0 2000
0.1
0.2
0 2000
0.1
0.2
N=5
00
0 2000
0.1
0.2
0 2000
0.1
0.2
0 2000
0.1
0.2
0 2000
0.1
0.2
N=1
000
0 2000
0.1
0.2
0 2000
0.1
0.2
0 2000
0.1
0.2
0 2000
0.1
0.2
N=2
500
0 2000
0.1
0.2
0 2000
0.1
0.2
0 2000
0.1
0.2
0 2000
0.1
0.2
N=5
000
0 2000
0.1
0.2
0 2000
0.1
0.2
0 2000
0.1
0.2
0 50 100 150 200 250 300 350
50
250
500
1000
2500
5000
N(0,1)
102 0
37 3
42 24
15 18
24 15
16 24
0 50 100 150 200 250 300 350
50
250
500
1000
2500
5000
T-10
82 0
48 19
28 15
21 16
23 31
15 23
0 50 100 150 200 250 300 350
50
250
500
1000
2500
5000
T-5
99 0
31 5
27 15
8 14
18 23
14 32
0 50 100 150 200 250 300 350
50
250
500
1000
2500
5000
Log-N(1,0.4)
84 0
35 7
25 9
22 14
28 27
23 28
6
Figure 14A. Sampling distributions of 2KR (Smoothed histograms): single outlier case
Figure 14B. Sampling distribut ions of 2KR (Box-plots): single outlier case
-1 0 10
10
N(0,1)
N=5
0
-1 0 10
10
T-10
-1 0 10
10
T-5
-1 0 10
10
Log-N(1,0.4)
-1 0 10
10
N=2
50
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 0 10
10
N=5
00
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 0 10
10
N=1
000
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 0 10
10
N=2
500
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 0 10
10
N=5
000
-1 0 10
10
-1 0 10
10
-1 0 10
10
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
N(0,1)
1 35
10 37
16 31
11 26
20 29
28 27
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-10
7 45
11 20
7 28
19 29
22 33
15 27
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-5
2 44
6 26
10 28
13 27
25 23
18 21
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
Log-N(1,0.4)
7 50
10 28
20 31
19 38
14 24
12 26
7
Figure 15A. Sampling distributions of 3KR (Smoothed histograms): single outlier case
Figure 15B. Sampling distributions of 3KR (Box-plots): single outlier case
0 2 40
5
10
N(0,1)
N=5
0
0 2 40
5
10
T-10
0 2 40
5
10
T-5
0 2 40
5
10
Log-N(1,0.4)
0 2 40
5
10
N=2
50
0 2 40
5
10
0 2 40
5
10
0 2 40
5
10
0 2 40
5
10
N=5
00
0 2 40
5
10
0 2 40
5
10
0 2 40
5
10
0 2 40
5
10
N=1
000
0 2 40
5
10
0 2 40
5
10
0 2 40
5
10
0 2 40
5
10
N=2
500
0 2 40
5
10
0 2 40
5
10
0 2 40
5
10
0 2 40
5
10
N=5
000
0 2 40
5
10
0 2 40
5
10
0 2 40
5
10
-1 0 1 2 3 4
50
250
500
1000
2500
5000
N(0,1)
48 8
35 37
19 28
24 31
29 22
22 19
-1 0 1 2 3 4
50
250
500
1000
2500
5000
T-10
40 0
16 22
16 15
16 15
28 23
24 26
-1 0 1 2 3 4
50
250
500
1000
2500
5000
T-5
47 13
15 23
11 36
17 26
17 31
12 23
-1 0 1 2 3 4
50
250
500
1000
2500
5000
Log-N(1,0.4)
38 16
11 18
20 34
19 18
21 25
26 14
8
Figure 16A. Sampling distributions of 4KR (Smoothed histograms): single outlier case
Figure 16B. Sampling distributions of 4KR (Box-plots): single outlier case
0 5 100
5
N(0,1)
N=5
0
0 5 100
5
T-10
0 5 100
5
T-5
0 5 100
5
Log-N(1,0.4)
0 5 100
5
N=2
50
0 5 100
5
0 5 100
5
0 5 100
5
0 5 100
5
N=5
00
0 5 100
5
0 5 100
5
0 5 100
5
0 5 100
5
N=1
000
0 5 100
5
0 5 100
5
0 5 100
5
0 5 100
5
N=2
500
0 5 100
5
0 5 100
5
0 5 100
5
0 5 100
5
N=5
000
0 5 100
5
0 5 100
5
0 5 100
5
-2 0 2 4 6 8 10
50
250
500
1000
2500
5000
N(0,1)
20 31
7 31
13 22
17 22
21 27
15 11
-2 0 2 4 6 8 10
50
250
500
1000
2500
5000
T-10
19 29
23 42
8 26
11 19
22 23
19 15
-2 0 2 4 6 8 10
50
250
500
1000
2500
5000
T-5
23 41
8 36
15 30
17 30
8 22
19 28
-2 0 2 4 6 8 10
50
250
500
1000
2500
5000
Log-N(1,0.4)
8 38
14 53
11 33
18 20
22 18
15 15
1
Figure 17A. Sampling distributions of 1SK (smoothed histograms): mixture distribution case
Figure 17B. Sampling distributions of 1SK (Box-plots): mixture distribution case
-6 -4 -2 0 2 40
1
N(0,1)
N=5
0
-6 -4 -2 0 2 40
1
T-10
-6 -4 -2 0 2 40
1
T-5
-6 -4 -2 0 2 40
1
Log-N(1,0.4)
-6 -4 -2 0 2 40
1
N=2
50
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
N=5
00
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
N=1
000
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
N=2
500
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
N=5
000
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 40
1
-6 -4 -2 0 2 4
50
250
500
1000
2500
5000
N(0,1)
54 29
174 29
194 38
136 10
63 3
44 3
-6 -4 -2 0 2 4
50
250
500
1000
2500
5000
T-10
53 32
139 43
199 38
164 28
99 15
66 8
-6 -4 -2 0 2 4
50
250
500
1000
2500
5000
T-5
70 60
132 42
157 40
124 38
104 30
93 25
-6 -4 -2 0 2 4
50
250
500
1000
2500
5000
Log-N(1,0.4)
46 59
140 82
194 42
82 39
22 47
27 69
2
Figure 18A. Sampling distributions of 2SK (Smoothed histograms): mixture distribution case
Figure 18B. Sampling distributions of 2SK (Box-plots): mixture distribution case
-0.5 0 0.50
10
20N(0,1)
N=5
0
-0.5 0 0.50
10
20T-10
-0.5 0 0.50
10
20T-5
-0.5 0 0.50
10
20Log-N(1,0.4)
-0.5 0 0.50
10
20
N=2
50
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=5
00
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=1
000
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=2
500
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=5
000
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.5
50
250
500
1000
2500
5000
N(0,1)
10 15
26 31
18 15
27 29
33 29
17 16
-0.5 0 0.5
50
250
500
1000
2500
5000
T-10
18 20
21 24
23 21
20 25
32 22
21 19
-0.5 0 0.5
50
250
500
1000
2500
5000
T-5
25 29
23 17
26 27
19 24
18 13
25 21
-0.5 0 0.5
50
250
500
1000
2500
5000
Log-N(1,0.4)
21 14
16 19
23 15
17 13
23 18
21 17
3
Figure 19A. Sampling distributions of 3SK (Smoothed histograms): mixture distribution case
Figure 19B. Sampling distributions of 3SK (Box-plots): mixture distribution case
-0.5 0 0.50
10
20N(0,1)
N=5
0
-0.5 0 0.50
10
20T-10
-0.5 0 0.50
10
20T-5
-0.5 0 0.50
10
20Log-N(1,0.4)
-0.5 0 0.50
10
20
N=2
50
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=5
00
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=1
000
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=2
500
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=5
000
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.5
50
250
500
1000
2500
5000
N(0,1)
21 17
20 31
29 31
12 19
16 16
21 25
-0.5 0 0.5
50
250
500
1000
2500
5000
T-10
28 27
26 26
14 28
12 24
24 12
20 17
-0.5 0 0.5
50
250
500
1000
2500
5000
T-5
14 13
17 23
21 26
23 23
27 30
16 20
-0.5 0 0.5
50
250
500
1000
2500
5000
Log-N(1,0.4)
21 13
24 24
21 20
24 23
31 31
27 16
4
Figure 20A. Sampling distributions of 4SK (Smoothed histograms): mixture distribution case
Figure 20B. Sampling distributions of 4SK (Box-plots): mixture distribution case
-0.5 0 0.50
10
20N(0,1)
N=5
0
-0.5 0 0.50
10
20T-10
-0.5 0 0.50
10
20T-5
-0.5 0 0.50
10
20Log-N(1,0.4)
-0.5 0 0.50
10
20
N=2
50
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=5
00
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=1
000
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=2
500
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
N=5
000
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.50
10
20
-0.5 0 0.5
50
250
500
1000
2500
5000
N(0,1)
27 27
16 17
22 20
24 14
11 12
9 24
-0.5 0 0.5
50
250
500
1000
2500
5000
T-10
19 23
21 25
20 19
13 26
16 24
26 19
-0.5 0 0.5
50
250
500
1000
2500
5000
T-5
20 30
8 19
18 30
12 23
20 17
22 20
-0.5 0 0.5
50
250
500
1000
2500
5000
Log-N(1,0.4)
38 12
46 14
37 12
26 9
34 12
23 12
5
Figure 21A. Sampling distributions of 1KR (Smoothed histograms): mixture distribution case
Figure 21B. Sampling distributions of 1KR (Box-plots): mixture distribution case
0 20 400
0.5N(0,1)
N=5
0
0 20 400
0.5T-10
0 20 400
0.5T-5
0 20 400
0.5Log-N(1,0.4)
0 20 400
0.5
N=2
50
0 20 400
0.5
0 20 400
0.5
0 20 400
0.5
0 20 400
0.5
N=5
00
0 20 400
0.5
0 20 400
0.5
0 20 400
0.5
0 20 400
0.5
N=1
000
0 20 400
0.5
0 20 400
0.5
0 20 400
0.5
0 20 400
0.5
N=2
500
0 20 400
0.5
0 20 400
0.5
0 20 400
0.5
0 20 400
0.5
N=5
000
0 20 400
0.5
0 20 400
0.5
0 20 400
0.5
0 50 100
50
250
500
1000
2500
5000
N(0,1)
0 90
0 190
0 201
0 150
0 74
0 78
0 50 100
50
250
500
1000
2500
5000
T-10
0 108
0 182
0 211
0 159
0 117
0 96
0 50 100
50
250
500
1000
2500
5000
T-5
0 119
0 160
0 177
0 170
0 120
0 125
0 50 100
50
250
500
1000
2500
5000
Log-N(1,0.4)
0 115
0 175
0 189
0 109
0 66
0 78
6
Figure 22A. Sampling distributions of 2KR (Smoothed histograms): mixture distribution case
Figure 22B. Sampling distributions of 2KR (Box-plots): mixture distribution case
-1 0 10
5
10
N(0,1)
N=5
0
-1 0 10
5
10
T-10
-1 0 10
5
10
T-5
-1 0 10
5
10
Log-N(1,0.4)
-1 0 10
5
10
N=2
50
-1 0 10
5
10
-1 0 10
5
10
-1 0 10
5
10
-1 0 10
5
10
N=5
00
-1 0 10
5
10
-1 0 10
5
10
-1 0 10
5
10
-1 0 10
5
10
N=1
000
-1 0 10
5
10
-1 0 10
5
10
-1 0 10
5
10
-1 0 10
5
10
N=2
500
-1 0 10
5
10
-1 0 10
5
10
-1 0 10
5
10
-1 0 10
5
10
N=5
000
-1 0 10
5
10
-1 0 10
5
10
-1 0 10
5
10
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
N(0,1)
7 45
7 38
7 28
9 32
13 18
27 26
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-10
5 48
11 34
10 23
18 32
22 23
21 27
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-5
10 49
11 36
11 35
22 28
9 19
22 20
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
Log-N(1,0.4)
6 39
8 33
10 25
23 30
20 27
32 28
7
Figure 23A. Sampling distributions of 3KR (Smoothed histograms): mixture distribution case
Figure 23B. Sampling distributions of 3KR (Box-plots): mixture distribution case
-1 0 10
5
N(0,1)
N=5
0
-1 0 10
5
T-10
-1 0 10
5
T-5
-1 0 10
5
Log-N(1,0.4)
-1 0 10
5
N=2
50
-1 0 10
5
-1 0 10
5
-1 0 10
5
-1 0 10
5
N=5
00
-1 0 10
5
-1 0 10
5
-1 0 10
5
-1 0 10
5
N=1
000
-1 0 10
5
-1 0 10
5
-1 0 10
5
-1 0 10
5
N=2
500
-1 0 10
5
-1 0 10
5
-1 0 10
5
-1 0 10
5
N=5
000
-1 0 10
5
-1 0 10
5
-1 0 10
5
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
N(0,1)
9 51
9 82
0 74
1 63
4 40
13 44
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-10
13 61
6 88
4 76
3 46
13 46
4 39
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
T-5
3 48
3 53
4 50
9 51
7 37
15 36
-1 -0.5 0 0.5 1
50
250
500
1000
2500
5000
Log-N(1,0.4)
5 60
2 75
1 56
7 38
14 28
15 30
8
Figure 24A. Sampling distributions of 4KR (Smoothed histograms): mixture distribution case
Figure 24B. Sampling distributions of 4KR (Box-plots): mixture distribution case
-1 0 1 20
5
N(0,1)
N=5
0
-1 0 1 20
5
T-10
-1 0 1 20
5
T-5
-1 0 1 20
5
Log-N(1,0.4)
-1 0 1 20
5
N=2
50
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
N=5
00
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
N=1
000
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
N=2
500
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
N=5
000
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 20
5
-1 0 1 2
50
250
500
1000
2500
5000
N(0,1)
1 57
12 45
14 30
19 31
25 40
32 27
-1 0 1 2
50
250
500
1000
2500
5000
T-10
1 54
9 45
14 37
15 22
19 15
16 26
-1 0 1 2
50
250
500
1000
2500
5000
T-5
0 58
5 34
14 31
22 34
8 26
19 28
-1 0 1 2
50
250
500
1000
2500
5000
Log-N(1,0.4)
0 72
9 35
8 25
15 33
12 27
22 26