the four horsemen: heavy-tails, negative skew, volatility cl
TRANSCRIPT
Discussion Paper: 2014-004
1
Business School
Discipline of Finance
Discussion Paper
2014-004
“The Four Horsemen:
Heavy-tails, Negative Skew, Volatility
Clustering, Asymmetric Dependence”
David Allen Pembroke College
University of Cambridge
Stephen Satchell Trinity College Cambridge /
University of Sydney Business School
Discussion Paper: 2014-004
2
WORKING PAPER
December 4th
, 2013
ABSRACT
In the wake of the worst financial crisis since the Great Depression, there has been a
proliferation of new risk management and portfolio construction approaches. These
approaches endeavour to capture the “stylised facts” of financial asset returns: heavy tails,
negative skew, volatility clustering and asymmetric dependence. Many approaches
capture two or three characteristics, while capturing all four in a scalable framework
remains elusive. We propose a novel approach that captures all four stylised
characteristics using EGARCH, the skewed-t copula and extreme-value theory. Using
eight data sets we show the approach is superior to eight benchmark models in both a
VaR forecasting and a dynamic portfolio rebalancing framework. The approach generates
significant economic value relative to the 1/N rule and the Gaussian approach. We also
find that accounting for asymmetric dependence leads to a consistent improvement in
VaR prediction and out-of sample portfolio performance including lower drawdowns.
I. Introduction
Since 2008, 465 banks have failed in the United States alone, amounting to $687 bn. in total assets1.
The failures at systemically important institutions, in particular Bear Stearns, Lehman Brothers, AIG,
Fannie Mae and Freddie Mac, reverberated around the global economy, wiped 23.4% off the net
wealth of American households between 2007 and 2009 (Kennickell, 2011) and precipitated the Great
Recession. The Financial Crisis Inquiry Commission concluded that rather than being due to
exogenous unavoidable events, the “dramatic failures of corporate governance and risk
management at many systemically important financial institutions were a key cause of this crisis”. In
2012, JP Morgan suffered a $5.8 bn. loss on a single derivative trade amounting to 25% of the firm’s
annual profits. The potential for trading losses to erode shareholder wealth and to trigger systemic
events has galvanised renewed interest in approaches to risk management and portfolio construction.
In the current work we develop a multivariate model that captures the four stylised facts of financial
asset returns: heavy tails, negative skew, volatility clustering, and asymmetric tail dependence. We
find the approach produces superior VaR forecasts and adds significant economic value in out-of-
sample tests relative to eight benchmark methodologies. This finding is robust across eight data sets
including a range of asset classes. The majority of the uplift in performance derives from accounting
for non-stochastic volatility in line with the empirical work of Fleming, Kirby, and Ostdiek (2001,
2003) and Kirby, and Ostdiek (2012) and the analytical work of Allen, Lizieri and Satchell (2013).
From the viewpoint of the corporate entity it may appear uncontroversial that prudent risk
management increases the value of a firm and is worthwhile. However, as McNeil, Frey, and
Embrechts (2005) point out, from a corporate finance perspective it is by no means obvious that in a
world with perfect capital markets, including no informational asymmetries, no taxes, transaction
costs, or bankruptcy costs, that risk management should enhance shareholder value. If shareholders
have access to perfect capital markets then they can undertake their own risk management
transactions and will not pay a premium for the corporation to do so on their behalf. The potential
irrelevance of corporate risk management follows from the Modigliani-Miller (1958) theorem on
1 Federal Deposit Insurance Corporation, Failed Bank List
Discussion Paper: 2014-004
3
capital structure and the value of the firm. In practice informational asymmetries and inferior access
to capital markets limit the ability of shareholders to undertake risk-management positions. Further,
risk management makes bankruptcy less likely and less costly.
Indeed, sensible capital allocation and risk management are widely seen as the essential core
competencies of a financial institution (Allen and Santomero, 1998). If capital reserves are excessive,
then shareholder returns, business and consumer credit, and economic growth are inhibited. If capital
reserves are too low, there is an increased risk of bank failures, erosion of shareholder value and a
contraction in economic growth. The potential for trading losses to erode shareholder wealth and
imperil the very survival of financial entities has been a recurring theme over the last two decades. In
1995, Barings Bank, the oldest merchant bank in the United Kingdom was ruined by a $1.79 bn. loss
betting on equity futures. In 1998, Long Term Capital Management suffered a $5.85 bn. loss on
interest rate and equity derivatives prompting the Federal Reserve to coordinate a bailout. In 2006,
Amaranth Advisors collapsed after sustaining a $6.69 bn. loss on natural gas futures. In 2008,
Morgan-Stanley lost 8.67 bn. on credit default swaps relating to the subprime market. More recently,
in 2012, JP Morgan suffered a $5.8 bn. loss on a single derivative trade amounting to 25% of the
firm’s annual profits. The potential for trading losses to destroy shareholder wealth and to trigger
systemic events has galvanised renewed interest in approaches to risk management and portfolio
construction.
From a regulatory perspective, the recent transition to Basel 2.5 has magnified the importance of
quantifying risk accurately. The Basel 2.5 reforms are part of the Basel Committee on Banking
Supervision’s (BCBS) comprehensive response to the 2008 financial crisis. Under Basel 2, total
required market risk capital was given by the sum of a market value-at-risk (VaR) component and a
standardised specific risk measure. Under Basel 2.5, risk capital must also cover a stressed VaR
measure and two incremental risk charges relating to unsecuritised credit positions and correlation
trades. The BCBS quantitative impact study estimates that the adoption of Basel 2.5 will lead to a
three-fold increase in market risk regulatory capital2. The “traffic-light” system that operated under
Basel II is preserved under Basel 2.5 such that an excessive number of VaR model violations trigger
increased capital requirements and increased regulatory scrutiny.
It is being increasingly recognised that financial markets may be inherently unstable. For a long time
the neoclassical view of markets held sway. The efficient market hypothesis (EMH) of Fama (1965)
maintains that agents are rational and that prices reflect all available information instantaneously.
Under the EMH, the market is always in equilibrium and speculative asset bubbles are not possible.
The Great Moderation in the variability of GDP observed in the United States (Stock and Watson,
2002) and most OECD countries from the mid-1980s onwards was seen by many as evidence of the
efficiency and stability of markets and the success of monetarism (Bernanke, 2005, Summers, 2005).
Since the late 1970s behavioural evidence has been accumulating that suggests that humans are not
the hyper-rational computing machines (Kahneman and Tversky, 1979) of economic textbooks and
that markets may not be perfectly efficient (Shiller, 1981)3. The Great Recession has led to a
resurgence of the idea that free-market systems are inherently unstable. Alan Greenspan admitted that
he assumed that rational firms would not expose themselves to annihilation and that the “whole
intellectual edifice had collapsed”4.
2 Basel Committee on Banking Supervision, Analysis of the trading book quantitative impact study, October,
2009 3 See section 2 for an expanded list of references
4 House Committee on Oversight and Government Reform
Discussion Paper: 2014-004
4
The idea that disequilibrium may be the rule rather than the exception dates back to Schumpeter
(1928) and Fischer (1933). Fischer (1933) formally an apostle of the neoclassical school recanted his
beliefs during the Great Depression arguing that it is as absurd to assume that markets will stay in
equilibrium as it is “to assume that the Atlantic Ocean can ever be without a wave”. The work of
Minsky (1982), long dismissed by the neoclassical school is now being taken seriously. Minsky
(1982) argues that periods of consistent economic growth lead to upward revisions in expectations, a
greater willingness to borrow and lend, the extrapolation of growth rates and in time asset bubbles. It
is in this sense that Minsky (1982) argues that “stability is destabilising”. The long history of booms
and busts in free-market systems, including the South Sea Bubble, the Tulip mania, the 1970s Savings
and Loan crisis, the early 2000s Technology bubble to name but a few is certainly consistent with this
view. If markets are inherently unstable, and periods of tranquillity presage periods of turbulence, we
argue it is all the more important to understand the strengths and weaknesses of risk management and
portfolio construction approaches.
The remainder of this paper is organised as follows. In Section II, we provide a survey of the
literature. In section III we introduce the GSEV model. Section IV describes our eight benchmark
portfolio construction models and discusses our data and methodology for evaluating VaR forecasts,
and our dynamic rebalancing and performance evaluation frameworks. Section V discusses our
findings, and section VI concludes.
II. Literature Survey
Gaussian approaches to risk management have been roundly criticised in the literature. Li’s (2000)
Gaussian copula model for CDOs has been described as the “formula that felled Wall Street”5 and
“instrumental in causing the unfathomable losses that brought the world financial system to its
knees”6. The mainstay of modern portfolio theory, the mean-variance model has been criticised in
both the finance literature and the financial press. DeMiguel, Garlappi, and Uppal (2009) conclude
that there are “many miles to go” before the promised benefits of optimal portfolio choice can be
realised out of sample. Taleb (2009) describes mean-variance as “hot air” and a “quack remedy”.
Even the Babylonian Talmud has contributed to the debate with the sage advice that “one should
always divide his wealth into three parts: a third in land, a third in merchandise, and a third ready to
hand”7. The criticisms of the mean-variance approach centre on the alternative maintained hypotheses
that investors have mean-variance utility or that portfolio returns follow the normal distribution
(Cootner, 1964 and Lintner, 1972) and are independently and identically distributed. A careful
reading of Markowitz (1952) however reveals no mention of the Gaussian distribution. In fact
Markowitz and Usmen (1996a, 1996b), in Markowitz’s sole investigation of the return generating
process, conclude that the log returns of the S&P 500 are well described by a student-t distribution
with between four and five degrees of freedom. Nor did Markowitz assume that investors have mean-
variance utility. Rather, Levy and Markowitz (1979) find that the mean-variance approach serves as a
robust approximation to a wide range of utility functions, a finding that has been widely replicated
(Pulley, 1981, Kroll, Levy and Markowitz, 1984, and Simaan, 1993).
Mandelbrot (1963) finds that the commodity return distributions are heavy tailed, an observation that
has since been made in every major investment class including equities (Fama, 1963), fixed income
5 Jones, S. The formula that felled Wall St, Financial Times, April 24
th, 2009
6 Salmon, F. Recipe for Disaster: The formula that killed Wall street, Wired magazine, 2009
7 This quote appeared in DeMiguel, Garlappi, and Uppal (2009) and is attributed to Rabbi Issac bar Aha in the
Babylonian Talmud: Tractate Baba Mezi’a, folio 42a
Discussion Paper: 2014-004
5
(Amin and Kat, 2003), currencies (Westerfiled, 1977), REITS (Lizieri, Satchell, Zhang, 2007), and
hedge funds (Agrawal and Naik, 2004). Further, it is well established that financial assets are often
negatively skewed where large declines are more common than large rallies (Kraus and Litzenberger,
1976, Beedles, 1979, Alles and Kling, 1994, Harvey and Siddique, 1999). Mandelbrot (1963) also
identifies volatility clustering, where “large changes tend to be followed by large changes, of either
sign, and small changes tend to be followed by small changes”. Again this finding is pervasive across
asset classes including equities (Fama, 1965), fixed income (Weiss, 1984) and foreign exchange
(Baillie and Bollerslev, 1989). It was later shown that financial assets exhibit a “leverage” effect
where negative innovations lead to larger upward revisions in conditional volatility than downward
revisions (Black, 1976, Christie, 1982, Glosten, Jaganathan and Runkel, 1993, Hansen and Lunde,
2005)8. It is generally acknowledged that the increase in financial leverage alone is insufficient to
account for the observed increase in volatility following market downturns (Bollerslev, 2009), and
that behavioural factors may be at work. We also capture this effect within our approach.
More recently, the asymmetric tail dependence between assets has been identified, aligning with the
market adage, that “when the market crashes all correlations go to one”. Erb, Harvey and Viskanta
(1994) show that the correlations between the G7 country indices are higher in up markets than in
down markets. Karolyi and Stulz (1996) find that the dependence between U.S. and Japanese stocks
increases during large shocks. Ang and Bekaert (2002) show that the correlations between
international equity indices tend to increase during volatile periods. The same pattern is evident within
countries. Longin and Solnik (2001) and Ang and Chen (2002) show that the dependence between
individual stocks and the aggregate market index is significantly higher for downside moves than for
upside moves9. Patton (2004) finds evidence of asymmetric dependence for indices of U.S. large and
small-cap portfolios. Hong, Tu and Zhou (2007) provide a model-free test for asymmetric correlation
and conclude that there is strong evidence for asymmetries for the “size” and “momentum” portfolios.
Beine, Cosma and Vermeulen (2010) argue that financial liberalization has significantly increased
left-tail comovement in international equities. Lower tail dependence reduces the ability of an agent to
protect against downside risk. Moreover the agent that does not account for tail dependence will
structurally underestimate downside risk. Despite the recent advances in the understanding of the
dependence structure between assets, linear correlation and dependence are often used
interchangeably. Correlation however is only appropriate for elliptical distributions. To take an absurd
example, consider the case of the unit circle around the origin. The Pearson’s correlation coefficient
will equal zero despite perfect dependence.
Collectively these distributional characteristics are often referred to as “stylised facts” of financial
markets 10
.
I. Leptokurtosis or “heavy tails”
II. Negative skew
III. Heteroskedasticity and “volatility clustering”
IV. Tail dependence
There is a growing body of evidence indicating that the accuracy of VaR estimates and the
performance of portfolio construction techniques can be improved by accounting for the respective
stylised facts. Lucas and Klaasen (1998) show that the failure to account for heavy tails leads to the
underestimation of the true risk by 25-30% and overly aggressive portfolio allocations at the 1% VaR
8 It is generally acknowledged that the asymmetry is not however present for currencies
9 In particular for value and small capitalisation stocks
10 See for example Cont (2001) or McNeil, Frey and Embrechts (2005)
Discussion Paper: 2014-004
6
confidence level; at the 5% VaR level the Gaussian assumption leads to an overestimation of risk and
overly conservative allocations. McNeil, Frey and Embrechts (2005) show that the standard
unconditional Gaussian approach systematically results in too many VaR violations at the 1% level11
for a collection of equity indices. Harvey, Liechty, Liechty, and Muller (2010) show that
incorporating the effect of skew into the investor’s utility function leads to significant increases in
expected utility. This result mirrors the earlier results of Prakash, Chang, and Pactwa (2003) using a
Polynomial Goal Programming approach. Accounting for stochastic volatility has also been shown to
significantly improve outcomes for investors. Fleming, Kirby, and Ostdiek (2001) show that
“volatility timing” leads to a significant uplift in investor welfare that exceeds typical active
management fees. There is also evidence to suggest that accounting for the asymmetric dependence
between assets can improve investor welfare. Kole, Koedijk, and Verbeek (2007) show that the
Gaussian copula results in a significant underestimation of joint extreme downward realisations and
overestimates potential diversification. Alcock and Hatherley (2009) show that accounting for
asymmetric dependence leads to significant gains in economic value.
The deficiencies of the Gaussian approach have been well recognised for some time. Alan Greenspan
writes:
From the point of view of the risk manager, inappropriate use of the normal distribution
leads to an understatement of risk, which must be balanced against the significant
advantage of simplification. From the central bank’s corner, the consequences are even
more serious because we often need to concentrate on the left tail of the distribution in
formulating lender of last resort policies. Improving the characterisation of the
distribution of extreme values is of paramount importance.
Joint Central Banks Research Conference, 1995
It is also well established that investors like positive skew and dislike excess kurtosis. Harvey and
Siddique (2000) put forward an asset pricing model that incorporates skewness and find that an
investor may be willing to accept a negative return for high positive skewness. The popularity of
lottery tickets is an example of this preference. Dittmar (2002) employs nonlinear pricing kernels to
demonstrate that higher moments drive out the explanatory power of the Fama-French factors. Hwang
and Satchell (1999) find evidence that emerging market returns are better explained by incorporating
co-skewness and co-kurtosis. At the investor level, Mitton and Vorkink (2007) look at the allocations
of 60,000 accounts and find that investors systematically under-diversify to achieve positively skewed
portfolios.
At first blush it makes sense to abandon the mean-variance approach in favour of a more sophisticated
approach that accounts for departures from the i.i.d. Gaussian assumption and allows for investor
preferences for moments higher than order two. The mean-variance approach is of course deeply
embedded in the financial industry. Fabozzi, Focardi and Jonas (2007) survey 38 medium and large-
sized equity investment managers in North America and Europe totalling $4.3 trillion in assets under
management. The authors find that 97% of managers use variance to measure risk and that 83% of
managers employ mean-variance optimisation. The EDHEC European Practices Survey, 2008,
examines the investment behaviour of 229 investment managers including fund management firms,
pension funds, private banks, investment banks, family officers and consultants. The survey finds that,
57% of investment managers use the absolute return variance as the risk objective when performing
11
Approximately three times too many between 1996-2003 for a composite portfolio of the S&P 500, the FTSE
100, and Swiss Market indices.
Discussion Paper: 2014-004
7
portfolio optimisation. For managers that set relative risk objectives, 76% use the tracking error,
which is, the square-root of the variance of excess returns. Of the managers that perform optimisation
subject to value at risk (VaR) and expected shortfall constraints, the majority use the normal
distribution to derive their estimates, which of course yields the same set of portfolios as mean-
variance analysis. The mean-variance approach is also a well understood, tractable approach with
closed form expressions for optimal portfolios, expected utility and the decomposition of risk. The
mean-variance approach is also a key building block for other pillars of modern financial theory,
including the capital asset pricing model. Before we abandon the mean-variance approach in favour of
an ostensibly more appropriate alternative it makes sense to quantify the loss in investor welfare from
assuming returns are normally distributed.
A number of modelling approaches capture two or three of the stylised facts; however the holy grail
of all four in a scalable framework remains elusive. Nystrom and Skoglund (2002)12
capture heavy-
tails, negative skew and volatility clustering, but not asymmetric dependence13
. Xiong (2010)14
accounts for heavy tails and skew, but ignores volatility clustering and asymmetric dependence.
Sortino (2010)15
captures heavy tails, skew and asymmetric tail dependence, but does not account for
volatility clustering. Hu and Kercheval (2007)16
account for skew, asymmetric dependence, volatility
clustering and semi-heavy tails17
. Patton (2004)18
and Viebig and Poddig (2010)19
develop models that
account for all four stylised facts, however generalising the Archimedean copula approach they use to
higher dimensions ( ) requires restrictive assumptions. Filtered Historical Simulation (FHS)
proposed by Barone-Adesi, Bourgoin and Giannopouos (1998) captures all four stylised
characteristics, although the approach is non-parametric. The approach we propose captures all four
stylised facts and can be employed in high dimensions.
There is a vast literature on VaR estimation, and we do not discuss all of the findings here. There is a
consensus that models that account for stochastic volatility outperform static models (Pritsker, 2001,
Berkowitz and O’Brien, 2002, McAleer and da Veiga, 2008, Skoglund, Erdman, and Chen, 2010) and
that models that employ non-Gaussian innovations tend to outperform Gaussian models (McAleer and
da Veiga, 2008). Further, there is a general consensus that single index approaches tend to outperform
portfolio approaches (Berkowitz and O’Brien, 2002, Brooks and Persand, 2003, Bauwens, Laurent,
and Rombouts, 2006, Christoffersen, 2009, and McAleer, 2009)20
. Interestingly, the arguably more
intuitive historical simulation and its variant the filtered historical simulation approaches are currently
the most widely used methods at commercial banks (Christoffersen, 2006, Pérignon and Smith, 2010).
While there is an abundance of work showing the superiority of novel downside portfolio construction
techniques in-sample, there is a relative paucity of robust out-of-sample evidence. The techniques
proposed in the literature tend to have a large number of parameters, so it is of little surprise that in-
sample the techniques perform well. As Patton (2004) notes, a common finding in the point
forecasting literature is that complex models often provide inferior forecasts than simple misspecified
models (Swanson and White, 1997, Stock and Watson, 1999). It is critical to evaluate the out-of-
12
GARCH/EVT/t-copula approach 13
The t-copula is radially symmetric leading to the same degree of upper and lower tail dependence 14
Truncated Levy-flight distribution 15
Historical simulation 16
GARCH-multivariate skew-t 17
Misorek and Weron, 2010 18
Time varying copulas and moments up to the fourth order that are functions of exogenous variables 19
GARCH/EVT/Archimedean-copula approach 20
The results of McAleer and da Veiga (2008) however are mixed
Discussion Paper: 2014-004
8
sample evidence to ensure the benefits of a technique are robust to the increase in estimation that
tends to accompany an increase in parameters. Brandt, Santa-Clara, and Valkanov (2009) argue that
“extending the traditional approach beyond first and second moments, when the investor’s utility
function is not quadratic, is practically impossible because it requires modelling not only the
conditional skewness and kurtosis of each stock but also the numerous high-order cross-moments”.
The bulk of existing studies exclusively present in-sample portfolio performance evidence (Campbell,
Huisman, Koedijk, 2000, Consigli, 2002, Nystrom and Skoglund, 2002b, Prakash, Chang, and
Pactwa, 2003, Tokat, Rachev, and Schwartz, 2003, Morton, Popova and Popova, 2005, Jarrow and
Zhao, 2006, Jondeau and Rockinger, 2006 , Harvey, Liechty, Liechty, and Mulleer, 2010, Viebig and
Poddig, 2010, Sortino, 2010). In this sense, the literature concentrates on the demonstration of
techniques rather than providing hard evidence of efficacy.
The number of studies that examine the performance of downside portfolio construction approaches
out-of-sample is surprisingly low. Out-of-sample analyses typically involve the estimation of complex
multivariate distributions each period over the back-test window. The estimation of these models
often involves Markov-chain Monte Carlo or the Expectations-Maximisation algorithm and can be
both computationally demanding and time consuming. Guastaroba, Mansini, Speranza (2009)
evaluate the computational burden posed by several prominent scenario generation models. As an
example, the authors find that it takes 60 minutes to estimate and then simulate 10,000 times from a
multivariate GARCH model with student-t innovations21
. If we are re-estimating the models daily or
even weekly it is easy to see that meaningful out-of-sample tests can be prohibitively time consuming
even for a short back-test. We speculate that this may explain the small number of out-of-sample
studies in this important area.
Of the out-of-sample analyses, Patton (2004) finds that accounting for asymmetry and skewness using
skewed-t marginals and the rotated Gumbel copula leads to a small improvement in realised utility
relative to 1/N. Patton using a stationary boot-strap approach demonstrates that for all levels of risk
aversion the utility of the Gumbel copula approach is statistically significantly larger than the utility
of the Gaussian copula approach. Adler and Kritzman (2007) employ full-scale optimisation in
conjunction with S-shaped and bi-linear utility. Full-scale optimisation maximises utility directly and
makes no parametric assumptions, mitigating the effect of estimation error. The authors conclude that
the in-sample superiority of full-scale optimisation prevails out of sample. In a related approach
Brandt, Santa-Clara, and Valkanov (2009) address the portfolio construction problem by modelling
the weight of each asset as a function of its characteristics. The idea is to massively reduce the
dimensionality of the problem by maximising expected utility directly as a function of loadings on
common factors. The advantage of the Brandt et al. (2009) approach is that it circumvents the
estimation of complex multivariate distributions thereby reducing estimation error. Further, the
approach is also utility function agnostic. The disadvantage of the approach is that it does not lead to
downside risk estimates such as VaR and CVaR. Brandt et al. (2009) find that the characteristic-based
approach leads to substantial uplifts in utility that persists out of sample.
Alcock and Hatherley (2009) employ Gaussian marginals in combination with the Clayton copula.
The authors find that accounting for asymmetric dependence leads to a significant uplift in average
returns and statistically superior returns in down-markets. Guastaroba, Mansini, Speranza (2009)
compare the performance of the standard boot-strap, the block-boot strap, multivariate GARCH and
historical simulation. The authors conclude that the block-boot-strap consistently outperforms all the
other scenario generation techniques. Martellini and Ziemann (2010) use a Taylor Series approach to
21
For a FTSE 100 universe with a Pentium III processor and 1 GB of RAM
Discussion Paper: 2014-004
9
investigate the effect of incorporating investor preferences for moments higher than two. The authors
find that incorporating higher moments does not improve investor welfare unless sophisticated
shrinkage estimators of higher moments are employed. Xiong (2010) uses the truncated Lévy-flight
distribution to capture skew and heavy tails. The standard Levy-stable distribution was used by
Mandelbrot (1963) and Fama (1965) to model asset process. The Levy-stable distribution however
has an infinite variance, violating common sense and creating problems for the expected utility
framework. The truncated Levy-flight distribution obviates this shortcoming. Xiong (2010) shows the
approach outperforms mean-variance marginally during the 2008 financial crisis.
The out-of-sample evidence also lacks depth. It is notable that of the seven key out-of-sample studies
discussed above not a single one uses more than one data set. Because financial data are inherently
noisy and the extreme market conditions that provide stress tests of an approach are by definition rare,
we argue for employing as much high quality data as possible. By way of illustration, consider a top
quartile fund manager with an information ratio22
, defined as the annual excess return divided by the
annual tracking error23
, of 0.524
. Using the standard-error formula given in Lo (2002) it is
straightforward to show that no less than 18 years of data are required to reject the hypothesis that the
true information ratio is greater than zero. The information ratio is of course a Gaussian- based metric.
Intuitively, when returns and the performance metrics employed are non-Gaussian it will take even
longer to establish statistical significance, particularly when the gains of a given technique may be
marginal. Employing multiple data sets also limits the scope for the “cherry picking” of results.
Further, of the seven studies, five pertained to equities while only one included bonds, and another
included hedge fund indices. A further concern is the absence of tests to determine whether the
supposed uplifts in performance are statically significant. In order to assert that a given approach is
superior to an existing benchmark approach establishing statistical significance would appear
essential. The lack of tests for statistical significance is probably due to the lack of closed form
solutions for non-Gaussian performance metrics.
To establish statistical significance we follow Patton (2004) and use the stationary boot-strap
procedure of Politis and Romano (1994). A final criticism of the downside portfolio construction
literature is the lack of work comparing the prominent approaches on a like for like basis across the
same investment problems. The empirical research examining downside portfolio construction
techniques lacks the thoroughness of the literature analysing the mean-variance approach. For
example, DeMiguel, Uppal and Garlappi (2009) examine fourteen extensions of the mean-variance
approach assessing statistical significance across seven data sets. Kritzman, Page and Turkington
(2010) compare the performance of the mean-variance approach to 1/N out of sample across thirteen
data sets using seven asset classes.
In general, the literature suggests that incorporating non-Gaussian characteristics into portfolio
construction methods results in improvements in investor welfare. It appears that there are gains from
accounting for each of the stylised facts relative to the mean-variance approach. There is a lack of
consensus however, as to which portfolio construction approach should be preferred. It is also
difficult to infer from the literature the relative importance of accounting for each of the stylised facts.
We believe that we need to provide the same level of thoroughness described in the mean-variance
literature to help build consensus and move the discussion forward. This means comparing multiple
22
The annual excess return divided by the tracking error 23
The tracking error is defined as the standard deviation of returns in excess of a benchmark 24
Grinold and Kahn (2000)
Discussion Paper: 2014-004
10
approaches out-of-sample across a large number of investment problems, with multiple asset classes,
and testing for statistical significance.
III. The GSEV Model
In this section, we introduce the GSEV25
model comprised of ARMA/EGARCH, the skewed-t copula
and extreme-value theory. An approach that has gained popularity in the risk modelling literature is to
use an ARMA/GARCH filter to capture autocorrelation and stochastic volatility and non-Gaussian
marginal distributions to accommodate asymmetry and heavy tails26
. We break the multivariate
estimation problem down into three parts: the fitting of an ARMA/EGARCH process, the estimation
of the univariate marginals, and the modelling of the dependence structure. The advantages of this
approach are manifold. Filtering using ARMA/GARCH yields a series that is i.i.d., a prerequisite for
fitting a parametric distribution. The estimation procedure is also simplified and accelerated with a
minimal loss in efficiency (see for example the simulation studies of Joe, 2005 and Patton, 2006). By
fitting the dependence structure separately from the univariate marginals we also have the ability to
use a different distribution for the copula and the univariate marginals27
. We now discuss the three
25
We abbreviate the rather onerous ARMA/EGARCH-Skewed-t copula-Extreme-value theory model to the
GSEV model 26
Key examples include Barone-Adesi, Bourgoin and Giannopouos (1998) using non-parametric marginals, and
McNeil, and Frey (2000), Nystrom and Skoglund (2002), Kuester, Mittnik, and Palollea (2006), Hu and
Kercheval (2007), Viebig and Poddig (2010) and Hilal, Poon, and Tawn (2011) using parametric marginals 27
It is also possible to use different univariate marginals for each asset class
Heavy tails
Tail dependence
Volatility Clustering
Skew
Barone-Adesi, Bourgoin and
Giannopouos (1998)
Xiong (2010)
Harvey, Liechty, Liechty
and Muller (2004)
Nystrom and Skoglund (2002)
Bonato (2003)
Effron and Tibshirani
(1993)
Sortino (2010)
Alcock and Hatherley
(2009)
Hu and Kercheval (2007)
RiskMetrics™, JP Morgan
Figure 1 – Literature Venn diagram. Figure 1 shows a Venn diagram of the multivariate financial return literature
Discussion Paper: 2014-004
11
components: the type of ARMA/GARCH process, the non-Gaussian distribution, and the copula
function.
A. Modelling Heteroskedasticity: ARMA/EGARCH Process
As we show in section IV, several of our data-sets display weak first-order autocorrelation. We
therefore apply an AR(1) to each asset as follows.
(1)
We use exponentially-weighted GARCH (EGARCH) developed by Nelson (1991). EGARCH is an
asymmetric GARCH process that models the logarithm of the conditional volatility rather than the
conditional variance thereby obviating the need for parameter constraints. Alexander (2009) shows
that the EGARCH model provides a superior fit relative to competing asymmetric and symmetric
GARCH models28
.
Definition 3.1 EGARCH Process: An exponential GARCH(p,q) process where the innovation
distribution is Gaussian is given by
In addition to capturing heteroskedasticity, the EGARCH process generates heavy-tails in the
unconditional distribution of returns. McNeil, Frey and Embrechts (2005) provide the kurtosis of
as follows for a standard GARCH (1,1) process as follows
(2)
where refers to the ARCH coefficient, refers to the GARCH coefficient, the . Thus,
if , for example using Gaussian or scaled student-t innovations, the kurtosis of will be
strictly greater than the kurtosis of the innovations, .
B. Modelling the tails: Extreme-Value Theory
There are a number of distributions, with varying levels of theoretical support that can accommodate
heavy tails and asymmetry including the log-normal, the Generalised Hyperbolic, and Levy-alpha
stable distributions. Rather than imposing an arbitrary distribution on the data we draw on extreme
value theory (EVT). Several studies show the benefits of combining a conditional volatility model
with EVT including, McNeil and Frey (2000) and Nystrom and Skoglund (2002), Kuester, Mittnik,
and Palollea (2006), Viebig and Poddig (2010) and Hilal, Poon, and Tawn (2011). EVT provides the
theoretical basis for how the tails of all i.i.d. distributions behave asymptotically and is used to model
rare events in a variety of fields29
. In hydrology, EVT is used to help determine the size that damn
28
This finding however is not unanimous. Engle and Ng (1993) find that the GJR-GARCH model is superior to
EGARCH. In unreported results we find that our conclusions are robust to the choice of asymmetric GARCH
process. 29
For example, predicting the likelihood of forest fires, or core-melt in nuclear power plants.
Discussion Paper: 2014-004
12
walls need to be to withstand a 100-year flood. EVT can be thought of as the counterpart of the
central limit theorem (CLT).
Definition 3.2 Central Limit Theorem: If are n independently and identically
distributed random variables with mean and standard-deviation, , and , then
Whereas the CLT is concerned with the aggregation of fluctuations around a mean value, EVT deals
with the asymptotic behaviour of extreme departures from the mean30
. The Fisher-Tippett (1928)
theorem gives the limiting distribution of a sequence of block-maxima.
Definition 3.3 Fisher-Tippet Theorem: If there are two normalising constants, and , and a
non-degenerate distribution, , such that
then converges to the generalised extreme value distribution (GEV) given by31
where , , and are the location, scale and shape parameters respectively.
The generalised extreme-value distribution is generalised in that it subsumes the Weibull ,
Cauchy and Frechet distributions . The block-maxima of a set of data can be used to
calibrate the generalised extreme-value distribution. The approach however is wasteful of scarce data
and in practice has largely been superseded by methods based on threshold exceedances that use all
data that is designated as extreme.
Definition 3.4 Limiting Distribution of Threshold Exceedances: For a threshold, , and a
positive function , such that for all ,
The right-hand side of definition 3.4 is the familiar Generalised Pareto distribution.
Definition 3.5 Generalised Pareto Distribution: The Generalised Pareto distribution is defined
as follows
30
The CLT says that the mean of a sufficiently large number of independent random variables will be
approximately normally distributed. The CLT does not however describe the behaviour of the tails of a
distribution. 31
Using the von Mises representation
Discussion Paper: 2014-004
13
where and are the shape and scale parameters.
We follow McNeil and Frey (2000) and Nystrom and Skoglund (2002), and employ the GPD to
model the tails of the distribution, and a Gaussian smoothing kernel to model the body, defined as the
inner 90% of the density32
. Intuitively, this piecewise approach enables us to model extreme events,
the key area of concern to the risk manager, more accurately than the standard maximum likelihood
approach that does not place much weight on the tails. Further, modelling the lower and upper tail
independently allows for asymmetry.
C. Modelling the dependence structure: The Skewed-t Copula
Having characterised the univariate densities of each asset’s returns, we model the dependence
structure using copulas. The use of copula functions to model asset returns has increased dramatically
in recent years33
. Copulas have proven a valuable addition to the econometrician’s toolbox because
they enable the researcher to model the dependence structure separately from the univariate densities
using the inference function for margins (IFM) method of Joe and Xu (1996). Multivariate
distributions that were extremely difficult and time consuming to fit can now be estimated rapidly. Joe
and Xu’s (1996) method follows from Sklar’s (1959) theorem which says that if is a multivariate
distribution function, and denotes the marginal distribution, , then a copula, exists such that for
all in = .
Definition 3.6 Sklar’s (1959) Theorem: Let F be a joint distribution function with margins
. Then there exists a copula such that, for all in
A parallel of to the inference function for margins method can be seen in the estimation of CCC
(Bollerslev, 1990) and DCC (Engle, 2002) multivariate GARCH models. These approaches separate
the estimation of the variance and covariance matrix into two parts. In the same way, we estimate the
marginal distributions of each asset separately from the dependence structure. Before we introduce the
skewed-t copula we define the lower and upper tail dependence between assets and as follows.
Definition 3.7 Tail dependence: The lower tail dependence, , and upper tail dependence, ,
between two stochastic variables and with cumulative density functions of and are given
by
32
In this way we eliminate the stair-case pattern evident in empirical density functions 33
Mikosch (2005) notes that a Google search of the word “copula” in 2003 produced 10,000 hits. In 2013, the
same search yields 1.63 million hits
Discussion Paper: 2014-004
14
To capture the dependence patterns discussed in section II, we argue that the ideal copula should be
able to accommodate the following four characteristics:
I. Tail dependence: , for all and
II. Asymmetric tail dependence: for all and
III. Heterogeneous tail dependence: and for all assets , , and
IV. Scalable in high dimensions
The first requirement says that the copula should allow for asymptotic dependence. The Gaussian
copula in contrast is asymptotically independent (Sibuya, 1960). As returns become more extreme
they become uncorrelated, for all , a potentially hazardous characteristic when modelling
financial asset returns. Indeed Li’s (2000) Gaussian copula model for CDOs has been widely blamed
for the mispricing of derivative instruments and the massive losses taken by investment banks during
the 2008 financial crisis34
. The t-copula accommodates tail dependence as can be seen below.
Definition 3.8 Tail dependence of the t-copula: The tail dependence of the t-copula, where
represents the tail of the univariate t-distribution, is the degrees of freedom parameter, and is the
correlation coefficient is given by
The second requirement, asymmetric tail dependence, says that the upper and lower tail dependence
of a given asset pair should be able to differ. For example, in “bear” markets dependence tends to be
higher than in “bull” markets. As shown in definition 3.8, the t-copula is radially symmetric. Several
copulas provide radially asymmetry including the Clayton and Gumbel copulas of the Archimedean
family.
Our third requirement, heterogeneous tail dependence, stipulates that the ideal copula should allow for
different dependence structures across asset pairs. For example, empirically the structure of tail
dependence is very different between developed equities and bonds than it is between developed
equity and emerging market equity or between different pairs of hedge fund strategies (Viebig and
Poddig, 2010). Unfortunately, standard Archimedean copulas do not generalise readily to higher
dimensions and it is necessary to impose severe constraints on the dependence structure, tantamount
to forcing the off-diagonal terms in the correlation matrix to be equal. The multivariate Levy-Stable
distribution also allows for radial asymmetry; however it imposes a homogenous dependence
structure across asset pairs. The fourth requirement, to be useful to practitioners, a copula should be
scalable in dimensions . Again this is a problem with the Archimedean copulas.
The skewed-t copula is perhaps unique in that it fulfils all four requirements. Fitting the skewed-t
copula however is non-trivial. Smith, Gan and Kohn (2010) use a Bayesian Markov Chain Monte
Carlo approach, while Kollo and Pettere (2010) employ a GMM approach. We follow Sun, Rachev,
Stoyanov and Fabozzi (2008) and extract the copula from the multivariate skewed-t distribution. We
can extract the dependence structure from the multivariate distribution in conjunction with the
marginal distributions by Sklar’s theorem (1959).
We first need to discuss the skewed-t distribution. There is a bewildering array of variants of the
skewed-t distribution including Hansen (1994), Fernandez and Steel (1998), Branco and Dey (2001),
34
Salmon (2009)
Discussion Paper: 2014-004
15
Bauwens and Laurent (2002), Azzalini and Capitanio (2003), Jones and Faddy (2003), Sahu, Dey and
Branco (2003), and Patton (2004). Each of these distributions have polynomial upper and lower tails,
so while they can fit heavy-tailed data well, they cannot accommodate significant asymmetry (Aas
and Haff, 2006). We employ the skewed-t distribution of DeMarta and McNeil (2005) that derives
from the Generalised Hyperbolic (GH) distribution. Unique among skewed-t distributions, the GH
skewed-t distribution can accommodate significant asymmetry.
Definition 3.9 Generalised Multivariate Hyperbolic Distribution: The Generalised Multivariate
Hyperbolic distribution where is a vector, is a matrix, is a vector, and and are constants is
given by
where the normalising constant is
The GH family incorporates several important distributions. If , we have the
hyperbolic distribution, and if , we attain the normal-inverse Gaussian distribution. If
, we obtain the GH skewed-t distribution following McNeil, Frey and Embrechts (2005).
Definition 3.13 Multivariate Skewed-t Distribution: The Multivariate Skewed-t distribution
where is given by
where the normalising constant is
If , we recover the familiar multivariate student-t distribution.
In order to simulate from the skewed-t copula, we require the covariance matrix, Σ. We can derive this
from the stochastic equation of the multivariate skewed-t distribution as follows.
(3)
The covariance matrix is then comprised of a skew-based component and a heteroskedastic Gaussian
component. For the skewed-t distribution, . Employing the expected values of the
Discussion Paper: 2014-004
16
first and second moments of the inverse gamma distribution we can estimate the covariance term as
follows
(4)
We are now armed with the requisite components to extract the skewed-t copula from the multivariate
skewed-t distribution. We follow the algorithm of Sun, Rachev, Stoyanov and Fabozzi (2008) set out
below.
Algorithm 3.1 Skewed-student-t Copula Simulation
1. Estimate the parameters for the univariate skewed-t distributions using MLE35
2. Estimate the covariance matrix using equation (4)
3. Draw independent n-dimensional vectors from the multivariate skewed-t distribution using
the stochastic representation given by (2). The result is a N-by-d matrix of simulations, .
a. Draw independent d-dimensional vectors from the multivariate
Gaussian distribution defined by
b. Draw independent random numbers from the inverse gamma
distribution defined by
c. Substitute and into equation (3)
4. Generate the cumulative distribution functions for the univariate marginals in step 1 through
numerical integration36
5. Transform the simulations, , to uniformly distributed variables using the cumulative
distribution functions in step 4
An alternative procedure to MLE is to use the expectation-maximisation algorithm (EM)37
. In the case
of the univariate skewed-t distribution that we are using here, we have three free parameters, and
and a standard solving method such as quasi-Newton suffices38
.
In each panel of figure 2, we show 10,000 simulations for different permutations of and of the
bivariate skewed-t copula using steps 1-5. The upper left hand panel exhibits lower tail dependence,
while the lower right panel exhibits upper tail dependence. The copula in the centre is radially
symmetric and is identical to the standard-t copula. The remaining panels are not exchangeable and
show a range of tail behaviour indicative of the flexibility of the skewed-t copula.
35
Note that we do not lose any flexibility in fixing the degrees of freedom parameter, . 36
We are unaware of a tractable analytical solution for the c.d.f. of the skewed-t distribution 37
Liu (2012) 38
McNeil, Frey and Embrechts (2005)
Discussion Paper: 2014-004
17
In summary, we employ EGARCH to capture time varying volatility and the leverage effect, extreme
value theory (EVT) to capture negative skew and heavy tails, and the skewed-t copula to capture
heterogeneous asymmetric tail dependence.
E. Applying the GSEV model to the S&P 500
In this subsection we provide a worked example of the proposed GSEV model. We employ daily data
of the S&P 500 index from 1/1983-1/2013. To proxy for a small-cap index we use the value-weighted
portfolio of the smallest quintile of firms in the CRSP universe39,40
. The data displays strong
departures from normality failing the Jacque-Bera test at the 1% level. We apply the Jarque-Bera test
to annual non-overlapping sub-periods and find that in 100% of years, normality can be rejected for
both indices.
39
Courtesy of K.R. French 40
Commercial small-cap indices did not start providing daily indices until more recently, for example the MSCI
U.S.A. Small Cap Total Return index commenced on 1/2001.
-
0.2
0.4
0.6
0.8
1.0
- 0.5 1.0
pro
babili
ty
probability
β1=-2, β2=-2
-
0.2
0.4
0.6
0.8
1.0
- 0.5 1.0
pro
babili
ty
probability
β1=-2, β2=0
-
0.2
0.4
0.6
0.8
1.0
- 0.5 1.0
pro
babili
ty
probability
β1=-2, β2=2
-
0.2
0.4
0.6
0.8
1.0
- 0.5 1.0
pro
babili
ty
probability
β1=0, β2=-2
-
0.2
0.4
0.6
0.8
1.0
- 0.5 1.0
pro
babili
ty
probability
β1=0, β2=0
-
0.2
0.4
0.6
0.8
1.0
- 0.5 1.0
pro
babili
ty
probability
β1=0, β2=2
-
0.2
0.4
0.6
0.8
1.0
- 0.5 1.0
pro
babili
ty
probability
β1=2, β2=-2
-
0.2
0.4
0.6
0.8
1.0
- 0.5 1.0
pro
babili
ty
probability
β1=2, β2=0
-
0.2
0.4
0.6
0.8
1.0
- 0.5 1.0
pro
babili
ty
probability
β1=2, β2=2
Figure 2 – Skewed-T Copula Simulation. Each panel of figure 2 shows 10,000 draws from the bivariate
skewed-t copula. and refer to the skew parameters of the two variables. Simulations were performed
with , and .
Discussion Paper: 2014-004
18
Table I
Summary Statistics: S&P 500
S&P 500
Index
Small Cap
Index
Annualised Return 10.7 9.7
Standard-deviation 18.3 17.2
Skewness -0.6 -0.7
Kurtosis 16.1 11.1
Jacque-Bera 0.001 0.001
% Jarque-Bera failures 100 100
% stat. sign skew 65 89
% stat. sign kurtosis 100 92
-0.0 0.1
0.1 0.4
In figure 3, we show the sample autocorrelation function of the squared daily returns for the S&P 500
(panel A) and U.S. Small Cap (panel B) indices. The horizontal red lines delineate statistical
significance at the 5% level. The squared returns for both indices follow an autoregressive structure
consistent with volatility clustering. We then apply an ARMA/EGARCH model to each return series.
The maximum-likelihood estimates are given in table II. The leverage effect is negative and
statistically significant for both indices indicating that negative returns lead to larger increases in
volatility than positive returns of the same magnitude. Moreover, the EGARCH model provides a
superior fit to the standard GARCH model for both indices.
Table II – ARMA/EGARCH Parameters
S&P 500: GARCH S&P 500: EGARCH
Coefficient Error t-stat
Coefficient Error p-value
C 0.07 0.01 7.25
0.04 0.01 0.00
0.02 0.01 1.21
0.02 0.01 0.07
0.01 0.00 11.71
0.00 0.00 0.01
0.91 0.00 316.91
0.98 0.00 0.00
0.08 0.00 36.82
0.13 0.01 0.00
-0.09 0.00 0.00
-10147.16
-10043.10
Small Cap Index: GARCH
Small Cap Index: EGARCH
Coefficient Error t-stat
Coefficient Error p-value
C 0.05 0.01 5.78
0.04 0.01 0.00
0.21 0.01 15.79
0.22 0.01 0.00
0.01 0.00 11.71
-0.00 0.00 0.00
0.84 0.00 292.51
0.97 0.00 0.00
0.15 0.00 70.47
0.25 0.01 0.00
-0.07 0.00 0.00
Table I provides the summary statistics for S&P 500 index and a U.S. Small Cap index comprised of the value-weighted
portfolio of the smallest quintile of firms listed on the NYSE, AMEX or NASDAQ exchanges for the period 1/1983-
12/2012. The annualised return is calculated geometrically. The % of Jarque-Bera failures refers to the proportion of 260
day sub-periods where normality is rejected at the 5% level. The % of statistically significant skewness and % of
statistically significant kurtosis refers to the proportion of sub-periods where the skewness and kurtosis is statistically
significant. refers to the autocorrelation at one lag. refers to the one-period autocorrelation
of the absolute value of returns.
Table II provides the ARMA/GARCH and ARMA/EGARCH(1,1) maximum-likelihood estimates for the S&P 500 and U.S.
Small Cap indices using daily returns between 1/1983-12/2012.
Discussion Paper: 2014-004
19
-8537.08
-8518.97
In figure 3, panels C and D we show the autocorrelation function of the squared standardised
ARMA/EGARCH residuals. The process has removed the autoregressive structure in the squared
returns, and we then fit a generalised Pareto-distribution to the lower 5% quantile of the standardised
residuals again using maximum-likelihood.
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
0 5 10 15 20
Lag
Panel A: Sample ACF of squared daily returns for the S&P 500 Index: 1/1983-12/2012
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
0 5 10 15 20
Lag
Panel C: Sample ACF of squared daily ARMA/EGARCH residuals for the S&P 500
Index: 1/1983-12/2012
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
0 5 10 15 20
Lag
Panel B: Sample ACF of squared daily returns for the U.S. Small Cap. Index: 1/1983-12/2012
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
0 5 10 15 20
Lag
Panel D: Sample ACF of squared daily ARMA/EGARCH residuals for the U.S. Small
Cap Index: 1/1983-12/2012
Figure 3 – Autocorrelation Function of squared daily S&P 500 and Small Cap index returns. Figure 3 shows the
autocorrelations for the standardised residuals for the daily S&P 500 and Small Cap index returns between 1/1983 and
12/2012 for 0 to 20 lags. The horizontal blue lines delineate statistical significance at the 5% level. Standardised residuals
are obtained by dividing by the EGARCH residuals by the EGARCH conditional standard-deviation estimates.
Discussion Paper: 2014-004
20
The estimated cumulative distribution function maps on well to the empirical c.d.f. for both indices.
The Kolmogorov–Smirnov test fails to reject the null hypothesis of a GPD in both cases at the 1%
level. The estimated coefficients are shown in table III.
In order to forecast portfolio risk and construct portfolios, we now need to combine the univariate
GPDs into a multivariate distribution. As detailed in section III B, we do this using the skewed-t
copula. To simulate the copula function, we follow algorithm 1.1. (Sun, Rachev, Stoyanov and
Fabozzi, 2008). The first stage is maximum-likelihood to estimate the parameters of the univariate
0
0.2
0.4
0.6
0.8
1
-8 -7 -6 -5 -4 -3 -2 -1 0
Exceedance
Panel A: Lower 5% quantile of standardised residuals for S&P 500 Index: 1/1983-12/2012
0
0.2
0.4
0.6
0.8
1
-8 -7 -6 -5 -4 -3 -2 -1 0
Exceedance
Panel B: Lower 5% quantile of standardised residuals for U.S.Small Cap Index: 1/1983-12/2012
KS-test p-value
S&P 500 index -1.66 0.218 0.494 0.88
US Small Cap index -1.72 0.193 0.569 0.14
Table III – MLE estimates for GPD distribution for 5% quantile of daily S&P 500 and US Small Cap index
ARMA/EGARCH residuals. Table 8 provides the maximum likelihood estimates for the GPD for the ARMA/EGARCH
residuals estimated for the period 1/1983-12/2012. We also provide the p-value for the Kolmogorov–Smirnov goodness of
fit test.
Figure 4 – Cumulative Density Function for the 5% quantile using the Fitted Generalised Pareto Distribution and the
Empirical distributions for the ARMA/EGARCH residuals for the S&P 500 and US Small Cap indices. Figure 9
shows the cumulative density functions for 5% quantile using the fitted GPD and empirical distributions for the
ARMA/EGARCH residuals for the period 1/1983-12/2012 for the S&P 500 and US Small Cap indices.
Discussion Paper: 2014-004
21
skewed-t distributions, and for a fixed degrees of freedom, . We constrain to zero which is a
common assumption for short forecast horizons.
Definition 3.14 Univariate Skewed-t Distribution: The Univariate Skewed-t distribution is given
by
We then estimate the covariance matrix of the multivariate skewed-t distribution using equation (4)
reproduced below.
We then simulate the multivariate skewed-t distribution using the stochastic representation given by
equation (3)
where , and where A is an upper triangular matrix.
To simulate the copula we transform the columns of the simulated return matrix to uniform
distributions by taking the inverse of the c.d.f. of the skewed-t distribution using the fitted parameters.
We then take the bivariate uniform distributions and draw from the fitted GPD to generate the
unconditional GSEV distribution. The final stage is to condition on volatility using the unconditional
GSEV distribution as innovations for the ARMA/GSEV process.
By way of illustration, at the onset of the global financial crisis in January 2008, the maximum
likelihood estimates for the asymmetry parameters, for the S&P 500 and US Small Cap indices were
=-0.4 and =-0.49. Figure 5 shows the simulated bivariate skewed-t copula using these
parameters. Note the increase in dependence in the left-hand tail.
Discussion Paper: 2014-004
22
IV. Data and Methodology
Capturing the four stylised facts in a coherent scalable framework is not straight-forward. In the case
of the GSEV model, a practitioner requires advanced knowledge of GARCH processes, copula theory,
the multivariate skewed-t distribution and extreme value theory. It is therefore important to quantify
the likely benefits of the approach relative to more parsimonious approaches. We compare the GSEV
approach to portfolio construction models commonly used by practitioners or that are prominent in
the literature. Our empirical approach is made up of two parts. In section V A we compare the
performance of the value-at-risk (VaR) forecasts of the GSEV model to seven benchmark models. In
section V B we compare the performance of the GSEV model to eight benchmark models in a
dynamic portfolio rebalancing framework. To test the generality of our conclusions we utilise eight
data sets including five asset classes.
A. Benchmark Models
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
pro
babili
ty
probability
Figure 5 – Simulated Bivariate Skewed-t Copula and GSEV model returns for S&P 500 and US Small Cap indices.
Panel A shows 10,000 draws from the bivariate skewed-t copula with parameters and , and .
Panel B shows 10,000 draws from the conditional GSEV daily returns for the S&P 500 and the US Small Cap index
estimated at 1/2008.
Discussion Paper: 2014-004
23
The eight benchmark models are divided into unconditional and conditional methods. We discuss the
motivation for and the estimation of each of these models in detail in appendix A. Table IV provides
an overview. As well as serving as performance benchmarks, these particular models have been
selected to help elucidate the desirable characteristics of a risk modelling approach, for example
conditional or non-conditional, parametric or non-parametric, symmetric or asymmetric dependence,
Gaussian or non-Gaussian marginal distributions.
Table IV – Benchmark Methodologies
Methodology Type Components Key references
1/N Unconditional, Heuristic -Equal portfolio weights Bernatzi and Thaler (2001),
DeMiguel, Garlappi and
Uppal (2009)
Gaussian Unconditional,
Parametric
Gaussian distribution Bloomfield, Leftwitch and
Long (1977)
Jobson and Korkie (1981)
Student-t Unconditional,
Parametric
Student-t distribution Lauprete, Samarov and
Welsch (2001)
Hu and Kercheval (2010)
Historical Simulation Unconditional, non-
Parametric
Bootstrapped from historical
distribution
Efron and Tibshirani (1993),
Sortino (2010)
Gaussian Marginals/
Clayton-Copula
Unconditional
parametric
Gaussian marginals and the
Clayton copula
Alcock and Hatherley
(2009)
EWMA Conditional, Parametric -Exponentially-weighted
covariance matrix
RiskMetrics Technical
Document, 1996
Filtered Historical
Simulation
Conditional, non-
Parametric
-EGARCH simulation applied
to bootstrapped returns
Barone-Adesi, Bourgoin and
Giannopouos (1998)
GGEV Conditional, Parametric -EGARCH
-Generalised Pareto distribution
to fit tails
-Gaussian Copula
Nelson (1991)
Nystrom and Skoglund
(2002)
B. Risk Model Forecast Evaluation
The accurate quantification of tail-risk is essential to the efficient operation and indeed survival of
financial institutions. If capital reserves are excessive, then shareholder returns, business and
consumer credit, and economic growth are inhibited. If capital reserves are too low, there is an
increased risk of bank failures, erosion of shareholder value and a contraction in economic growth.
Further, under Basel III, internal VaR models that perform poorly trigger increased capital
requirements. To prepare the ground, we begin by defining the loss function.
Definition 4.1 Loss Function: The loss function is given by the change in value, , of the
portfolio between time and .
By convention, the loss function is usually expressed as a positive value and we are concerned with
the right-hand tail. VaR is defined as the loss that a portfolio will not exceed a given probability of the
time, . Mathematically, VaR refers to the -quantile of a distribution.
Definition 4.2 Value-at-Risk: The value-at-risk, for confidence level , where
is the cumulative distribution function for the loss associated with is given by
Table IV provides an overview of the eight benchmark methodologies and the GSEV model.
Discussion Paper: 2014-004
24
We re-estimate the respective VaR models daily and produce daily VaR forecasts for the equally
weighted portfolios for each asset class. Under Basel II a 99% confidence level is used to estimate the
10-day VaR. In addition to the 99% VaR level we evaluate the 99.5% VaR estimate over a one day
horizon as is common in the VaR validation literature (Skoglund, Erdman, and Chen 2010). The Basel
Committee on Banking Supervision is currently considering the standardisation of VaR model
estimation windows to between two and five years. For our estimation window, we follow McNeil,
Frey and Embrechts (2005) and use a 1000-day estimation window41
. In the Gaussian case, VaR can
be estimated analytically.
Definition 4.3 Value at Risk for the Gaussian Distribution: The conditional value-at-risk for the
Gaussian distribution, where is the confidence level, , is the standard Gaussian density function,
and is the standard-deviation of , is given by
Similarly if the return series is distributed as a student-t, we have the following relation.
Definition 4.4 Value at Risk for Student-t Distribution: The value-at-risk for the student-t
distribution, where is the confidence level, is the degrees of freedom, is the density function of
the student-t, and is the standard-deviation of , is given by
To estimate the VaR of the non-elliptical models we use Monte Carlo simulation to generate 100,000
daily returns for each asset for each model, each day. We then estimate the VaR as the quantile of
the simulated return series.
Definition 4.5 Value at Risk for Discrete Data: The value-at-risk for discrete data, where is the
cumulative simulated density function of the discrete data, and is the confidence level, is given by
The process for generating the respective VaR forecasts is summarised below.
Algorithm 4.1 VaR Estimation
1. At time, , estimate each risk model using data from -1000 to -1
2. Simulate 100,000 daily returns for each risk model
3. Estimate the VaR for each model as the quantile of the simulated data42
4. Step forward one business day and repeat
In order to evaluate the quality of our VaR models, we first define an indicator variable of violations,
also known as exceedances, where the VaR estimate is breached43
. If the model is well calibrated, the
proportion of violations should approximate one minus the confidence level.
41
The 1000 day estimation window also ensures that we have sufficient data to estimate each of our nine models
for all of our data sets 42
For the Gaussian models we use the closed form solution instead of relying on quantiles 43
See “The Analytics of Risk Model Validation” Christodoulakis and Satchell (2008) for a review of novel VaR
model evaluation techniques
Discussion Paper: 2014-004
25
Definition 4.6 Indicator Variable of Violations: The indicator variable of violations where
is the loss in period and is the VaR estimate with confidence level made at point, , is
given by
We evaluate the quality of the VaR forecast using the unconditional coverage (UC), serial
independence (SI) and conditional coverage (CC) tests proposed by Christoffersen (1998). The
unconditional coverage (UC) compares the theoretical proportion of violations, , with the
observed proportion of violations, . If the model produces too many violations, , and the
VaR model underestimates risk. If the model produces too few violations, , and the VaR
model overestimates risk. The likelihood ratio of the unconditional coverage ratio is defined as
follows.
Definition 4.7 Likelihood Ratio for Unconditional Coverage: The Likelihood Ratio for
Unconditional Coverage where is the confidence level and is the maximum
likelihood estimate of , and is the number of observations with value is given by
The correct unconditional coverage is not however a sufficient condition for a well-behaved VaR
model, the violations should also be serially independent (SI). If the violations cluster together, the
firm’s capital reserves may prove insufficient to ward off the threat of bankruptcy. The likelihood
ratio for serial independence is defined as follows.
Definition 4.8 Likelihood Ratio for Serial Independence: The likelihood ratio for serial
independence where is the number of observations with the value , followed by . is given by
and
and
We then use Christoffersen’s (1998) joint test of unconditional coverage and serial independence
known as the conditional coverage test. The likelihood ratio of the conditional coverage statistic is
given by the sum of the conditional coverage and serial independence likelihood ratios.
Definition 4.9 Likelihood Ratio Conditional Coverage
C. Dynamic Portfolio Rebalancing Framework
We quantify the economic benefits of the GSEV model in a dynamic portfolio rebalancing framework
(Solnik, 1993, Fletcher, 1997). As in the preceding section we re-estimate each VaR model each day
for each data set. We then produce an optimal portfolio each day for each model for each dataset
using a rolling 1000 day window.
Discussion Paper: 2014-004
26
C1. Expected Returns and Estimation Error
A key goal of this paper is to quantify the economic value of a range of risk modelling approaches. It
is therefore important that our conclusions are not influenced by the estimation error inherent in
expected returns that can easily contaminate the conclusions of dynamic portfolio rebalancing
analyses. Chopra and Ziemba (1993) show that errors in expected returns are over ten times more
important than errors in variance and over twenty times more important than errors in covariance in
determining portfolio weights. Three approaches to estimating expected returns are commonplace in
the literature: the use of historical sample averages (Bloomfield, Leftwitch and Long, 1977, Jorion,
1991, Jagannathan and Ma, 2003, Duchin and Levy, 2009, DeMiguel Garlappi and Uppal, 2009),
predictive regressions based on fundamental variables44
(Solnik, 1993, Fletcher, 1997), and
equilibrium based models (Black and Litterman, 1992, and Pastor and Stambaugh, 1999, 2000). All
approaches however entail large amounts of estimation error that can easily confound the empirical
results. For example an equilibrium model may perform poorly leading to poor performance of the
out-of-sample portfolios even if the underlying risk model performs well. In effect, we have a dual
hypothesis problem.
Bayesian methods, first suggested by Zellner and Chet (1965) and improved on by Jorion (1986), and
heuristic methods, including resampling (Michaud, 1998) and constraints (Frost and Savarino, 1998),
have been applied to mitigate the effect of estimation error. Again, if we were to use one of these
heuristic techniques, it becomes difficult to disentangle the benefits of the heuristic and the benefits of
the portfolio construction approach. We take a different approach and assume identical expected
returns for each asset. This is consistent with a null prior and may build a level of conservatism into
our conclusions in that we are implicitly assuming that agents have no forecasting skill45
.
C2. CVaR and Coherent Measures of Risk
We now motivate our choice of conditional value-at-risk (CVaR), also known as expected shortfall, as
our primary risk measure. Artzner, Delbaen, Eber and Heath, (1998) provide four axioms of a
coherent risk measure.
Definition 4.10 Axioms of a Coherent Risk Measure (Arzner, Delbaen, Eber and Heath, 1998)
A1 Translation Invariance: For all and all real numbers , we have
A2 Sub-additivity: for all and
A3 Positive Homogeneity: for all and all
A4 Monotonicity: For all and with , we have
Translation invariance means that the risk measure should be linear with respect to returns.
Subadditivity requires the risk measure of a portfolio of two assets to be less than the sum of the risk
measures for each asset and hence reward diversification. Positive homogeneity requires that the risk
measure is a linear function of the portfolio exposure. Monotonicity means that if a portfolio has a
higher return than a second portfolio in all states of the world, the risk of the second portfolio must be
higher. In the Artzner et al (1999) framework there is no requirement that risk pertain to negative
outcomes running counter to one’ s intuitive grasp of risk. Indeed, Pedersen and Satchell (1998) in a
set of axioms that precedes Artzner et al (1999) propose that a basic property of a risk measure should
44
Campbell and Thompson (2008) provide an excellent overview of the explanatory power of a wide range of
variables 45
With regard to expected returns
Discussion Paper: 2014-004
27
be a focus on the downside. The downside focus is consistent with Prospect Theory, the behavioural
studies of Kahneman, Daniel, and Amos Tversky (1979), and common-sense. We therefore require
that our risk measure has a downside focus in addition to the axioms of a coherent risk measure.
Under Basel II, the Committee on Banking Supervision advocates for the use of VaR. Despite the
ubiquity of VaR, the approach does not satisfy the second axiom of a coherent risk measure because it
is not subadditive. Instances can occur where a portfolio of two assets has a higher VaR than the sum
of the VaRs of the two portfolios running counter to the principle of diversification. VaR is also
invariant to the size of losses beyond the VaR threshold. McNeil, Frey and Embrechts (2005) show
how the optimisation procedure can exploit this conceptual weakness of VaR and result in portfolios
that are highly risky and undiversified. From a practical point of view, using VaR to construct
portfolios leads to a non-smooth, non-convex, and difficult to solve optimisation problem. Despite
significant toil, efficient algorithms to solve VaR based optimisations in all but low dimensions have
not been forthcoming (Kast, Luciano and Peccati 1998, Basak and Shapiro, 2001, Puelz, 1999,
Emmer, Kluppelberg and Korn 2000). The Basel Committee on Banking Supervision have recently
argued for the adoption of conditional value at risk (CVaR)46
, also known as expected shortfall, over
VaR, due to the latter’s “inability to capture tail risk”47
. We define as follows
Definition 4.11 Conditional Value at Risk: , where is the loss function and is
the underlying probability density function is given by
The advantages of CVaR are numerous. Firstly, unlike variance, CVaR is downside focussed. CVaR
is a coherent measure of risk and hence sub-additive (Acerbi and Tasche, 2002). CVaR is therefore
also a convex risk measure and amenable to optimisation. CVaR, unlike VaR accounts for the size of
losses beyond the VaR threshold. CVaR is also intuitive and readily interpretable, capturing a key
quantum interest to practitioners and regulators: if there is an extreme negative realisation, how large
is it likely to be? For these reasons and owing to the recent Basel Committee endorsement, we employ
CVaR as our principle risk measure.
C3. Optimisation Problem
The standard approach in analyses of portfolio optimisation techniques is to maximise an objective
function that is a positive function of expected returns and a negative function of expected risk.
(5)
where is a vector of asset weights, is the coefficient of risk aversion, and is a risk function.
We instead follow Leibowitz and Kegelman (1991), Lucas and Klaasen (1998), Campbell, Huisman
and Koedijk (2001), and Chekhlov, Uryasev, and Zabarankin (2005) and maximise the expected
return of the portfolio, subject to a constraint on risk, in our case CVaR.
46
The notion of expected shortfall has been familiar to insurance practitioners for several decades (Misorek and
Weron, 2010). 47
Fundamental Review of the Trading Book, Consultative document, Basel Committee on Banking Supervision,
May 2012
Discussion Paper: 2014-004
28
(6)
where is a vector of portfolio weights, is a vector of ones, and is a function that estimates
the conditional value at risk. The origin of this approach traces back to Roy (1952) and the safety-first
principle. This approach is consistent with second order stochastic dominance. We set the one-day
99% CVaR target to 2%, corresponding to the historical 99% one-day CVaR of the average pension
fund allocation invested 50% in equities and 50% in fixed-income and cash48,49,50
.
There are several benefits of maximising returns subject to a CVaR constraint. First, this approach
aligns with the risk management practice at banks and hedge funds. For example, a portfolio manager
will in general attempt to maximise the returns he or she can generate in order to attract inflows and
earn a performance fee, while limiting the potential for a severe negative outcome that will lead to
outflows or the loss of the mandate. Second, constraining the CVaR of the portfolios also enables us
to see if the proposed model produces portfolios with the targeted CVaR out-of-sample. Third, we are
freed from the requirement to estimate the level of risk aversion. The expected utility maximisation
framework is consistent with our constrained approach and the efficient frontiers of the two
approaches coincide.
CVaR defined in 4.11 is not a smooth convex function leading to difficulties converging to the global
optimum portfolio. Rockafellar and Uryasev (2001) solve this problem by re-expressing the
optimisation problem as follows.
Definition 4.12 Rockafellar and Uryasev (2001) Conditional Value at Risk: The -CVaR,
where is the threshold level, is the loss function and is the underlying probability
density function, and when and 0 when , is given by
The discretised version is given below.
Definition 4.13 Conditional Value at Risk for Discrete Data: The value-at-risk for discrete data,
where is the number of violations of is given by
Under this approach, VaR enters the optimisation as an additional variable and is determined
simultaneously with CVaR. In this way the dependence on VaR is eliminated and the problem is
smooth and convex. We use the Rockafellar and Uryasev (2001) approach to estimate the CVaR using
100,000 simulated daily returns for each benchmark model51
.
Algorithm 4.2 Dynamic Portfolio Rebalancing
48
Estimated based on 50% weight in the S&P 500 index and 50% weight in the Barclays Aggregate
Government Bond index for the period 5/1990 to 10/2013 49
Ibbotson and Kaplan (2000) 50
The performance measures we describe in section IV D are invariant to the level of the CVaR constraint. 51
For the two elliptical cases we instead use the closed form solution for CVaR given in definition 4.1 and 4.2
Discussion Paper: 2014-004
29
1. At time estimate each risk model using data from -1000 to -1
2. Simulate 100,000 daily returns for each asset for each risk model
3. Maximise the objective function in (6) where is defined in 4.12 for each model
4. Record portfolio returns and weights at
5. Step forward one business day and repeat
D. Performance Evaluation
Our primary performance measure is the ratio of the mean historical return in excess of the risk free
rate divided by the historical . The Mean/ ratio is consistent with our optimisation
problem in section IV C3, the axioms of a coherent risk measure, the downside requirement of
Pedersen and Satchell (1998) and Basel III.
Definition 4.14 Mean/ Ratio: The Mean/ ratio, where is the mean historical
return, is the risk-free rate, and refers to the conditional value-at-risk with a confidence
level of is given by
We now describe our approach for evaluating the robustness of our findings to alternative
performance measures. We define a performance measure as a ratio of reward to risk and that is valid
with respect to a given utility function. In practice, investor preferences are heterogeneous and there is
no single utility function that is valid for all investors. Hence there is no single performance measure
that is appropriate for all investors and it makes sense to evaluate performance through the prism of a
number of different measures. We restrict ourselves to measures that are consistent with the key
axioms of a coherent risk measure, are implied by expected utility theory, and have a decision
theoretic basis.
In section IV B2 we used the axioms of a coherent risk measure as provided by Artzner et al (1998) to
motivate the use of CVaR. The Artzner et al (1999) axioms do not require that a risk measure should
be non-linear with respect to losses. For example CVaR is invariant to the distribution of losses
beyond the VaR threshold and is determined purely by the mean exceedance. CVaR is thus a risk-
neutral risk measure. So while CVaR has several attractive properties, including coherence,
convexity, interpretability, and consistency with Basel III, it is not a panacea.
Generalised lower partial moments (LPM) proposed by Fishburn (1977) and (Bawa, 1977) provide
the basis for our two supplementary performance measures. LPM follow directly from the utility
function proposed in Fishburn (1977) and Bawa (1978).
Generalised LPM subsume several notable risk measures including, the probability of below target
return ( for all ), expected regret ( ), and the Sortino ratio ( ). LPM
are widely used in the investment industry suggesting the metrics reflect the utility functions of
market participants.
Discussion Paper: 2014-004
30
Definition 4.15 Generalised Lower Partial Moments: The Generalised Lower Partial Moment,
, where is the threshold return, is the portfolio return, and is the moment order is defined
as follows.
Generalised Lower Partial Moments (LPM) are not translation invariant. Barbosa and Ferreira (2004)
argue that non-observance of the translation invariance axiom is not a serious drawback because
increasing the weight of the risk-free asset in a risky portfolio will still lead to a decrease in risk52
.
LPM satisfy the homogeneity axiom for , and the subadditivity axioms for . We therefore
restrict ourselves to LPM with and . Setting the target return to zero also ensures that the
LPM-based metrics provide a complimentary perspective of risk to Mean-CVaR where the threshold
return is in the far left-hand tail. The first supplementary second measure we use is the Mean/
ratio and is defined as follows.
Definition 4.16 Mean/ Ratio: The Mean/ ratio, where is the mean historical return,
is the risk-free rate, and refers to the second order lower-partial moment of returns with
respect to the origin, is given by
It can easily be shown that the Mean/ ratio is equivalent to Keating and Shadwick’s (2002)
Omega ratio minus one53
. The Omega ratio is equal to the sum of the probability above the target
return divided by the probability below the targeted return. Keating and Shadwick (2002) argue that in
this sense the Omega ratio summarises the entire probability density in a single number. Mean/ ,
like Mean/ is a risk-neutral measure of risk.
The second performance measure we use is the Mean/ ratio, and is equivalent to the Sortino
ratio with a target return of on the numerator and zero on the denominator. The is convex in
losses and is therefore a risk averse measure and the Mean/ ratio provides a complimentary view
of risk to the Mean/ and Mean/ measures.
Definition 4.17 Mean/ Ratio: The Mean/ ratio, where is the mean historical return,
is the risk-free rate, and refers to the second order lower-partial moment of returns with
respect to the origin, is given by
E. Establishing Statistical Significance
In the extant literature, there is almost a complete absence of studies that evaluate the statistical
significance of the difference in performance between non-Gaussian and benchmark portfolio
construction techniques. We attempt to fill this gap. For Gaussian performance ratios such as the
Sharpe ratio, it is trivial to evaluate the statistical significance of the difference between two strategies
52
Note however, that LPM satisfy the “shift invariance” property of Pedersen and Satchell (1998) 53
where the target return on the numerator and denominator is given by
Discussion Paper: 2014-004
31
using the Jobson and Korkie (1981) t-statistic with Memmel’s (2003) correction and we do so in the
current work.
There is not however a similarly developed literature to evaluate the statistical significance of non-
Gaussian measures, such as those identified in section IV D. We employ a block bootstrap technique
to estimate the standard-errors of our performance measures. It is necessary to use a block bootstrap to
account for the observed autocorrelation in portfolio returns. We select the block size, , to ensure the
fastest possible rate of convergence given the observed level of dependence using the algorithm of
Politis and White (2004).
where
and
and
We bootstrap from the observed strategy returns 10,000 times and compute 10,000 estimates for each
performance metric.
F. Quantifying Economic Value Added
In this subsection we discuss our methodology for quantifying the economic value added of a given
approach. Performance ratios allow for many insights. They do not however allow us to readily gauge
the uplift in investor welfare of a portfolio construction technique. We follow West, Edison, and Cho
(1993) and Fleming, Kirby and Ostdiek (2001) and estimate the value added, , as the fee that equates
the expected utility of the proposed approach with a benchmark approach, in our case the
unconstrained Gaussian model54
. To help ensure our results are robust to the choice of utility function,
we employ the mean-variance utility function and the power utility function. While the mean-variance
utility function is more widely used by practitioners, it does not of course account for higher
moments. Nevertheless, we include it for completeness. We use three levels of risk aversion
54
Fleming, Kirby and Ostdiek (2001) among others refer to this metric as the performance fee. We use the term
“value added” to avoid confusion with the terminology used in the mutual fund industry where a performance
fee is charged as a percentage of the return
Discussion Paper: 2014-004
32
representing a conservative ( =0.028), a moderate ( =0.05), and an aggressive investor ( =0.089).
These values are derived from the first order conditions of the mean-variance investors using the
weights of an aggressive, moderate and conservative investor.55
Definition 4.18 Value Added – Mean-Variance Utility: The value-added refers to the management
fee, , that solves the following equation where refers to the coefficient of risk aversion,
and refers to the return of the model and benchmark portfolios in period .
Conversely, power utility does account for higher moments and is generally considered a more
plausible model of agent behaviour, exhibiting constant relative risk aversion. Again we use three
levels of risk aversion, =5, =10, and =15, consistent with estimates given in the literature.56
Definition 4.19 Value Added – Power Utility: The value-added refers to the management fee, ,
that solves the following equation where refers to the coefficient of risk aversion, and
refers to the return of the model and benchmark portfolios in period .
F. Data
We employ eight data sets across all of the major asset classes as summarised in table V. In the vast
majority of the related literature, one or perhaps two data sets are used per study to validate a given
portfolio construction technique. Because financial data are inherently noisy and the extreme market
conditions that provide stress tests of an approach are by definition rare, we argue for employing as
much high quality data as possible. By way of illustration, consider a top quartile fund manager with
an information ratio57
of 0.558
. Using the Jobson and Korkie (1981) t-statistic it can be shown that we
requires no less than 26 years of data to reject the hypothesis that the true information ratio equals
zero. The information ratio is of course a Gaussian- based metric. Intuitively, when returns and the
performance metrics employed are non-Gaussian it will take even longer to establish statistical
significance. Employing multiple data sets also limits the scope for “cherry picking”.
Our data sets span multiple periods of market turbulence, including the 1987 stock-market crash, the
1997 Asian financial crisis, the Russian debt default and the demise of Long-Term Capital
Management in 1998, the dot-com crash in 2000, the 2001 terrorist attacks, and the 2007-8 financial
crisis. We limit ourselves to value-weighted portfolios and indices to help ensure that the strategies
are robust to trading costs. The first data set, replicates the asset allocation problem of the institutional
investor, and is comprised of U.S. equities (S&P 500), international equities (MSCI EAFE), U.S.
corporate bonds (Barclays Corporate Bond index), a broad-based commodities index (GSCI), and
U.S. REITs (FTSE/EPRA U.S. REITs). The second, third, and fourth data sets include U.S. sectors at
three levels of granularity and have been selected to replicate the investment universe of an equity
55
Allen, Lizieri and Satchell (2012) 56
For example, Jondeau and Rockinger (2012) 57
The annual excess return divided by the tracking error 58
Grinold and Kahn (2000)
Discussion Paper: 2014-004
33
portfolio manager; a typical structure for a fund management firm is for the portfolio manager to
oversee sector analysts and to allocate capital to each sector. The fifth data-set includes the ten largest
U.S. equities that have been continuously listed since 1985, and has been selected to represent the
stock picker’s problem. The sixth and seventh data sets have been chosen to replicate the investment
universe of a fund of funds manager that allocates between large cap and small-capitalisation funds,
and value and growth funds. Our eighth data set includes the three Fama-French factors (Fama and
French, 1993), the momentum factor used in the four factor market of Carhart (1997) and the short-
term reversal factor (Jegadeesh, 1990) utilised by statistical arbitrage funds. This data set is analogous
with the opportunity set of a fund of hedge fund manager allocating between different hedge fund
styles. All our data series are daily including dividends where relevant.
Table V – Data Sets
Data set Description Source Period
1. 5 Asset Classes US Equities (S&P 500), EAFE Equities
(MSCI), Corporate Bonds Index (Barclays) ,
Commodities (GSCI), US REITs
(FTSE/EPRA)
Datastream,
Bloomberg
12/89-5/13
2. 5 US Industries Value weighted K.R. French 12/89-5/13
3. 10 US Industries Value weighted K.R. French 12/89-5/13
4. 30 US Industries Value weighted K.R. French 12/89-5/13
5. 10 Equities Largest 10 U.S. Equities59 continuously listed
for 31/12/1985 to 31/5/2013
Factset, Ex-share
database
12/84-5/13
6. 5 Market Capitalisation Portfolios Value-weighted quintiled by market
capitalisation
K.R. French
7. 5 Book-to-Market Portfolios Value-weighted quintiled by Book-to-Market
ratio
K.R. French
8. Fama-French Factors Market, Size, Value, Momentum, Short-term
Reversal
K.R. French 12/89-5/13
In table VI, we show the average summary statistics for each of the eight data sets60
. We divide the
data sets into overlapping 1000-day sub-periods and conduct Jarque-Bera tests for normality. All data
sets show marked departures from normality. In roughly 80% of sub-periods normality is rejected for
each asset class. Statistically significant excess kurtosis is more common than statistically significant
skew. In general, our data sets exhibit insignificant levels of first-order autocorrelation. There is
however pronounced autocorrelation in the absolute value of returns consistent with
heteroskedasticity.
59
at 31/12/1985 60
The full summary statistics for each asset class are available from the author on request
Table V summarised our eight datasets used to validate the respective approaches to VaR estimation and portfolio
construction.
Discussion Paper: 2014-004
34
Table VI
Average Summary Statistics: All Asset Classes
V. The Benchmark Models
Asset
Allocation
Problem
5
Industries
10
Industries
30
Industries
10
Stocks
Size
Portfolios
Value
Portfolios
Fama-
French
Factors
Annualised Return 5.9 11.6 11.5 11.0 10.8 11.1 11.9 9.0
Standard-deviation 20.5 19.5 20.3 22.9 30.6 18.2 18.1 12.4
Skewness -0.6 -0.5 -0.4 -0.3 -0.1 -0.6 -0.7 -0.2
Kurtosis 16.3 15.5 14.4 12.5 20.3 12.1 17.6 18.0
Maximum 9.6 12.2 12.8 13.8 22.0 9.8 11.1 8.0
Minimum -10.0 -17.5 -17.4 -17.7 -25.6 -13.9 -17.4 -9.8
Jacque-Bera 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001
% of Jacque-Bera failures 80 79 81 80 85 82 80 79
% of stat. sig. skew 43 46 46 46 50 57 46 57
% of stat. sig. kurtosis 82 79 81 80 82 82 79 78
0.04 0.01 0.02 0.03 -0.04 0.04 -0.00 0.10
0.21 0.17 0.18 0.20 0.23 0.28 0.18 0.25
We quantify the degree of asymmetric dependence (AD) using the concept of exceedance
correlations.
Definition 4.20 Exceedance Correlations: The positive and negative exceedance correlations
where and are random variables, and is the exceedance standard-deviation are defined as
To determine whether or not the observed asymmetry is statistically significant, Hong, Tu and Zhou
(2007) propose the model-free J-test. The J-test is distributed as a where is the number of
exceedance thresholds.
Definition 4.21 J-test for Asymmetric Correlation (Hong, Tu, and Zhou, 2007): The J-test
statistic where is the number of data points, and are the upper and lower exceedance
correlation coefficients, and is an estimate of the covariance matrix of is given by
We use the signed J-test of Alcock and Hatherley (2009) that also allows us to identify whether the
data displays net upper or lower tail dependence with exceedance thresholds of
Table VI provides the median summary statistics for the nine data sets listed in table I. The time period is 1/1983-12/2012 for
each all data sets except for the asset allocation problem (121989-5/2013) and the stock selection data set (12/1983-3/2013).
The annualised returns are calculated geometrically. The % of Jarque-Bera failures refers to the proportion of 1000 day sub-
periods where normality is rejected at the 5% level. The % of statistically significant skewness and % of statistically
significant kurtosis refers to the proportion of sub-periods where the skewness and kurtosis is statistically significant at the
5% level. The mean positive (negative) exceedance ρ refers to the average exceedance correlation above 1.5 standard-
deviations. The % stat. sign. net +ve exceedance ρ refers to the proportion of sub-periods where the net exceedances
correlation is statistically significant using the signed J-test. refers to the autocorrelation at one lag.
refers to the one-period autocorrelation of the absolute value of returns.
Discussion Paper: 2014-004
35
.61 We calculate the signed J-statistic for non-overlapping 260 day sub-
periods. The mean “bear” (downside) and “bull” (upside) correlations are shown in table VII for
absolute returns in excess of one-standard-deviation62
. In six out of eight of the data sets the mean
bear correlation exceeds the mean bull correlation. In seven out of the eight data sets, statistically
significant downside asymmetric dependence is more common than upside asymmetric dependence.
In general statistically significant asymmetric dependence however is the exception rather than the
rule. The Fama-French data exhibits asymmetric dependence most frequently, approximately twice as
often as would be expected by chance alone. The finding that asymmetric dependence is not universal
is perhaps under appreciated. The fact that asymmetric dependence appears to be episodic does not
however mean that it is unimportant. It is well established that dependence tends to increase during
times of market stress coinciding with periods when investors value diversification the most. It is
plausible therefore that although asymmetric dependence is not an ever-present feature of financial
markets, investors may still benefit from accounting for it.
Table VII
Asymmetric Dependence: All Asset Classes
Asset
Allocation
Problem
5
Industries
10
Industries
30
Industries 10 Stocks
Size
Portfolios
Value
Portfolios
Fama-
French
Factors
Mean "bear" ρ 0.13 0.67 0.57 0.48 0.30 0.81 0.79 0.16
Mean "bull" ρ 0.17 0.55 0.44 0.36 0.18 0.70 0.72 0.21
% sign AD 8.33 1.72 2.15 4.30 7.33 0.34 0.34 11.03
% sign + AD 5.00 0.34 1.38 2.43 4.86 0.0 0.0 12.07
V. Results
We proceed as follows. In section V A we evaluate the model fit of the GSEV approach. In section V
B we present the VaR forecasting results for the eight models. In section V C we discuss the out-of-
sample performance of the respective models and quantify the uplift in investor welfare.
A. Evaluating Model Goodness of Fit
It is important to ensure that the marginal density functions are consistent with the data as an
inappropriate specification of the marginal distributions will lean to a misspecification of the copula
function. We evaluate the fit of the Gaussian, student-t, and GSEV distributions to the univariate
ARMA/EGARCH residuals using the two-sample Kolmogorov-Smirnov and Cramer von-Mises
tests63
. We perform the tests each month on overlapping sub-periods of 1000 days each for each asset.
Table VIII shows the proportion of sub-periods that each distribution is rejected for each of the data
sets. Both the two-sample Kolmogorov–Smirnov and Cramer von-Mises tests indicate that normality
can be rejected approximately half the time. This is a reduction from what we saw in table VI
61
Hong, Tu and Zhou (2007) find that has reasonable finite sampling properties. We find that
this choice leads to unstable estimates owing to our shorter estimation window. We therefore use
consistent with Alcock and Hatherley, (2009) 62
Theoretically we would prefer to use a higher standard-deviation threshold; however a higher threshold would
lead to sampling issues 63
Evaluating the relative model fit using, for example, Akaike’s information criterion is inappropriate as we are
using a smoothing kernel for the body of the distribution for the GSEV model.
Table VII provides mean exceedance correlations for each asset class. The upside and downside exceedance correlations are
estimated using returns in excess of one standard deviation from the mean using non-overlapping 260 day sub-periods.
Statistical significance is determined using the singed J-test at the 5% level.
Discussion Paper: 2014-004
36
indicating that the ARMA/EGARCH process is accounting for a portion of the non-normality of the
data consistent with the discussion in section III64
. The student-t distribution while an improvement
over the Gaussian distribution is still rejected in 17% of cases. The GSEV approach however is
rejected less than 1% of the time. The low level of rejections is not a surprise given the use of a
smoothing kernel for the body of the data.
Table VIII Goodness of Fit and Model Comparison: All Data Sets
Kolmogorov–Smirnov Cramer von-Mises
Gaussian Student-t GSEV Gaussian Student-t GSEV
Asset Allocation Problem 48% 22% 0% 42% 17% 0%
5 Industries 40% 8% 0% 37% 5% 0%
10 Industries 34% 7% 0% 35% 3% 0%
30 Industries 86% 31% 1% 89% 26% 1%
10 Stocks 34% 7% 0% 35% 3% 0%
Size Portfolios 57% 36% 0% 55% 31% 0%
Value Portfolios 46% 16% 0% 44% 8% 0%
Fama-French Factors 42% 9% 0% 43% 5% 0%
Mean 48% 17% 0% 47% 12% 0%
Given our focus on downside risk, we also perform goodness-of fit tests on the lower 5% quantile of
the univariate residuals. This can be seen as viewing the lower 5% quantile as a complete distribution.
In the case of the GSEV model, the lower 5% quantile is identical to the generalised Pareto
distribution. The results are shown in table IX. In general the rate of rejection for the tail is higher
than for the entire distribution. The student-t improves on the Gaussian distribution, while the GSEV
model has the lowest rate of rejection for each data set, in the region of 2-10%. The GSEV model
therefore appears to fit the body and the lower tail of the data well in sample.
Table IX Goodness of Fit and Model Comparison: All Data Sets
Kolmogorov–Smirnov Cramer von-Mises
Gaussian Student-t GSEV Gaussian Student-t GSEV
Asset Allocation Problem 62% 46% 7% 70% 50% 5%
5 Industries 51% 30% 9% 55% 32% 6%
10 Industries 52% 33% 8% 53% 32% 6%
30 Industries 48% 33% 8% 50% 35% 7%
10 Stocks 63% 46% 10% 61% 46% 10%
Size Portfolios 69% 65% 3% 74% 66% 2%
Value Portfolios 55% 44% 3% 60% 46% 2%
Fama-French Factors 53% 32% 6% 55% 34% 4%
Mean 57% 41% 7% 60% 42% 5%
B1. VaR Prediction – 99% confidence level
64
This is also consistent with Cont (2001)
Table VIII shows the average proportion of 1000 day sub-periods where the GARCH residuals fail the two-sample
Kolmogorov–Smirnov and two-sample Cramer von-Mises tests for the Gaussian, student-t, and GSEV distributions for each
of the data sets.
Table IX shows the average proportion of 1000 day sub-periods where the lower 5% quantile of the GARCH residuals fail the
two-sample Kolmogorov–Smirnov and two-sample Cramer von-Mises tests for the Gaussian, student-t, and GSEV
distributions for each of the data sets.
Discussion Paper: 2014-004
37
We now turn to the evaluation of the VaR predictions. Fitting the data well in-sample is of little
consequence if we fail to forecast risk accurately out-of-sample. Table X shows the proportion of VaR
violations, the proportion of consecutive violations, and the p-values for the unconditional coverage,
serial independence and conditional coverage tests at the 99% confidence level. If the models perform
perfectly, there should be 1% of violations or one every 100 days. Across all data sets for the
Gaussian model there are approximately 130% more violations than there should be. The student-t
model and the exponentially weighted moving average (EWMA) improve on this; however violations
occur approximately twice as often as they should. The historical simulation (HS) and Clayton copula
models (CC) provide further improvement reducing the proportion of violations to 1.64% and 1.65%.
This makes sense given the arguably more appropriate dependence structure that these models
facilitate. Note that the Clayton copula model is only applicable when there is positive dependence
and it is hence omitted when this condition is not continuously met. On average, the final three
models, filtered historical simulation (FHS), the EGARCH/Gaussian copula/EVT model (GGEV), and
the GSEV model yield 1.3%, 1.38%, and 1.24% violations respectively. We have included the GGEV
model in order to gauge the benefits of accounting for asymmetric tail dependence through the
skewed-t-copula instead of the Gaussian copula which is asymptotically independent. The GSEV
model has the highest average p-value and the least number of significant unconditional coverage test-
statistics. As discussed in section IV, the correct number of violations is not a sufficient condition for
a well-behaved VaR model. It is also desirable for the violations to be independent. For the
unconditional approaches, the Gaussian, student-t, and historical simulation, we see a high proportion
of consecutive violations relative to the conditional approaches, EWMA, FHS, GGEV and GSEV.
From a practical perspective consecutive violations translate to successive demands on capital that
may be difficult to meet. For seven out of eight of the data sets the FHS, GGEV and GSEV models
pass the serial independence test. The final test of conditional coverage incorporates both
unconditional coverage and serial independence. We see that the FHS, GGEV and GSEV have the
highest p-values, with the GSEV approach having the highest p-value on average and for six out of
the eight data sets.
Discussion Paper: 2014-004
38
Table X VaR Forecast Evaluation: 1/N Portfolios, All Data Sets
β=99% Gaussian Student-t
Historical
Simulation
Clayton
Copula/ Gaussian
Marginals
Exp.
Weighted Moving
Average
Filtered Historical
Simulation GGEV GSEV
Asset Allocation
Violations 3.03% 1.97% 2.06% 2.24% 1.54% 1.62% 1.40%
UC: p-value 0.00 0.00 0.00 0.00 0.02 0.01 0.07
Cons. viol 13.0% 15.6% 12.8% 1.9% 2.8% 2.6% 3.0%
SI: p-value 0.00 0.00 0.00 0.86 0.60 0.67 0.50
CC: p-value 0.00 0.00 0.00 0.00 0.05 0.02 0.15
5 Industries
Violations 2.15% 1.90% 1.67% 1.86% 1.99% 1.26% 1.27% 1.13%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.04 0.03 0.28
Cons. viol 9.3% 9.7% 9.2% 9.9% 4.6% 1.2% 2.4% 1.4%
SI: p-value 0.00 0.00 0.00 0.00 0.06 0.97 0.41 0.86
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.13 0.07 0.55
10 Industries
Violations 2.16% 2.04% 1.72% 1.52% 2.02% 1.26% 1.32% 1.10%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.04 0.01 0.41
Cons. viol 9.9% 9.8% 9.8% 11.1% 5.3% 1.2% 1.2% 1.4%
SI: p-value 0.00 0.00 0.00 0.00 0.02 0.97 0.90 0.82
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.13 0.05 0.69
30 Industries
Violations 2.35% 2.38% 1.69% 1.41% 2.22% 1.38% 1.49% 1.30%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02
Cons. viol 9.8% 9.0% 8.2% 8.7% 4.1% 2.2% 1.0% 2.4%
SI: p-value 0.00 0.00 0.00 0.00 0.16 0.53 0.69 0.44
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.05
10 Stocks
Violations 1.86% 1.55% 1.41% 0.57% 1.54% 1.23% 1.26% 1.27%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.09 0.05 0.04
Cons. viol 6.1% 7.4% 7.0% 14.3% 1.1% 1.3% 1.3% 1.3%
SI: p-value 0.01 0.00 0.00 0.00 0.67 0.95 1.00 1.00
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.23 0.15 0.12
Size Portfolios
Violations 2.39% 2.29% 1.47% 2.32% 2.25% 1.23% 1.07% 1.03%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.08 0.55 0.82
Cons. viol 11.5% 11.4% 9.4% 11.3% 5.4% 2.5% 1.4% 1.5%
SI: p-value 0.00 0.00 0.00 0.00 0.02 0.36 0.78 0.72
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.14 0.81 0.92
Value Portfolios
Violations 2.26% 2.14% 1.66% 2.20% 1.98% 1.34% 1.32% 1.25%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.05
Cons. viol 10.2% 9.4% 10.2% 9.8% 5.5% 2.3% 3.5% 3.7%
SI: p-value 0.00 0.00 0.00 0.00 0.02 0.48 0.14 0.10
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.03 0.02 0.04
Fama-French Factors
Violations 1.95% 1.63% 1.43% 1.72% 1.18% 1.66% 1.40%
UC: p-value 0.00 0.00 0.00 0.00 0.15 0.00 0.00
Cons. viol 17.3% 21.7% 18.3% 7.1% 5.2% 5.6% 6.6%
SI: p-value 0.00 0.00 0.00 0.00 0.02 0.01 0.00
CC: p-value 0.00 0.00 0.00 0.00 0.02 0.00 0.00
Averages
Violations 2.27% 1.99% 1.64% 1.65% 2.00% 1.30% 1.38% 1.24%
UC: p-value - 0.00 0.00 0.00 0.00 0.05 0.08 0.21
Cons. viol 10.91% 11.73% 10.60% 10.84% 4.39% 2.34% 2.37% 2.65%
SI: p-value 0.00 0.00 0.00 0.00 0.23 0.61 0.57 0.56
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.09 0.14 0.31
Table X shows the summary statistics for the VaR forecasting models for the equally weighted portfolio for all eight data sets for
the period 1983-2012. From the 1000th day onwards, each model is re-estimated each day using a 1000-day estimation window.
Table X provides the proportion of violations using a 99% confidence level, the proportion of consecutive violations, and the p-
values of the unconditional, serial-independence, and conditional coverage tests.
Discussion Paper: 2014-004
39
B2. VaR Prediction – 99.5% confidence level
The results for the β=99.5% are shown in table XI. If the models perform perfectly we expect 0.5% of
violations or one every 200 days. It is again apparent that the elliptical models in general
underestimate tail risk. For the Gaussian model there are on average 260% too many violations across
all data sets. Similarly the student-t and EWMA models yield 160% and 180% too many violations.
The FHS and GGEV models produce 50% and 70% too many violations whereas the GSEV model
produces only 20% too many. The unconditional coverage p-value of the GSEV model is the highest
out of all the models for six out of eight of the data sets. We can reject the hypothesis of serial
independence for all of the unconditional models for all of the datasets and only for one of the data
sets for the FHS, GGEV and GSEV models. The FHS, GGEV and GSEV models have the highest
conditional coverage p-values on average, with the GSEV approach having the highest p-value on
average and for six out of the eight data sets. While the performance of the GSEV and GGEV models
is similar at the 99% confidence level, the GSEV model outperforms the GGEV model consistently at
the 99.5% level. The GGEV model produces 66% too many violations, the GSEV model only
produces 20%. Further, the conditional coverage p-values of the GGEV model exceed 5% in six out
of eight of the data sets and only in one data set for the GSEV model. The only difference between the
GGEV model and the GSEV model is the use of the Gaussian copula that is asymptotically
independent instead of the skewed-t copula that accommodates heterogeneous asymmetric tail
dependence. The superior performance of the GSEV model is indicative of the importance of
accounting for asymptotic dependence patterns as we look further into the tails.
Discussion Paper: 2014-004
40
Table XI VaR Forecast Evaluation: 1/N Portfolios, All Data Sets
β=99.5% Gaussian Student-t Historical Simulation
Clayton
Copula/
Gaussian Marginals
Exp.
Weighted
Moving Average
Filtered
Historical Simulation GGEV GSEV
Asset Allocation
Violations 2.11% 1.32% 1.18% 1.45% 0.92% 1.01% 0.48%
UC: p-value 0.00 0.00 0.00 0.00 0.01 0.00 0.90
Consecutive viol 14.6% 13.3% 7.4% 2.9% 4.5% 4.2% 8.3%
SI: p-value 0.00 0.00 0.04 0.53 0.21 0.25 0.05
CC: p-value 0.00 0.00 0.00 0.00 0.02 0.01 0.15
5 Industries
Violations 1.79% 1.20% 0.95% 1.43% 1.43% 0.81% 0.75% 0.61%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.21
Consecutive viol 9.4% 10.3% 9.7% 11.8% 4.3% 1.9% 2.0% 2.5%
SI: p-value 0.00 0.00 0.00 0.00 0.06 0.46 0.39 0.25
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.23
10 Industries
Violations 1.81% 1.46% 0.98% 1.17% 1.49% 0.81% 0.81% 0.64%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.11
Consecutive viol 11.0% 10.5% 6.3% 11.8% 4.1% 1.9% 1.9% 2.4%
SI: p-value 0.00 0.00 0.00 0.00 0.07 0.46 0.46 0.28
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.16
30 Industries
Violations 1.86% 1.69% 0.97% 0.98% 1.46% 0.83% 0.86% 0.58%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.36
Consecutive viol 7.4% 8.2% 7.9% 9.4% 3.2% 1.9% 1.8% 2.6%
SI: p-value 0.00 0.00 0.00 0.00 0.23 0.47 0.51 0.22
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.31
10 Stocks
Violations 1.47% 1.05% 0.77% 0.36% 1.05% 0.54% 0.75% 0.59%
UC: p-value 0.00 0.00 0.01 0.10 0.00 0.67 0.01 0.34
Consecutive viol 6.7% 7.8% 4.3% 9.1% 1.5% 2.9% 2.1% 2.7%
SI: p-value 0.00 0.00 0.05 0.00 0.72 0.19 0.38 0.22
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.38 0.02 0.30
Size Portfolios
Violations 1.89% 1.37% 0.86% 1.83% 1.52% 0.69% 0.66% 0.58%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.04 0.08 0.36
Consecutive viol 11.4% 9.0% 7.1% 10.9% 4.0% 2.2% 2.3% 2.6%
SI: p-value 0.00 0.00 0.00 0.00 0.09 0.32 0.29 0.22
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.07 0.13 0.31
Value Portfolios
Violations 1.86% 1.45% 0.92% 1.85% 1.43% 0.80% 0.65% 0.63%
UC: p-value 0.00 0.00 0.00 0.00 0.00 0.00 0.11 0.15
Consecutive viol 10.7% 9.6% 6.7% 10.8% 3.2% 1.9% 2.4% 2.4%
SI: p-value 0.00 0.00 0.00 0.00 0.21 0.44 0.28 0.26
CC: p-value 0.00 0.00 0.00 0.00 0.00 0.01 0.15 0.19
Fama-French Factors
Violations 1.60% 0.92% 0.72% 1.22% 0.66% 1.18% 0.69%
UC: p-value 0.00 0.00 0.02 0.00 0.08 0.00 0.04
Consecutive viol 17.3% 21.7% 23.4% 7.6% 2.3% 7.8% 2.2%
SI: p-value 0.00 0.00 0.00 0.00 0.29 0.00 0.32
CC: p-value 0.00 0.00 0.00 0.00 0.12 0.00 0.07
Averages
Violations 1.80% 1.30% 0.92% 1.27% 1.38% 0.76% 0.83% 0.60%
UC: p-value 0.00 0.00 0.00 0.02 0.00 0.10 0.03 0.31
Cons. viol 11.07% 11.29% 9.09% 10.65% 3.87% 2.45% 3.06% 3.23%
SI: p-value 0.00 0.00 0.01 0.00 0.24 0.35 0.32 0.23
CC: p-value - 0.00 0.00 0.00 0.00 0.08 0.04 0.22
Table XI shows the summary statistics for the eight VaR forecasting models for the equally weighted portfolio for all eight
data sets for the period 1983-2012. From the 1000th day onwards, each model is re-estimated each day using a 1000-day
estimation window. Table XI provides the proportion of violations using a 99.5% confidence level, the proportion of
consecutive violations, and the p-values of the unconditional, serial-independence, and conditional coverage tests.
Discussion Paper: 2014-004
41
Consistent with prior literature we have shown that conditional approaches outperform unconditional
approaches. We have also demonstrated the benefits of using extreme-value theory to produce more
accurate VaR predictions. Finally, we have highlighted the importance of accounting for asymmetric
dependence patterns between assets where the skewed-t copula outperforms the Gaussian copula. The
GSEV approach incorporates all of these elements to yield reliable VaR forecasts at the 99% and
99.5% confidence levels. Out of the models surveyed here, the GSEV model appears to be the most
promising.
C. Dynamic Portfolio Rebalancing
We now present the results for the nine portfolio construction models for the eight data sets. Tables
XII, and XIII summarise the performance of the models for each data set. Table XIV shows the
average summary statistics across all eight data sets.
C1. Risk Characteristics
For every data set, the out-of sample CVaRs of the Gaussian, student-t, HS, EWMA models deviate
significantly from the targeted 2% level. In contrast the FHS, GGEV and GSEV models generate
CVaRs that are quite close to the target level with the GSEV model producing the smallest CVaR
error65
on average. The GSEV approach produces the smallest CVaR error for seven out of eight of
the data sets. This mirrors findings in section V B. The kurtosis of the conditional models is also
significantly lower across all of the data sets. The FHS model produces the smallest kurtosis level on
average. The conditional models, the EWMA, FHS, GGEV and GSEV also provide a reliably lower
maximum drawdown, defined as the peak to trough return, a metric commonly used by practitioners.
The GSEV model produces the maximum drawdown with the smallest absolute value for all eight
investment problems66
.
C2. Sharpe Ratios
The unconditional approaches, 1/N, the Gaussian, student-t, historical simulation, and the Clayton
copula approach generate similar Sharpe ratios to each other. The average Sharpe ratio of the 1/N
strategy and the Gaussian strategy are almost identical at 0.51 and 0.52 as shown in table XIV. This
mirrors the results of DeMiguel, Garlappi and Uppal (2009) for constrained minimum-variance. The
relatively unimpressive performance of the Clayton copula model runs counter to Alcock and
Hatherley (2009) who show a substantial performance uplift relative to the Gaussian case. The Alcock
and Hatherley (2009) work considers triplets of industry indices. Given, that the standard Clayton
copula imposes an identical dependence structure across asset pairs and has a single dependence
parameter, it is conceivable that the performance of the approach degrades as the number of assets and
complexity increases. The GSEV approach generates the highest average Sharpe ratio of 1.13 across
all data sets. Using the Jobson and Korkie (1981) t-statistic, the difference in the Sharpe ratio of the
GSEV model is statistically different from the Gaussian model for each data set at the 1% level. On
average the uplift in the Sharpe ratio is 125%. The finding that accounting for non-Gaussian
characteristics is beneficial even for an investor that is only concerned with the first two moments is
consistent with the analytical findings of Allen, Lizieri, and Satchell (2013).
C3. Downside Performance Metrics
65
defined as the realised CVaR divided by the target CVaR 66
In two cases the difference in the maximum drawdown is negligible
Discussion Paper: 2014-004
42
The Sharpe ratio is of course a Gaussian-based metric and does not account for higher moments. We
have defined the Mean/CVaR ratio as the average excess return divided by the 99% expected
shortfall. The average Mean/CVaR ratio across our eight data sets is 5.02 versus 1.72 for the Gaussian
model and the GSEV model outperforms the Gaussian model for every data set. In addition the
improvement is statistically significant at the 1% level in all cases. Moreover the 182% average
increase in Mean/CVaR is even larger than the increase in the Sharpe ratio. The GSEV approach also
dominates the other benchmark models. In six out of the eight data sets the GSEV approach produces
the highest Mean/CVaR ratio. The GGEV model that uses the asymptotically independent Gaussian
copula produces the highest Mean/CVaR ratio in only two of the data sets. This provides further
evidence of the importance of accounting of asymmetric dependence. As we noted in table VII, the
two investment problems with the most frequent statistically significant asymmetric dependence are
the Asset Allocation and the Fama-French problems. It is perhaps of no surprise that these are the two
data sets where the GSEV approach adds the most value relative to the GGEV approach. For the
Fama-French data set, the increase in the Mean/CVaR ratio relative to the GGEV model is 13% and
statistically significant.
In general, the results for our two alternative downside performance metrics, the Mean/LPM1 and the
Mean/LPM2 tell the same story. The GSEV model generates higher downside-risk adjusted returns
than the Gaussian, student-t, historical simulation, Clayton Copula, and EWMA models. The average
Mean/LPM1 and Mean/LPM2 ratios of the Gaussian and GSEV models are 24.51 and 55.05, and
10.08 and 25.64 respectively. The results of the lower-partial moment measures mirror the Mean-
CVaR results giving us confidence that our findings are robust to the chosen performance metric.
Discussion Paper: 2014-004
43
Table XII – Dynamic Portfolio Simulation Results
1/N Gaussian Student-t
Historical
Simulation
Clayton
Copula/
Gaussian
Marginals
Exp.
Weighted
Moving
Average
Filtered
Historical
Simulation GGEV GSEV
Asset Allocation
Annualised return 4.8% 8.9% 2.9% 4.7% 17.0% 25.1% 24.9% 23.9%
Standard-dev. 16.5% 14.8% 11.7% 9.7% 14.3% 12.2% 12.3% 11.2%
Kurtosis 9.2 23.0 32.4 13.3 3.0 2.2 2.2 2.6
CVaR 99% -4.7% -4.8% -4.0% -2.9% -3.3% -2.5% -2.5% -2.3%
Risk Error (%) 133% 141% 98% 45% 67% 23% 26% 16%
Max. DD -52% -72% -64% -53% -50% -39% -39% -37%
Sharpe ratio 0.36 0.65 0.30** 0.52 1.16* 1.89** 1.86** 1.96**
Mean/CVaR 99% 1.30 2.00 0.89 1.75 5.00** 9.38** 9.12** 9.53**
Mean/LPM1 19.45 38.97 18.87 30.65 57.57** 98.83** 97.28** 103.9**
Mean/LPM2 8.21 13.51 6.14 11.25 26.20** 45.89** 45.23** 47.87**
5 Industries
Annualised return 6.6% 5.4% 4.6% 4.2% 5.4% 7.1% 8.8% 9.1% 8.8%
Standard-dev. 18.3% 13.6% 11.2% 10.2% 14.3% 13.6% 11.2% 11.3% 11.0%
Kurtosis 16.7 45.7 65.1 30.6 60.8 5.2 6.9 7.7 8.3
CVaR 99% -4.7% -3.5% -2.9% -2.6% -3.6% -3.1% -2.5% -2.6% -2.5%
Risk Error (%) 136% 74% 45% 30% 82% 57% 27% 29% 25%
Max. DD -51% -34% -29% -28% -38% -33% -27% -26% -26%
Sharpe ratio 0.44 0.45 0.45 0.45 0.44 0.57 0.80** 0.82** 0.81**
Mean/CVaR 99% 1.71 1.78 1.76 1.77 1.73 2.48** 3.56** 3.63** 3.60**
Mean/LPM1 22.79 23.38 24.15** 23.11 23.01 26.67** 38.39** 39.76** 39.22**
Mean/LPM2 9.97 10.02 9.95 10.00 9.72 12.81** 18.27** 18.80** 18.62**
10 Industries
Annualised return 6.7% 5.9% 5.0% 4.9% 5.8% 9.2% 10.4% 10.8% 10.3%
Standard-dev. 17.8% 13.7% 11.8% 10.8% 14.1% 14.1% 11.8% 11.9% 11.4%
Kurtosis 18.8 31.1 33.4 26.9 29.2 4.3 5.3 6.6 7.2
CVaR 99% -4.7% -3.4% -3.0% -2.7% -3.5% -3.1% -2.6% -2.6% -2.5%
Risk Error (%) 135% 70% 50% 37% 77% 57% 29% 31% 24%
Max. DD -52% -35% -32% -28% -36% -33% -27% -28% -23%
Sharpe ratio 0.45 0.48 0.46 0.49 0.47 0.69 0.89** 0.92** 0.91**
Mean/CVaR 99% 1.71 1.96 1.84 1.94 1.88 3.13** 4.11** 4.16** 4.19**
Mean/LPM1 23.47 24.95 24.52 25.42* 24.49 32.88** 43.25** 44.70** 44.41**
Mean/LPM2 10.14 10.88 10.50 10.97 10.69 15.88** 20.73** 21.18** 21.16**
30 Industries
Annualised return 4.6% 3.9% 3.6% 3.5% 4.1% 7.7% 8.5% 9.1% 8.2%
Standard-dev. 18.7% 13.8% 12.7% 11.3% 14.0% 15.9% 12.5% 12.8% 12.3%
Kurtosis 16.9 36.7 49.1 30.8 29.4 4.4 4.0 6.3 7.0
CVaR 99% -5.1% -3.5% -3.3% -2.9% -3.5% -3.6% -2.8% -2.9% -2.8%
Risk Error (%) 156% 75% 63% 44% 77% 80% 38% 45% 41%
Max. DD -60% -41% -36% -36% -34% -40% -30% -31% -30%
Sharpe ratio 0.33 0.34 0.34 0.36 0.35 0.54 0.71** 0.74** 0.70**
Mean/CVaR 99% 1.22 1.37 1.33 1.41 1.40 2.40** 3.25** 3.27** 3.07**
Mean/LPM1 17.49 17.80 18.20* 18.37** 18.36* 25.58** 34.32** 35.80** 33.92**
Mean/LPM2 7.46 7.65 7.61 7.93** 7.99* 12.33** 16.47** 16.90** 15.99**
Table XII shows the summary statistics for the eight portfolio construction methodologies for the Asset Allocation, 5
Industries, 10 industries, and 30 Industries data sets for the period 12/1986-12/2012. The annualised return is calculated
geometrically. The standard deviation is annualised. Kurtosis refers to excess kurtosis. CVaR 99% refers to the
conditional value at risk at the 99% confidence level. Risk error (%) refers to the percentage difference in the realised
CVaR 99% and the CVaR 99% target of 2%. The Sharpe ratio is calculated arithmetically and is annualised. Mean/CVaR
99% refers to the ratio of the annualised arithmetic excess return divided by the realised CVaR 99%. Mean/LPM1 refers
to the ratio of the annualised arithmetic excess return divided by the first lower partial moment with a return target of
zero. Mean/LPM2 refers to the ratio of the annualised arithmetic excess return divided by the second lower partial
moment with a return target of zero.
Discussion Paper: 2014-004
44
Table XIII – Dynamic Portfolio Simulation Results
1/N Gaussian Student-t
Historical
Simulation
Clayton
Copula/
Gaussian
Marginals
Exp.
Weighted
Moving
Average
Filtered
Historical
Simulation GGEV GSEV
10 Stocks
Annualised return 10.5% 9.5% 7.5% 8.0% 6.2% 13.1% 9.0% 10.0% 9.7%
Standard-dev. 18.9% 12.6% 10.6% 11.3% 11.9% 15.6% 12.4% 12.0% 11.6%
Kurtosis 7.7 7.0 7.6 4.1 6.8 3.8 2.1 2.4 2.4
CVaR 99% -4.6% -2.9% -2.5% -2.6% -2.7% -3.3% -2.6% -2.5% -2.4%
Risk Error (%) 130% 45% 26% 30% 34% 63% 29% 24% 19%
Max. DD -56% -32% -26% -34% -27% -32% -32% -27% -26%
Sharpe ratio 0.62 0.78 0.72 0.73 0.56 0.86 0.75 0.84 0.85
Mean/CVaR 99% 0.98 1.30 1.18 1.23 0.96 1.59* 1.39 1.58** 1.60**
Mean/LPM1 11.88 14.84 14.03 13.72 10.60 15.51 13.45 15.14 15.23
Mean/LPM2 5.54 7.02 6.50 6.58 5.13 7.74* 6.81 7.63** 7.68**
Size Portfolios
Annualised return 5.9% 4.5% 3.8% 3.2% 3.6% 8.2% 12.1% 12.1% 12.0%
Standard-dev. 18.4% 14.2% 12.9% 10.0% 15.1% 13.8% 10.8% 10.7% 10.6%
Kurtosis 10.8 33.9 45.5 17.5 49.4 7.3 6.1 6.9 7.2
CVaR 99% -4.8% -4.1% -3.8% -2.7% -4.5% -3.5% -2.5% -2.5% -2.4%
Risk Error (%) 141% 104% 91% 35% 124% 76% 25% 24% 22%
Max. DD -56% -47% -49% -37% -54% -43% -30% -29% -29%
Sharpe ratio 0.40 0.38 0.35 0.36 0.31 0.63* 1.10** 1.12** 1.12**
Mean/CVaR 99% 1.53 1.32 1.18 1.37 1.06 2.50** 4.80** 4.85** 4.86**
Mean/LPM1 20.66 20.28 19.26 18.71 17.30 30.81** 54.58** 55.55** 55.97**
Mean/LPM2 8.95 8.03 7.33 7.95 6.53 13.84** 25.07** 25.37** 25.49**
Value Portfolios
Annualised return 6.5% 6.6% 5.8% 4.5% 6.6% 8.6% 7.2% 7.2% 7.2%
Standard-dev. 18.2% 13.8% 11.9% 10.1% 14.4% 13.5% 11.1% 11.1% 11.0%
Kurtosis 18.1 41.4 47.9 39.8 47.3 5.7 5.7 6.3 5.9
CVaR 99% -4.9% -3.6% -3.1% -2.7% -3.8% -3.1% -2.6% -2.6% -2.5%
Risk Error (%) 144% 79% 56% 33% 88% 54% 29% 28% 26%
Max. DD -56% -40% -30% -33% -41% -28% -28% -28% -28%
Sharpe ratio 0.43 0.53 0.53 0.48 0.51 0.67 0.68 0.68 0.69
Mean/CVaR 99% 1.63 2.05 2.03 1.83 1.97 2.98** 2.94** 2.97** 3.00**
Mean/LPM1 23.07 27.83 28.32** 25.30 27.39 32.17** 32.40** 32.65** 32.93**
Mean/LPM2 9.81 11.70 11.65 10.54 11.38 15.24** 15.27** 15.39** 15.55**
Fama-French Factors
Annualised return 5.5% 6.5% 5.3% 12.1% 21.5% 25.3% 27.1% 26.2%
Standard-dev. 5.4% 14.7% 10.7% 13.1% 14.9% 13.3% 13.8% 12.1%
Kurtosis 18.2 25.6 74.3 8.5 4.4 2.6 4.3 3.5
CVaR 99% -1.5% -3.6% -2.7% -3.0% -3.0% -2.5% -2.7% -2.3%
Risk Error (%) 25% 80% 37% 49% 48% 25% 36% 16%
Max. DD -14% -41% -35% -39% -33% -38% -38% -36%
Sharpe ratio 1.00** 0.50 0.53 0.93** 1.38** 1.76** 1.80** 1.98**
Mean/CVaR 99% 3.65** 2.05 2.07 4.13** 6.98** 9.40** 9.16** 10.3**
Mean/LPM1 62.82** 27.96 33.80** 51.86** 77.27** 99.59** 103.4** 114.7**
Mean/LPM2 23.85** 11.78 12.64 23.22** 36.01** 47.15** 47.40** 52.77**
Table XIII shows the summary statistics for the eight portfolio construction methodologies for the 10 Stocks, 5 Size Portfolios,
5 Value Portfolios, and Fama-French factors for the period 12/1986-12/2012. The annualised return is calculated
geometrically. The standard deviation is annualised. Kurtosis refers to excess kurtosis. CVaR 99% refers to the conditional
value at risk at the 99% confidence level. Risk error (%) refers to the percentage difference in the realised CVaR 99% and the
CVaR 99% target of 2%. The Sharpe ratio is calculated arithmetically and is annualised. Mean/CVaR 99% refers to the ratio
of the annualised arithmetic excess return divided by the realised CVaR 99%. Mean/LPM1 refers to the ratio of the annualised
arithmetic excess return divided by the first lower partial moment with a return target of zero. Mean/LPM2 refers to the ratio
of the annualised arithmetic excess return divided by the second lower partial moment with a return target of zero.
Discussion Paper: 2014-004
45
Table XIV – Dynamic Portfolio Simulation Results
1/N Gaussian Student-t
Historical
Simulation
Clayton
Copula/ Gaussian
Marginals
Exp.
Weighted Moving
Average
Filtered Historical
Simulation GGEV GSEV
Annualised return 6.4% 6.4% 4.8% 5.6% 5.3% 11.5% 13.3% 13.8% 13.3%
Standard-dev. 16.5% 13.9% 11.7% 10.8% 14.0% 14.5% 11.9% 12.0% 11.4%
Kurtosis 14.5 30.6 44.4 21.4 37.2 4.8 4.4 5.3 5.5
CVaR 99% -4.4% -3.7% -3.2% -2.8% -3.6% -3.3% -2.6% -2.6% -2.5%
Risk Error (%) 125% 83% 58% 38% 80% 63% 28% 30% 24%
Max. DD -50% -43% -38% -36% -38% -36% -31% -31% -29%
Sharpe ratio 0.51 0.52 0.46 0.54 0.44 0.82 1.08 1.10 1.13
Mean/CVaR 99% 1.72 1.73 1.54 1.93 1.51 3.39 4.86 4.85 5.02
Mean/LPM1 25.21 24.51 22.65 25.90 20.20 37.31 51.86 53.05 55.05
Mean/LPM2 10.50 10.08 9.05 11.06 8.58 17.51 24.46 24.74 25.64
D. Economic Value Added
In the previous section we have shown that the GSEV approach produces portfolios with lower risk
errors, lower drawdowns and statistically significant uplifts in the Sharpe ratio and the downside-risk
performance metrics. We now quantify the economic significance of the uplift. Table XV provides the
value added for all the models relative to the Gaussian model based on the mean-variance utility
assumption. The three levels of risk aversion represent aggressive, moderate and conservative
investors and are derived by solving the first order conditions given observed portfolio weights67
. The
1/N strategy and the Clayton copula model result in a loss in economic value relative to the Gaussian
case for all three investors. On average the student-t model does not add value, while the historical
simulation adds value for the moderate and conservative investors. The exponentially weighted
moving average model that underlies the popular RiskMetrics™ system adds in excess of 4% p.a. for
all levels of risk aversion. Filtered historical simulation adds an average of 7.4% across all three
levels. It is the GGEV and GSEV models however that add the most value. Interestingly, on average,
the GGEV model generates a marginally higher uplift of 7.8% relative to 7.7% for the GSEV model.
Given that typical mutual fund fees are 1-2%, this is highly significant indicating that managers that
implement either model may be able to charge substantially higher management fees.
In table XVI, we show the economic value added for the power utility investor. Power utility is seen
as a more plausible model of investor preferences than mean-variance utility. Unlike the mean-
variance investor, the power utility investor displays constant relative risk aversion and rewards
positive skew and penalises excess kurtosis. The 1/N rule and the Clayton copula model again
subtract value. The average value added of the student-t and historical simulation models increase
from the mean-variance case from -0.1% to 1.4% and 1.1% to 3.5%. This makes sense given that
these are heavy-tailed models and the power utility function rewards distributions with less tail risk.
The exponentially weighted moving average model adds a similar amount of value on average under
67
Allen, Lizieri and Satchell (2013)
Table XIII shows the average summary statistics for the eight portfolio construction methodologies for across all eight data
sets for the period 12/1986-12/2012. The annualised return is calculated geometrically. The standard deviation is annualised.
Kurtosis refers to excess kurtosis. CVaR 99% refers to the conditional value at risk at the 99% confidence level. Risk error (%)
refers to the percentage difference in the realised CVaR 99% and the CVaR 99% target of 2%. The Sharpe ratio is calculated
arithmetically and is annualised. Mean/CVaR 99% refers to the ratio of the annualised arithmetic excess return divided by the
realised CVaR 99%. Mean/LPM1 refers to the ratio of the annualised arithmetic excess return divided by the first lower partial
moment with a return target of zero. Mean/LPM2 refers to the ratio of the annualised arithmetic excess return divided by the
second lower partial moment with a return target of zero.
Discussion Paper: 2014-004
46
mean-variance and power utility. The filtered historical simulation, GGEV and GSEV models
however add significantly more value, on average 2%, under power utility indicating that these
models are being rewarded for accounting for higher moments. In contrast to the value-added under
mean-variance utility, under power utility the GSEV model generates more economic value than the
GGEV model. This should not be surprising given the power utility function’s sensitivity to higher
moments. The average economic value added by the GGEV approach across the three levels of risk
aversion is 10% p.a. This is highly significant suggesting that the model deserves further attention.
Table XV – Economic Value Added: Mean-Variance Utility
1/N Student-t
Historical
Simulation
Clayton
Copula/
Gaussian
Marginals
Exp.
Weighted
Moving
Average
Filtered
Historical
Simulation GGEV GSEV
Mean-Variance
Utility: λ=.0278
Asset
Allocation -4.3% -4.9% -2.7% 7.3% 14.5% 14.3% 13.8%
5 Industries -0.4% -0.2% -0.4% -0.3% 1.6% 3.6% 4.0% 3.7%
10 Industries -0.5% -0.4% -0.3% -0.2% 3.0% 4.6% 4.9% 4.6%
30 Industries -0.8% 0.0% 0.2% 0.1% 3.0% 4.7% 5.1% 4.5%
10 Stocks -6.2% -1.5% -1.1% -2.9% 2.4% -0.4% 0.5% 0.3%
Size Portfolios 0.0% -0.4% -0.3% -1.0% 3.6% 7.8% 7.9% 7.8%
Value Portfolios -1.4% -0.3% -1.2% -0.1% 1.9% 1.2% 1.2% 1.2%
Fama-French 0.8% -0.2% 5.6% 13.2% 16.7% 17.9% 17.6%
Average -1.6% -1.0% 0.0% -0.7% 4.5% 6.6% 7.0% 6.7%
Mean-Variance
Utility: λ=.0278
Asset
Allocation -5.0% -3.9% -1.2% 7.4% 15.3% 15.1% 14.9%
5 Industries -2.1% 0.6% 0.6% -0.7% 1.6% 4.4% 4.7% 4.5%
10 Industries -2.0% 0.2% 0.6% -0.3% 2.8% 5.2% 5.5% 5.3%
30 Industries -2.7% 0.4% 1.0% 0.1% 2.2% 5.1% 5.4% 4.9%
10 Stocks -3.4% -0.9% -0.8% -2.7% 1.4% -0.3% 0.7% 0.6%
Size Portfolios -1.6% 0.0% 0.9% -1.4% 3.7% 8.8% 8.9% 8.9%
Value Portfolios -3.1% 0.3% -0.1% -0.3% 2.0% 2.0% 2.0% 2.1%
Fama-French 3.0% 1.0% 6.1% 13.2% 17.2% 18.3% 18.5%
Average -2.1% -0.3% 0.9% -0.9% 4.3% 7.2% 7.6% 7.5%
Mean-Variance
Utility: λ=.0278
Asset
Allocation -6.1% -2.1% 1.6% 7.8% 16.9% 16.7% 17.0%
5 Industries -5.4% 1.9% 2.4% -1.4% 1.6% 5.7% 6.0% 5.9%
10 Industries -4.9% 1.2% 2.2% -0.5% 2.6% 6.3% 6.6% 6.6%
30 Industries -6.2% 1.1% 2.4% 0.0% 0.9% 5.9% 6.1% 5.8%
10 Stocks -7.8% 0.1% -0.1% -2.3% -0.5% -0.2% 1.0% 1.2%
Size Portfolios -4.7% 0.8% 3.2% -2.0% 4.0% 10.7% 10.8% 10.8%
Value Portfolios -6.2% 1.3% 1.8% -0.7% 2.2% 3.4% 3.5% 3.6%
Fama-French 7.1% 3.3% 7.1% 13.1% 18.1% 18.9% 20.1%
Average -4.3% 1.0% 2.6% -1.2% 4.0% 8.3% 8.7% 8.9%
Grand Average -2.7% -0.1% 1.1% -0.9% 4.2% 7.4% 7.8% 7.7%
Table XV shows the value added of the given model relative to the Gaussian model for a mean-variance investor.
The value added is defined as the annual fee that equates the expected mean-variance utility of the given model
and the Gaussian model.
Discussion Paper: 2014-004
47
Table XVI – Economic Value Added: Power Utility
1/N Student-t
Historical
Simulation
Clayton
Copula/
Gaussian
Marginals
Exp.
Weighted
Moving
Average
Filtered
Historical
Simulation GGEV GSEV
Power Utility:
γ=5
Asset
Allocation 8.6% -3.9% -1.2% 7.6% 15.5% 15.3% 15.0%
5 Industries -1.4% 0.5% 0.6% -0.4% 1.8% 4.5% 4.8% 4.6%
10 Industries -2.3% 0.1% 0.5% -0.3% 2.9% 5.3% 5.6% 5.3%
30 Industries -2.5% 0.4% 1.0% 0.1% 2.4% 5.2% 5.5% 5.0%
10 Stocks -3.1% -1.0% -0.8% -2.7% 1.5% -0.3% 0.7% 0.6%
Size Portfolios -1.4% 0.0% 1.0% -1.5% 3.8% 8.9% 9.0% 9.0%
Value Portfolios -2.9% 0.3% -0.1% -0.4% 2.2% 2.1% 2.1% 2.2%
Fama-French 2.9% 0.9% 6.1% 13.3% 17.2% 18.3% 18.5%
Average -0.3% -0.3% 0.9% -0.9% 4.4% 7.3% 7.7% 7.5%
Power Utility:
γ=10
Asset
Allocation -5.4% -1.5% 2.7% 8.6% 18.0% 17.8% 18.2%
5 Industries -5.4% 2.3% 3.3% -1.4% 2.5% 6.8% 7.1% 7.0%
10 Industries -6.2% 1.6% 2.7% -0.4% 3.1% 7.0% 7.2% 7.3%
30 Industries -3.2% 1.2% 3.0% 0.2% 1.3% 6.7% 6.8% 6.6%
10 Stocks -8.3% 0.2% 0.0% -2.2% -0.7% -0.1% 1.1% 1.2%
Size Portfolios -4.5% 1.0% 4.3% -2.8% 4.7% 11.8% 11.9% 12.0%
Value Portfolios -6.4% 1.7% 2.7% -1.1% 3.0% 4.5% 4.6% 4.7%
Fama-French 8.0% 3.7% 7.6% 13.7% 18.7% 19.5% 20.8%
Average -3.9% 1.3% 3.3% -1.3% 4.5% 9.2% 9.5% 9.7%
Power Utility:
γ=15
Asset
Allocation -5.6% 1.3% 7.4% 10.3% 21.4% 21.1% 22.2%
5 Industries -8.8% 4.8% 7.3% -3.6% 4.5% 10.5% 10.7% 10.9%
10 Industries -9.2% 3.5% 5.4% -0.6% 3.9% 9.5% 9.6% 10.0%
30 Industries -11.1% 2.3% 5.6% 0.7% 1.1% 9.2% 9.0% 9.1%
10 Stocks -13.7% 1.4% 0.8% -1.7% -3.1% 0.1% 1.5% 1.9%
Size Portfolios -7.2% 2.1% 8.5% -5.0% 6.4% 15.7% 15.9% 16.0%
Value Portfolios -9.9% 3.8% 6.5% -2.3% 5.0% 8.2% 8.3% 8.5%
Fama-French 13.9% 6.8% 9.7% 14.7% 20.9% 21.2% 23.7%
Average -6.4% 3.2% 6.4% -2.1% 5.4% 11.9% 12.2% 12.8%
Grand Average -3.5% 1.4% 3.5% -1.4% 4.8% 9.5% 9.8% 10.0%
Table XVI shows the value added of the given model relative to the Gaussian model for a power utility investor.
The value added is defined as the annual fee that equates the expected power utility of the given model and the
Gaussian model.
Discussion Paper: 2014-004
48
VI. Conclusions.
The massive losses at systemically important institutions during the 2008 global financial crisis and
the economic sequela have reaffirmed the importance of robust risk management and portfolio
construction approaches. Financial assets deviate markedly from the idealised Gaussian distribution,
exhibiting heavy tails, volatility clustering, asymmetry and complex patterns of dependence. Up until
this point, there has been almost a complete absence of scalable techniques that can deal with all four
phenomena. We have proposed the GSEV model that incorporates ARMA/EGARCH, extreme-value
theory and the skewed-t copula. EGARCH captures heteroskedasticity and the leverage effect, and
extreme-value theory that provides a robust theoretical framework for the asymptotic behaviour of the
tails. The skewed-t copula is perhaps unique among copulas in that it can accommodate
heterogeneous asymmetric dependence in high dimensions. The GSEV model produces superior VaR
risk forecasts to a range of benchmark methodologies commonly cited in the literature. The GSEV
model also outperforms the benchmark methodologies in an out-of-sample dynamic rebalancing
framework. The approach generates higher risk adjusted returns, lower drawdowns and more accurate
out-of-sample risk levels. The approach also outperforms the GGEV approach that employs the
Gaussian copula lending support to the skewed-t copula and the importance of accounting for
asymmetric dependence. We also show that the GSEV approach generates significant economic value
for the investor, far exceeding typical active management fees.
Appendix A – Benchmark Models
A. Unconditional Models
A1. 1/N
The first benchmark model we use is the so-called naïve, 1/N portfolio that weights each asset
equally. The 1/N benchmark is relevant for four reasons. Humans have an innate behavioural
tendency to equal weight the options presented to them. Bernatzi and Thaler (2001) find that many
investors equally weight the investment choices they are presented with68
. It appears few are immune
from this bias towards equal weighting. Markowitz himself, when probed about how he allocated his
retirement investments in his TIAA-CREF account, confessed: “I should have computed the historic
covariances of the asset classes and drawn an efficient frontier. Instead…I split my contributions fifty-
fifty between bonds and equities”69
. The 1/N rule is not dependent on return or risk expectations and
is therefore devoid of the estimation risk that potentially contaminates the more complex approaches.
The 1/N rule is also easy for investors to apply, and thus a viable alternative in practice. Finally, the
1/N approach is pervasive in the literature.
A2. Gaussian
The second benchmark model we use is a portfolio constructed with the Gaussian distribution. This
serves as the key benchmark against which the alternative approaches are evaluated. Because we
68
See Huberman and Jiang (2006) for a more recent example. 69
Zweig (1998)
Discussion Paper: 2014-004
49
construct all portfolios, except for the 1/N rule, to have a CVaR of 2%, and because we use equal
return forecasts across assets, this is equivalent to the minimum variance portfolio scaled to the
targeted risk level. Despite, or perhaps because the minimum-variance portfolio does not attempt to
account for expected returns, the minimum-variance portfolio has been shown to perform well in out-
of-sample simulations. Clarke, de Silva and Thorley (2011) show for the period 1967-2009, the
minimum variance portfolio has generated higher returns than the cap-weighted US equity market
with a 50% higher Sharpe ratio. The strong performance of the minimum-variance strategy prompted
MSCI to launch the Global Minimum Volatility Indices in 2008.
The multivariate-Gaussian approach is heavily used by leading risk-model vendors and practitioners.
Portfolio variance is estimated using the standard formulae
VaR is then estimated analytically using the following standard relation.
Definition A1 Conditional Value at Risk for the Gaussian Distribution: The conditional value-
at-risk for the Gaussian distribution, where is the confidence level, is the standard Gaussian
density function, and is the standard-deviation of portfolio returns, , is given by
A3. Multivariate Student-t
The student-t distribution can accommodate heavy tails and has been widely used to capture the tails
of financial data. Indeed the use of the distribution was supported by Markowitz and Usmen (1996a,
1996b). There are several formulations of the multivariate student-t distribution70
. The student-t
distribution, unlike the Gaussian distribution allows for tail dependence; however the dependence
structure is symmetrical between upper and lower tails, which in general is unrealistic for financial
assets. We employ the Generalised Hyperbolic multivariate Student-t distribution in keeping with our
use of the Generalised Hyperbolic skewed-t copula in section III.
Definition A2 Multivariate Student-t Distribution: The density of the multivariate student-t
distribution where is the degrees of freedom, is the number of dimensions, is a vector of means,
and is a covariance matrix is given by
We use the expectation conditional maximisation (ECM) algorithm of Lui and Rubin (1998) to
estimate the parameters of the model. As in the Gaussian case, the CVaR can be estimated using
standard formulae.
Definition A3 Condition Value at Risk for Student-t Distribution: The conditional value-at-risk
for the Student-t distribution, where is the confidence level, is the degrees of freedom, and is
the density function and the density of the student-t, is given by
70
see Kotz and Nadarajah (2004) for an extensive survey
Discussion Paper: 2014-004
50
A4. Historical Simulation
Historical simulation and its variant filtered historical simulation are currently the most widely used
methods for estimating VaR at commercial banks (Pritsker, 2006, Berkowitz, Christoffersen, and
Pelletier, 2008, Pérignon and Smith, 2010). Under standard historical simulation, the VaR and CVaR
estimates are generated by bootstrapping from the historical portfolio return generated using the
current portfolio weights. There is no need to specify the univariate marginals or the multivariate
dependence structure, and the potential for model risk is attenuated. Further, the approach is fast, and
intuitive. The approach is unable to adapt to shifts in the level of volatility and produce single period
returns that are outside of what has been observed historically. It is also well known that historical
simulation tends to perform poorly as we move further out into the tails and will perform poorly if
data is scarce. Historical simulation is not assumption free as is sometimes suggested. The implicit
assumptions are that the standardised distribution is stationary and the dependence structure is
persistent.
A5. Gaussian/Clayton Copula Model - Alcock and Hatherley (2009)
Alcock and Hatherley (2009) among others71
use the Clayton copula to capture lower-tail dependence,
and demonstrate economically significant gains from accounting for asymmetric dependence. The
Clayton copula was developed to model the incidence of disease within families (Clayton, 1975). The
Archimedean family of copulas to which the Clayton copula belongs have been formulated to have
convenient mathematical properties and are amenable to maximum likelihood estimation and
simulation. The key downside of Archimendean copulas in general is the lack of generalisability to
dimensions greater than two. Nested and Hierarchical approaches have been proposed to remedy this
shortcoming; however it is not obvious that these techniques provide definitive advantages. We
follow Alcock and Hatherley (2009), use Gaussian marginals and impose a symmetrical dependence
structure across assets. We use the inference function for margins approach, estimating the mean and
variance of the marginal distributions via maximum likelihood estimation, and then deriving the
copula parameter, .
The n-variate Clayton copula is given by
(7)
where
The Clayton copula density is given by
(8)
where is the Euler function.
71
Rodrigues (2003), Hurlimann (2004), Viebig and Poddig (2010)
Discussion Paper: 2014-004
51
We maximise (8) to yield an estimate of at each rebalance point for each data set72. To simulate the
Clayton copula, we use the Marshall-Okin (1998) method. The approach is fast and scalable.
Algorithm A1 Simulation of Clayton Copula: Marshall-Olkin (1988) method
1. Draw an independent n-dimensional vector from the gamma distribution:
2. Draw by i.i.d. realizations from the uniform distribution:
3. Generate Clayton copula uniforms for using
In each panel of figure 9 we show 10,000 draws from the Gaussian bivariate copula with
(left-hand panel), and the Clayton copula with (middle panel) and (right-hand panel).
B. Conditional Models
B1. Exponentially Weighted Moving Average (EWMA)
In 1989, JP Morgan developed the RiskMetrics™ model in order to quantify the firm’s market risk.
The approach uses an exponentially weighted covariance matrix to give more weight to recent
observations and to dampen the effect of observations falling out of the estimation window. The
exponentially weighed covariance matrix is given by
(9)
The EWMA approach can be seen as a special case of a GARCH process without a mean-reversion
term as follows.
(10)
Now if and , equation (11) reduces to
72
The standard Clayton copula can only accommodate positive dependence. We therefore ensure positive
dependence before we use MLE by estimating Kendaull’s tau.
0
0.5
1
0 0.5 1
Gaussian Copula, ρ=0.75
0
0.5
1
0 0.5 1
Clayton Copula, α=2
0
0.5
1
0 0.5 1
Clayton Copula, α=6
Discussion Paper: 2014-004
52
which is equivalent to the recursive form of (10).
The EWMA approach assumes that returns are conditionally Gaussian. The simplicity of the EWMA
approach has contributed to the widespread employment of the technique through the RiskMetrics
software and by other risk model vendors. We employ a decay factor of for daily data, in line
the RiskMetrics Technical Document and Fleming, Kirby and Ostdiek (2001).
B2. Filtered Historical Simulation
Filtered historical simulation (FHS) as proposed by Barone-Adesi, Bourgoin and Giannopouos (1998)
bootstraps from the historical standardised residuals, and scales by conditional volatility using a
GARCH model. FHS is a rare example of an approach that can produce stochastic volatility, skew,
heavy tails and tail dependence in high dimensions. The approach does not impose a parametric
distribution on the return innovations. Nor is it necessary to model the multivariate dependence
structure. Unlike standard historical simulation the approach is conditional and adapts to changes in
volatility. Again, the approach is not assumption free, implicitly assuming that the standardised
distribution is stationary and the dependence structure is persistent.
Algorithm 3.4 - Filtered Historical Simulation
1. Fit a GARCH process yielding residuals, and conditional volatility estimates
2. Standardise the residuals as follows
3. Bootstrap 10,000 times from the residuals to yield a simulated vector
where days
4. Scale the bootstrapped residuals using the GARCH volatility estimates
where is the simulated return in period , and
is the simulated volatility
estimate in period using the updated GARCH process73
B3 GGEV Model
Finally, we evaluate the performance of a variant of the GSEV model that employs the Gaussian
copula instead of the skewed-t copula. The motivation is to shed light on the importance of accounting
for tail dependence through the skewed-t copula in the GSEV model.
73
Note that in the first simulated period we use
where is estimated from the
final estimated residual in the GARCH process
Discussion Paper: 2014-004
53
References
Aas, K, and I Haff, 2006, The generalized hyperbolic skew student's t-distribution, Journal of
Financial Econometrics 4, 275-309.
Acerbi, C., and D. Tasche, 2002, On the coherence of expected shortfall, Journal of Banking
& Finance 26, 1487-1503.
Adler, Timothy, and Mark Kritzman, 2007, Mean–variance versus full-scale optimisation: In
and out of sample, Journal of Asset Management 7, 302-311.
Agarwal, V., and N. Y. Naik, 2004, Risks and portfolio decisions involving hedge funds,
Review of Financial Studies 17, 63-98.
Alcock, J, and A Hatherley, 2009, Asymmetric dependence between domestic equity indices
and its effect on portfolio construction, Australian Actuarial Journal 15.
Alles, L. A., and J. L. Kling, 1994, Regularities in the variation of skewness in asset returns,
Journal of Financial Research 17, 427-438.
Amin, G. S., and H. M. Kat, 2003, Stocks, bonds, and hedge funds - not a free lunch, Journal
of Portfolio Management 29, 113
Ang, A., and G. Bekaert, 2002, International asset allocation with regime shifts, Review of
Financial Studies 15, 1137-1187.
Ang, A., and J. Chen, 2002, Asymmetric correlations of equity portfolios, Journal of
Financial Economics 63.
Artzner, P., F. Delbaen, J. M. Eber, and D. Heath, 1999, Coherent measures of risk,
Mathematical Finance 9, 203-228.
Azzalini, A., and A. Capitanio, 2003, Distributions generated by perturbation of symmetry
with emphasis on a multivariate skew t-distribution, Journal of the Royal Statistical
Society Series B-Statistical Methodology 65, 367-389.
Baillie, R. T., and T. Bollerslev, 1989, The message in daily exchange-rates - a conditional-
variance tale, Journal of Business & Economic Statistics 7, 297-305.
Barone-Adesi, G, F Bourgoin, and K Giannopoulos, 1998, Don't look back, Risk 11.
Bauwens, L., S. Laurent, and J. V. K. Rombouts, 2006, Multivariate garch models: A survey,
Journal of Applied Econometrics 21, 79-109.
Bawa, V. S., and E. B. Lindenberg, 1977, Capital-market equilibrium in a mean-lower partial
moment framework, Journal of Financial Economics 5, 189-200.
Beedles, W. L., 1979, Asymmetry of market returns, Journal of Financial and Quantitative
Analysis 14, 653-660.
Beine, Michel, Antonio Cosma, and Robert Vermeulen, 2010, The dark side of global
integration: Increasing tail dependence, Journal of Banking & Finance 34.
Benartzi, S., and R. H. Thaler, 2001, Naive diversification strategies in defined contribution
saving plans, American Economic Review 91, 79-98.
Berkowitz, J., and J. O'Brien, 2002, How accurate are value-at-risk models at commercial
banks?, Journal of Finance 57, 1093-1111.
Berkowitz, Jeremy, Peter Christoffersen, and Denis Pelletier, 2011, Evaluating value-at-risk
models with desk-level data, Management Science 57, 2213-2227.
Black, F, 1976, Studies of stock price changes, Proceedings of the 976 Meeting of the
American Statistical Association, Business and Economic Statistics Section.
Black, F, and R Litterman, 1992, Global portfolio optimization, Financial Analysts Journal
28-43.
Bloomfield, T, R Leftwich, and JB Long, 1977, Portfolio strategies and performance, Journal
of Financial Economics 5, 201-218.
Bollerlev, T, 2010. Glossary to ARCH (GARCH*) (University Press, Oxford Scholarship
Online).
Discussion Paper: 2014-004
54
Bollerslev, T., 1990, Modeling the coherence in short-run nominal exchange-rates - a
multivariate generalized arch model, Review of Economics and Statistics 72, 498-505.
Boudt, Kris, Brian Peterson, and Christophe Croux, 2008, Estimation and decomposition of
downside risk for portfolios with non-normal returns, Journal of Risk 11, 79-103.
Branco, M. D., and D. K. Dey, 2001, A general class of multivariate skew-elliptical
distributions, Journal of Multivariate Analysis 79, 99-113.
Brooks, C., and G. Persand, 2003, Volatility forecasting for risk management, Journal of
Forecasting 22, 1-22.
Campbell, R., R. Huisman, and K. Koedijk, 2001, Optimal portfolio selection in a value-at-
risk framework, Journal of Banking & Finance 25, 1789-1804.
Carhart, M. M., 1997, On persistence in mutual fund performance, Journal of Finance 52,
57-82.
Chekhlov, A, S Uryasev, and M Zabarakin, 2005, Drawdown measure in portfolio
optimization, International Journal of Theoretical and Applied Finance 8.
Chopra, V. K., and W. T. Ziemba, 1993, The effect of errors in means, variances, and
covariances on optimal portfolio choice, Journal of Portfolio Management 19, 6-11.
Christie, A. A., 1982, The stochastic-behavior of common-stock variances - value, leverage
and interest-rate effects, Journal of Financial Economics 10, 407-432.
Christoffersen, P, 2009. Value-at-risk models (Heidelberg: Springer, Berlin).
Christoffersen, P. F., 1998, Evaluating interval forecasts, International Economic Review 39,
841-862.
Clarke, Roger, Harindra de Silva, and Steven Thorley, 2011, Minimum-variance portfolio
composition, Journal of Portfolio Management 37, 31-45.
Cootner, P, 1964, The random character of stock market prices, M.I.T. Press.
Demarta, S., and A. J. McNeil, 2005, The t copula and related copulas, International
Statistical Review 73, 111-129.
DeMiguel, Victor, Lorenzo Garlappi, and Raman Uppal, 2009, Optimal versus naive
diversification: How inefficient is the 1/N portfolio strategy?, Review of Financial
Studies 22, 1915-1953.
Duchin, Ran, and Haim Levy, 2009, Markowitz versus the Talmudic portfolio diversification
strategies, Journal of Portfolio Management 35, 71-+.
Efron, B, and R Tibshirani, 1993. An introduction to the bootstrap (Chapman and Hall, New
York).
Emmer, S., C. Kluppelberg, and R. Korn, 2001, Optimal portfolios with bounded capital at
risk, Mathematical Finance 11, 365-384.
Engle, R., 2002, Dynamic conditional correlation: A simple class of multivariate generalized
autoregressive conditional heteroskedasticity models, Journal of Business &
Economic Statistics 20, 339-350.
Erb, C, C Harvey, and T Viskanta, 1994, Forecasting international equity corrleations,
Financial Analysts Journal November-December.
Fama, E. F., 1963, Mandelbrot and the stable paretian hypothesis, Journal of Business 36,
420-429.
Fama, E. F., 1965, The behavior of stock-market prices, Journal of Business 38, 34-105.
Fama, E. F., and K. R. French, 1993, Common risk-factors in the returns on stocks and
bonds, Journal of Financial Economics 33, 3-56.
Fernandez, C., and M. F. J. Steel, 1998, On bayesian modeling of fat tails and skewness,
Journal of the American Statistical Association 93, 359-371.
Fishburn, P. C., 1977, Mean-risk analysis with risk associated with below-target returns,
American Economic Review 67, 116-126.
Fisher, R, and L Tippet, 1928, Limiting forms of the frequency distribution of the largest and
Discussion Paper: 2014-004
55
smallest member of a sample, Proceedings of the Cambridge Philosophical Society.
Fleming, J., C. Kirby, and B. Ostdiek, 2001, The economic value of volatility timing, Journal
of Finance 56, 329-352.
Fleming, J., C. Kirby, and B. Ostdiek, 2003, The economic value of volatility timing using
"Realized" Volatility, Journal of Financial Economics 67, 473-509.
Fletcher, J, 1997, An examination of alternative estimators of expected returns in mean-
variance analysis, Journal of Financial Research 20, 129-143.
Frost, P. A., and J. E. Savarino, 1988, For better performance - constrain portfolio weights,
Journal of Portfolio Management 15, 29
Glosten, L. R., R. Jagannathan, and D. E. Runkle, 1993, On the relation between the expected
value and the volatility of the nominal excess return on stocks, Journal of Finance 48,
1779-1801.
Hansen, B. E., 1994, Autoregressive conditional density-estimation, International Economic
Review 35, 705-730.
Hansen, P. R., and A. Lunde, 2005, A forecast comparison of volatility models: Does
anything beat a GARCH(1,1)?, Journal of Applied Econometrics 20, 873-889.
Harvey, C. R., and A. Siddique, 1999, Autoregressive conditional skewness, Journal of
Financial and Quantitative Analysis 34, 465-487.
Hilal, S., S. H. Poon, and J. Tawn, 2011, Hedging the black swan: Conditional
heteroskedasticity and tail dependence in S&P 500 and VIX, Journal of Banking &
Finance 35, 2374-2387.
Hlawatsch, Stefan, and Peter Reichling, 2010, A framework for loss given default validation
of retail portfolios, Journal of Risk Model Validation 4, 23-48.
Hong, Yongmiao, Jun Tu, and Guofu Zhou, 2007, Asymmetries in stock returns: Statistical
tests and economic evaluation, Review of Financial Studies 20, 1547-1581.
Hu, W, and A Kercheval, 2007, Risk management with generalized hyperbolic distributions,
Proceedings of the Fourth International Conference on Financial Engineering and
Applications.
Hu, Wenbo, and Alec N. Kercheval, 2010, Portfolio optimization for student t and skewed t
returns, Quantitative Finance 10, 91-105.
Huberman, G., and W. Jiang, 2006, Offering versus choice in 401(k) plans: Equity exposure
and number of funds, Journal of Finance 61, 763-801.
Ibbotson, R, and P Kaplan, 2000, Does asset allocation explain 40, 90, or 100 percent of
performance?, Financial Analysts Journal January/February.
Jagannathan, R., and T. S. Ma, 2003, Risk reduction in large portfolios: Why imposing the
wrong constraints helps, Journal of Finance 58, 1651-1683.
Jegadeesh, N., 1990, Evidence of predictable behavior of security returns, Journal of Finance
45, 881-898.
Jobson, J. D., and B. M. Korkie, 1981, Performance hypothesis-testing with the sharpe and
treynor measures, Journal of Finance 36, 889-908.
Joe, H, and J Xu, 1996, The estimation method of inference functions for margins for
multivariate models, Technical Report no, 166, Department of Statistics, University of
British Columbia.
Jondeau, E., and M. Rockinger, 2012, On the importance of time variability in higher
moments for asset allocation, Journal of Financial Econometrics 10, 84-123.
Jones, M. C., and M. J. Faddy, 2003, A skew extension of the t-distribution, with
applications, Journal of the Royal Statistical Society Series B-Statistical Methodology
65, 159-174.
Jones, S, 2009, The formula that felled wall st, Financial Times.
Jorion, P., 1986, Bayes-stein estimation for portfolio analysis, Journal of Financial and
Discussion Paper: 2014-004
56
Quantitative Analysis 21, 279-292.
Jorion, P., 1991, Bayesian and CAPM estimators of the means - implications for portfolio
selection, Journal of Banking & Finance 15, 717-727.
Kahneman, D., and A. Tversky, 1979, Prospect theory – an Analysis of decision under risk,
Econometrica 47, 263-291.
Karolyi, G. A., and R. M. Stulz, 1996, Why do markets move together? An investigation of
us-japan stock return comovements, Journal of Finance 51.
Keating, C, and W Shadwick, 2002, An introduction to omega, (The Finance Development
Centre).
Kennickell, A, 2011, Tossed and turned: Wealth dynamics of u.S. Households, 2007-2009,
Finance and Economics Discussion Series 2011-51 (Federal Reserve Board,
Washington D.C.).
Kirby, C., and B. Ostdiek, 2012, It's all in the timing: Simple active portfolio strategies that
outperform naive diversification, Journal of Financial and Quantitative Analysis 47,
437-467.
Kollo, T, and G Pettere, 2010, Parameter estimation and application of the multivariate skew
t-copula, in P Jaworski, ed.: Copula theory and its applications, lecture notes in
statistics 198.
Kraus, A., and R. H. Litzenberger, 1976, Skewness preference and valuation of risk assets,
Journal of Finance 31, 1085-1100.
Kroll, Y., H. Levy, and H. M. Markowitz, 1984, Mean-variance versus direct utility
maximization, Journal of Finance 39, 47-61.
Kuester, K, S Mittnik, and M Paolella, 2006, Value-at-risk prediction: A comparison of
alternative strategies, Journal of Financial Econometrics 4.
Lauprete, G. J., A. M. Samarov, and R. E. Welsch, 2002, Robust portfolio optimization,
Metrika 55, 139-149.
Levy, H., and H. M. Markowitz, 1979, Approximating expected utility by a function of mean
and variance, American Economic Review 69, 308-317.
Li, D, 2000, On default correlation: A copula function approach, Journal of Fixed Income 9,
43-54.
Lintner, J, 1972, Equilibrium in a random walk and lognormal security market, Harvard
Institute of Economic Research.
Lizieri, Colin, Stephen Satchell, and Qi Zhang, 2007, The underlying return-generating
factors for REIT returns: An application of independent component analysis, Real
Estate Economics 35, 569-598.
Longin, F., and B. Solnik, 2001, Extreme correlation of international equity markets, Journal
of Finance 56, 649-676.
Luciano, E., and R. Kast, 2001, A value at risk approach to background risk, Geneva Papers
on Risk and Insurance Theory 26, 91-115.
Mandelbrot, B., 1963, The variation of certain speculative prices, Journal of Business 36,
394-419.
Markowitz, H. M., and N. Usmen, 1996, The likelihood of various stock market return
distributions .1. Principles of inference, Journal of Risk and Uncertainty 13, 207-219.
Markowitz, H. M., and N. Usmen, 1996, The likelihood of various stock market return
distributions .2. Empirical results, Journal of Risk and Uncertainty 13, 221-247.
Markowitz, Harry, 1952, Portfolio selection, Journal of Finance 7, 77-91.
Marshall, A. W., and I. Olkin, 1988, Families of multivariate distributions, Journal of the
American Statistical Association 83, 834-841.
Martellini, Lionel, and Volker Ziemann, 2010, Improved estimates of higher-order
comoments and implications for portfolio selection, Review of Financial Studies 23,
Discussion Paper: 2014-004
57
1467-1502.
McAleer, Michael, 2009, The ten commandments for optimizing value-at-risk and daily
capital charges, Journal of Economic Surveys 23, 831-849.
McAleer, Michael, and Bernardo Da Veiga, 2008, Single-index and portfolio models for
forecasting value-at-risk thresholds, Journal of Forecasting 27, 217-235.
McNeil, A, and R Frey, 2000, Estimation of tail related risk measures for heteroskedastic
time series: An extreme value approach, Journal of Empirical Finance 7.
McNeil, A, R Frey, and P Embrechts, 2005. Quantitative risk management: Concepts,
techniques and tools (Princeton University Press).
Memmel, Christoph, 2003, Performance hypothesis testing with the sharpe ratio, Finance
Letters 1.
Michaud, R, 1998. Efficient asset management (Harvard Business School Press, New York).
Mikosch, T, 2005, Copulas: Tales and facts, Working Paper (Laboratory of Actuarial
Mathematics, University of Copenhagen).
Misoerk, A, and Rafal Weron, 2010, Heavy-tailed distributions in var calculations, Hugo
Steinhauss Centre Research Report, University of Technology, Poland.
Nadarajah, S., and S. Kotz, 2004, Multitude of bivariate t distributions, Statistics 38, 527-539.
Nelson, D. B., 1991, Conditional heteroskedasticity in asset returns - a new approach,
Econometrica 59, 347-370.
Nystrom, K, and J Skoglund, 2002, A framework for scenario based risk estimation, Working
Paper (SAS Institute and the University of Umea).
Pastor, L., and R. F. Stambaugh, 1999, Costs of equity capital and model mispricing, Journal
of Finance 54.
Pastor, L., and R. F. Stambaugh, 2000, Comparing asset pricing models: An investment
perspective, Journal of Financial Economics 56.
Patton, A, 2004, On the out-of-sample importance of skewness and asymmetric dependence
for asset allocation, Journal of Financial of Econometrics 2.
Pedersen, Christian, and S Satchell, 1998, Choosing the right risk measure: A survey, in S.B.
Dahiya, ed.: The current state of economic science (Spellbound Publications).
Perignon, Christophe, and Daniel R. Smith, 2008, A new approach to comparing var
estimation methods, Journal of Derivatives 16, 54-66.
Politis, D, and H White, 2004, Automatic block-length selection for the dependent bootstrap,
Econometric Reviews 23.
Pritsker, M., 2006, The hidden dangers of historical simulation, Journal of Banking &
Finance 30, 561-582.
Pulley, L. B., 1981, A general mean-variance approximation to expected utility for short
holding periods, Journal of Financial and Quantitative Analysis 16, 361-373.
Sahu, S. K., D. K. Dey, and M. D. Branco, 2003, A new class of multivariate skew
distributions with applications to bayesian regression models, Canadian Journal of
Statistics-Revue Canadienne De Statistique 31, 129-150.
Salmon, F, 2009, Recipe for disaster: The formula that killed Wall Street, Wired magazine,
Wired.
Sibuya, M., 1960, Bivariate extreme statistics .1, Annals of the Institute of Statistical
Mathematics 11.
Simaan, Y., 1993, What is the opportunity cost of mean-variance investment strategies,
Management Science 39, 578-587.
Sklar, A, Fonctions de réparition à n dimensions et leurs marges, Publ. Inst. Statist. Univ.
Paris 8, 229-231.
Skoglund, Jimmy, Donald Erdman, and Wei Chen, 2010, The performance of value-at-risk
models during the crisis, Journal of Risk Model Validation 4, 3-21.
Discussion Paper: 2014-004
58
Smith, Michael S., Quan Gan, and Robert J. Kohn, 2012, Modelling dependence using skew t
copulas: Bayesian inference and applications, Journal of Applied Econometrics 27,
500-522.
Solnik, B, 1993, The performance of international asset allocation strategies using
conditioning information, Journal of Empirical Finance 1, 33-55.
Sortino, F.A, 2010. The sortino framework for constructing portfolios (Elsevier Science).
Stock, J, and M Watson, 1999, A comparison of linear and nonlinear univariate models for
forecasting macroeconomic time series, in R Engle, and H White, eds.: Cointegration
and causality: A festschrift in honor of clive w.J. Granger (Oxford University Press,
Oxford).
Sun, W., S. Rachev, S. V. Stoyanov, and F. J. Fabozzi, 2008, Multivariate skewed student's t
copula in the analysis of nonlinear and asymmetric dependence in the German equity
market, Studies in Nonlinear Dynamics and Econometrics 12, 36.
Swanson, N. R., and H. White, 1997, A model selection approach to real-time
macroeconomic forecasting using linear models and artificial neural networks, Review
of Economics and Statistics 79, 540-550.
Taleb, N, 2007. The Black Swan (Random House).
Uryasev, S., and R. T. Rockafellar, 2001, Conditional value-at-risk: Optimization approach,
Stochastic Optimization: Algorithms and Applications 54, 411-435.
Viebig, Jan, and Thorsten Poddig, 2010, Modeling extreme returns and asymmetric
dependence structures of hedge fund strategies using extreme value theory and copula
theory, Journal of Risk 13.
Von Puelz, A., 2001, Value-at-risk based portfolio optimization, Stochastic Optimization:
Algorithms and Applications 54, 279-302.
Weiss, A, 1984, Arma models with arch errors, Journal of Time Series Analysis 5, 129-43.
Welch, Ivo, and Amit Goyal, 2008, A comprehensive look at the empirical performance of
equity premium prediction, Review of Financial Studies 21, 1455-1508.
West, K. D., H. J. Edison, and D. Cho, 1993, A utility-based comparison of some models of
exchange-rate volatility, Journal of International Economics 35, 23-45.
Westerfield, J. M., 1977, Examination of foreign-exchange risk under fixed and floating rate
regimes, Journal of International Economics 7, 181-200.
Xiong, James X., and Thomas M. Idzorek, 2011, The impact of skewness and fat tails on the
asset allocation decision, Financial Analysts Journal 67, 23-35.
Zakoian, J. M., 1994, Threshold heteroskedastic models, Journal of Economic Dynamics &
Control 18, 931-955.
Zellner, A., and V. K. Chetty, 1965, Prediction and decision-problems in regression-models
from the bayesian point of view, Journal of the American Statistical Association 60.
Zweig, J, 1998, Five lessons from America's top pension fund, Money.