backtesting general spectral risk measures with application to expected shortfall.pdf

BACKTESTING GENERAL SPECTRAL RISK MEASURES

WITH APPLICATION TO EXPECTED SHORTFALL

NICK COSTANZINO AND MIKE CURRAN

Abstract. In this note, we present a simple, practical and easily im-plementable coverage test to backtest any spectral risk measure. Ourtest gives a single decision at a specified confidence level and is perfectlyconsistent with the binomial test for VaR. Particular attention is givento the special case of Expected Shortfall.

Contents

1. Background and Motivation 12. VaR and Spectral Risk Measures 33. Deriving the Backtest Statistic and Coverage Test 44. Application to Expected Shortfall 95. Conclusions 10Acknowledgements 10References 11

1. Background and Motivation

Let {ti}Ni=0 be a sequence of historical trading days and {Li}Ni=1 the cor-responding realized trading losses. One way to assess the accuracy of thecalculation of VaR for those trading days is to backtest VaR using the fol-lowing coverage test.

For each trading day i = 1, ..., N , let VaRi(α) denote the VaR at level

α and X(i)VaR(α) := 1{Li≤V aRi(α)} ∈ {0, 1} denote the VaR failure indicator.

We define the VaR failure rate XNVaR(α) ∈ [0, 1] for level α ∈ [0, 1] over N

trading days as

Date: February 21, 2015.

1

XNVaR(α) :=

1

N

N∑i=1

X(i)VaR(α)

=1

N

N∑i=1

1{Li≤VaRi(α)}.

(1.1)

Hence XNVaR is simply the average number of VaR breaches at level α over

N trading days. By appealing to the Central Limit Theorem XNVaR is ap-

proximately normal under the null-hypothesis with expected value α andvariance α(1− α)/N . Thus XN

VaR admits a Z-test with Z-score

ZNVaR(α) =√N

(X̂N

VaR(α)− α√α(1− α)

)(1.2)

where X̂NVaR is the realized/empirical VaR failure rate (1.1) over the N

trading days. We can then define a one-sided or two-sided Z-test throughthe cumulative distribution Φ. For instance, Φ(ZNVaR(α)) for one-sided andΦ(2|ZNVaR(α)|) for two-sided.

In this note we develop an extension of the VaR coverage backtest toany Spectral Risk Measure, in particular to Expected Shortfall. Similar tothe VaR coverage test which serves as the basis for the Basel Committee’sTraffic Light test, our coverage test does not test independence (for more onVaR and backtesting methodologies c.f. [9, 16]). As we shall see in Section3 the coverage test for Spectral Risk Measures essentially amounts to a jointtest of a continuum of weighted VaR quantiles and gives a single decision ata fixed confidence level. The key to the method is to show that the SpectralMeasure Failure Rate defined by (3.2) is asymptotically normal under thenull hypothesis and therefore admits a formal Z-test as in Theorem 3.5.

The motivation for developing a coverage test for Expected Shortfall andother Spectral Risk Measures is that ES is gaining traction as a risk measureand could soon replace VaR as the workhorse for Risk Management. This isborne from Regulators desire to better capture tail risk, since by definitionVaR does not provide any insight into the potential losses in the event ofa loss in excess of the VaR level. For instance it’s Fundamental Review ofthe Trading Book [13], the Basel Committee proposes to replace VaR withES which captures tail risk better. In addition, since ES is a coherent riskmeasure [3], it allows for more straightforward allocation of capital to theunderlying books of exposure and hence reasonable to use for determiningstandardized capital requirements.

2

Despite the potentially prominent role of Expected Shortfall in Risk Man-agement, accepted methods to backtest it are still elusive. Worse, therehave been growing concerns that ES is in fact not even backtestable andthus should not be adopted as a risk measure. The claims that ES is notbacktestable range from mainstream articles [10, 11] to research articles [12].The claim can be traced back to some observations that ES does not possesssome mathematical properties such as elicitability [14]. While it is true thatif a risk measure has elicitable structure there is particular way to backtest itvia the scoring function, lack of elicitability need not imply lack of backtesta-bility. For example, as noted in [7], backtesting methods for VaR that areused in practice do not use any structure from elicitability. In fact, recentlyAcerbi & Szekely [2] make a strong argument that elicitability has nothingto do with backtesting at all, but rather only model selection. Hence, themathematical property of elicitability may not be as important as thoughtin the practical implementation of an backtesting algorithm.

In the midst of claims that ES is not backtestable, some backtestingapproaches have nonetheless been proposed. These include the censoredGaussian approach introduced by Berkowitz [8], the functional delta ap-proach used by Kerhoff and Melenberg [17] and the saddle-point techniqueintroduced by Wong [18] and later extended by Graham and Pal [15]. Asin every statistical method, each of these different approaches have theirstrengths and weaknesses, and our method should be seen as complemen-tary to these. In particular, our test gives a single decision at a specifiedconfidence level.

2. VaR and Spectral Risk Measures

Before deriving our coverage test in Section 3, we review the basic defini-tions of VaR and Spectral Risk Measures for completeness.

Definition 2.1. (Value-at-Risk) Suppose X is a random variable with cu-mulative distribution function FX . The Value-at-Risk (VaR) of a randomvariable X with confidence level α ∈ (0, 1) is given by

(2.1) VaR(α) := inf{z ∈ R : FX(z) ≥ α}.

VaR has been criticized as a risk measure in that it is not subbadditiveand does not take into account the severity of losses beyond level α. In [4, 5]Artzner, Delbaen, Eber, and Heath outline some basic properties which anygood risk measure should posses and call these measures Coherent. Oneclass of such measures are Spectral Risk Measures, which can be thought ofweighing VaR by a spectrum φ having particular properties.

3

Definition 2.2. (Admissible Risk Spectrum) We say φ ∈ L1([0, 1]) is anadmissible risk spectrum if

i. φ is non-negativeii. φ is non-increasingiii. ‖φ‖1 = 1.

Definition 2.3. (Spectral Risk Measure) Suppose X is a random variablewith cumulative distribution function FX and φ is an admissible risk spec-trum. Then we say that Mφ defined by

(2.2) Mφ :=

∫ 1

0φ(p)VaR(p)dp

is a Spectral Risk Measure with risk spectrum φ.

Note that Mφ depends on the distribution X through VaR (2.1) eventhough we suppressed this dependence in our notation.

Remark 2.4. If φ(p) = Diracα(p), then MDiracα = VaR(α). However,Diracα(p) violates both Properties ii and iii in 2.3 and thus not an admissiblerisk spectrum. This explains why VaR is not a spectral risk measure.

In [1], C. Acerbi proved that Spectral Risk Measures (2.2) are Coherent.These properties have made Spectral Risk Measures, and ES in particular,a favourite of Regulators. For instance it is the measure that is proposedto replace VaR under Basel III. For ES to be widely adopted, reasonablebacktesting methods must be developed. We now construct the coveragebacktest.

3. Deriving the Backtest Statistic and Coverage Test

In analogy with the VaR failure rate XNVaR described in the Introduction,

we define a Spectral Risk Measure failure rate XNφ which will serve as our

test statistic for our coverage test for Spectral Risk Measures.

Definition 3.1. (Spectral Risk Measure Failure Rate) For an admissible

spectrum φ, let X(i)SR ∈ [0, 1] be defined by

X(i)SR(φ) =

∫ 1

0φ(p)1{Li≤VaRi(p)}dp.(3.1)

We define the Spectral Risk Measure failure rate XNSR ∈ [0, 1] for admissible

risk spectrum φ as4

XNSR(φ) :=

1

N

N∑i=1

X(i)SR(φ)

=1

N

N∑i=1

∫ 1

0φ(p)1{Li≤VaRi(p)}dp.

(3.2)

Remark 3.2. In contrast to VaR where failure is a discrete event with valueeither zero or one (i.e. X

(i)VaR := 1{Li≤VaRi(α)} ∈ {0, 1}) the corresponding

failure for Spectral Measures is a continuous variable with value between

zero and one (i.e. X(i)SR :=

∫ 10 φ(p)1{Li≤VaRi(α)} ∈ [0, 1]) and depends on the

severity of the failure.

To understand this better, we may alternatively write the Spectral RiskMeasure Failure Rate as

XNSR(φ) :=

1

N

N∑i=1

∫ 1

0φ(p)1{Li≤VaRi(p)}dp

=1

N

N∑i=1

∫ 1

0φ(p)1{VaR−1

i (Li)≤p}dp

=1

N

N∑i=1

∫ 1

VaR−1i (Li)

φ(p) dp

= 1− 1

N

N∑i=1

Φ(VaR−1i (Li))

(3.3)

where Φ′ = φ.

Definition 3.3. (Null Hypothesis for Coverage Test) The null-hypothesisfor the Spectral Risk Measure Coverage Test is

(3.4) H0 : {X(i)φ }

Ni=1 ∀i 6= j, and P [Li ≤ VaRi(p)] = p ∀ p ∈ supp φ

The first hypothesis essentially means that {X(i)φ }

Ni=1 are i.i.d. Since the

Alternate Hypothesis is complement of the Null, our coverage test cannotdifferentiate between rejection due to the i.i.d. hypothesis or rejection dueto the distribution hypothesis. This is a typical shortcoming of classicalcoverage tests, including the standard VaR coverage test.

Proposition 3.4. (Mean and Variance of XNφ under h0) Consider the ran-

dom variable XNφ ∈ [0, 1] defined by (3.2). Then under the null-hypothesis,

µφ := E[XNSR(φ)] =

∫ 1

0φ(p) p dp(3.5)

5

and

σ2φ := V[XNSR(φ)] =

1

N

(2

∫ 1

0

∫ p

0φ(p)φ(q)qdpdq −

(∫ 1

0φ(p)pdp

)2).

(3.6)

Proof. To prove (3.5) we first consider a single trading day failure X(i)SR and

compute

E[X(i)φ (φ)] = E

[∫ 1


]=

∫ 1

0φ(p)E[1{Li≤VaRi(p)}]dp

=

∫ 1

0φ(p)P[Li ≤ VaRi(p)]dp

=

∫ 1

0φ(p) p dp

(3.7)

Thus

E[XNSR(φ)] := E

[1

N

N∑i=1

X(i)SR(φ)

]

=1

N

N∑i=1

E[X

(i)SR(φ)

]=

1

N

N∑i=1

∫ 1

0φ(p) p dp

=

∫ 1

0φ(p) p dp

(3.8)

Hence the expected value of the average over N trading days is equal to theexpected value of a single trading day.

To prove (3.6) we again first consider a single trading day failure X(i)SR and

compute6

E[(X(i)SR(φ))2] = E

[(∫ 1


)2]

= E[∫ 1


∫ 1

0φ(q)1{Li≤VaRi(q)} dq

]= 2E

[∫ 1

0

∫ p

0φ(p)φ(q)1{Li≤VaRi(p)}1{Li≤VaRi(q)}dp dq

]= 2E

[∫ 1

0

∫ p

0φ(p)φ(q)1{{Li≤VaRi(p)}∩{Li≤VaRi(q)}}dp dq

]= 2E

[∫ 1

0

∫ p

0φ(p)φ(q)1{{Li≤VaRi(q)}dp dq

]assuming q ≤ p

= 2

∫ 1

0

∫ p

0φ(p)φ(q)P[Li ≤ VaRi(q)]dp dq

= 2

∫ 1

0

∫ p

0φ(p)φ(q) q dp dq

(3.9)

Using (3.5) we then have

V[X(i)SR(φ)] := E[(X

(i)SR(φ))2]− E[X

(i)SR(φ)]2

= 2

∫ 1

0

∫ p


(∫ 1

0φ(p)pdp

)2

.(3.10)

Finally,

V[XNSR(φ)] = V

[1

N

N∑i=1

X(i)SR(φ)

]

=1

N2

N∑i=1

V[X(i)SR(φ)] +

∑i 6=j

corr〈X(i)SR(φ), X

(j)SR(φ)〉︸︷︷︸

=0underH0(3.4)

=

1

N2

N∑i=1

(2

∫ 1

0

∫ p


(∫ 1

0φ(p)pdp

)2)

=1

N

(2

∫ 1

0

∫ p

0φ(p)φ(q)q dp dq −

(∫ 1

0φ(p)pdp

)2).

(3.11)

�

Lemma 3.5. (XNSR admits a Z-test) The Spectral Measure Failure Rate Xφ

is asymptotically normal under the null hypothesis and therefore admits aZ-test.

7

Proof. Under the null hypothesis, the sequence {X(i)SR}Ni=1 is i.i.d. with

bounded mean µφ and variance σ2φ given by (3.5) and (3.6) respectively.Thus by the Lindeberg-Levy Central Limit Theorem

√N(XN

SR(φ)− µφ) D−−−→N→∞

N (0, σ2φ).(3.12)

In fact, the convergence is pointwise and uniform, so that

limN→+∞

‖P[√

N(XNSR − µφ) ≤ ξ

]− Φ(ξ/σφ)‖L∞ξ = 0.(3.13)

Therefore, for large enough N , XNSR is approximately normal and therefore

admits a Z-test. �

As usual, it would be interesting to get precise error bounds on the dif-ference between the distribution of XN

SR for finite N and its limiting distri-bution. This would allows us to control the error in some measure and letus choose N to within some tolerance. For instance, since the third momentof XN

SR is uniformly bounded we can appeal to the Berry-Esseen theorem togive point-wise bounds on the difference between the CDF for XN

SR and thelimiting CDF. However, we do not pursue that direction here. We finallyarrive at our main Theorem.

Theorem 3.6. (Coverage Test for Spectral Risk Measures) Let µφ and σφbe the mean and standard deviation of the Spectral Measure Failure RateXN

SR (3.2) under the null hypothesis given by

µφ =

∫ 1

0φ(p) p dp(3.14)

σφ =1√N

√2

∫ 1

0

∫ p

0φ(p)φ(q)q dp dq −

(∫ 1

0φ(p)pdp

)2

(3.15)

Then the Z-score ZNSR defined by

ZNSR(φ) :=X̂N

SR(φ)− µφσφ

(3.16)

defines a Z-test for XNSR and a Coverage Test for Mφ.

Proof. By Lemma 3.5 the test statistic XNSR admits a Z-test with standard

Z-score. By Proposition 3.4 the mean and variance of XNSR under the null

hypothesis are given by (3.5) and (3.6) respectively leading to (3.16). �8

4. Application to Expected Shortfall

Expected Shortfall is a particular Spectral Risk Measure introduced in [4,5] which has gained popularity since being introduced as a possible measureto replace VaR. It is given by the average VaR above a threshold α,

ES(α) :=1

α

∫ 1

0VaR(p)dp.(4.1)

In order to use the Coverage Test in Theorem 3.6 to backtest ES, we need towrite (4.1) as a Spectral Risk Measure with a specific choice of risk spectrumφES.

Definition 4.1. (Expected Shortfall) The Expected Shortfall is a specialcase of a Spectral Risk Measure with risk spectrum φ given by

(4.2) φES(p) :=1

α1{0≤p≤α}.

Thus, the ES risk measure (4.1) can be written as

(4.3) MES(α) =

∫ α

0

VaR(p)

αdp.

Since φES is an admissible risk spectrum, MES enjoys all the mathemat-ical properties of Spectral Risk Measures.

To derive the coverage test for ES, we recall Definition 3.1 and define theExpected Shortfall Failure Rate XN

ES as

XNES(α) :=

1

N

N∑i=1

1

α

∫ α

01{Li≤VaRi(p)}dp(4.4)

which is simply (3.2) with φ = φES (4.2). Hence by Theorem 3.6, XNES is

asymptotically normal and admits a Z-test. To calculate the Z-score weneed to calculate the mean and variance of XN

ES under the null-hypothesis.To do this we substitute the Expected Shortfall risk spectrum φES (4.2) into(3.5) and (3.6) and obtain

µES(α) =

∫ 1

0φES(p)p dp

=1

α

∫ α

0p dp

=α

2

(4.5)

and9

σ2ES(α) =1

N

(2

∫ 1

0

∫ p

0φES(p)φES(q)q dp dq − µ2ES

)=

1

N

(2

α2

∫ α

0

∫ p

0q dp dq − µ2ES

)=

1

N

(2

α2

∫ α

0

1

2p dp− 1

4α2

)=α

N

(4− 3α

12

)(4.6)

Given µES and σ2ES we can use Theorem 3.6 to compute the Z-score whichforms basis for our Coverage Test. For completeness we write this as aCorollary to Theorem 3.6.

Corollary 4.2. (Coverage Test for Expected Shortfall) The Expected Short-fall measure MES (4.1) admits a Z-test with Z-score

ZNES(α) =X̂N

ES(α)− µES(α)

σES(α)

=√

3N

(2X̂ES(α)− α√α(4− 3α)

).

(4.7)

This is clearly a test of the α-tail in quantile space rather than dollarspace. We define a mapping between the two spaces in a subsequent paper.

5. Conclusions

In this note we presented a simple coverage test for any Spectral RiskMeasure, including Expected Shortfall. The test gives a single decision ata specified confidence level and is complementary to other testing methods,most notably the saddle-point techniques in [15, 18]. It would be interest-ing to compare the backtesting performance of our coverage test with themethods mentioned in the Introduction.

Acknowledgements

The authors would like to thank Carlo Acerbi (MSCI) and Janos Pal(BMO) for insightful discussions on backtesting risk measures, as well asthe anonymous referee for a critical reading of the manuscript.

10

References

[1] C. Acerbi, Spectral measures of risk: A coherent representation of subjective riskaversion, Journal of Banking and Finance, 26, (2002), 1505-1518.

[2] C. Acerbi & B. Szekely, Backtesting Expected Shortfall, to appear in Risk Mag-azine, 2014.

[3] C. Acerbi & D. Tasche, On the coherence of Expected Shortfall, Journal of Bankingand Finance, Volume 26, Issue 7, July 2002, Pages 1487-1503.

[4] P. Artzner, F. Delbaen, J.M. Eber, & D. Heath, Thinking Coherently, Risk,Vol 10, No. 11, (1997) 68-71.

[5] P. Artzner, F. Delbaen, J.M. Eber, & D. Heath, Coherent Measures of Risk,Mathematical Finance, Vol. 9, Issue 3, (1999) 203-228.

[6] Basle Committee on Banking Supervision, Supervisory Framework for the useof “Backtesting” in Conjunction with the Internal Models Approach to Market RiskCapital Requirements, January 1996.

[7] F. Bellini & V. Bignozzi, Elicitable Risk Measures, Preprint, December 2013.[8] J. Berkowitz, Testing Density Forecasts, with Applications to Risk Management,

Journal of Business and Economic Statistics, Vol 19, No 4, (2001) 465-474.[9] S.D. Campbell, A Review of Backtesting and Backtesting Procedures, Finance and

Economics Discussion Series, Federal Reserve Board, Washington, D.C., 2005.[10] L. Carver, Mooted VaR substitute cannot be back-tested, says top quant, Risk,

March 08, 2013.[11] L. Carver, Back-testing expected shortfall: mission impossible?, Risk, October 17,

2014.[12] J.M. Chen, Measuring market risks under the Basel Accord: VaR, stressed VaR,

and expected shortfall,Aestimatio, The IEB International Journal of Finance, volume8,(2014) pp. 184-201.

[13] Bank for International Settlements, Fundamental review of the trading book:A revised market risk framework, Consultative Document, October 2013.

[14] T. Gneiting, Making and Evaluating Point Forecasts, SSRN Preprint, 2010.[15] A. Graham & J. Pal, Backtesting value-at-risk tail losses on a dynamic portfolio,

Journal of Risk Model Validation, Volume 8, Number 2, 2014.[16] P. Jorion, Value at Risk : The New Benchmark for Managing Financial Risk, 3rd

Edition, McGraw-Hill, 2007.[17] J. Kerhof and B. Melenberg, Backtesting for Risk-Based Regulatory Capital,

Journal of Banking and Finance, Vol 28, No 8, (2004) 1845-1865.[18] W.K. Wong, Backtesting trading risk of commercial banks using expected shortfall,

Journal of Banking & Finance, Volume 32, Issue 7, July 2008, Pages 14041415.

E-mail address: [email protected]

Risklab Toronto, University of Toronto, 1 Spadina Crescent, Toronto,ON, M5S 3G3

E-mail address: [email protected]

Bank of Montreal, 100 King St W, Toronto, ON, M5X 1A1

11

backtesting general spectral risk measures with application to expected shortfall.pdf

Documents