multistage optimization of option portfolio using higher order coherent risk measures

European Journal of Operational Research 227 (2013) 190–198

Contents lists available at SciVerse ScienceDirect

European Journal of Operational Research

journal homepage: www.elsevier .com/locate /e jor

Innovative Applications of O.R.

Multistage optimization of option portfolio using higher order coherentrisk measures

Yassine Matmoura a, Spiridon Penev b,⇑a Ecole Centrale, Paris, Franceb The University of New South Wales, Sydney, Australia

a r t i c l e i n f o a b s t r a c t

Article history:Received 6 January 2012Accepted 11 December 2012Available online 21 December 2012

Keywords:Coherent risk measuresDualityAverage value-at-riskMonte Carlo simulationKusuoka measureStochastic programming

0377-2217/$ - see front matter Crown Copyright � 2http://dx.doi.org/10.1016/j.ejor.2012.12.013

⇑ Corresponding author. Address: Department of Stics and Statistics, 2052 Sydney, NSW, Australia. Tel.:(0)2 9385 7123.

E-mail address: [email protected] (S. Penev).

Choosing a suitable risk measure to optimize an option portfolio’s performance represents a significantchallenge. This paper is concerned with illustrating the advantages of Higher order coherent risk mea-sures to evaluate option risk’s evolution. It discusses the detailed implementation of the resultingdynamic risk optimization problem using stochastic programming. We propose an algorithmic procedureto optimize an option portfolio based on minimization of conditional higher order coherent risk mea-sures. Illustrative examples demonstrate some advantages in the performance of the portfolio’s levelswhen higher order coherent risk measures are used in the risk optimization criterion.

Crown Copyright � 2012 Published by Elsevier B.V. All rights reserved.

1. Introduction strained optimization of a certain function like in the case of para-

Portfolio optimization is the process of selection of wealth allo-cation to financial instruments held to meet certain pre-definedcriteria. It is prudent to take risk into account when evaluatingthe optimal portfolio. Risk assessment and evaluation is in fact crit-ical. The uncertainty in risk assessment is expressed mathemati-cally using a random variable, a random vector or a randomprocess and the risk is a functional (numerical surrogate) relatedto it. Choosing the appropriate numerical surrogate is what choos-ing the suitable (optimal) risk measure is about (Rockafellar et al.,2008). In order to apply risk assessment methodology in practice,the inference aspect using data has to be addressed. In the contextof the risk measures of relevance to financial applications, this as-pect is still only vaguely discussed, especially in dynamic settingsthat describe the evolution of financial data in time. The relativelyslow progress on the front of inference about dynamic stochasticrisk could perhaps be explained by the fact that the resulting infer-ence problems are cast as difficult and non-standard optimizationproblems under constraints. These constraints make the modelbetter correspond to reality but make the optimization challenging(Shapiro et al., 2009). Further difficulties arise when trying to con-struct confidence regions for the risk. It is known how to proceedwhen the estimator of the risk measure results from an uncon-

012 Published by Elsevier B.V. All

atistics, School of Mathemat-+61 (0)2 9385 7023; fax: +61

metric maximum likelihood estimators, M-estimators and the like.However, when we are dealing with non-explicitly defined estima-tors resulting from an involved constrained optimization problem,a new methodology about the asymptotic distribution of theseestimators is required.

Suppose we want to invest an amount A0 in n assets. Denote byxi, i = 1, 2, . . ., n, the portion of capital to be invested in each asset.Suppose, further, each asset has a return rate Ri, i = 1, 2, . . ., n. Ofcourse, the return rates are unknown in advance and can be con-sidered to be random. A decision is to be made on the choice ofxi, i = 1, 2, . . ., n based on our beliefs about Ri, i = 1, 2, . . ., n. Sponta-neously, an investor can choose wealth allocation by maximizingthe expected return rate. Such decision leads to concentratingthe investment in the assets with the highest expected returns.This solution is very simplistic since it does not take into accountunfavorable events. It leaves out all considerations regarding riskof losing if most profitable assets (defined as large expected returnrate) start following a dropping path.

The pressing question to be answered first is how should therisk be defined. Recently, there have been attempts to formulateuniversally acceptable conditions of consistency that a risk mea-sure should satisfy. The pioneering paper Artzner et al. (1999)has introduced the concept of a coherent risk measure which ismeant to satisfy a set of properties. These are translation invari-ance, sub-additivity, positive homogeneity, and monotonicity.Later, Föllmer and Schied (2002) introduced a more general classof risk measures, called convex risk measures. In their definition,the sub-additivity has been replaced by the convexity property.

rights reserved.

http://dx.doi.org/10.1016/j.ejor.2012.12.013

mailto:[email protected]

http://dx.doi.org/10.1016/j.ejor.2012.12.013

http://www.sciencedirect.com/science/journal/03772217

http://www.elsevier.com/locate/ejor

Y. Matmoura, S. Penev / European Journal of Operational Research 227 (2013) 190–198 191

There is only a slight difference between the two definitions andvery often, as will be also the case in this article, the convex riskmeasures will be used as a counterpart of coherent risk measures.Formally, for random variables X, Y, the coherency of the risk g(with values on the real axis) requires:

� convexity g(kX + (1 � k)Y) 6 kg(X) + (1 � k)g(Y), k 2 [0,1].� monotonicity: if X 6 Y then g(X) P g(Y).� positive homogeneity: if k P 0 then g(kX) = kg(X).� translation invariance: if m 2 R then g(Y + m) = g(Y) �m.

Here the function g(X) represents a ‘‘safe equivalent’’ that off-sets the random variable and is designed to highlight most impor-tant risk information in X.

The convexity reflects the widely held view that diversificationdoes not increase risk. Consequently, new convex coherent mea-sures were introduced.

This line of research leads to new mathematical problems inconvex analysis, optimization and statistics (see the papers ofRockafellar and co-authors in Rockafellar and Uryasev (2002)and Rockafellar et al. (2008)). Research problems of interest arisewhen trying to optimize coherent risk measures (or their data-based counterparts) under practically relevant restrictions onthe random vectors involved in their description. Regarding prac-tical applications, the real interest is focussed on dynamic riskmeasures. These need to be considered on a filtered probabilityspace and to satisfy versions of the above four properties for suit-ably defined stopping times in a time-consistent manner. Despiterecent advances in the theory of such measures e.g., Acciaio andPenner (2011); Bion-Nadal (2009) and Ruszczynski (2010), the is-sue about statistical inference about such dynamic measures islagging behind.

A way to deal with both performance and risk is based onutility theory. A utility function is a transformation of randomoutcomes which gives information on outcomes based on riskaverse modeling. A precursor of this theory was developed inMarkowitz (1952, 1987). His formulation of the investment deci-sion making process lead to the by now commonly named’’mean–variance’’ model for portfolio optimization. This modelobviously suffers from non-robustness and various versions ofrobust optimization have been proposed (Ben-Tal and Nemirovski,1999). When applied in portfolio optimization context theseversions seem to significantly immunize the portfolio againstuncertainty in the asset returns. Further derivative insuranceguarantees, recently proposed in Zymler et al. (2011) allow fora better trade-off between weak and strong guarantees of theworst-case portfolio return. However the resulting portfolio opti-mization problems are still similar to the Markowitz model. Theycan not therefore overcome one of its main shortcomings,namely the fact that the variance as a measure of risk penalizesequally an excess over the mean and a shortfall of the samemagnitude.

More non-symmetric risk measures are better suited for finan-cial applications. A popular risk measure of this type is the Value atRisk VaR(a),i.e., simply the a-quantile of the loss distribution. How-ever its non-convexity makes it a non-coherent risk measure. Analternative to VaR is the Average Value at Risk (AVaR) (Acerbiand Tasche, 2002; Rockafellar and Uryasev, 2002). When theunderlying probability space on which the random variable X is de-fined, is atom-less, AVaR is a coherent risk measure and even whenthe probability space is atomic, AVaR can become coherent aftersome continuity adjustment. Hence it is bold to say that AVaR isa coherent risk measure and as such is preferred in algorithmic sto-chastic optimization.

The AVaR plays a central role in the theory of coherent risk mea-sures due to the fact that it is a building block in the construction of

every law invariant coherent risk measure via the Kusuoka repre-sentation. In particular, the main object of our study in this paper,the higher moment dual risk measures, allow for such a represen-tation, too. These are tail risk measures, first introduced in Krokh-mal (2007), with their Kusuoka representation derived inDentcheva et al. (2010).

In order to define a framework for algorithmic portfolio optimi-zation using these measures, we will introduce multistage riskaverse optimisation using stochastic programming. A slightly sim-ilar approach has been used recently in Topaloglou et al. (2011)where the authors use it to optimize the risk of a portfolio compris-ing of options and forwards. However, the risk measure used thereis AVaR (in fact they minimize the conditional value at risk (CVaR)but it is well known that these two measures are very similar andin fact for the case of return distributions not having point masses,they actually coincide). Our goal in this paper is to formalize theuse of higher order dual risk measures as a criterion in option port-folio optimization and to demonstrate on a given data set theadvantage of using these risk measures.

The plan of the paper is a follows. In Section 2, we will recall andreflect on properties of higher moment dual risk measures. We willthen introduce our framework of algorithmic portfolio optimisa-tion using stochastic programming in Section 3. In Section 4, wewill describe how to estimate financial risk measure for optionsand will optimize Vanilla options composed from S&P500 stocksusing higher order coherent risk measure, the mean–variancemodel, and AVaR (in their dynamic formulation), in order to illus-trate the advantages of the higher order coherent risk measures incomparison to the other two. To the best of our knowledge, for aportfolio of derivatives, such comparison has not been made inthe literature. In Section 5 we discuss numerical implementationissues and illustrative examples. Section 6 concludes.

2. Higher order dual risk measures

Let ðX;F ; PÞ be a probability space an p 2 [1,1). We denote by R

the extended real line. For random variables X;Y 2 LpðX;F ; PÞ anda coherent risk measure gð:Þ : LpðX;F ; PÞ ! R. In the definition ofcoherent risk measure given in Section 1, random outcomes repre-sent a reward. If we choose the random variables X, Y to represent aloss then the requirements about monotonicity and translationinvariance in the definition should clearly be modified as follows:

� monotonicity: if X 6 Y then g(X) 6 g(Y),� translation invariance: if m 2 R then g (Y + m) = g(Y) + m.

Of particular importance for the results in this paper are a spe-cial class of coherent risk measures, namely the higher order tailrisk measures. They have been investigated in Cheridito and Li(2009); Krokhmal (2007); Dentcheva et al. (2010). In the paperKrokhmal (2007) the name higher order risk measures was intro-duced to a set of measures that are in fact a special case of a moregeneral family considered in Cheridito and Li (2009). The dual rep-resentation of these measures (termed Kusuoka representation)was obtained in Dentcheva et al. (2010).

The higher order risk measures, to be introduced below, are be-lieved to be advantageous in comparison to many proceedingcoherent risk measures with a fixed tail cutoff points precisely be-cause their cut-off point is adjustable to the level of confidencea 2 (0,1). In addition, the values of p > 1 allow to incorporate infor-mation from the higher pth order moments of the variables whichleads to a more flexible risk assessment.

Some examples of portfolio optimization models were shown inKrokhmal (2007). He also has demonstrated some advantages ofthese measures in comparison to the classical mean–variance

192 Y. Matmoura, S. Penev / European Journal of Operational Research 227 (2013) 190–198

optimization model of Markowitz. Our goal in this paper is to dem-onstrate similar advantages in dynamic settings and in applicationto derivative portfolio optimization. The examples include dataform the S&P500 Index.

For a random variable X 2 L1ðX;F ; PÞ, we denote the cumula-tive distribution function (cdf) and the higher order cdf as

FXðgÞ ¼ P½X 6 g� ¼ Fð1ÞX ðgÞ;

FðkÞX ðgÞ ¼Z g

�1Fðk�1Þ

X ðaÞda for g 2 R; k P 2:ð1Þ

The left-continuous inverse of the cumulative distribution functionis defined as follows:

Fð�1ÞX ðaÞ ¼ inffg : FXðgÞP ag for 0 < a < 1:

The absolute Lorenz function Fð�2ÞX ð�Þ : R! R, is defined as the

cumulative quantile:

Fð�2ÞX ðaÞ ¼

Z a

0Fð�1Þ

X ðtÞdt for 0 < a 6 1: ð2Þ

We define Fð�2ÞX ð0Þ ¼ 0 and observe that Fð�2Þ

X ð1Þ ¼ E½X�.Similarly to Fð2ÞX ð�Þ, the function Fð�2Þ

X ð�Þ is well defined for anyrandom variable X 2 L1ðX;F ; PÞ. By construction, it is convex. Veryoften, in the econometric literature, the normed variant, the rela-tive Lorenz curve a # Fð�2Þ

X ðaÞ=E½X� is considered instead of the Lor-enz curve. Because of the norming at a = 1, the relative Lorenzfunction is a convex-shaped cumulative distribution function.

Popular measures of risk are VaR, which at level a is defined asVaRaðXÞ ¼ �Fð�1Þ

X ðaÞ, and the AVaR, which at level a is defined as

AVaRaðXÞ ¼ �1a

Fð�2ÞX ðaÞ ¼ 1

a

Z a

0VaRtðXÞdt:

The AVaR is well known to be a coherent risk measure (for a dem-onstration see for example Krokhmal (2007)).

A fundamental result in the theory of coherent measures of riskis the Kusuoka theorem (Kusuoka, 2001):

For a non-atomic space X, every law invariant, finite-valuedcoherent measure of risk on L1ðX;F ; PÞ can be represented asfollows:

gðXÞ ¼ supl2M

Z 1

0AVaRaðXÞlðdaÞ; ð3Þ

with some convex setM� Pðð0;1�Þ, the set of probability measureson (0,1].

Kusuoka’s result underscores the importance of AVaR in theconstruction of law invariant coherent risk measures.

We define g(X) to be a Kusuoka measure of risk, if there exists aconvex set M in the set Pðð0;1�Þ such that for all such X we have

gðXÞ ¼ supl2M

Z 1

0AVaRaðXÞlðdaÞ:

Kusuoka’s fundamental result above in fact states that undergeneral conditions, a coherent measure of risk on L1ðX;F ; PÞ is aKusuoka measure. This result was also extended for Lp spaces(p P 1) (Shapiro et al., 2009). It can be used in two ways: to showa Kusuoka representation for a measure that is known to be coher-ent or, by choosing the set M, to generate a new coherent riskmeasure.

In order to describe the intimate relation of the Kusuoka mea-sures of risk with the higher order risk measures, we notice the fol-lowing Fenchel duality result from Ogryczak and Ruszczynski(1999).

We start by defining Fð2ÞX ð:Þ:

Fð2ÞX ðgÞ ¼Z g

�1FXðaÞda ¼ E½ðg� XÞþ� for g 2 R: ð4Þ

Recall that for the function f : Rn ! R, its Fenchel conjugate isf � : Rn ! R : f �ðsÞ ¼ supxfhs; xi � f ðxÞg.

The crucial non-trivial fact is that not only are both Fð2ÞX ð�Þ andFð�2Þ

X ð�Þ convex functions but they are also Fenchel conjugates toeach other, that is:

Fð�2ÞX ¼ Fð2ÞX

h i�and Fð�2Þ

X ¼ Fð�2ÞX

h i�ð5Þ

holds. In particular, we get the representation of AVaRa(X):

AVaRaðXÞ ¼ �1a

Fð2ÞX

h i�ðaÞ ¼ �1

asupg2R

ga� Fð2ÞðgÞn o

¼ infg2R

1a

E½ðg� XÞþ� � g� �

: ð6Þ

If we set c = 1/a and take M to consist of a single measure l thatputs its whole mass on a fixed a 2 (0,1] then AVaRa(X) is obviouslya Kusuoka measure of risk. The above observations also suggest thatby replacing the one-norm in (6) by a p-norm (p > 1) we can gener-ally consider the following higher moment measures of risk:

gc;pðXÞ ¼ infg2Rfckðg� XÞþkp � gg; p > 1; c > 1: ð7Þ

In (7) the choice of the constant c would generally depend on a cho-sen a 2 (0,1) and, typically, c ¼ 1

a. Although not obvious, it has beendemonstrated in Krokhmal (2007) that the risk measures in (7) arecoherent, i.e., they satisfy the four conditions listed in theIntroduction.

These higher order measures have advantage in comparison topredecessors of the type

�EX þ bkðEX � XÞþkp;

where the tail cutoff point EX is fixed. For the new measures, thecutoff is adjustable via ahence they are called tail risk measures.

Let p 2 (1,1), q: 1/p + 1/q = 1, and choose a constant c P 1. Con-sider the risk measure gc,p(X). The main result from Dentcheva et al.(2010) states that gc,p(X) has a Kusuoka representation

gc;pðXÞ ¼ supl2Mq

Z 1

0AVaRaðXÞlðdaÞ ð8Þ

holds. In particular, when p = 2 we get

gc;2ðXÞ ¼ �gX þgX � Fð�1Þ

X

� �þ

�� 2

2

gX � Fð�1ÞX

� �þ

�� 1

;

where gX is the optimal solution in (7). If Y ¼ gX � Fð�1ÞX

� �þ

then

gc;2ðXÞ ¼ EðYÞ � gX þVarðYÞEðYÞ :

The last term in the right hand side is the Fano factor (Fano, 1947)which is itself an important measure of noise-to-signal ratio usedin Statistics. We note that gX may not represent a quantile of X sothat the above risk measures are of novel type.

3. Dynamic risk optimization using stochastic programming

The risk measure is a theoretical construction involving a func-tional defined on a random variable. The random variable itself istypically a transformation of a random vector. (For example, theportfolio’s value is a weighted combination of the values of theseparate stocks in the portfolio. Optimizing the risk means findingthe most appropriate weights.) Typically, there are some con-straints involved in this optimization. One obvious constraint isthat the sum of weights should be equal to one. Given data, thetheoretically defined functional could be replaced by an estimateand then optimization of this estimate with respect to the


constraints is an ultimate goal to achieve. However, rarely would itbe the case that independent and identically distributed observa-tions of the random variable exist. Typically only one observation(the portfolio’s value) will be available at each point in time. Hencethe dynamics of the portfolio’s value is represented by a time ser-ies. Of course, more structure should be imposed on the time seriesmodel to allow for a tractable solution.

The proper framework to perform the optimization is in dy-namic setting. This section describes the successive steps in suchan analysis.

3.1. From 2-stage to multistage risk averse optimization

Recalling the motivating problem from Section 1, we have anamount A0 to be invested in n assets where we need to satisfy bothrequirements of minimum expected return and of safe investment.The respective optimization problem can be formulated as follows:

min gXn

i¼1

xiRi

!;

subject to EXn

i¼1

xiRi

" #P c; ðiÞ

Xn

i¼1

xi ¼ A0; ðiiÞ

x 2 C; ðiiiÞ

ð9Þ

with g(.) being the risk measure, c P 0, x = [x1, x2, . . ., xn]0 and C a setto be defined each time. (Henceforth 0 denotes transposition).

The first constraint (i) symbolises the requirement fora minimum expected return, the second constraint (ii) representsthe distribution of the whole amount A0. In this setting aconstraint (iii) C ¼ Rn

þ would imply that long positions only areallowed.

Problem (9) represents a one-stage risk averse optimization. Wenote that in this formulation, a static evaluation of the risk is to beperformed hence it is typically easy to solve the resulting optimi-zation problem. Next we would like to extend it to a dynamic prob-lem by including many steps via a stochastic process formalismand by defining new risk evaluation that is adjusted to the result-ing filtration.

Let X be a sample space equipped with two r-algebras F 1 � F 2

and a probability measure P on ðX;F 2Þ. A functional g : Lp

ðX;F 2; PÞ ! LpðX;F 1; PÞ, is called conditional risk mapping if itsatisfies:

1. Convexity: for all a 2 [0,1], for all Z; Z0 2 LpðX;F 2; PÞ:

gðaZ þ ð1� aÞZ0Þ 6 agðZÞ þ ð1� aÞgðZ0Þ:

2. Monotonicity: for all ðZ; Z0Þ 2 LpðX;F 2; PÞ: if Z P Z0 then g(Z) 6g(Z0).

3. Translation invariance: for all ðZ;YÞ 2 LpðX;F 2; PÞ�LpðX;F 1; PÞ:gðZ þ YÞ ¼ gðZÞ � Y .

4. Positive homogeneity: for all a P 0; Z 2 LpðX;F 2; PÞ : gðaZÞ ¼agðZÞ.

We observe immediately that if we choose trivially F 1 ¼ f£;Xgwe can recover the static coherent risk measure definition. Formore details about conditional risk mappings we refer to Shapiroet al. (2009, Section 6.7.2).

For a 2 (0,1) and for each x 2X, we can now define a randomvariable called conditional AVaR as follows:

AVaRaðXjF 1ÞðxÞ ¼ infYðxÞ2LpðX;F1 ;PÞ

YðxÞ þ 1a

E½ð�X � YÞþjF 1�ðxÞ� �

:

ð10Þ

Then, similarly to the approach in Section 2 we can define a higherorder moment coherent conditional risk measure for all a 2 (0,1),p > 1 as follows:

gðXjF 1ÞðxÞ ¼ infYðxÞ2LpðX;F1 ;PÞ

YðxÞ þ 1a

E ð�X � YÞpþjF 1� �1=pðxÞ

� �:

ð11Þ

To deal with a multistage modeling on LpðX;F ; PÞ let us con-sider filtrations F 1 � F 2 � � � � F T with F T ¼ F and F 1 ¼ f£;Xg.Denote for j ¼ 1;2; . . . ; T : nj ¼ LpðX;F j; PÞ. Then we denote a se-lected family of conditional risk mappings satisfying conditions(1)–(4) above:

gtþ1jF t: ntþ1 ! nt ; t ¼ 1;2; . . . ; T � 1:

The composition g2jF1 � � � gTjF T�1

: nT ! R is a coherent risk mea-sure (see Shapiro et al. (2009), Section 6.7.3). We can now formulatea multistage risk averse optimization problem. We note that in thisformulation, as opposed to the static formulation (9) above, we usesuperscripts xi, i = 1, 2, . . ., T to indicate the time evolution asfollows:

minx1 ;x2 ;...;xT

fZ1ðx1Þ þ g2jF1½Z2ðx2ðxÞ;xÞ þ � � � þ

gT�1jFT�2½ZT�1ðxT�1ðxÞ;xÞ þ gTjFT�1

½ZTðxTðxÞ;xÞ��g;subject to x1 2 H1 ¼ Rn; xtðxÞ 2 Htðxt�1ðxÞ;xÞ; t ¼ 2; . . . ; T;

with Zt such that Ztðxt ; :Þ 2 nt ; Ht � Rnt�1 �X:

ð12Þ

Problem (12) represents a very general formulation of the mul-tistage risk averse optimization problem. The advantage of thisgeneral formulation is that it can be specialized with any kind ofproducts from correlated spot assets to derivative products. Inthe next subsection, we will specialize this formulation for the pur-pose of option portfolio selection.

3.2. Numerical implementation for option portfolio selection

Denote by Rti

; t ¼ 1;2; . . . ; T the random process of rates of re-

turn for the ith asset (i = 1, 2, . . ., n). Hence Rt ¼ Rt1;R

t2; . . . ;Rt

n

� �0is a

random vector of rates of return of all assets in the portfolio at timet. We denote the r-algebra F t ¼ rðR1; . . . ;RtÞ. The random process(xt), adapted to the filtration F t , is xt ¼ xt

1; xt2; . . . ; xt

n

� �0 and it repre-sents the random vector of weights associated to the n assets attime t. Then the (random) return of the portfolio at time t isZtðxÞ ¼

Pni¼1xt

i ðxÞRti ðxÞ ¼ xt0RtðxÞ. The price process, that is, the

F t-predictable random process registering the value of the portfo-lio, will be denoted by (bt), t = 1, 2, . . ., T. For a specific realization rt

at time t of Rt, and for a specific selection ~xt of xt we have

btþ1 ¼ bt þXn

i¼1

~xti r

ti ;

with an initial value b0 = A0. We arrive at the following nested for-mulation of the portfolio selection problem:

minðxt Þf�gðZTÞg ¼ fg1½. . . gT�1jFT�2

½gTjFT�1½ZT ��g;

subject to EXn

i¼1

xti R

ti jF t�1

" #P c;

Xn

i¼1

xti ¼ bt ;

xt 2 Ct ; t ¼ 1;2; . . . ; T;

ð13Þ

with Ct a subset of Rn (usually Ct ¼ Rnþ) and �gð:Þ ¼

g1 g2jF1 . . . gTjF T�1

ð:Þ. Note that �gð:Þ is directly a coherent riskmeasure if the conditional risks are defined as the respective condi-tional expectations (Shapiro et al., 2009, p. 325).


Next we introduce a general algorithm to implement the aboveoptimization process. It includes the following steps:

1. At t = 0 we know the vector of initial prices of assets

S0 ¼ S01; . . . ; S0

n

h i0and we choose b0 = 1.

2. For each fixed t = 1 to t = T � 1:(a) From the information on previous realizations (up to

moment (t � 1)) and using the assumptions on the assetprice evaluation, we get the distribution of Rt.

(b) From the distribution of Rt we draw an appropriately scaled

sample of size J: RtJ ¼

Rt1;1 . . . Rt

1;J. . . . . . . . .Rt

n;1 . . . Rtn;J

0@ 1A, where J denotes the

number of simulated price evolutions.(c) Using the generated samples in Rt

J we replace the associatedexpected values by their empirical estimators and solve theresulting optimization problems:

minðxt Þ

gtjF t�1ðxtÞ0Rt

J

� �n o;

subject to1J

XJ

j¼1

ðxtÞ0RtJ ej P c;

Xn

i¼1

xti ¼ bt ;

xt 2 Ct ; t ¼ 1;2; . . . ; T:

ð14Þ

Hereby we have denoted by ej the unit-length vector [0, 0, . . .,1, . . ., 0]0 with a one in jth position.

(d) We calculate the portfolio value at the next moment in timevia btþ1 ¼ bt þ ð~xtÞ0rt .

3. For t = T: Now that b = (b0, b1, . . ., bT) and ð~x ¼ ð~x1; . . . ; ~xT�1Þ areavailable we can analyze the price variation (delta) of the port-folio and the substitution of the optimal values of the optimiza-tion problem demonstrates the risk evolution.

4. Option portfolio optimization using higher order coherentrisk measures

We note that the algorithmic method in Section 3.2 is presentedin its very general form whereby we did not require specificassumptions on the price evolutions and on the kind of assets inthe portfolio. In this section, we will specialize the method whenapplying it for a portfolio of options on stocks. This will allow usto evaluate the distribution of Rt. We then apply the method byvarying the choice of the risk measures in order to illustrate theadvantage of the use of higher order coherent risk measures.

At each step of the optimization algorithm we need to find adistribution of the rate of returns of the options to substitute itin the estimation of the risk. The risk is to be optimized by choos-ing the optimal weights (the xt values) that represent the portfoliore-balancing. Further, we need the variation of the option prices ateach point in time. We achieve this by simulating many asset pathsbetween every two time points. For each path we calculate the op-tion price associated to the two dates and derive a distribution forthe variation of the option price between these days, and thereturn.

We now consequently introduce a suitable option pricing mod-el first, then we describe scenario simulations for the underlyingassets in order to derive the rate distribution. Finally, we outlineour approach in estimating the risk.

4.1. Option pricing model

There are many option pricing methods in the literature, withdifferent levels of sophistication. Here, we will be using a relatively

simple and not too much computer intensive method, yet which isflexible enough to evaluate the implied volatilities and correlationsbetween the assets. The method uses Monte Carlo simulationsbased on a multivariate GARCH (1,1) model to estimate thetime-varying covariance matrices. As the time for the Monte Carlocalculations in this model grows only linearly with the number ofassets, the added advantage is that the method can be applied witha large number of options. The detailed presentation of the quasi-likelihood method to estimate the time-varying covariance matri-ces is in Ledoit et al. (2003). We only summarize the steps in a min-imalist way in order to be able to describe the numerical procedurewe are employing.

Let uti be the realization of the rate of return of the asset i at

time t. We assume that for all i = 1, 2, . . ., n and t ¼1;2; . . . ; T : E Ut

i jF t�1� �

¼ 0. The GARCH-style model for the covari-ance matrices is defined by

Cov Uti ;U

tj jF t�1

� �¼: Cov t

ij ¼ cij þ aijut�1i ut�1

j þ bijCov t�1ij ; i – j;

Var Uti jF t�1

¼: rt

i2 ¼ r2

i þ aiiut�1i

2 þ biirt�1i

2:

ð15Þ

The interpretation in (15) is that the time-varying covariance Cov tij

between the ith and jth asset at time t depends on a long-termcovariance between these two assets and also on the history at timet � 1. Similar is the interpretation of the time varying variance inthe second Equation in (15). To ensure covariance stationarity, thecoefficients aij, bij, aii, bii satisfy the restrictions

0 < aij þ bij < 1; 0 < aii þ bii < 1; aij > 0; bij > 0;aii > 0; bii > 0:

We assume multivariate normality: Ut ¼ Ut1;U

t2; . . . ;

�Ut

n�0 N 0;Rtð Þ with

Rt ¼rt

12

. . . Cov t1;n

. . . . . . . . .

Cov tn;1 . . . rt

n2

0B@1CA:

Then the quasi-likelihood procedure for consistent estimation ofthe diagonal coefficients of Rt and of aii; bii; r2

i is well known(Campbell et al., 1997, Section 12.2). In Ledoit et al. (2003), asimple quasi-likelihood method is suggested to find consistentestimators of the off-diagonal coefficients by restricting eachtime attention to the 2 � 2 submatrices corresponding to thevariables Ui and Uj. For inference purposes, we follow a standardapproach whereby if L is certain period of time used to estimatethe above parameters by using daily returns then the estimatorsobtained are considered ‘‘valid’’ for the next time period oflength L.

To simplify the treatment, we assume, for now, that all optionsin the portfolio have the same exercise date T, and that we aredealing with European call and put options only. Let K = [K1, K2,. . ., Kn]0 be the vector containing the n strike prices and r be therisk-free rate. Then at time t < T the option price vector is

f ðtÞ ¼f1ðtÞ. . .

fnðtÞ

0B@1CA ¼ e�rðT�tÞ

E max � ST1 � K1

� �;0

� �jSt

1

h i. . .

E max � STn � Kn

� �;0

� �jSt

n

h i0BB@

1CCA; ð16Þ

where we use (+) for a call option and a (�) for a put option,and St

i denotes the price of the underlying asset at time t.When evaluating the option, we assume that the stock pricefollows:

STi ¼ St

i eðr�1

2

Xn

j¼1

eRtT ði;jÞÞðT�tÞþ

ffiffiffiffiffiffiT�tp

Ui

; ð17Þ


with U ¼ ½U1;U2; . . . ;Un�0 N 0; eRtT

� �where each element of the

matrix eRtT is calculated as:

eRtTði;jÞ¼ ðT� tÞ cij

1�aij�bijþ 1�ðaijþbijÞT�t

�logðaijþbijÞðT� tÞ ðRtÞði;jÞ�cij

1�aij�bij

� ( )

(we refer to Hull (2009, p. 479)). The above assumption about thestock price evolution is generalizing the assumptions regardingthe Black–Scholes model in that now the variance is notconsidered as a constant anymore and the assets are allowedto be correlated. Having simulated many random vectorsUk ¼ Uk

1;Uk2; . . . ;

hUk

n�0; k ¼ 1;2; . . . ;B we calculate for each of them

the resulting price vectors fk(t) and then take the arithmeticmean of the B simulated vectors to evaluate f(t). We choseB = 10,000 in our simulations and we used the Cholesky trans-form of eRt

T and the antithetic variable method (Hull, 2009,p. 425) to reduce the variance.

4.2. Evolution scenarios

Since the returns distribution between t and t + 1 is difficult toevaluate explicitly, we adopt simulations at this step, too. We sim-ulate underlying stocks evolutions between t and t + 1.

The classical assumption is that stock increments follow Brown-ian motion, which leads to the standard log-normal model used inoption pricing. However, when historical data on the underlyingassets is available, we are in a position and we perform a moreaccurate modeling by estimating the density of the daily returnsnon-parametrically by using a kernel density estimation method.The advantage of this approach is that we do not have to makeany assumptions on the returns distribution.

For the multistage portfolio optimization duration time wechoose the same T as for the option’s life. This means that theend of the algorithmic optimization coincides with the end ofthe options’ life. We divide the interval [0,T] in equidistant pointsDt ¼ T

M. In our simulations, we use Dt = 5 (that is, a week oftrading). At each step kDt, we know SlDt, l = 1, 2, . . ., k and theprevious returns rlDt, l = 1, 2, . . ., (k � 1). Now we generate then � J matrix

eSðkþ1ÞDt ¼Sðkþ1ÞDt;1

1 . . . Sðkþ1ÞDt;J1

. . . . . . . . .

Sðkþ1ÞDt;1n . . . Sðkþ1ÞDt;J

n

0BB@1CCA

consisting of J columns of simulated prices at time (k + 1)Dt. Theprice simulations are done under the kernel density estimator ob-tained by using the historical data. Then for each column S(k+1)Dt,i

we calculate the associated option prices to get the matrix

gðkþ1Þ

E max � ST1 � K1

� �; 0

� �jSðkþ1ÞDt;1

1

h i. . . E max � ST

1 � K1

� �;0

� �jSðkþ1ÞDt;J

1

h i. . . . . . . . .

E max � STn � K1

� �; 0


n

h i. . . E max � ST

n � K1

� �;0


n

h i0BBB@

1CCCA;

with the scalar factor g(k+1) = e�r(T�(k+1)Dt). Each column of the ma-trix will be denoted ð~f ððkþ 1ÞDtÞjÞ; j ¼ 1;2; . . . ; J. At any momentkDt in time, we know (have observed) the vector f ðkDtÞ 2 Rn thuswe have a sample of return rates for the options at that time. Thishelps us to calculate the variation of the option price:

D~f ijðkDtÞ ¼ ~f iððkþ 1ÞDtÞj � fiðkDtÞ; i ¼ 1;2; . . . ;n; j

¼ 1;2; . . . ; J: ð18Þ

The sample of values (18) constitutes the scenario of evolution be-tween kDt and (k + 1)Dt.

Finally then we are in a position to evaluate the sample of ratesof return

RkDtij ¼

~f iððkþ 1ÞDtÞj � fiðkDtÞfiðkDtÞ ; 1 6 j 6 J; 1 6 i 6 n:

4.3. Risk evaluation

In accordance with (10) and (11) we have for Z 2 LpðX;F t ; PÞand a 2 (0,1) the conditional versions

CAVaRaðZÞ ¼ infY2LpðX;F t�1 ;PÞ

Y þ 1a

E½ð�Z � YÞþjY ��

and for p > 1, the conditional higher moment coherent risk

CHMCRaðZÞ ¼ infY2LpðX;F t�1 ;PÞ

Y þ 1a

E ð�Z � YÞpþjY� �1=p

� �:

Suppose we are at time kDt (hence we know r(k�1)Dt and SkDt). Aswe have seen, an estimator of R(k+1)Dt is obtained using

RkDtij

� �; i ¼ 1;2; . . . ;n; j ¼ 1;2; . . . ; J. The corresponding risks can

be estimated as follows:

dCAVaRaðZkDtÞ ¼ infg2R

gþ 1a

1J

XJ

j¼1

�g�Xn

i¼1

xkDti RkDt

ij

� �þ

!( ); ð19Þ

dCHMCRaðZkDtÞ ¼ infg2R

gþ 1a

1

J1=p

XJ

j¼1

�g�Xn

i¼1

xkDti RkDt

ij

� � !p

þ

!1=p8<:

9=;:ð20Þ

5. Numerical implementation issues and illustrative examples

When p > 1, the optimization at each step k = 1, 2, . . ., M � 1 rep-resents a convex optimization problem with convex constraints,hence it has a unique solution. The case p = 1 actually representsa linear programming problem and can in principle be solved withthe simplex method although the general procedure for p > 1 canalso be applied for this particular case. We illustrate this for thecase of (20) [(19) is obviously similarly treated]. We see easily thatit is equivalent to:

ming;d;x

gþ 1a

1J1=p

XJ

j¼1

dpj

!1=p8<:

9=;;subject to

Xn

i¼1

xkDti ¼ bðtÞ;

1J

XJ

j¼1

Xn

i¼1

RkDtij xkDt

i P c;

� g�Xn

i¼1

RkDtij xkDt

i 6 dj;

dj P 0;

xkDti P k:

ð21Þ

Hereby we have denoted by d the vector in which all vectors dj, j = 1,2, . . ., J are stacked. If we only allow for long positions then we setk = 0. Else we can let k to take a negative value depending on theinvestor’s choice.

We also note that for p = 2 the problem is a second order coneprogramming problem. In this paper we use the CVX package tosolve the problem. The latter package is a freely downloadable Mat-lab software for specifying and solving convex programs (Grant andBoyd, 2011) hence it can be used for any choice of p > 1 in our case.

To illustrate the advantages of the higher order coherent riskmeasures, we use the stocks that are involved as components ofthe S&P500 index. We build options on n = 10 of these stocks andcombine these options in a portfolio to be optimized. The sample

0.5

1

1.5

2

2.5

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

Risk Averse Optimized portfolio evolutions

AVaR Optimized portfolio (euro options)SMCRM Optimized portfolio (euro options)

Mean Var Optimized portfolio (euro options)TMCRM Optimized portfolio (euro options)

Fig. 1. Weekly rebalanced portfolio levels from 2003/11/04 to 2005/09/22.


is composed by using the Close Prices of S&P500 Stocks from 4thJanuary 2000 to 22nd September 2005 which gives 1500 quota-tions for each Stock. Initialization is done using quotations from4th January 2000 to 3rd November 2003. Using 1000 daily returns,we estimate the GARCH (1,1) multivariate coefficients and theCovariance Matrix (15). Next we calculate the 995 weekly returns.Using the remaining data from 4th November 2003 to 22nd Sep-tember 2005, we launch out-of-sample simulations. At this stagedata appear like continuously new quotations. Each week, the sam-ple of 995 weekly returns and the sample of 999 daily returns arerefreshed by deleting the five ‘‘oldest’’ and adding five new weeklyreturns. Then portfolio re-balancing is made using these samples.Thus at each loop of the algorithm (i.e., every week) the previous995 weekly returns are used to estimate the return distributionsof the stocks. Previous 999 daily returns are used to estimate theparameters of the GARCH (1,1) model. To illustrate the procedurenumerically, we choose a basket composed with calls in the moneydefined at the date 4th November 2003, set K = 0.8 � S0. In our

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 5 10 15 20 25 30 35 40 45 50

Risk evolution of Risk Averse Optimized portfolio

Va

Fig. 2. Weekly rebalanced portfolio risk evo

example, we have T = 2005/09/22. We choose cij to coincide withthe risk-free rate a.s. and as a result the portfolio performs at leastas a risk free asset. For each Risk Measure we choose a level of con-fidence to have a fair comparison. We require these levels tosatisfy:

aðAVaRÞ < aðSMCRMÞ < aðTMCRMÞ;

with SMCRM denoting ‘‘Second Order Coherent Risk Measure’’(p = 2), and TMCRM denoting ‘‘Third Moment Coherent Risk Mea-sure’’ (p = 3). These choices are imposed by the Higher Order RiskMeasures definition. For a discussion, we refer to Example 2.2 inKrokhmal (2007) (by noting that (1 � a) used in his notationis denoted as a in this paper). We choose a(AVaR) = 0.01,a(SMCRM) = 0.07 and a(TMCRM) = 0.1. Finally we choosek(t) = �b(t)/n with n being the number of assets in the portfolio.With other words, we do allow short positions as long as they arelimited to the opposite value of the portfolio.

55 60 65 70 75 80 85 90 95 100

s (10 euro Call In The Money S&P 500 assets)

AVaR Optimized portfolioSMCRM Optimized portfolio

Mean Var Optimized portfolioTMCRM Optimized portfolio

R(CVaR dual value) Optimized portfolio

lutions from 2003/11/04 to 2005/09/22.

0.5

1

1.5

2

2.5

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

Delta evolution of Optimized portfolios (10 euro Call In The Money S&P 500 assets)

AVaR Optimized portfolio (euro options)SMCRM Optimized portfolio (euro options)

Mean Var Optimized portfolio (euro options)TMCRM Optimized portfolio (euro options)

Fig. 3. Weekly rebalanced portfolio delta evolution from 2003/11/04 to 2005/09/22.


Fig. 1 represents the evolution of the portfolio’s levels. Everyportfolio starts with the value 1 and follows its path that is drivenby the risk optimization methodology. We can see that at the endof the 2 years of out-of-sample simulations, the best performing isthe portfolio based on TMCRM (p = 3) optimization. Second bestwas the result obtained via SMCRM (p = 2) optimization, third bestis the AVaR optimization (p = 1) and the worst result is obtainedwhen optimization using the Markowitz model is performed.

Fig. 2 shows the risk evolution of each portfolio. We can verifyagain that virtually at each point in time, the SMCRM, TMCRM andAVaR results are about equal and are uniformly higher than thoseobtained when using VaR. It can also be seen that the SMCRM andTMCRM results are very volatile.

Fig. 3 reports the Delta evolution of each portfolio. Delta hasbeen calculated by utilizing the Monte Carlo simulations used tocalculate option price. The percentage of variation of underlyingwas settled at 1%. We can see that Delta grows very strongly fromthe 80th week for SMCRM, TMCRM and AVaR portfolios. Moreoverthe Delta is very volatile for every portfolio.

These results highlight the limitation of the Markowitz modelwhen it deals with no cash products. Complications appear whenwe treat even simple vanilla options. Simple minimization of var-iance is not efficient to reduce future portfolio losses. The higherorder coherent risk measures, by focusing on the tail behavior ofthe losses, take into account more efficiently the information pro-vided by the past. However these portfolios exhibit larger varia-tions which are difficult to predict. An investor who wants amore stable investment could in addition apply a reduction of glo-bal variance method to these portfolios by dividing every weightapplied on the different options by their realized volatility. In thisway the global volatility can be kept in control.

We also observe heavy dependence on delta. A way to reducethis dependence can be to take an opposite position on Index witha level of delta. This method can reduce performance but will con-siderably reduce dependence on market moves.

6. Conclusion

This paper developed application of higher order risk measuresto the optimization of an option portfolio. Starting with the discus-

sion of coherent risk measures in static and dynamic setting, wediscussed the family of higher order coherent risk measuresthrough their general definition. This family can be used to esti-mate risk of every process which follows uncertain paths. To dealwith a general portfolio optimization we introduced the multistagerisk averse optimization using Stochastic Programming. The lattermethod is a generalization of the well-known 2-stage optimizationwhich minimizes a risk functional on portfolio weights, imposing aminimum expected return. An important advantage of this optimi-zation is that it can be applied with every instrument as long as wecan get a return distribution. We chose to apply the method to Va-nilla European options. As the members of an S&P500 Index areintensively correlated, we took into account these correlationsand the time dependence through the use of the flexible Multivar-iate GARCH (1,1) covariance model in option pricing. After gettinga sample of underlying paths between two instants we were ableto obtain the options weekly returns. These evolution scenarioswere implemented in the multistage optimization using each ofthe Markowitz, AVaR and the HMCRM estimators. Results obtainedwith European Calls show a wide efficiency of HCMRM when esti-mating options risk. The methodology outlined in the paper can beextended to any kind of options. Different option pricing have to beused to get option returns especially for path dependant optionsbut the algorithmic method is generally applicable. An investorwho uses American options can use a tree method to estimatethe variation of option prices, for example.

There are further avenues to be explored. The global perfor-mance of risk optimization strategies could be improved by addingmore layers on the risk control. One possibility is to add a deltahedging with a daily frequency to reduce the market expositionof the portfolio. A Vega and Gamma control can also be added byde-leveraging and leveraging the global exposition in a weekly ormonthly basis. One more important avenue is, by using again thehigher order coherent risk measures instead of the more traditionalones, to study the risk evolution of a hedging strategy during thelife cycle of options.

The demonstrated effects in Section 5 are data specific. Replicat-ing the empirical part of the study using newer data and differenttime spans, varying the choice of the constant c in (7), changing thesubset of stocks from the S&P index and other variations of the dataanalysis are possible. These activities are left as a future avenue of


research. However the presented case study indicates the great po-tential and the promising performance of the higher order coher-ent risk measures as a criterion in optimizing the performance ofoption portfolios.

Acknowledgements

The authors are very grateful to the editor and to the threeexternal referees for their valuable input on an earlier submission.Their very useful comments and suggestions helped us to improveboth the style and the presentation of the paper.

References

Acciaio, B., Penner, I., 2011. Dynamic risk measures. In: Di Nunno, G., Øksendal, B.(Eds.), Advanced Mathematical Methods for Finance. Springer, Heidelberg, NewYork, pp. 1–34.

Acerbi, C., Tasche, D., 2002. On the coherence of expected shortfall. Journal ofBanking and Finance 26, 1487–1503.

Artzner, P., Delbaen, F., Eber, J.-M., Heath, D., 1999. Coherent measures of risk.Mathematical Finance 9, 203–228.

Ben-Tal, A., Nemirovski, A., 1999. Robust solutions of uncertain linear programs.Operations Research Letters 25, 1–13.

Bion-Nadal, J., 2009. Time consistent dynamic risk processes. Stochastic Processesand their Applications 119, 633–654.

Campbell, J., Lo, A., MacKinlay, A.C., 1997. The Econometrics of Financial Markets.Princeton University Press, Princeton.

Cheridito, P., Li, T.H., 2009. Risk measures on Orlicz hearts. Mathematical Finance19, 189–214.

Dentcheva, D., Penev, S., Ruszczynski, A., 2010. Kusuoka representation of higherorder dual risk measures. Annals of Operations Research 181, 325–335.

Grant, M., Boyd, S., 2011. CVX: Matlab software for disciplined convexprogramming, version 1.21. <http://cvxr.com/cvx/>.

Fano, U., 1947. Ionization yield of radiations: II. The fluctuations of the number ofions. Physical Review 72, 26–29.

Föllmer, H., Schied, A., 2002. Convex measures of risk and trading constraints.Finance and Stochastics 6, 429–447.

Krokhmal, P., 2007. Higher moment coherent risk measures. Quantitative Finance 7,373–387.

Hull, J., 2009. Options, Futures and Other Derivatives. Pearson Education, NewJersey.

Kusuoka, S., 2001. On law invariant coherent risk measures. Advances inMathematical Economics 3, 83–95.

Ledoit, O., Santa-Clara, P., Wolf, M., 2003. Flexible multivariate GARCH modelingwith an application to international stock markets. The Review of Economicsand Statistics 85, 735–747.

Markowitz, H.M., 1952. Portfolio selection. Journal of Finance 7, 77–91.Markowitz, H.M., 1987. Mean–Variance Analysis in Portfolio Choice and Capital

Markets. Blackwell, Oxford.Ogryczak, W., Ruszczynski, A., 1999. From stochastic dominance to mean-risk

models: semideviations and risk measures. European Journal of OperationalResearch 116, 33–50.

Rockafellar, R., Uryasev, S., Zabarankin, M., 2008. Risk tuning with generalized linearregression. Mathematics of Operations Research 33, 712–729.

Rockafellar, R., Uryasev, S., 2002. Conditional value-at-risk for general lossdistributions. Journal of Banking & Finance 26, 1443–1471.

Ruszczynski, A., 2010. Risk-averse dynamic programming for Markov decisionprocesses. Mathematical Programming 125, 235–261.

Shapiro, A., Dentcheva, D., Ruszczynski, A., 2009. Lectures on StochasticProgramming: Modelling and Theory. SIAM.

Topaloglou, N., Vladimirou, H., Zenios, S., 2011. Optimizing international portfolioswith options and forwards. Journal of Banking & Finance 35, 3188–3201.

Zymler, S., Rustem, B., Kuhn, D., 2011. Robust portfolio optimization with derivativeinsurance guarantees. European Journal of Operational Research 210, 410–424.

http://cvxr.com/cvx/

multistage optimization of option portfolio using higher order coherent risk measures

Documents