selected topics from portfolio optimization

105
Selected Topics from Portfolio Optimization Including some topics on Stochastic Optimization Lecture Notes Winter 2020 Alois Pichler Faculty of Mathematics DRAFT Version as of May 28, 2021

Upload: others

Post on 09-Dec-2021

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Selected Topics from Portfolio Optimization

Selected Topics fromPortfolio Optimization

Including some topics on

Stochastic Optimization

Lecture Notes

Winter 2020

Alois Pichler

Faculty of Mathematics

DRAFTVersion as of May 28, 2021

Page 2: Selected Topics from Portfolio Optimization

2

rough draft: do not distribute

Page 3: Selected Topics from Portfolio Optimization

Preface and Acknowledgment

Le silence รฉternel de ces espacesinfinis mโ€™effraie.

Blaise Pascal, Pensรฉes,Fragment Transition no 7 / 8

The purpose of these lecture notes is to facilitate the content of the lecture and the course.From experience it is helpful and recommended to attend and follow the lectures in addition.These lecture notes do not cover the lectures completely.

I am indebted to Prof. Georg Ch. Pflug for numerous discussions in the area and signifi-cant support over years.

Please report mistakes, errors, violations of copyright, improvements or necessary com-pletions.

Content: https://www.tu-chemnitz.de/mathematik/studium/module/2013/M16.pdf

3

Page 4: Selected Topics from Portfolio Optimization

4

rough draft: do not distribute

Page 5: Selected Topics from Portfolio Optimization

Contents

1 Historical Milestones in Portfolio Optimization 91.1 In Banking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.2 In Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Bibliography 10

2 Introduction and Classification of Stochastic Programs 132.1 Relations and Connections to Portfolio Optimization: Markowitz . . . . . . . . 132.2 Alternative Formulations of the Markowitz Problem . . . . . . . . . . . . . . . . 132.3 Involving Risk Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.1 Risk Neutral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.2 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.3 Robust Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.3.4 Distributionally Robust Optimization . . . . . . . . . . . . . . . . . . . . 15

2.4 Probabilistic Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.5 Stochastic Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.6 On General Difficulties In Stochastic Optimization . . . . . . . . . . . . . . . . . 15

3 The Markowitz Model 173.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 Empirical Problem Formulation And Variables . . . . . . . . . . . . . . . . . . . 173.3 The Empirical/ Discrete Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.4 The First Moment: Return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.5 The Second Moment: Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.6 The Non-Empirical Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.7 The Capital Asset Pricing Model (CAPM) . . . . . . . . . . . . . . . . . . . . . 23

3.7.1 The Mean-Variance Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.7.2 Tangency portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.7.3 The Two Fund Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.8 Markowitz Portfolio Including a Risk Free Asset . . . . . . . . . . . . . . . . . . 293.9 One Fund Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.9.1 Capital Asset Pricing Model (CAPM) . . . . . . . . . . . . . . . . . . . . 313.9.2 On systematic and specific risk . . . . . . . . . . . . . . . . . . . . . . . 333.9.3 Sharpe ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.10 Alternative Formulations of the Markowitz Problem . . . . . . . . . . . . . . . . 333.11 Principal Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5

Page 6: Selected Topics from Portfolio Optimization

6 CONTENTS

4 Value-at-Risk 374.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.2 How about adding risk? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3 Properties of the Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.4 Profit versus loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5 Axiomatic Treatment of Risk 43

6 Examples of Coherent Risk Functionals 456.1 Mean Semi-Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.2 Average Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466.3 Entropic Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.4 Spectral Risk Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.5 Kusuokaโ€™s Representation of Law Invariant Risk Measures . . . . . . . . . . . 506.6 Application in Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

7 Portfolio Optimization Problems Involving Risk Measures 537.1 Integrated Risk Management Formulation . . . . . . . . . . . . . . . . . . . . . 537.2 Markowitz Type Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537.3 Alternative Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

8 Expected Utility Theory 578.1 Examples of utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578.2 Arrowโ€“Pratt measure of absolute risk aversion . . . . . . . . . . . . . . . . . . 578.3 Example: St. Petersburg Paradox1 . . . . . . . . . . . . . . . . . . . . . . . . . 588.4 Preferences and utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . 59

9 Stochastic Orderings 619.1 Stochastic Dominance of First Order . . . . . . . . . . . . . . . . . . . . . . . . 619.2 Stochastic Dominance of Second Order . . . . . . . . . . . . . . . . . . . . . . 629.3 Portfolio Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

10 Arbitrage 6710.1 Type A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6710.2 Type B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

11 The Flowergirl Problem2 7111.1 The Flowergirl problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7111.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

12 Duality For Convex Risk Measures 73

1By Ruben Schlotter2Also: Newsboy, or Newsvendor problem

rough draft: do not distribute

Page 7: Selected Topics from Portfolio Optimization

CONTENTS 7

13 Stochastic Optimization: Terms, and Definitions, and the Deterministic Equiva-lent 7513.1 Expected Value of Perfect Information (EVPI) and Value of Stochastic Solution

(VSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7513.2 The Farmer Ted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7513.3 The Risk-Neutral Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7513.4 Glossary/ Concept/ Definitions: . . . . . . . . . . . . . . . . . . . . . . . . . . . 7613.5 KKT for (13.2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7613.6 Deterministic Equivalent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7713.7 L-Shaped Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7713.8 Farkasโ€™ Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7813.9 L-Shaped Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7813.10Variants of the Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

14 Co- and Antimonotonicity 8114.1 Rearrangements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8114.2 Comonotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8214.3 Integration of Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 8414.4 Copula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8414.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

15 Convexity 8715.1 Properties of Convex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 8715.2 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8915.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

16 Sample Average Approximation (SAA) 9316.1 SAA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

16.1.1 Pointwise LLN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9316.1.2 Pointwise and Functional CLT . . . . . . . . . . . . . . . . . . . . . . . . 94

16.2 The ฮ”-method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

17 Weak Topology of Measures 9717.1 General Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9717.2 The Wasserstein Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9817.3 The Real Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

18 Topologies For Set-Valued Convergence 10118.1 Topological features of Minkowski addition . . . . . . . . . . . . . . . . . . . . . 101

18.1.1 Topological features of convex sets . . . . . . . . . . . . . . . . . . . . . 10118.2 Preliminaries and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

18.2.1 Convexity, and Conjugate Duality . . . . . . . . . . . . . . . . . . . . . . 10218.2.2 Pompeiuโ€“Hausdorff Distance . . . . . . . . . . . . . . . . . . . . . . . . 103

18.3 Local description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

Version: May 28, 2021

Page 8: Selected Topics from Portfolio Optimization

8 CONTENTS

rough draft: do not distribute

Page 9: Selected Topics from Portfolio Optimization

1Historical Milestones in Portfolio Optimization

Probability is the foundation of banking.

Francis Ysidro Edgeworth, 1845โ€“1926,Anglo-Irish philosopher and political

economist. Edgeworth [1888]

1.1 IN BANKING

โŠฒ 1938: Bond Duration, Edgeworth [1888]

โŠฒ 1952: Markowitz mean-variance framework

โŠฒ 1963: Sharpโ€™s capital asset pricing model

โŠฒ 1966: Multiple factor models

โŠฒ 1973: Black & Scholes option pricing model, the โ€œGreeksโ€

โŠฒ 1988: Risk weighted assets for banks

โŠฒ 1993: Value-at-Risk

โŠฒ 1994: Risk Metrics

โŠฒ 1997: Credit Metrics

โŠฒ 1998: Integration of credit and market risk

โŠฒ 1998: Risk Budgeting, the Basel Rules

โŠฒ 2007: Basel II

โŠฒ 2017: Basel III

1.2 IN INSURANCE

The natural business of insurance companies is concerned with Risk.

โŠฒ Pricing of individual Contracts

โŠฒ Reserving in the Portfolio

โŠฒ The Cramรฉr-Lundberg model

โŠฒ Solvability

9

Page 10: Selected Topics from Portfolio Optimization

10 HISTORICAL MILESTONES IN PORTFOLIO OPTIMIZATION

โŠฒ Solvency II

โŠฒ US and Canada Insurance Supervisory: Conditional Tail Expectation

rough draft: do not distribute

Page 11: Selected Topics from Portfolio Optimization

Bibliography

P. Artzner, F. Delbaen, and D. Heath. Thinking coherently. Risk, 10:68โ€“71, 1997. 43

P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath. Coherent Measures of Risk. MathematicalFinance, 9:203โ€“228, 1999. doi:10.1111/1467-9965.00068. 43

A. Ben-Tal and A. Nemirovski. Lectures on modern convex optimization. SIAM Series onOptimization. 2001. 15

R. I. Bot, S.-M. Grad, and G. Wanka. Duality in Vector Optimization. Springer, 2009.doi:10.1007/978-3-642-02886-1. 87

C. Castaing and M. Valadier. Convex Analysis and Measurable Multifunctions. Number580 in Lecture Notes in Mathematics. Springer, 1977. doi:10.1007/BFb0087685. URLhttps://books.google.com/books?id=Fev0CAAAQBAJ. 103

G. Cornuejols and R. Tรผtรผncรผ. Optimization Methods in Finance. Cambridge UniversityPress (CUP), 2006. doi:10.1017/cbo9780511753886. 67

D. Denneberg. Non-additive measure and integral, volume 27. Springer Science & BusinessMedia, 1994. doi:10.1007/978-94-017-2434-0. 82

D. Dentcheva and A. Ruszczynski. Portfolio optimization with risk control by stochastic dom-inance constraints. In G. Infanger, editor, Stochastic Programming, volume 150 of Interna-tional series in Operations Research & Management Science, chapter 9, pages 189โ€“211.Springer Science+Business Media, LLC, 2011. doi:10.1007/978-1-4419-1642-6. 65

F. Y. Edgeworth. The mathematical theory of banking. Journal of the Royal Statistical Society,51(1):113โ€“127, 1888. 9

H. Fรถllmer and A. Schied. Stochastic Finance: An Introduction in Discrete Time. de GruyterStudies in Mathematics 27. Berlin, Boston: De Gruyter, 2004. ISBN 978-3-11-046345-3.doi:10.1515/9783110218053. URL http://books.google.com/books?id=cL-bZSOrqWoC.51

C. Hess. Set-valued integration and set-valued probability theory: An overview. In E. Pap,editor, Handbook of Measure Theory, volume I, II of Handbook of Measure Theory, chap-ter 14, pages 617โ€“673. Elsevier, 2002. doi:10.1016/B978-044450263-6/50015-4. 103

A. Mรผller and D. Stoyan. Comparison methods for stochastic models and risks. Wiley seriesin probability and statistics. Wiley, Chichester, 2002. ISBN 978-0-471-49446-1. URL https://books.google.com/books?id=a8uPRWteCeUC. 64

G. Ch. Pflug and A. Pichler. Multistage Stochastic Optimization. Springer Series inOperations Research and Financial Engineering. Springer, 2014. ISBN 978-3-319-08842-6. doi:10.1007/978-3-319-08843-3. URL https://books.google.com/books?id=q_VWBQAAQBAJ. 71, 98

11

Page 12: Selected Topics from Portfolio Optimization

12 BIBLIOGRAPHY

G. Ch. Pflug and W. Rรถmisch. Modeling, Measuring and Managing Risk. World Scientific,River Edge, NJ, 2007. doi:10.1142/9789812708724. 17, 39, 51, 72

S. T. Rachev and L. Rรผschendorf. Mass Transportation Problems Volume I: Theory, VolumeII: Applications, volume XXV of Probability and its applications. Springer, New York, 1998.doi:10.1007/b98893. 99

R. T. Rockafellar. Conjugate Duality and Optimization, volume 16. CBMS-NSF Regional Con-ference Series in Applied Mathematics. 16. Philadelphia, Pa.: SIAM, Society for Industrialand Applied Mathematics. VI, 74 p., 1974. doi:10.1137/1.9781611970524. 103

R. T. Rockafellar and R. J.-B. Wets. Variational Analysis. Springer Nature SwitzerlandAG, 1997. doi:10.1007/978-3-642-02431-3. URL https://books.google.com/books?id=w-NdOE5fD8AC. 103

A. Shapiro. Time consistency of dynamic risk measures. Operations Research Letters, 40(6):436โ€“439, 2012. doi:10.1016/j.orl.2012.08.007. 51

A. Shapiro, D. Dentcheva, and A. Ruszczynski. Lectures on Stochastic Programming. MOS-SIAM Series on Optimization. SIAM, second edition, 2014. doi:10.1137/1.9780898718751.51, 93

A. E. van Heerwaarden and R. Kaas. The Dutch premium principle. Insurance: Mathematicsand Economics, 11:223โ€“230, 1992. doi:10.1016/0167-6687(92)90049-H. 51

rough draft: do not distribute

Page 13: Selected Topics from Portfolio Optimization

2Introduction and Classification of Stochastic Programs

Universitรคten sind gefรคhrlicher alsHandgranaten.

Ruhollah Chomeini, 1902โ€“1989

We employ the usual axioms in probability theory and denote a probability space by

(ฮฉ, F , ๐‘ƒ).

Typically, we denote random variables mapping to a state space ฮž by

b : ฮฉโ†’ ฮž

(or sometimes also ๐‘Œ : ฮฉโ†’ R).

2.1 RELATIONS AND CONNECTIONS TO PORTFOLIO OPTIMIZATION: MARKOWITZ

See Markowitz, Section 3 below for details.

Definition 2.1. A portfolio ๐‘ฅโˆ— โˆˆ R๐ฝ (with ๐ฝ indicating the number of stocks) is efficient if itsolves

minimize in ๐‘ฅโˆˆR๐ฝ var ๐‘ฅ>b (2.1)subject to E ๐‘ฅ>b โ‰ฅ `,

1> ๐‘ฅ โ‰ค 1,(๐‘ฅ โ‰ฅ 0).

2.2 ALTERNATIVE FORMULATIONS OF THE MARKOWITZ PROBLEM

Instead of Markowitz (2.1) one might consider the problem

maximize E ๐‘ฅ>b (2.2)subjet to var ๐‘ฅ>b โ‰ค ๐‘ž,

1> ๐‘ฅ โ‰ค 1,(๐‘ฅ โ‰ฅ 0).

13

Page 14: Selected Topics from Portfolio Optimization

14 INTRODUCTION AND CLASSIFICATION OF STOCHASTIC PROGRAMS

StochasticsProbability Theory

Optimization

Portfolio Optimization

Figure 2.1: Wide intersections: in theory and (economic) practice

2.3 INVOLVING RISK FUNCTIONALS

2.3.1 Risk Neutral

... is about the expectation, as

minimizein x E๐‘„(๐‘ฅ, b)subject to E๐บ๐‘– (๐‘ฅ, b) โ‰ค 0, ๐‘– = 1, . . . , ๐‘˜

This may be reformulated by employing the risk functional

R(๐‘„(๐‘ฅ, ยท)

)B Eb ๐‘„(๐‘ฅ, b).

2.3.2 Utility Functions

A function ๐‘ข : Rโ†’ R is a utility function if it satisfies some model-design properties in addition.Optimization problems involving utility function generally read

minimizein x E ๐‘ข(๐‘„(๐‘ฅ, b)

)subject to E๐บ๐‘– (๐‘ฅ, b) โ‰ค 0, ๐‘– = 1, . . . , ๐‘˜

or

minimizein x R(๐‘„(๐‘ฅ, b)

)subject to R

(๐บ๐‘– (๐‘ฅ, b)

)โ‰ค 0, ๐‘– = 1, . . . , ๐‘˜

where we might want to put

R(๐‘„(๐‘ฅ, ยท)

)B E ๐‘ข

(๐‘„(๐‘ฅ, b)

).

rough draft: do not distribute

Page 15: Selected Topics from Portfolio Optimization

2.4 PROBABILISTIC CONSTRAINTS 15

2.3.3 Robust Optimization

Robust optimization considers the problem (Ben-Tal and Nemirovski [2001])

minimizein x maxb โˆˆฮž

๐‘„(๐‘ฅ, b)

subject to maxb โˆˆฮž

๐บ๐‘– (๐‘ฅ, b) โ‰ค 0, ๐‘– = 1, . . . , ๐‘˜

Note, that there is no probability measure

R(๐‘„(๐‘ฅ, ยท)

)B ess sup๐‘„(๐‘ฅ, b)

and the problem is basically about the support of the probability measure.

2.3.4 Distributionally Robust Optimization

Distributionally robust optimization involves the probability measure instead,

minimizein x max๐‘ƒโˆˆ๐’ซR๐‘ƒ

(๐‘„(๐‘ฅ, b)

),

subject to R๐‘ƒ

(๐บ๐‘– (๐‘ฅ, b)

)โ‰ค 0, ๐‘– = 1, . . . , ๐‘˜,

where we indicate the probability measure ๐‘ƒ โˆˆ ๐’ซ explicitly.

2.4 PROBABILISTIC CONSTRAINTS

This is about the problem

minimize (in ๐‘ฅ) R(๐‘„(๐‘ฅ, b)

)subject to ๐‘ƒ

(๐บ๐‘– (๐‘ฅ, b) โ‰ค 0

)โ‰ฅ ๐›ผ, ๐‘– = 1, . . . , ๐‘˜

Example 2.2 (Economic example). Call Center

2.5 STOCHASTIC DOMINANCE

minimizein x E ๐‘ข(๐‘„(๐‘ฅ, b)

)(2.3)

subject to ๐บ๐‘– (๐‘ฅ, b) < ๐‘Œ, ๐‘– = 1, . . . , ๐‘˜

for some (stochastic) order relation <.

2.6 ON GENERAL DIFFICULTIES IN STOCHASTIC OPTIMIZATION

sample ofobservations

estimation/

model selectionprobability

modelscenario

generationscenariomodel

Problem 2.3 (Accuracy). How large/ small do we need Y > 0?

Version: May 28, 2021

Page 16: Selected Topics from Portfolio Optimization

16 INTRODUCTION AND CLASSIFICATION OF STOCHASTIC PROGRAMS

rough draft: do not distribute

Page 17: Selected Topics from Portfolio Optimization

3The Markowitz Model

The trend is your friend.

Bรถrsenweisheit

This section follows Pflug and Rรถmisch [2007, Section 4].The model of Markowitz1 is historically the first model to determine the decomposition of

an optimal portfolio.

3.1 INTRODUCTION

For portfolio optimization people often use the simple model provided by Harry Markowitzwhich involves the variance. Note that it is a significant drawback of the Markowitz modelthat positive deviations (profits โ€“ this is what the investor wants) and negative deviations(losses โ€“ this is what the investor tries to avoid) are treated exactly the same way. So theMarkowitz model is of historical interest (it was the first model on asset allocation with theobjective to reduce the variance) but it violates some natural objectives of an investor.

Extensions of the problem described at the end of this section avoid this downside.

3.2 EMPIRICAL PROBLEM FORMULATION AND VARIABLES

(i) ๐ฝ is the number of stocks considered (๐ฝ = 5 in the example which Table 3.1 displays);

(ii) each stock ๐‘— โˆˆ {1, . . . , ๐ฝ} is observed at ๐‘› + 1 consecutive times ๐‘ก0, . . . , ๐‘ก๐‘› (๐‘› = 12 inTable 3.1);

(iii) the price observed of stock ๐‘— at time ๐‘ก is ๐‘† ๐‘—๐‘ก ;

(iv) b๐‘– =(b1๐‘–, b2

๐‘–, . . . b๐ฝ

๐‘–

)> collects the annualized returns of all ๐ฝ stocks; note that b ๐‘—๐‘–= ๐‘’>

๐‘—b๐‘–;

(v) ๐‘ฅ ๐‘— represents the fraction of cash invested in stock ๐‘— , ๐‘— โˆˆ {1, . . . , ๐ฝ}; we set ๐‘ฅ B(๐‘ฅ1, ๐‘ฅ2, . . . , ๐‘ฅ๐ฝ )>, ๐‘ฅ is the allocation vector;

(vi) The budget constraint: the total amount of cash to be invested is not more than thebudget available. 1C is the default value (or 1๐‘šC, say): the budget constraint thusreads ๐‘ฅ> 1 โ‰ค 1C, where 1 = (1, . . . , 1)>;

(vii) Short-selling constraint: occasionally we do not allow short-selling (i.e., negative po-sitions), then the constraints ๐‘ฅ โ‰ฅ 0 has to be added. ๐‘ฅ โ‰ฅ 0 is understood as ๐‘ฅ ๐‘— โ‰ฅ 0,๐‘— โˆˆ {1, . . . , ๐ฝ}, for each stock.

To solve the problem one needs to specify the probability measure ๐‘ƒ which is used to com-pute the expectation E and the variance var.

1Harry Max Markowitz. 1927. Nobel Memorial Price in Economic Sciences in 1990

17

Page 18: Selected Topics from Portfolio Optimization

18 THE MARKOWITZ MODEL

๐‘†๐‘ก/C DAX RWE gold oil US-$/ C

๐‘ก0 January 9798.11 12.870 981.49 34.9 0.9228๐‘ก1 February 9495.40 10.540 1030.49 33.08 0.9197๐‘ก2 March 9965.51 11.375 1144.73 33.77 0.8787๐‘ก3 April 10038.97 13.045 1137.17 37.04 0.8729๐‘ก4 May 10262.74 11.765 1192.35 43.71 0.8983๐‘ก5 June 9680.09 14.190 1121.85 45.81 0.9005๐‘ก6 July 10337.50 15.905 1222.25 46.04 0.8949๐‘ก7 August 10592.69 14.665 1244.67 39.78 0.8962๐‘ก8 September 10511.02 15.335 1204.53 43.35 0.8896๐‘ก9 October 10665.01 14.460 1212.10 46.15 0.9107๐‘ก10 November 10640.30 11.860 1180.53 44.95 0.9444๐‘ก11 December 11481.06 11.815 1082.17 47.72 0.9509

๐‘ก12 January 11599.01 11.800 1081.43 52.69 0.9494

Table 3.1: Prices ๐‘† ๐‘—๐‘ก๐‘–

observed in 2016. www.investing.com

Jan 2016 Apr 2016 Jul 2016 Oct 2016 Jan 20170.8

0.9

1

1.1

1.2

1.3

1.4

1.5

1.6

DAX

RWE

gold

oil

Figure 3.1: Prices of Table 3.1

rough draft: do not distribute

Page 19: Selected Topics from Portfolio Optimization

3.3 THE EMPIRICAL/ DISCRETE MODEL 19

3.3 THE EMPIRICAL/ DISCRETE MODEL

Empirical models extract the probability model from historic observations.For this observe a stock at ๐‘› + 1 successive times (๐‘ก๐‘–)๐‘›๐‘–=0 and collect the prices ๐‘†๐‘ก๐‘– .

Definition 3.1. Its annualized return during the time-period [๐‘ก๐‘–โˆ’1, ๐‘ก๐‘–] is2

b๐‘– B1

๐‘ก๐‘– โˆ’ ๐‘ก๐‘–โˆ’1ln

๐‘†๐‘ก๐‘–

๐‘†๐‘ก๐‘–โˆ’1.

Define the weights (probabilities) ๐‘๐‘– B ๐‘ก๐‘–โˆ’๐‘ก๐‘–โˆ’1๐‘ก๐‘›โˆ’๐‘ก0 , a random variable b : ฮฉ โ†’ R and a proba-

bility measure with

๐‘ƒ(b = b๐‘–) B ๐‘๐‘– =๐‘ก๐‘– โˆ’ ๐‘ก๐‘–โˆ’1๐‘ก๐‘› โˆ’ ๐‘ก0

and we set

๐‘> B (๐‘1, . . . , ๐‘๐‘›) B(๐‘ก1 โˆ’ ๐‘ก0๐‘ก๐‘› โˆ’ ๐‘ก0

,๐‘ก2 โˆ’ ๐‘ก1๐‘ก๐‘› โˆ’ ๐‘ก0

, . . . ,๐‘ก๐‘› โˆ’ ๐‘ก๐‘›โˆ’1๐‘ก๐‘› โˆ’ ๐‘ก0

).

Remark 3.2. Note thatโˆ‘๐‘›

๐‘–=1 ๐‘๐‘– = ๐‘> ยท 1 =

โˆ‘๐‘›๐‘–=1

๐‘ก๐‘–โˆ’๐‘ก๐‘–โˆ’1๐‘ก๐‘›โˆ’๐‘ก0 = 1, thus

1๐‘ก๐‘› โˆ’ ๐‘ก0

ln๐‘†๐‘ก๐‘›

๐‘†๐‘ก0๏ธธ ๏ธท๏ธท ๏ธธannual return

=1

๐‘ก๐‘› โˆ’ ๐‘ก0

๐‘›โˆ‘๐‘–=1ln

๐‘†๐‘ก๐‘–

๐‘†๐‘ก๐‘–โˆ’1(3.1)

=

๐‘›โˆ‘๐‘–=1

๐‘ก๐‘– โˆ’ ๐‘ก๐‘–โˆ’1๐‘ก๐‘› โˆ’ ๐‘ก0๏ธธ ๏ธท๏ธท ๏ธธ

๐‘๐‘–

ยท 1๐‘ก๐‘– โˆ’ ๐‘ก๐‘–โˆ’1

ln๐‘†๐‘ก๐‘–

๐‘†๐‘ก๐‘–โˆ’1๏ธธ ๏ธท๏ธท ๏ธธannualized return per period

=

๐‘›โˆ‘๐‘–=1

๐‘๐‘– ยท b๐‘–๏ธธ ๏ธท๏ธท ๏ธธaverage of returns

= E๐‘ƒ b. (3.2)

Based on this observation it follows that the annual return for the entire period is the expectedvalue of the annualized returns of successive periods (cf. Table 3.2).

Definition 3.3 (The first moment). The expected return is ๐‘Ÿ B E b.

Obviously, one may observe all ๐ฝ stocks in parallel, at the same time. So put

b๐‘—

๐‘–B

1๐‘ก๐‘– โˆ’ ๐‘ก๐‘–โˆ’1

ln๐‘†๐‘—๐‘ก๐‘–

๐‘†๐‘—๐‘ก๐‘–โˆ’1

, ๐‘– = 1, . . . , ๐‘›, ๐‘— = 1, . . . , ๐ฝ

and set b๐‘– B(b1๐‘–, . . . , b๐ฝ

๐‘–

). Collect all observations in the ๐‘› ร— ๐ฝ-matrix

ฮž B(b๐‘—

๐‘–

) ๐‘—=1:๐ฝ๐‘–=1:๐‘›

=

(1

๐‘ก๐‘– โˆ’ ๐‘ก๐‘–โˆ’1ln

๐‘†๐‘—๐‘ก๐‘–

๐‘†๐‘—๐‘ก๐‘–โˆ’1

) ๐‘—=1:๐ฝ

๐‘–=1:๐‘›

(cf. Table 3.2). For the random return vector we have that

๐‘ƒ

(b =

(b1๐‘– , . . . , b

๐ฝ๐‘–

))= ๐‘ƒ (b = ฮž๐‘–) = ๐‘๐‘–

2Here and always: logarithmus naturalis with basis ๐‘’ = 2.718 . . .

Version: May 28, 2021

Page 20: Selected Topics from Portfolio Optimization

20 THE MARKOWITZ MODEL

annualized monthly returns b ๐‘—๐‘ก DAX RWE gold oil US-$/ C

๐‘1 = 1/12 or ๐‘1 = 31/365 -37.7% -239.7% 58.5% -64.3% -4.0%๐‘2 = 1/12 or ๐‘2 = 28/365 58.0% 91.5% 126.2% 24.8% -54.7%๐‘3 = 1/12 or ๐‘3 = 31/365 8.8% 164.4% -8.0% 110.9% -7.9%๐‘4 = 1/12 or ๐‘4 = 30/365 26.5% -123.9% 56.9% 198.7% 34.4%๐‘5 = 1/12 or ๐‘5 = 31/365 -70.1% 224.9% -73.1% 56.3% 2.9%๐‘6 = 1/12 or ๐‘6 = 30/365 78.8% 136.9% 102.9% 6.0% -7.5%๐‘7 = 1/12 or ๐‘7 = 31/365 29.3% -97.4% 21.8% -175.4% 1.7%๐‘8 = 1/12 or ๐‘8 = 31/365 -9.3% 53.6% -39.3% 103.1% -8.9%๐‘9 = 1/12 or ๐‘9 = 30/365 17.5% -70.5% 7.5% 75.1% 28.1%๐‘10 = 1/12 or ๐‘10 = 31/365 -2.8% -237.9% -31.7% -31.6% 43.6%๐‘11 = 1/12 or ๐‘11 = 30/365 91.3% -4.6% -104.4% 71.8% 8.2%๐‘12 = 1/12 or ๐‘12 = 31/365 12.3% -1.5% -0.8% 118.9% -1.9%

average of monthly returns, (3.2) 16.9% -8.7% 9.7% 41.2% 2.8%annual return, (3.1) 16.9% -8.7% 9.7% 41.2% 2.8%

Table 3.2: Matrix ฮž, collecting the returns b ๐‘—๐‘–, cf. Table 3.1

where ฮž๐‘– is the ๐‘–th row in the matrix ฮž). Note that the return of stock ๐‘— rewrites for theempirical probability measure

๐‘ƒ(ยท) =๐‘›โˆ‘๐‘–=1

๐‘๐‘– ยท ๐›ฟ( b 1๐‘– ,..., b ๐ฝ๐‘– ) (ยท),

i.e., each stock is a random variable with ๐‘ƒ

(b ๐‘— = b

๐‘—

๐‘–

)= ๐‘๐‘–, independently of ๐‘— . For every ๐‘—

thus

E b ๐‘— =

๐‘›โˆ‘๐‘–=1

๐‘๐‘–b๐‘—

๐‘–= ๐‘>b ๐‘— ,

where

๐‘> = (๐‘1, . . . , ๐‘๐‘›) =(๐‘ก1 โˆ’ ๐‘ก0๐‘ก๐‘› โˆ’ ๐‘ก0

,๐‘ก2 โˆ’ ๐‘ก1๐‘ก๐‘› โˆ’ ๐‘ก0

, . . . ,๐‘ก๐‘› โˆ’ ๐‘ก๐‘›โˆ’1๐‘ก๐‘› โˆ’ ๐‘ก0

).

Remark 3.4. It follows from the Taylor series expansion

ln(1 + ๐‘ฅ) = ๐‘ฅ โˆ’ 12๐‘ฅ2 + 1

3๐‘ฅ3 โˆ’ . . .

for small ๐‘ฅ that

b๐‘—

๐‘–=

1๐‘ก๐‘– โˆ’ ๐‘ก๐‘–โˆ’1

ln๐‘†๐‘—๐‘ก๐‘–

๐‘†๐‘—๐‘ก๐‘–โˆ’1

=1

๐‘ก๐‘– โˆ’ ๐‘ก๐‘–โˆ’1ln

(1 +

๐‘†๐‘—๐‘ก๐‘–

๐‘†๐‘—๐‘ก๐‘–โˆ’1

โˆ’ 1)โ‰ˆ 1๐‘ก๐‘– โˆ’ ๐‘ก๐‘–โˆ’1

(๐‘†๐‘—๐‘ก๐‘–

๐‘†๐‘—๐‘ก๐‘–โˆ’1

โˆ’ 1).

3.4 THE FIRST MOMENT: RETURN

Suppose an amount of ๐‘ฅ ๐‘— is invested in the stock ๐‘— . Then the total return of the investment is

๐‘ฅ>b =๐ฝโˆ‘๐‘—=1๐‘ฅ ๐‘—b

๐‘— .

rough draft: do not distribute

Page 21: Selected Topics from Portfolio Optimization

3.5 THE SECOND MOMENT: RISK 21

DAX RWE gold oil US-$/ C

return ๐‘Ÿ ๐‘— = E ๐‘’>๐‘—b 16.9% -8.7% 9.7% 41.2% 2.8%

variance ฮฃ ๐‘— ๐‘— = var(๐‘’>๐‘—b

)19.2% 208.8% 42.8% 88.9% 5.9%

Table 3.3: Return and variance (cf. Table 3.1)

Lemma 3.5. The expected return is ๐‘Ÿ B E b = ๐‘>ฮž.

The return observed of the portfolio in period ๐‘– isโˆ‘๐ฝ

๐‘—=1 b๐‘—

๐‘–๐‘ฅ ๐‘— = (ฮž ยท ๐‘ฅ)๐‘– (the ๐‘–th line in the

matrix ฮž ยท ๐‘ฅ). Markowith adds the constraint

E ๐‘ฅ>b =๐‘›โˆ‘๐‘–=1

๐‘๐‘–ยฉยญยซ

๐ฝโˆ‘๐‘—=1๐‘ฅ๐‘—

๐‘–b ๐‘—

ยชยฎยฌ = ๐‘>ฮž๐‘ฅ = ๐‘Ÿ>๐‘ฅ โ‰ฅ `,

which means, that a minimum return ` is required.

Remark 3.6. Suppose the total cash invested in stock ๐‘– is ๐ถ๐‘– (i.e., ๐ถ ๐‘—/๐‘† ๐‘—

0 is the number ofshares of stock ๐‘—) with total initial cash ๐ถ0 =

โˆ‘๐ฝ๐‘—=1๐ถ

๐‘— , then it is natural to define the fraction

๐‘ฅ ๐‘— B๐ถ ๐‘—โˆ‘๐ฝ๐‘—=1๐ถ

๐‘—so that

โˆ‘๐ฝ๐‘—=1 ๐‘ฅ ๐‘— = 1. The total portfolio value at time ๐‘ก then is ๐ถ๐‘ก = ๐ถ0 ยท

โˆ‘๐ฝ๐‘—=1 ๐‘ฅ ๐‘—

๐‘†๐‘—๐‘ก

๐‘†๐‘—

0.

Note, however, thatโˆ‘๐ฝ

๐‘—=1 ๐‘ฅ ๐‘—โˆ‘๐ผ

๐‘–=1 ๐‘†๐‘—๐‘ก๐‘–= ๐ถ๐‘ก๐ผ , but

โˆ‘๐ฝ๐‘—=1 ๐‘ฅ ๐‘—

โˆ‘๐ผ๐‘–=1 b

๐‘ก๐‘–๐‘—โ‰ 

๐ถ๐‘ก๐‘–

๐ถ๐‘ก0.

3.5 THE SECOND MOMENT: RISK

The covariance is

var ๐‘ฅ>b = E(๐‘ฅ>b โˆ’ E ๐‘ฅ>b

)2= E

(๐‘ฅ>b

)2 โˆ’ (E ๐‘ฅ>b

)2=

=

๐‘›โˆ‘๐‘–=1

๐‘๐‘–ยฉยญยซ

๐ฝโˆ‘๐‘—=1b๐‘—

๐‘–๐‘ฅ ๐‘—

ยชยฎยฌ2

โˆ’(

๐‘›โˆ‘๐‘–=1

๐‘๐‘–b๐‘—

๐‘–๐‘ฅ ๐‘—

)2=

๐‘›โˆ‘๐‘–=1

๐‘๐‘–ยฉยญยซ

๐ฝโˆ‘๐‘—=1

๐ฝโˆ‘๐‘—โ€ฒ=1

b๐‘—

๐‘–๐‘ฅ ๐‘— ยท b ๐‘—

โ€ฒ

๐‘–๐‘ฅ ๐‘—โ€ฒ

ยชยฎยฌ โˆ’๐‘›โˆ‘๐‘–=1

๐‘๐‘–ยฉยญยซ

๐ฝโˆ‘๐‘—=1b๐‘—

๐‘–๐‘ฅ ๐‘—

ยชยฎยฌ ยท๐‘›โˆ‘

๐‘–โ€ฒ=1๐‘๐‘–โ€ฒ

ยฉยญยซ๐ฝโˆ‘

๐‘—โ€ฒ=1b๐‘—โ€ฒ

๐‘–โ€ฒ ๐‘ฅ ๐‘—โ€ฒยชยฎยฌ

=

๐ฝโˆ‘๐‘—=1๐‘ฅ ๐‘—

๐ฝโˆ‘๐‘—โ€ฒ=1

๐‘ฅ ๐‘—โ€ฒ ยท๐‘›โˆ‘๐‘–=1

๐‘๐‘–

(b๐‘—

๐‘–b๐‘—โ€ฒ

๐‘–

)โˆ’

๐ฝโˆ‘๐‘—=1๐‘ฅ ๐‘—

๐ฝโˆ‘๐‘—โ€ฒ=1

๐‘ฅ ๐‘—โ€ฒ ยท๐‘›โˆ‘๐‘–=1

๐‘๐‘–b๐‘—

๐‘–

๐‘›โˆ‘๐‘–โ€ฒ=1

๐‘๐‘–โ€ฒb๐‘—โ€ฒ

๐‘–โ€ฒ

=

๐ฝโˆ‘๐‘—=1๐‘ฅ ๐‘—

๐ฝโˆ‘๐‘—โ€ฒ=1

๐‘ฅ ๐‘—โ€ฒ ยท(

๐‘›โˆ‘๐‘–=1

๐‘๐‘– ยท b ๐‘—๐‘– b๐‘—โ€ฒ

๐‘–โˆ’

๐‘›โˆ‘๐‘–=1

๐‘๐‘–b๐‘—

๐‘–ยท

๐‘›โˆ‘๐‘–โ€ฒ=1

๐‘๐‘–โ€ฒb๐‘—โ€ฒ

๐‘–โ€ฒ

)๏ธธ ๏ธท๏ธท ๏ธธ

=:ฮฃ ๐‘—, ๐‘—โ€ฒ

and thus

var ๐‘ฅ>b =๐ฝโˆ‘๐‘—=1๐‘ฅ ๐‘—

๐ฝโˆ‘๐‘—โ€ฒ=1

๐‘ฅ ๐‘—โ€ฒฮฃ ๐‘— , ๐‘—โ€ฒ = ๐‘ฅ>ฮฃ๐‘ฅ,

Version: May 28, 2021

Page 22: Selected Topics from Portfolio Optimization

22 THE MARKOWITZ MODEL

19.2% 4.1% 7.9% 1.3% -1.9%4.1% 208.8% -7.4% 44.5% -18.9%7.9% -7.4% 42.8% -10.3% -6.7%1.3% 44.5% -10.3% 88.9% 2.9%

-1.9% -18.9% -6.7% 2.9% 5.9%

(a) Covariance matrix ฮฃ of returns in Table 3.1, cf. also Ta-ble 3.3

5.72 -0.04 -1.01 -0.20 0.69-0.04 1.03 0.74 -0.57 4.41-1.01 0.74 3.59 -0.14 6.22-0.20 -0.57 -0.14 1.49 -2.780.69 4.41 6.22 -2.78 39.83

(b) The inverse ฮฃโˆ’1

Table 3.4: Covariance matrix ฮฃ of returns in Table 3.1 and its inverse

where ฮฃ is the covariance matrix (aka. variance-covariance matrix) with entries

ฮฃ ๐‘— , ๐‘—โ€ฒ =

๐‘›โˆ‘๐‘–=1

๐‘๐‘–b๐‘—

๐‘–b๐‘—โ€ฒ

๐‘–โˆ’

๐‘›โˆ‘๐‘–=1

๐‘๐‘–b๐‘—

๐‘–ยท

๐‘›โˆ‘๐‘–โ€ฒ=1

๐‘๐‘–โ€ฒb๐‘—โ€ฒ

๐‘–โ€ฒ

= E b ๐‘—b ๐‘—โ€ฒ โˆ’ E b ๐‘— ยท E b ๐‘—โ€ฒ = cov

(b ๐‘— , b ๐‘—

โ€ฒ).

Remark 3.7 (Besselโ€™s correction). For the empirical measure ๐‘๐‘– = 1/๐‘›, the entries of thevarianceโ€“covariance matrix are

ฮฃ ๐‘— , ๐‘—โ€ฒ = cov(b ๐‘— , b ๐‘—

โ€ฒ)=1๐‘›

๐‘›โˆ‘๐‘–=1

(b๐‘—

๐‘–โˆ’ b ๐‘—

) (b๐‘—โ€ฒ

๐‘–โˆ’ b ๐‘—

โ€ฒ), where b

๐‘—B

๐‘›โˆ‘๐‘–=1

b๐‘—

๐‘–.

Besselโ€™s correction replaces this quantity by ฮฃ ๐‘— , ๐‘—โ€ฒ =1

๐‘›โˆ’1โˆ‘๐‘›

๐‘–=1

(b๐‘—

๐‘–โˆ’ b ๐‘—

) (b๐‘—โ€ฒ

๐‘–โˆ’ b ๐‘—

โ€ฒ).

Example 3.8. Cf. Table 3.4.

3.6 THE NON-EMPIRICAL FORMULATION

Here, the random variable is b : ฮฉโ†’ R๐ฝ . We define ๐‘Ÿ B E ๐‘ฅ>b and observe that

var ๐‘ฅ>b = E(๐‘ฅ>b

)2 โˆ’ (E ๐‘ฅ>b

)2= E

ยฉยญยซ๐ฝโˆ‘๐‘—=1b ๐‘—๐‘ฅ ๐‘—

ยชยฎยฌ2

โˆ’ ยฉยญยซ๐ฝโˆ‘๐‘—=1๐‘ฅ ๐‘— E b

๐‘—ยชยฎยฌ2

=

๐ฝโˆ‘๐‘—=1

๐ฝโˆ‘๐‘—โ€ฒ=1

๐‘ฅ ๐‘—๐‘ฅ ๐‘—โ€ฒ E(b ๐‘—b ๐‘—

โ€ฒ)โˆ’

๐ฝโˆ‘๐‘—=1

๐ฝโˆ‘๐‘—โ€ฒ=1

๐‘ฅ ๐‘—๐‘ฅ ๐‘—โ€ฒ(E b ๐‘—

) (E b ๐‘—

โ€ฒ)

=

๐ฝโˆ‘๐‘—=1

๐ฝโˆ‘๐‘—โ€ฒ=1

๐‘ฅ ๐‘—๐‘ฅ ๐‘—โ€ฒ(E

(b ๐‘—b ๐‘—

โ€ฒ)โˆ’

(E b ๐‘—

) (E b ๐‘—

โ€ฒ))

๏ธธ ๏ธท๏ธท ๏ธธcov( b ๐‘— , b ๐‘—โ€ฒ)

= ๐‘ฅ> cov(b) ๐‘ฅ.

Definition 3.9. The covariance matrix3 is

ฮฃ B cov(b) = E(b ยท b>

)โˆ’ (E b) ยท (E b)> .

3The covariance cov is a.k.a. variance matrix.

rough draft: do not distribute

Page 23: Selected Topics from Portfolio Optimization

3.7 THE CAPITAL ASSET PRICING MODEL (CAPM) 23

Figure 3.2: Harry Markowitz (1927) explains the CAPM and the mean variance plot. NobelMemorial Prize in Economic Sciences (1990)

Remark 3.10. Note, thatฮฃ = ฮž> ยท diag(๐‘) ยท ฮž โˆ’ ๐‘>ฮž๏ธธ๏ธท๏ธท๏ธธ

๐‘Ÿ

ยท ฮž>๐‘๏ธธ๏ธท๏ธท๏ธธ๐‘Ÿ>

and ฮฃ is symmetric.

3.7 THE CAPITAL ASSET PRICING MODEL (CAPM)

Markowitz considers the problem

minimize (in ๐‘ฅ โˆˆ R๐ฝ ) var ๐‘ฅ>b (3.3)subject to E ๐‘ฅ>b โ‰ฅ `,

1> ๐‘ฅ โ‰ค 1C,(๐‘ฅ โ‰ฅ 0)

The Markowitz problem (3.3) is quadratic, with linear constraints: ๐ฝ varialbes, 2 constraints.

Definition 3.11. A portfolio ๐‘ฅโˆ— โˆˆ R๐ฝ is efficient if it solves (3.3).

Expressed by matrices the Markowitz problem (3.3) is

minimize in ๐‘ฅโˆˆR๐ฝ ๐‘ฅ>ฮฃ๐‘ฅ

subject to ๐‘ฅ>๐‘Ÿ โ‰ฅ `,๐‘ฅ> 1 โ‰ค 1,(๐‘ฅ โ‰ฅ 0).

Theorem 3.12. The efficient Markowitz portfolio is given by

๐‘ฅโˆ—(`) = `(๐‘

๐‘‘ฮฃโˆ’1๐‘Ÿ โˆ’ ๐‘

๐‘‘ฮฃโˆ’1 1

)โˆ’ ๐‘๐‘‘ฮฃโˆ’1๐‘Ÿ + ๐‘Ž

๐‘‘ฮฃโˆ’1 1, (3.4)

where ๐‘Ž B ๐‘Ÿ>ฮฃโˆ’1๐‘Ÿ, ๐‘ B ๐‘Ÿ>ฮฃโˆ’1 1, ๐‘ B 1> ฮฃโˆ’1 1 and ๐‘‘ B ๐‘Ž๐‘ โˆ’ ๐‘2 are auxiliary quantities.

Version: May 28, 2021

Page 24: Selected Topics from Portfolio Optimization

24 THE MARKOWITZ MODEL

y

g(x,y) = c

f(x,y) = d1

f(x,y) = d2

x

f(x,y) = d3

Figure 3.3: Illustration of Lagrange multipliers, contour lines of ๐‘“

Remark 3.13. The units of the auxiliary are [๐‘Ž] = interest2variance , [๐‘] = interest

variance , [๐‘] = 1variance and

[ฮฃ] = variance.

Proof. Differentiate the Lagrangian4

๐ฟ (๐‘ฅ;_, ๐›พ) B 12๐‘ฅ>ฮฃ๐‘ฅ โˆ’ _(๐‘Ÿ>๐‘ฅ โˆ’ `) โˆ’ ๐›พ(1> ๐‘ฅ โˆ’ 1)

to get the necessary conditions for optimality,

0 =๐œ•๐ฟ

๐œ•๐‘ฅ=12๐‘ฅ>ฮฃ + 1

2๐‘ฅ>ฮฃ โˆ’ _๐‘Ÿ> โˆ’ ๐›พ 1>, (3.5)

0 =๐œ•๐ฟ

๐œ•_= โˆ’๐‘Ÿ>๐‘ฅ + `, (3.6)

0 =๐œ•๐ฟ

๐œ•๐›พ= โˆ’1> ๐‘ฅ + 1. (3.7)

It follows from (3.5) that๐‘ฅโˆ— = _ฮฃโˆ’1๐‘Ÿ + ๐›พฮฃโˆ’1 1 . (3.8)

To determine the shadow prices _ and ๐›พ we employ (3.6) and (3.7), i.e.,

` = ๐‘Ÿ>๐‘ฅโˆ— = _๐‘Ÿ>ฮฃโˆ’1๐‘Ÿ + ๐›พ๐‘Ÿ>ฮฃโˆ’1 1 and (3.9)

1 = 1> ๐‘ฅโˆ— = _ 1> ฮฃโˆ’1๐‘Ÿ + ๐›พ 1> ฮฃโˆ’1 1 .

We may rewrite these latter equations as a usual matrix equation,(๐‘Ž ๐‘

๐‘ ๐‘

) (_

๐›พ

)=

(`

1

)(3.10)

with solutions _โˆ— = `๐‘โˆ’๐‘๐‘Ž๐‘โˆ’๐‘2 and ๐›พโˆ— = ๐‘Žโˆ’`๐‘

๐‘Ž๐‘โˆ’๐‘2 . Substitute them in (3.8) to get the assertion (3.4) ofthe theorem, i.e., the efficient portfolio. ๏ฟฝ

4We could choose ๐‘ฅ>ฮฃ๐‘ฅ orโˆš๐‘ฅ>ฮฃ๐‘ฅ equally well in the Lagrangian function ๐ฟ.

rough draft: do not distribute

Page 25: Selected Topics from Portfolio Optimization

3.7 THE CAPITAL ASSET PRICING MODEL (CAPM) 25

Corollary 3.14. Note from (3.9) that

E ๐‘ฅโˆ—>b = ๐‘ฅโˆ—>b = `

and

var(๐‘ฅโˆ—>b

)= ๐‘ฅโˆ—>ฮฃ๐‘ฅโˆ— =

`2๐‘ โˆ’ 2`๐‘ + ๐‘Ž๐‘Ž๐‘ โˆ’ ๐‘2

, (3.11)

cov(๐‘ฅโˆ—>b, b๐‘–

)= ๐‘ฅโˆ—>ฮฃ๐‘’๐‘– =

`๐‘ โˆ’ ๐‘๐‘Ž๐‘ โˆ’ ๐‘2

๐‘Ÿ๐‘– +๐‘Ž โˆ’ `๐‘๐‘Ž๐‘ โˆ’ ๐‘2

.

Proof. We have

var(๐‘ฅโˆ—>b

)= ๐‘ฅโˆ—>ฮฃ๐‘ฅโˆ— =

(_ฮฃโˆ’1๐‘Ÿ + ๐›พฮฃโˆ’1 1

)>ฮฃ๐‘ฅโˆ—

= _๐‘Ÿ>๐‘ฅโˆ— + ๐›พ 1> ๐‘ฅโˆ— = _` + ๐›พ

=`๐‘ โˆ’ ๐‘๐‘Ž๐‘ โˆ’ ๐‘2

` + ๐‘Ž โˆ’ `๐‘๐‘Ž๐‘ โˆ’ ๐‘2

= `2๐‘

๐‘Ž๐‘ โˆ’ ๐‘2โˆ’ 2` ๐‘

๐‘Ž๐‘ โˆ’ ๐‘2+ ๐‘Ž

๐‘Ž๐‘ โˆ’ ๐‘2=: ๐œŽ2(`).

๏ฟฝ

Corollary 3.15. It holds that ๐‘Ž๐‘ > ๐‘2.

Proof. The matrix ฮฃ is positive definite as it is a covariance matrix, and so is its inverse. Itthus holds that ๐‘ = 1> ฮฃโˆ’1 1 > 0. The variance is also positive for every ` > 0, so it followsfrom (3.11) that ๐‘Ž๐‘ โˆ’ ๐‘2 > 0. ๏ฟฝ

3.7.1 The Mean-Variance Plot

This section studies the Markowitz problem as a function of the parameter `.

Corollary 3.16 (Mean-variance). Set ๐œŽ2 B var๐‘Œ๐‘ฅโˆ— , then we have (for ๐œŽ > 1โˆš๐‘) that

`(๐œŽ) = ๐‘ +โˆš(๐‘Ž๐‘ โˆ’ ๐‘2) (๐‘๐œŽ2 โˆ’ 1)

๐‘=๐‘

๐‘+

โˆš(๐‘Ž โˆ’ ๐‘

2

๐‘

) (๐œŽ2 โˆ’ 1

๐‘

). (3.12)

Proof. Solve (3.11) for var๐‘Œ๐‘ฅ = ๐œŽ2 using the quadratic formula. ๏ฟฝ

Figure 3.4 graphs the relation (3.12), i.e., the mean and the variance of efficient portfolios.

Corollary 3.17. It holds that

`(๐œŽ) โ‰ค ๐‘๐‘+ ๐œŽ

โˆš๐‘Ž โˆ’ ๐‘

2

๐‘(3.13)

and every efficient portfolio satisfies ๐œŽ โ‰ฅ 1โˆš๐‘

and ` โ‰ฅ ๐‘๐‘.

Proof. This is immediate from (3.12); cf. also Figure 3.4. ๏ฟฝ

Version: May 28, 2021

Page 26: Selected Topics from Portfolio Optimization

26 THE MARKOWITZ MODEL

0 5 ยท 10โˆ’2 0.1 0.15 0.2 0.25 0.3

5 ยท 10โˆ’2

0.1

0.15

0.2

risk, ๐œŽ

retu

rn,`

Figure 3.4: The mean-variance plot (3.12), the efficient frontier and asymptotic (3.13)

Remark 3.18. The Markowitz portfolio with smallest variance which does not include a riskfree asset is given for ` = ๐‘

๐‘(differentiate (3.11) with respect to `) and this portfolio thus is of

particular interest. In particular, note that its variance, by (3.11), is

๐œŽ2min = var ๐‘ฅโˆ—(๐‘

๐‘

)>b =

(๐‘๐‘

)2๐‘ โˆ’ 2๐‘

๐‘๐‘ + ๐‘Ž

๐‘Ž ๐‘ โˆ’ ๐‘2=1๐‘

๐‘2 โˆ’ 2๐‘2 + ๐‘Ž ๐‘๐‘Ž ๐‘ โˆ’ ๐‘2

=1๐‘.

Exercise 3.1. The portfolio with smallest-variance in our data is ` = ๐‘๐‘= 1.7666.3 = 2.66%, the

corresponding standard deviation, which cannot be improved, is ๐œŽ = 1โˆš๐‘= 1โˆš

1> ฮฃโˆ’1 1= 12.3%;

cf. Figure 3.4 and Figure 3.5.

3.7.2 Tangency portfolio

For some fixed risk free rate ๐‘Ÿ0 we study the particular reward

`๐‘š B๐‘Ž โˆ’ ๐‘Ÿ0 ๐‘๐‘ โˆ’ ๐‘Ÿ0 ๐‘

. (3.14)

Lemma 3.19. The variance corresponding to the reward `๐‘š is

๐œŽ2๐‘š B ๐œŽ2(`๐‘š) =๐‘Ž โˆ’ 2๐‘Ÿ0 ๐‘ + ๐‘Ÿ20 ๐‘(๐‘ โˆ’ ๐‘Ÿ0 ๐‘)2

. (3.15)

Proof. From (3.11) it follows that

๐œŽ2(`๐‘š) =๐‘`2๐‘š โˆ’ 2๐‘`๐‘š + ๐‘Ž

๐‘Ž๐‘ โˆ’ ๐‘2

=1

(๐‘ โˆ’ ๐‘Ÿ0๐‘)2๐‘(๐‘Ž โˆ’ ๐‘Ÿ0๐‘)2 + 2๐‘(๐‘Ž โˆ’ ๐‘Ÿ0๐‘) (๐‘ โˆ’ ๐‘Ÿ0๐‘) + ๐‘Ž(๐‘ โˆ’ ๐‘Ÿ0๐‘)2

๐‘Ž๐‘ โˆ’ ๐‘2

= ยท ยท ยท =๐‘Ž โˆ’ 2๐‘Ÿ0 ๐‘ + ๐‘Ÿ20 ๐‘(๐‘ โˆ’ ๐‘Ÿ0 ๐‘)2

rough draft: do not distribute

Page 27: Selected Topics from Portfolio Optimization

3.7 THE CAPITAL ASSET PRICING MODEL (CAPM) 27

after some annoying, but elementary algebra. ๏ฟฝ

Definition 3.20 (Sharpe ratio). The Sharpe ratio of the portfolio with reward `๐‘š is

๐‘ ๐‘š B`๐‘š โˆ’ ๐‘Ÿ0๐œŽ๐‘š

. (3.16)

Remark 3.21. Note first that `๐‘šโˆ’๐‘Ÿ0๐‘โˆ’๐‘Ÿ0 ๐‘ = ๐œŽ2๐‘š. It follows that

๐‘ โˆ’ ๐‘Ÿ0 ๐‘ =`๐‘š โˆ’ ๐‘Ÿ0๐œŽ2๐‘š

=(3.16)

๐‘ ๐‘š

๐œŽ๐‘š

(3.17)

and with (3.14) that๐‘Ž โˆ’ ๐‘Ÿ0 ๐‘ =

(3.14)`๐‘š (๐‘ โˆ’ ๐‘Ÿ0 ๐‘) =

(3.17)

`๐‘š ยท ๐‘ ๐‘š๐œŽ๐‘š

.

From (3.15) we deduce further that

๐‘Ž โˆ’ 2๐‘Ÿ0 ๐‘ + ๐‘Ÿ20 ๐‘ =(3.15)

๐œŽ2๐‘š (๐‘ โˆ’ ๐‘Ÿ0 ๐‘)2 =(3.17)

๐‘ 2๐‘š. (3.18)

Definition 3.22 (Market portfolio, tangency portfolio). The market portfolio is

๐‘ฅ๐‘š B ๐‘ฅโˆ—(`๐‘š) =๐œŽ๐‘š

๐‘ ๐‘šยท ฮฃโˆ’1

(๐‘Ÿ โˆ’ ๐‘Ÿ0 ยท 1

). (3.19)

Remark 3.23. It follows according (3.4) is

๐‘ฅ๐‘š = ๐‘ฅโˆ—(`๐‘š) =`๐‘š๐‘ โˆ’ ๐‘๐‘Ž๐‘ โˆ’ ๐‘2

ฮฃโˆ’1๐‘Ÿ โˆ’ `๐‘š๐‘ โˆ’ ๐‘Ž๐‘Ž๐‘ โˆ’ ๐‘2

ฮฃโˆ’1 1

=1

๐‘ โˆ’ ๐‘Ÿ0๐‘ฮฃโˆ’1๐‘Ÿ โˆ’ ๐‘Ÿ0

๐‘ โˆ’ ๐‘Ÿ0๐‘ฮฃโˆ’1 1

and with (3.17) thus (3.19).

Remark 3.24. For the line ๐‘ก (๐œŽ) B ๐‘Ÿ0 + ๐œŽ ยท `๐‘šโˆ’๐‘Ÿ0๐œŽ๐‘šit holds that

๐‘ก (๐œŽ๐‘š) = `(๐œŽ๐‘š) and๐‘ก โ€ฒ(๐œŽ๐‘š) = `โ€ฒ(๐œŽ๐‘š) (cf. (3.12)).

The line ๐‘ก thus is the tangent drawn from the point of the risk-free asset ๐‘ก (0) = ๐‘Ÿ0 to thefeasible region for risky assets. The tangent line is called capital market line. The portfoliowith decomposition (3.19) is also called the most efficient portfolio, it has the highest reward-to-volatility ratio.

3.7.3 The Two Fund Theorem

Note that ` is a model-parameter in the Markowitz model (2.1). We now compare efficientportfolios for different returns `.

Theorem 3.25 (Two fund theorem). If ๐‘ฅโˆ—1 and ๐‘ฅโˆ—2 are different efficient portfolios (for different`s), then every efficient portfolio can be obtained as an affine combination of these two.

Version: May 28, 2021

Page 28: Selected Topics from Portfolio Optimization

28 THE MARKOWITZ MODEL

0.05 0.1 0.15 0.2

return ยต-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

allo

ca

tio

n

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

sta

nd

ard

de

via

tio

n ฯƒ

DAX

RWE

gold

oil

Figure 3.5: Asset allocation according to Markowitz for varying return `

Proof. Observe from (3.1) and (3.9) that

๐‘ฅโˆ—(`) = `๐‘ โˆ’ ๐‘๐‘Ž๐‘ โˆ’ ๐‘2

ฮฃโˆ’1๐‘Ÿ + ๐‘Ž โˆ’ `๐‘๐‘Ž๐‘ โˆ’ ๐‘2

ฮฃโˆ’1 1

=

(โˆ’ ๐‘

๐‘Ž๐‘ โˆ’ ๐‘2ฮฃโˆ’1๐‘Ÿ + ๐‘Ž

๐‘Ž๐‘ โˆ’ ๐‘2ฮฃโˆ’1 1

)+ `

(๐‘

๐‘Ž๐‘ โˆ’ ๐‘2ฮฃโˆ’1๐‘Ÿ โˆ’ ๐‘

๐‘Ž๐‘ โˆ’ ๐‘2ฮฃโˆ’1 1

).

๏ฟฝ

Exercise 3.2. The result for our data is (cf. Figure 3.5)

๐‘ฅโˆ—(`) =

ยฉยญยญยญยญยญยซ2.7%10.5%14.3%โˆ’7.8%80.2%

ยชยฎยฎยฎยฎยฎยฌ+ `

ยฉยญยญยญยญยญยซ+1.90โˆ’0.80โˆ’0.05+1.68โˆ’2.73

ยชยฎยฎยฎยฎยฎยฌ;

the auxiliary quantities are ๐‘Ž = 0.40, ๐‘ = 1.76 and ๐‘ = 66.31.

rough draft: do not distribute

Page 29: Selected Topics from Portfolio Optimization

3.8 MARKOWITZ PORTFOLIO INCLUDING A RISK FREE ASSET 29

3.8 MARKOWITZ PORTFOLIO INCLUDING A RISK FREE ASSET

Set ๐‘Ÿ B E b, i.e., ๐‘Ÿ ๐‘— B E b ๐‘— , ๐‘— = 1, . . . , ๐ฝ. Further, let ๐‘Ÿ0 be the return of the risk-free asset.We set

๐‘Ÿ =

ยฉยญยญยญยญยซ๐‘Ÿ0๐‘Ÿ1...

๐‘Ÿ๐ฝ

ยชยฎยฎยฎยฎยฌ=

ยฉยญยญยญยซ๐‘Ÿ0

๐‘Ÿ

ยชยฎยฎยฎยฌ with ๐‘Ÿ =ยฉยญยญยซ๐‘Ÿ1...

๐‘Ÿ๐ฝ

ยชยฎยฎยฌ and ๐‘ฅ =ยฉยญยญยญยญยซ๐‘ฅ0๐‘ฅ1...

๐‘ฅ๐ฝ

ยชยฎยฎยฎยฎยฌ=

ยฉยญยญยญยซ๐‘ฅ0

๐‘ฅ

ยชยฎยฎยฎยฌ with ๐‘ฅ =ยฉยญยญยซ๐‘ฅ1...

๐‘ฅ๐ฝ

ยชยฎยฎยฌ .Note that ๐‘†0๐‘ก = ๐‘†

00 ยท๐‘’

๐‘Ÿ0๐‘ก and the annualized return b0๐‘–= 1

๐‘ก๐‘–+1โˆ’๐‘ก๐‘– ln๐‘†0๐‘ก+1

๐‘†0๐‘ก+1

= ๐‘Ÿ0 is the constant risk-free

interest rate. The risk free asset is not correlated with other assets. The covariance matrix

ฮฃ B

(0 00 ฮฃ

)thus is not invertible. Consequently, the results from the previous section do not apply.

Theorem 3.26. The Markowitz portfolio is given by

๐‘ฅโˆ—(`) =(๐‘ฅโˆ—0(`)๐‘ฅโˆ—(`)

)=

(`๐‘šโˆ’``๐‘šโˆ’๐‘Ÿ0

`โˆ’๐‘Ÿ0`๐‘šโˆ’๐‘Ÿ0 ยท ๐‘ฅ๐‘š

), (3.20)

where ๐‘ฅ๐‘š (cf. (3.19)) is the tangency portfolio (market portfolio).

Proof. Differentiate the Lagrangian (cf. Figure 3.3 for illustration)

๐ฟ (๐‘ฅ;_, ๐›พ) B 12๐‘ฅ>ฮฃ๐‘ฅ โˆ’ _(๐‘Ÿ>๐‘ฅ โˆ’ `) โˆ’ ๐›พ(1> ๐‘ฅ โˆ’ 1)

to get the necessary conditions for optimality,

0 =๐œ•๐ฟ

๐œ•๐‘ฅ= ฮฃ๐‘ฅ โˆ’ _๐‘Ÿ> โˆ’ ๐›พ 1>, (3.21)

0 =๐œ•๐ฟ

๐œ•๐‘ฅ0= โˆ’_๐‘Ÿ0 โˆ’ ๐›พ, (3.22)

0 =๐œ•๐ฟ

๐œ•_= ๐‘Ÿ>๐‘ฅ โˆ’ ` = ๐‘Ÿ0๐‘ฅ0 + ๐‘Ÿ>๐‘ฅ โˆ’ `, (3.23)

0 =๐œ•๐ฟ

๐œ•๐›พ= 1> ๐‘ฅ โˆ’ 1 = ๐‘ฅ0 + 1> ๐‘ฅ โˆ’ 1, (3.24)

We get from (3.21) that ๐‘ฅโˆ— = _ฮฃโˆ’1๐‘Ÿ + ๐›พฮฃโˆ’1 1. Substitute ๐‘ฅโˆ— in (3.23) and (3.24), and aftercollecting terms in (3.22)โ€“(3.24) one finds (cf. (3.10))

ยฉยญยซ0 ๐‘Ÿ0 1๐‘Ÿ0 ๐‘Ž ๐‘

1 ๐‘ ๐‘

ยชยฎยฌ ยฉยญยซ๐‘ฅ0_

๐›พ

ยชยฎยฌ =ยฉยญยซ0`

1

ยชยฎยฌ .This linear matrix equation has explicit solution

ยฉยญยซ๐‘ฅโˆ—0_โˆ—

๐›พโˆ—

ยชยฎยฌ =1

๐‘Ž โˆ’ 2๐‘๐‘Ÿ0 + ๐‘๐‘Ÿ20๏ธธ ๏ธท๏ธท ๏ธธ=๐‘ 2๐‘š, cf. (3.18)

ยฉยญยซ๐‘Ž โˆ’ ๐‘๐‘Ÿ0 + `(๐‘๐‘Ÿ0 โˆ’ ๐‘)

โˆ’๐‘Ÿ0 + `๐‘Ÿ20 โˆ’ ๐‘Ÿ0`

ยชยฎยฌ . (3.25)

Version: May 28, 2021

Page 30: Selected Topics from Portfolio Optimization

30 THE MARKOWITZ MODEL

So we finally get

๐‘ฅโˆ— =

(๐‘ฅโˆ—0๐‘ฅโˆ—

)=

(1๐‘ 2๐‘š(๐‘Ž โˆ’ ๐‘๐‘Ÿ0 + `(๐‘๐‘Ÿ0 โˆ’ ๐‘))_โˆ—ฮฃโˆ’1๐‘Ÿ + ๐›พโˆ—ฮฃโˆ’1 1

)=1๐‘ 2๐‘š

(๐‘ 2๐‘š โˆ’ (๐‘ โˆ’ ๐‘Ÿ0๐‘) (` โˆ’ ๐‘Ÿ0)(` โˆ’ ๐‘Ÿ0)

(ฮฃโˆ’1๐‘Ÿ โˆ’ ๐‘Ÿ0ฮฃโˆ’1 1

) )from (3.24); cf. Exercise 3.3. ๏ฟฝ

Corollary 3.27. The variance (standard deviation, resp.) of the portfolio corresponding to `is

var(๐‘ฅโˆ—(`)>b

)=

(` โˆ’ ๐‘Ÿ0๐‘ ๐‘š

)2(๐œŽ(`) = |` โˆ’ ๐‘Ÿ0 |

๐‘ ๐‘š, resp.).

Proof. Recall the special structure of ฮฃ. It thus follows from (3.20) that

var(๐‘ฅโˆ—(`)>b

)= ๐‘ฅโˆ—(`)>ฮฃ๐‘ฅโˆ—(`)

=(` โˆ’ ๐‘Ÿ0)2

๐‘ 4๐‘š(๐‘Ÿ โˆ’ ๐‘Ÿ0 1)> ฮฃโˆ’1ฮฃฮฃโˆ’1 (๐‘Ÿ โˆ’ ๐‘Ÿ0 1)

=(` โˆ’ ๐‘Ÿ0)2

๐‘ 4๐‘š

(๐‘Ÿ>ฮฃโˆ’1๐‘Ÿ โˆ’ 2๐‘Ÿ0 1> ฮฃโˆ’1๐‘Ÿ + ๐‘Ÿ20 1

> ฮฃโˆ’1 1)

=(` โˆ’ ๐‘Ÿ0)2

๐‘ 4๐‘š

(๐‘Ž โˆ’ 2๐‘Ÿ0๐‘ + ๐‘Ÿ20๐‘

)=

(` โˆ’ ๐‘Ÿ0๐‘ ๐‘š

)2, (3.26)

which is the assertion. ๏ฟฝ

Remark 3.28. Note, that the portfolio with smallest variance is attained here for หœ = ๐‘Ÿ0, thecorresponding variance by (3.26) is zero , i.e, there is no risk. This is in significant contrastto Remark 3.18.

3.9 ONE FUND THEOREM

Theorem 3.29 (One fund theorem, Tobin5-separation, market portfolio). Every efficient port-folio is the affine combination of

(i) a portfolio without a risk-free asset (the market portfolio), and

(ii) the risk free asset.

Definition 3.30. The portfolio allocation in (i) is called market portfolio.

Proof. Choose

(i) ` B ๐‘Ÿ0, then the portfolio in (3.20) is ๐‘ฅโˆ—(๐‘Ÿ0) =ยฉยญยญยญยญยซ10...

0

ยชยฎยฎยฎยฎยฌ; this portfolio does not involve stocks

and thus is completely free of risk, i.e., it consists of the risk-free asset solely;

5Tobin-Separation, 1918โ€“2002, American economist

rough draft: do not distribute

Page 31: Selected Topics from Portfolio Optimization

3.9 ONE FUND THEOREM 31

(ii) For the market portfolio, recall (cf. the tangency portfolio (3.14))

`๐‘š B `๐‘ก =๐‘Ž โˆ’ ๐‘Ÿ0 ๐‘๐‘ โˆ’ ๐‘Ÿ0 ๐‘

and ๐‘ฅ๐‘š B ๐‘ฅ๐‘ก . Then, by (3.19) and (3.20),

๐‘ฅโˆ—(`๐‘š) =(0

๐‘ฅโˆ—(`๐‘ก )

)=

(0

๐œŽ๐‘š

๐‘ ๐‘šฮฃโˆ’1 (๐‘Ÿ โˆ’ ๐‘Ÿ0 ยท 1)

)=

(0๐‘ฅ๐‘š

),

which means that the portfolio ๐‘ฅโˆ—(`๐‘š) is free from risk-free assets, i.e., does not containthe risk free asset.

The assertion follows, as every portfolio is a linear combination of both portfolios by the TwoFund Theorem, Theorem 3.25. Explicitly, the optimal portfolio (cf. (3.20)) is

๐‘ฅโˆ—(`) = `๐‘š โˆ’ ``๐‘š โˆ’ ๐‘Ÿ0

(10

)+ ` โˆ’ ๐‘Ÿ0`๐‘š โˆ’ ๐‘Ÿ0

(0

๐œŽ๐‘š

๐‘ ๐‘šฮฃโˆ’1

(๐‘Ÿ โˆ’ ๐‘Ÿ0 1

) ) =`๐‘š โˆ’ ``๐‘š โˆ’ ๐‘Ÿ0

(10

)+ ` โˆ’ ๐‘Ÿ0`๐‘š โˆ’ ๐‘Ÿ0

(0๐‘ฅ๐‘š

).

๏ฟฝ

Remark 3.31. The tangency portfolio coincides with the market portfolio, ๐‘ฅ๐‘š = ๐‘ฅ๐‘ก .

Example 3.32. For ๐‘Ÿ0 B 2%, the optimal portfolios for our data are given according the onefund theorem as

๐‘ฅโˆ—(`) = 83.3% โˆ’ `81.3%

ยฉยญยญยญยญยญยญยญยซ

100%00000

ยชยฎยฎยฎยฎยฎยฎยฎยฌ๏ธธ ๏ธท๏ธท ๏ธธno stocks

+` โˆ’ 2%81.3%

ยฉยญยญยญยญยญยญยญยซ

0160.7%โˆ’56.0%10.3%132.2%โˆ’147.3%

ยชยฎยฎยฎยฎยฎยฎยฎยฌ๏ธธ ๏ธท๏ธท ๏ธธno cash

;

cf. Figure 3.6.

3.9.1 Capital Asset Pricing Model (CAPM)

Recall that the market (or tangency) portfolio

๐‘ฅ๐‘š = ๐‘ฅ๐‘ก =๐œŽ๐‘š

๐‘ ๐‘šฮฃโˆ’1

(๐‘Ÿ โˆ’ ๐‘Ÿ0 ยท 1

)has expectation (cf. (3.14))

E ๐‘ฅ>๐‘šb = `๐‘š

and variance (cf. (3.15))var

(๐‘ฅ>๐‘šb

)= ๐œŽ2๐‘š.

Remark 3.33 (Covariance of the market portfolio). The covariance of the market portfoliowith asset ๐‘— is

cov(๐‘’>๐‘— b, ๐‘ฅ

>๐‘šb

)= ๐‘’>๐‘— ฮฃ๐‘ฅ๐‘š = ๐‘’>๐‘— ฮฃ ยท

๐œŽ๐‘š

๐‘ ๐‘šฮฃโˆ’1 (๐‘Ÿ โˆ’ ๐‘Ÿ0 ยท 1) =

๐œŽ๐‘š

๐‘ ๐‘š

(๐‘Ÿ ๐‘— โˆ’ ๐‘Ÿ0),

where ๐‘Ÿ = E b and ๐‘Ÿ ๐‘— = E b ๐‘— .

Version: May 28, 2021

Page 32: Selected Topics from Portfolio Optimization

32 THE MARKOWITZ MODEL

0.05 0.1 0.15 0.2

ยต

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

allo

ca

tio

n

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

sta

nd

ard

de

via

tio

n ฯƒ

DAX

RWE

gold

oil

cash

Figure 3.6: Markowitz portfolio including a risk free asset (cash) with return ๐‘Ÿ0 = 2% forvarying return `

It follows from the definition of the Sharpe ratio (3.16) that

๐›ฝ ๐‘— Bcov

(๐‘ฅ>๐‘šb, ๐‘’

>๐‘—b

)var (๐‘ฅ>๐‘šb)

=

๐œŽ๐‘š

๐‘ ๐‘š

(๐‘Ÿ ๐‘— โˆ’ ๐‘Ÿ0)๐œŽ2๐‘š

=๐‘Ÿ ๐‘— โˆ’ ๐‘Ÿ0๐‘ ๐‘š ๐œŽ๐‘š

=๐‘Ÿ ๐‘— โˆ’ ๐‘Ÿ0`๐‘š โˆ’ ๐‘Ÿ0

,

i.e.,

๐‘Ÿ ๐‘— = ๐‘Ÿ0 + ๐›ฝ ๐‘— ยท (`๐‘š โˆ’ ๐‘Ÿ0) . (3.27)

The quantity ๐›ฝ ๐‘— is the sensitivity of the expected excess asset returns to the expected excessmarket returns

The relation (3.27) is the core of the capital asset pricing model (CAPM). The graphof (3.27),

๐›ฝ โ†ฆโ†’ ๐‘Ÿ0 + ๐›ฝ (`๐‘š โˆ’ ๐‘Ÿ0)

is also called security market line (SML in the `-๐›ฝ-diagram).

Remark 3.34. For the market portfolio ๐‘ฅ๐‘š it holds that ๐‘ฅ>๐‘š๐›ฝ = ๐›ฝ๐‘š = 1.Indeed, with (3.27),

`๐‘š = ๐‘ฅ>๐‘š๐‘Ÿ = ๐‘Ÿ0๐‘ฅ>๐‘š 1+๐‘ฅ>๐‘š๐›ฝ ยท (`๐‘š โˆ’ ๐‘Ÿ0) = ๐‘Ÿ0 + ๐‘ฅ>๐‘š๐›ฝ ยท (`๐‘š โˆ’ ๐‘Ÿ0)

and thus the assertion.

rough draft: do not distribute

Page 33: Selected Topics from Portfolio Optimization

3.10 ALTERNATIVE FORMULATIONS OF THE MARKOWITZ PROBLEM 33

3.9.2 On systematic and specific risk

Observe that the correlation of asset ๐‘— with the market is defined as ๐œŒ ๐‘— ,๐‘š Bcov

(๐‘’>๐‘—b , b>๐‘ฅ>๐‘š b

)โˆšvar(๐‘ฅ>๐‘š b) ยทvar

(๐‘’>๐‘—b

)so that

๐›ฝ ๐‘— = ๐œŒ ๐‘— ,๐‘š ยท๐œŽ๐‘—

๐œŽ๐‘š

.

It follows that

๐œŽ๐‘— = ๐œŒ ๐‘— ,๐‘š ๐œŽ๐‘— +(1 โˆ’ ๐œŒ ๐‘— ,๐‘š

)๐œŽ๐‘— = ๐›ฝ ๐‘— ๐œŽ๐‘š๏ธธ๏ธท๏ธท๏ธธ

systematic

+(1 โˆ’ ๐›ฝ ๐‘—

๐œŽ๐‘š

๐œŽ๐‘—

)๐œŽ๐‘—๏ธธ ๏ธท๏ธท ๏ธธ

specific

.

โŠฒ The systematic risk 6 is also called aggregate or undiversifiable risk ;

โŠฒ the specific risk 7 is also called unsystematic, residual or idiosyncratic risk.

3.9.3 Sharpe ratio

Note that E ๐‘’>๐‘—b = ๐‘Ÿ ๐‘— and var ๐‘’>

๐‘—b = ๐‘’>

๐‘—ฮฃ๐‘’ ๐‘— = ฮฃ ๐‘— ๐‘— .

Definition 3.35. Then quantity๐‘Ÿ ๐‘— โˆ’ ๐‘Ÿ0โˆš

ฮฃ ๐‘— ๐‘—

is the Sharpe ratio of asset ๐‘— .8

It holds that

๐›ฝ ๐‘— =cov

(b>๐‘’ ๐‘— , b>๐‘ฅโˆ—๐‘š

)var (b>๐‘ฅโˆ—๐‘š)

=corr

(b>๐‘’ ๐‘— , b>๐‘ฅโˆ—๐‘š

)๐œŽ๐‘š

โˆšฮฃ ๐‘— ๐‘—

๐œŽ2๐‘š=

โˆšฮฃ ๐‘— ๐‘—

๐œŽ๐‘š

๐œŒ ๐‘— ,๐‘š.

The security market line (SML) is

SML : ๐›ฝ โ†ฆโ†’ ๐‘Ÿ0 + ๐›ฝ ยท (`๐‘š โˆ’ ๐‘Ÿ0) .

Note, from (3.27), that

SML(๐›ฝ ๐‘—) = ๐‘Ÿ ๐‘— ,SML(0) = ๐‘Ÿ0 andSML(1) = ๐‘Ÿ๐‘š.

3.10 ALTERNATIVE FORMULATIONS OF THE MARKOWITZ PROBLEM

Instead of Markowitz (3.3) one may consider the problem

maximize ๐‘Ÿ>๐‘ฅsubjet to ๐‘ฅ>ฮฃ๐‘ฅ โ‰ค ๐‘ž,

1> ๐‘ฅ โ‰ค 1,(๐‘ฅ โ‰ฅ 0)

6systematisches Risiko (dt.)7unsystematisches, spezifisches, diversifizierbares Risiko (dt.)8William Sharpe, 1934, Nobel memorial Price in Economic Sciences (1990)

Version: May 28, 2021

Page 34: Selected Topics from Portfolio Optimization

34 THE MARKOWITZ MODEL

Eigenvalue 2.254 0.773 0.440 0.165 0.024Variance explained 61.7% 21.2% 12.0% 4.5% 0.6%

(a) Eigenvalues and percentages of explained variance

PC1 PC2 PC3

DAX -0.02 -0.34 0.31RWE -0.95 0.30 0.05gold 0.05 -0.24 -0.91oil -0.31 0.91 0.26FX 0.08 0.14 0.13

(b) The first 3 principal components ex-plain 94.9%

Table 3.5: Principal component analysis

Proposition 3.36 (Utility maximization). The explicit solution of (^ > 0)

maximize E ๐‘ฅ>b โˆ’ ^2

var ๐‘ฅ>b (3.28)

subjet to 1> ๐‘ฅ โ‰ค 1,(๐‘ฅ โ‰ฅ 0)

is

๐‘ฅ =1^ฮฃโˆ’1

(๐‘Ÿ + ^ โˆ’ 1

> ฮฃโˆ’1๐‘Ÿ

1> ฮฃโˆ’1 11

).

Proof. The first order conditions for the Lagrangian ๐ฟ (๐‘ฅ;_) B ๐‘ฅ>๐‘Ÿ โˆ’ ^2 ๐‘ฅ>ฮฃ๐‘ฅ + _ (1> ๐‘ฅ โˆ’ 1) are

0 = ๐‘Ÿ โˆ’ ^ ฮฃ ๐‘ฅ + _ 1 and1 = 1> ๐‘ฅ.

from which follows that ๐‘ฅ = 1^ฮฃโˆ’1(๐‘Ÿ + _ 1). Further, 1 = 1> ๐‘ฅ = 1

^1> ฮฃโˆ’1(๐‘Ÿ + _ 1), i.e., _ =

^โˆ’1> ฮฃโˆ’1๐‘Ÿ1> ฮฃโˆ’1 1

. Hence the result. ๏ฟฝ

Remark 3.37. The portfolios of (3.28) and (3.3) coincide for ^ = ๐‘‘๐‘ `โˆ’๐‘ . In this case, ` = ๐‘‘+๐‘ ^

๐‘ ^

and ๐œŽ2 = ๐‘‘+^2๐‘ ^2

.

3.11 PRINCIPAL COMPONENTS

Table 3.5 collects the eigenvalues and principal components of the covariance matrix ฮฃ forthe three components according the Karhunenโ€“Loรจve decomposition. The first three princi-pal components explain 95 % of the data.

3.12 PROBLEMS

Exercise 3.3. Verify (3.25) and (3.20).

rough draft: do not distribute

Page 35: Selected Topics from Portfolio Optimization

3.12 PROBLEMS 35

Stocks:return ` ๐‘†1 ๐‘†2 ๐‘†3 ๐‘†4 ๐‘†5

0% 2.8% 10.5% 14.3% โˆ’7.8% 80.2%15% 31.2% โˆ’1.5% 13.6% 17.4% 39.3%

5%

(a) Markowitz portfolio

Stocks:return ` ๐‘†1 ๐‘†2 ๐‘†3

2% โˆ’4% โˆ’2% 106%14% 12% 6% 82%

(b) Markowitz portfolio

Table 3.6: Markowitz portfolios for various `

Exercise 3.4. The following portfolios (asset allocations, Table 3.6a) are efficient (in thesense of Markowitz). Give the Markowitz portfolio for ` = 5%?

Exercise 3.5. Is there a risk free asset among ๐‘†1, . . . ๐‘†5 in Table 3.6a?

Exercise 3.6. Give two pros and two cons for Markowtzโ€™s model.

Exercise 3.7. The portfolios in Table 3.6b are efficient. What is the risk free rate?

Exercise 3.8. Give the portfolio in Table 3.6b which does not contain a risk free asset.

Exercise 3.9. Verify Remark 3.37.

Version: May 28, 2021

Page 36: Selected Topics from Portfolio Optimization

36 THE MARKOWITZ MODEL

rough draft: do not distribute

Page 37: Selected Topics from Portfolio Optimization

4Value-at-Risk

Never catch a falling knife.

investment strategy

4.1 DEFINITIONS

Definition 4.1 (Cumulative distribution function, cdf). Let ๐‘Œ : ฮฉโ†’ R be a real-valued randomvariable. The cumulative distribution function (cdf, or just distribution function) is1

๐น๐‘Œ (๐‘ฅ) B ๐‘ƒ(๐‘Œ โ‰ค ๐‘ฅ). (4.1)

Definition 4.2. The Value-at-Risk at (confidence, or risk) level ๐›ผ โˆˆ [0, 1] is2

V@R๐›ผ (๐‘Œ ) B ๐นโˆ’1๐‘Œ (๐›ผ) = inf {๐‘ฅ : ๐‘ƒ (๐‘Œ โ‰ค ๐‘ฅ) โ‰ฅ ๐›ผ} . (4.2)

The Value-at-Risk is also called the quantile funciton ๐‘ž๐›ผ (๐‘Œ ) B V@R๐›ผ (๐‘Œ ) or generalizedinverse.

Example 4.3. Cf. Figur 4.1 and Table 4.1 or Figure 9.1.

4.2 HOW ABOUT ADDING RISK?

Fact. Consider the random variables (cf. Table 4.2) for which

V@R20%(๐‘‹) + V@R20%(๐‘Œ ) = 2 + 0 = 2 < 4 = V@R20%(๐‘‹ + ๐‘Œ ),

1Note that ๐น๐‘Œ (ยท) is cร dlร g, i.e., continue ร  droite, limite ร  gauche: right continuous with left limits2Note, that inf โˆ… = +โˆž.

๐‘– 1 2 3 4 5

๐‘ฆ๐‘– 3 7 โˆ’3 8 โˆ’5๐‘ƒ (๐‘Œ = ๐‘ฆ๐‘–) 1/5 1/5 1/5 1/5 1/5

(a) Observations

๐‘ฆ โˆ’5 โˆ’3 3 7 8 3.9๐น๐‘Œ (๐‘ฆ) 1/5 2/5 3/5 4/5 5/5 3/5

(b) Cumulative distribution function

๐›ผ 1/5 2/5 3/5 4/5 5/5V@R๐›ผ (๐‘Œ ) โˆ’5 โˆ’3 3 7 8

(c) Value-at-Risk

Table 4.1: Value-at-Risk

37

Page 38: Selected Topics from Portfolio Optimization

38 VALUE-AT-RISK

๐‘ฅ0

1

๐‘๐‘ƒ(๐‘Œ โ‰ค ๐‘ž)

๐‘ž V@R๐›ผ (๐‘Œ )

๐›ผ

๐‘ฆโ€ฒ

๐‘ƒ(๐‘Œ = ๐‘ž)

(a) Cumulative distribution function, cdf

๐นโˆ’1๐‘Œ(๐›ผ)

๐›ผ0 1

๐‘ž = V@R๐‘ (๐‘Œ )

๐‘ฆโ€ฒ

๐›ผ

โˆ’V@R1โˆ’๐‘ (โˆ’๐‘Œ )

๐‘ = ๐‘ƒ(๐‘Œ โ‰ค ๐‘ฆโ€ฒ)

๐‘ƒ(๐‘Œ = ๐‘ž)

(b) Quantile function, the generalized inverse of Figure 4.1a

Figure 4.1: Cumulative distribution and its corresponding quantile function

Figure 4.2: Deutsche Bank, annual report 2014, Values-at-Risk

rough draft: do not distribute

Page 39: Selected Topics from Portfolio Optimization

4.2 HOW ABOUT ADDING RISK? 39

๐‘ƒ (๐‘‹ = ๐‘ฅ๐‘– , ๐‘Œ = ๐‘ฆ๐‘–) 1/3 1/3 1/3๐‘‹ 2 4 5๐‘Œ 7 0 6

๐‘‹ + ๐‘Œ 9 4 11

Table 4.2: Counterexample

butV@R40%(๐‘‹) + V@R40%(๐‘Œ ) = 4 + 6 = 10 > 9 = V@R40%(๐‘‹ + ๐‘Œ ).

Lemma 4.4 (Cf. Figur 4.1). It holds that

(i) ๐นโˆ’1๐‘Œ(๐›ผ) โ‰ค ๐‘ฅ if and only if ๐›ผ โ‰ค ๐น๐‘Œ (๐‘ฅ) (Galois inequalities);

(ii) ๐น๐‘Œ is continuous from right (upper semi-continuous);

(iii) ๐นโˆ’1๐‘Œ

is continuous from left (lower semi-continuous);

(iv) ๐นโˆ’1๐‘Œ(๐น๐‘Œ (๐‘ฅ)) โ‰ค ๐‘ฅ for all ๐‘ฅ โˆˆ R and ๐น๐‘Œ

(๐นโˆ’1๐‘Œ(๐›ผ)

)โ‰ฅ ๐›ผ for all ๐›ผ โˆˆ (0, 1);

(v) ๐นโˆ’1๐‘Œ

(๐น๐‘Œ

(๐นโˆ’1๐‘Œ(๐‘ฆ)

) )= ๐นโˆ’1

๐‘Œ(๐‘ฆ) and ๐น๐‘Œ

(๐นโˆ’1๐‘Œ(๐น๐‘Œ (๐‘ฆ))

)= ๐น (๐‘ฆ).

Remark 4.5 (Quantile transform). Let ๐‘ˆ be uniformly distributed, i.e., ๐‘ƒ(๐‘ˆ โ‰ค ๐‘ข) = ๐‘ข for every๐‘ข โˆˆ (0, 1). The random variables ๐‘Œ and ๐นโˆ’1

๐‘Œ(๐‘ˆ) share the same distribution.

Proof. ๐‘ƒ(๐นโˆ’1๐‘Œ(๐‘ˆ) โ‰ค ๐‘ฆ

)= ๐‘ƒ (๐‘ˆ โ‰ค ๐น๐‘Œ (๐‘ฆ)) = ๐น๐‘Œ (๐‘ฆ), the assertion. ๏ฟฝ

The converse does not hold true, i.e., ๐น๐‘Œ (๐‘Œ ) is not necessarily uniformly distributed. How-ever, we have the following:

Lemma 4.6 (The generalized quantile transform, Pflug and Rรถmisch [2007, Proposition 1.3]).Let ๐‘ˆ be uniform and independent from ๐‘Œ . Then

๐น (๐‘Œ,๐‘ˆ) B (1 โˆ’๐‘ˆ) ยท ๐น (๐‘Œโˆ’) +๐‘ˆ ยท ๐น (๐‘Œ ) (4.3)

is uniformly [0, 1] and๐นโˆ’1๐‘Œ

(๐น (๐‘Œ,๐‘ˆ)

)= ๐‘Œ almost surely,

where ๐น (๐‘ฅโˆ’) B lim๐‘ฅโ€ฒโ†—๐‘ฅ ๐น (๐‘ฅ โ€ฒ).

Proof. For ๐‘ โˆˆ (0, 1) fixed let ๐‘ฆ๐‘ satisfy ๐น๐‘Œ (๐‘ฆ๐‘โˆ’) โ‰ค ๐‘ โ‰ค ๐น (๐‘ฆ๐‘). Then

๐‘ƒ(๐น (๐‘Œ,๐‘ˆ) โ‰ค ๐‘ |๐‘Œ

)=

1 if ๐‘Œ < ๐‘ฆ๐‘

๐‘โˆ’๐น (๐‘ฆ๐‘โˆ’)๐น (๐‘ฆ๐‘)โˆ’๐น (๐‘ฆ๐‘โˆ’) if ๐‘Œ = ๐‘ฆ๐‘

0 if ๐‘Œ > ๐‘ฆ๐‘

and thus ๐‘ƒ(๐น (๐‘Œ,๐‘ˆ) โ‰ค ๐‘

)= ๐น (๐‘ฆ๐‘โˆ’) +

(๐น (๐‘ฆ๐‘) โˆ’ ๐น (๐‘ฆ๐‘โˆ’)

) ๐‘โˆ’๐น (๐‘ฆ๐‘โˆ’)๐น (๐‘ฆ๐‘)โˆ’๐น (๐‘ฆ๐‘โˆ’) = ๐‘, i.e., ๐น (๐‘Œ,๐‘ˆ) is

uniformly distributed.Conditional on {๐‘Œ = ๐‘ฆ} it holds that ๐น (๐‘Œ,๐‘ˆ) โˆˆ

[๐นโˆ’1(๐‘ฆโˆ’), ๐นโˆ’1(๐‘ฆ)

]. But ๐นโˆ’1(๐‘ข) = ๐‘ฆ for every

๐‘ข โˆˆ[๐นโˆ’1(๐‘ฆโˆ’), ๐นโˆ’1(๐‘ฆ)

]and thus the assertion. ๏ฟฝ

Version: May 28, 2021

Page 40: Selected Topics from Portfolio Optimization

40 VALUE-AT-RISK

4.3 PROPERTIES OF THE VALUE-AT-RISK

Nice properties (cf. Lemma 4.4)

(i) Homogeneity: it holds that V@R๐›ผ (_๐‘Œ ) = _ ยท V@R๐›ผ (๐‘Œ ) for _ โ‰ฅ0;3

๐น_๐‘Œ (ยท) = ๐น๐‘Œ ( ยท/_) .

(ii) Cash-invariance: V@R๐›ผ (๐‘Œ + ๐‘) = ๐‘ + V@R๐›ผ (๐‘Œ ), where ๐‘ โˆˆ R is a constant. Note alsothat

๐น๐‘Œ+๐‘ (ยท) = ๐น๐‘Œ (ยท โˆ’ ๐‘) .

(iii) Law-invariance: if ๐‘‹ and ๐‘Œ share the same law, i.e., ๐‘ƒ(๐‘‹ โ‰ค ๐‘ง) = ๐‘ƒ(๐‘Œ โ‰ค ๐‘ง) for all ๐‘ง โˆˆ R,then V@R๐›ผ (๐‘‹) = V@R๐›ผ (๐‘Œ ) (note ๐‘‹ and ๐‘Œ may have the same law, even if ๐‘‹ (๐œ”) โ‰  ๐‘Œ (๐œ”)for all ๐œ” โˆˆ ฮฉ and even if ๐‘‹ : ฮฉโ†’ R and ๐‘Œ : ฮฉโ€ฒโ†’ R);

(iv) ๐น๐‘Œ(๐นโˆ’1๐‘Œ(๐‘)

)โ‰ฅ ๐‘, with equality, if ๐‘ is in the range of ๐น๐‘Œ , equivalently, if ๐นโˆ’1

๐‘Œ(๐‘) is a point

of continuity of ๐น๐‘Œ ;

(v) ๐นโˆ’1๐‘Œ(๐น๐‘Œ (๐‘ข)) โ‰ค ๐‘ข, with equality, if ๐‘ข is in the range of ๐นโˆ’1

๐‘Œ, equivalently, if ๐น๐‘Œ (๐‘ข) is a point

of continuity of ๐นโˆ’1๐‘Œ

;

(vi) comonotonic additive (cf. Section 14), i.e., V@R๐›ผ (๐‘‹ + ๐‘Œ ) = V@R๐›ผ (๐‘‹) + V@R๐›ผ (๐‘Œ ), pro-vided that ๐‘‹ and ๐‘Œ are comonotonic.

4.4 PROFIT VERSUS LOSS

For an illustration see Figure 4.3.

Lemma 4.7 (Profit vs. loss, cf. Figure 4.3). It holds that

V@R๐›ผ (๐‘Œ ) โ‰ค โˆ’V@R1โˆ’๐›ผ (โˆ’๐‘Œ ) (4.4)

with equality if ๐น๐‘Œ (๐‘ฆ + โ„Ž) > ๐น๐‘Œ (๐‘ฆ) for โ„Ž > 0 at ๐‘ฆ = V@R๐›ผ (๐‘Œ ).

Proof. First,

โˆ’V@R1โˆ’๐›ผ (โˆ’๐‘Œ ) = โˆ’ inf {๐‘ฆ : ๐‘ƒ(โˆ’๐‘Œ โ‰ค ๐‘ฆ) โ‰ฅ 1 โˆ’ ๐›ผ}= sup {โˆ’๐‘ฆ : ๐‘ƒ(๐‘Œ โ‰ฅ โˆ’๐‘ฆ) โ‰ฅ 1 โˆ’ ๐›ผ}= sup {๐‘ฆ : ๐‘ƒ(๐‘Œ โ‰ฅ ๐‘ฆ) โ‰ฅ 1 โˆ’ ๐›ผ}= sup {๐‘ฆ : 1 โˆ’ ๐‘ƒ(๐‘Œ โ‰ฅ ๐‘ฆ) โ‰ค ๐›ผ}= sup {๐‘ฆ : ๐‘ƒ(๐‘Œ < ๐‘ฆ) โ‰ค ๐›ผ} (4.5)

Now observe that{๐‘ฆ : ๐‘ƒ(๐‘Œ < ๐‘ฆ) โ‰ค ๐›ผ} ยคโˆช {๐‘ฆ : ๐‘ƒ(๐‘Œ < ๐‘ฆ) > ๐›ผ}

3Throughout this lecture shall investigate positively homogeneous acceptability functionals.

rough draft: do not distribute

Page 41: Selected Topics from Portfolio Optimization

4.4 PROFIT VERSUS LOSS 41

โˆ’AV@R95%(โˆ’๐‘Œ ) = A(๐‘Œ )V@R5%(๐‘Œ )

0

E๐‘Œ

loss profit

๐‘Œ

V@R95%(โˆ’๐‘Œ ) = โˆ’V@R5%(๐‘Œ )AV@R95%(โˆ’๐‘Œ )

0

โˆ’E๐‘Œ

profit loss

โˆ’๐‘Œ

Figure 4.3: Profit versus loss

Version: May 28, 2021

Page 42: Selected Topics from Portfolio Optimization

42 VALUE-AT-RISK

๐‘†1 ๐‘†2 ๐‘†3

ฮž: 5% โˆ’4% โˆ’2%8% 2% 0%4% 1% 0%โˆ’9% 0% โˆ’10%

Table 4.3: Annualized, logarithmic returns

and disjoint intervals. It follows that

โˆ’V@R1โˆ’๐›ผ (โˆ’๐‘Œ ) = inf {๐‘ฆ : ๐‘ƒ(๐‘Œ < ๐‘ฆ) > ๐›ผ}โ‰ฅ inf {๐‘ฆ : ๐‘ƒ(๐‘Œ โ‰ค ๐‘ฆ) > ๐›ผ}โ‰ฅ inf {๐‘ฆ : ๐‘ƒ(๐‘Œ โ‰ค ๐‘ฆ) โ‰ฅ ๐›ผ} ,

the assertion.Now set ๐‘ฆ B V@R๐›ผ (๐‘Œ ). As the cdf ๐‘ฆ โ†ฆโ†’ ๐‘ƒ(๐‘Œ โ‰ค ๐‘ฆ) is right-continuous it follows that ๐‘ฆ is

feasible for (4.2) and (4.5), i.e., ๐‘ƒ(๐‘Œ < ๐‘ฆ) โ‰ค ๐›ผ โ‰ค ๐‘ƒ(๐‘Œ โ‰ค ๐‘ฆ). Hence the result. ๏ฟฝ

Problem 4.8. A Markowitz-like formulation involving the Value-at-Risk is

maximize(in ๐‘ฅ = (๐‘ฅ1, . . . ๐‘ฅ๐‘†) )

1๐‘

โˆ‘๐‘๐‘›=1 ๐‘ฅ

>b๐‘› = ๐‘ฅ>b

subject to V@R5% (๐‘ฅ>b) โ‰ฅ โˆ’2$ 5% worst profits > โˆ’2$๐‘ฅ1 + ยท ยท ยท + ๐‘ฅ๐‘† โ‰ค 1.000$ budget constraint(๐‘ฅ โ‰ฅ 0) shortselling allowed / not allowed

or, with (4.4),

maximize(in ๐‘ฅ = (๐‘ฅ1, . . . ๐‘ฅ๐‘†) )

1๐‘

โˆ‘๐‘๐‘›=1 ๐‘ฅ

>b๐‘› = ๐‘ฅ>b

subject to V@R95% (โˆ’๐‘ฅ>b) โ‰ค 2$ 95% of all losses < 2$๐‘ฅ1 + ยท ยท ยท + ๐‘ฅ๐‘† โ‰ค 1.000$ budget constraint(๐‘ฅ โ‰ฅ 0) shortselling allowed / not allowed

4.5 PROBLEMS

Exercise 4.1. Is Problem 4.8 always feasible? Where is the Risk? Which statistics areinvolved?

Exercise 4.2. Is Problem 4.8 clever? โ€“ Downsides? How can one obtain higher returns?

Exercise 4.3. The matrix ฮž in Table 4.3 contains logarithmic, annualized returns of 3 sharesat the end of 4 quarters. You are invested with ๐‘ฅ = (40%, 30%, 30%). What is the Value-at-Risk at risk-level ๐›ผ = 30%, ๐›ผ = 70% of your returns?

rough draft: do not distribute

Page 43: Selected Topics from Portfolio Optimization

5Axiomatic Treatment of Risk

If in trouble, double.

Bรถrsenweisheit

Definition 5.1 (Artzner et al. [1999, 1997]). A positively homogeneous risk measure, aka.risk functional or coherent risk measure is a mapping R : ๐ฟ ๐‘ โ†’ R โˆช {โˆž} with the followingproperties:

(i) MONOTONICITY: R (๐‘Œ1) โ‰ค R (๐‘Œ2) whenever ๐‘Œ1 โ‰ค ๐‘Œ2 almost surely;

(ii) CONVEXITY: R ((1 โˆ’ _)๐‘Œ0 + _๐‘Œ1) โ‰ค (1 โˆ’ _) R (๐‘Œ0) + _R (๐‘Œ1) for 0 โ‰ค _ โ‰ค 1;

(iii) TRANSLATION EQUIVARIANCE:1 R(๐‘Œ + ๐‘) = R(๐‘Œ ) + ๐‘ if ๐‘ โˆˆ R;

(iv) POSITIVE HOMOGENEITY: R(_๐‘Œ ) = _R(๐‘Œ ) whenever _ > 0.

Throughout this lecture shall investigate positively homogeneous risk functionals.

Remark 5.2. In the present context ๐‘Œ is associated with loss. In the literature the mapping

๐œŒ : ๐‘Œ โ†ฆโ†’ R(โˆ’๐‘Œ )

is often called coherent risk measure instead, when ๐‘Œ is associated with a reward rather thana loss: Whereas R is more frequent in an actuarial (insurance) context, ๐œŒ is typically used ina banking context.

The term acceptability functional (or Utility Function, cf. Figure 4.3) is frequently used forthe concave mapping

A : ๐‘Œ โ†ฆโ†’ โˆ’R (โˆ’๐‘Œ ) . (5.1)

Example 5.3 (Simple Examples of risk measures). The functionals

R(๐‘Œ ) B E๐‘Œ

andR(๐‘Œ ) B ess sup๐‘Œ

are risk measures.

Example 5.4. The functional R(๐‘Œ ) B E๐‘Œ๐‘ is a risk functional, provided that ๐‘ โ‰ฅ 0 andE ๐‘ = 1.

1In an economic or monetary environment this is often called CASH INVARIANCE instead.

43

Page 44: Selected Topics from Portfolio Optimization

44 AXIOMATIC TREATMENT OF RISK

Proof. By translation equivariance we have that

R(๐‘Œ ) + ๐‘ = R(๐‘Œ + ๐‘ 1) = E(๐‘Œ + ๐‘ 1)๐‘ = E๐‘Œ + ๐‘E ๐‘,

hence E ๐‘ = 1.Further, we have for all ๐‘Œ1 โ‰ค ๐‘Œ2 that E๐‘Œ1๐‘ = R(๐‘Œ1) โ‰ค R(๐‘Œ1) = E๐‘Œ2๐‘, i.e., E ๐‘๐‘Œ โ‰ฅ 0 for all

๐‘Œ โ‰ฅ 0. This can only hold true for ๐‘ โ‰ฅ 0. ๏ฟฝ

Theorem 5.5. Suppose a risk functional R(ยท) is well defined on ๐ฟโˆž and satisfies (i) and (iii)in Definition 5.1. Then R is Lipschitz-continuous with respect to โ€–ยทโ€–โˆž; the Lipschitz constantis 1.

Proof. For ๐‘‹ and ๐‘Œ โˆˆ ๐ฟโˆž it is true that ๐‘‹ โˆ’ ๐‘Œ โ‰ค โ€–๐‘‹ โˆ’ ๐‘Œ โ€–โˆž and hence ๐‘‹ โ‰ค โ€–๐‘Œ โˆ’ ๐‘‹ โ€–โˆž + ๐‘Œ . Bymonotonicity and translation equivariance thus

R(๐‘‹) โ‰ค R(๐‘Œ + โ€–๐‘Œ โˆ’ ๐‘‹ โ€–โˆž) = โ€–๐‘Œ โˆ’ ๐‘‹ โ€–โˆž + R(๐‘Œ )

and thusR(๐‘‹) โˆ’ R(๐‘Œ ) โ‰ค โ€–๐‘Œ โˆ’ ๐‘‹ โ€–โˆž ;

interchanging the role of ๐‘‹ and ๐‘Œ reveals the result. ๏ฟฝ

Lemma 5.6. If R1 and R2 are risk measures, then so are

โŠฒ 12R1 +

12R2 and

โŠฒ max {R1,R2}.

By the Fenchelโ€“Moreau theorem (see below), no other risk measures are possible.

rough draft: do not distribute

Page 45: Selected Topics from Portfolio Optimization

6Examples of Coherent Risk Functionals

Buy on rumors, sell on facts.

Bรถrsenweisheit

6.1 MEAN SEMI-DEVIATION

The semi-deviation risk measure is1

R(๐‘Œ ) B E๐‘Œ + ๐›ฝ ยท โ€– (๐‘Œ โˆ’ E๐‘Œ )+โ€– ๐‘ , (6.1)

where ๐›ฝ โˆˆ [0, 1] and ๐‘ โ‰ฅ 1.

Proposition 6.1. The semi-deviation (6.1) is a risk measure.

Proof. Convexity (ii), translation equivariance (iii) and homogeneity (iv) are evident. To showmonotonicity (i) assume that ๐‘‹ โ‰ค ๐‘Œ . By Jensens inequality we have that (๐‘ฅ + ๐‘ฆ)+ โ‰ค ๐‘ฅ+ + ๐‘ฆ+and it follows that

(๐‘‹ โˆ’ E ๐‘‹)+ =(๐‘Œ โˆ’ E๐‘Œ + (๐‘‹ โˆ’ ๐‘Œ โˆ’ E(๐‘‹ โˆ’ ๐‘Œ ))

)+

โ‰ค (๐‘Œ โˆ’ E๐‘Œ )+ +(๐‘‹ โˆ’ ๐‘Œ โˆ’ E(๐‘‹ โˆ’ ๐‘Œ )

)+.

Now we have that ๐‘ฅ+ = ๐‘ฅ + (โˆ’๐‘ฅ)+ (cf. Footnote 1) and thus further

(๐‘‹ โˆ’ E ๐‘‹)+ โ‰ค (๐‘Œ โˆ’ E๐‘Œ )+ +(๐‘‹ โˆ’ ๐‘Œ โˆ’ E(๐‘‹ โˆ’ ๐‘Œ )

)+

(๐‘Œ โˆ’ ๐‘‹ โˆ’ E(๐‘Œ โˆ’ ๐‘‹)

)+.

Notice recall that ๐‘Œ โˆ’ ๐‘‹ โ‰ฅ 0 and further that E ๐‘‹ โ‰ค E๐‘Œ , and thus(๐‘Œ โˆ’ ๐‘‹ โˆ’E(๐‘Œ โˆ’ ๐‘‹)

)+ โ‰ค ๐‘Œ โˆ’ ๐‘‹.

Consequently

(๐‘‹ โˆ’ E ๐‘‹)+ โ‰ค (๐‘Œ โˆ’ E๐‘Œ )+ + ๐‘‹ โˆ’ ๐‘Œ โˆ’ E(๐‘‹ โˆ’ ๐‘Œ ) + ๐‘Œ โˆ’ ๐‘‹= (๐‘Œ โˆ’ E๐‘Œ )+ + E(๐‘Œ โˆ’ ๐‘‹).

It follows thatE ๐‘‹ + (๐‘‹ โˆ’ E ๐‘‹)+ โ‰ค E๐‘Œ + (๐‘Œ โˆ’ E๐‘Œ )+.

Now multiply the latter inequality with the density ๐‘ with ๐‘ โ‰ฅ 0, E ๐‘ = 1 and โ€–๐‘ โ€–๐‘ž โ‰ค 1 toobtain

E ๐‘‹ + โ€–(๐‘‹ โˆ’ E ๐‘‹)+โ€– ๐‘ โ‰ค E๐‘Œ + โ€–(๐‘Œ โˆ’ E๐‘Œ )+โ€– ๐‘by Hรถlderโ€™s inequality. Multiplying this inequality with ๐›ฝ and adding (1โˆ’ ๐›ฝ) times the inequalityE ๐‘‹ โ‰ค E๐‘Œ finally gives monotonicity (i). ๏ฟฝ

1๐‘ฅ+ B max {0, ๐‘ฅ}. Note, that ๐‘ฅ+ โˆ’ (โˆ’๐‘ฅ)+ = ๐‘ฅ.

45

Page 46: Selected Topics from Portfolio Optimization

46 EXAMPLES OF COHERENT RISK FUNCTIONALS

6.2 AVERAGE VALUE-AT-RISK

The most important and prominent acceptability functional satisfying all axioms of the Defini-tion is the Average Value-at-Risk.

Definition 6.2. The Average Value-at-Risk 2 at level ๐›ผ โˆˆ [0, 1] is given by

AV@R๐›ผ (๐‘Œ ) B11 โˆ’ ๐›ผ

โˆซ 1

๐›ผ

V@R๐‘ (๐‘Œ ) d๐‘ =11 โˆ’ ๐›ผ

โˆซ 1

๐›ผ

๐นโˆ’1๐‘Œ (๐‘ข) d๐‘ข (0 โ‰ค ๐›ผ < 1)

andAV@R1 (๐‘Œ ) B ess sup๐‘Œ .

Proposition 6.3. Representations of the Average Value-at-Risk include(cf. Footnote 1 on thepreceding page)

AV@R๐›ผ (๐‘Œ ) =11 โˆ’ ๐›ผ

โˆซ 1

๐›ผ

๐นโˆ’1๐‘Œ (๐‘) d๐‘ (6.2)

= inf๐‘žโˆˆR

๐‘ž + 11 โˆ’ ๐›ผ E (๐‘Œ โˆ’ ๐‘ž)+ (6.3)

= sup{E ๐‘Œ๐‘ : 0 โ‰ค ๐‘ โ‰ค (1 โˆ’ ๐›ผ)โˆ’1 , E ๐‘ = 1

}(6.4)

= sup{E๐‘„ ๐‘Œ :

d๐‘„d๐‘ƒโ‰ค 11 โˆ’ ๐›ผ

}=

cf. Remark 6.4E

(๐‘Œ | ๐‘Œ > V@R๐›ผ (๐‘Œ )

). (6.5)

The ๐›ผ-quantile ๐‘žโˆ— = ๐นโˆ’1๐‘Œ(๐›ผ) minimizes (6.3).

Remark 6.4. Equation (6.5) is correct, provided that ๐‘ƒ (๐‘Œ > V@R๐›ผ (๐‘Œ )) = 1โˆ’๐›ผ, i.e., ๐‘ƒ (๐‘Œ โ‰ค V@R๐›ผ (๐‘Œ )) =๐›ผ, or ๐น๐‘Œ

(๐นโˆ’1๐‘Œ(๐›ผ

)= ๐›ผ (cf. Lemma 4.4 (iv)). In this case (6.5) follows from (6.2).

Proof. Differentiate the objective in (6.3) with respect to ๐‘ž to get the necessary condition ofoptimality

0 = 1 โˆ’ 11 โˆ’ ๐›ผ E1{๐‘ž<๐‘Œ } = 1 โˆ’

11 โˆ’ ๐›ผ (1 โˆ’ ๐‘ƒ (๐‘Œ โ‰ค ๐‘ž)) ,

i.e., ๐‘ƒ(๐‘Œ โ‰ค ๐‘ž) = ๐›ผ, so that ๐‘žโˆ— = ๐นโˆ’1๐‘Œ(๐›ผ) = V@R๐›ผ (๐‘Œ ). The objective in (6.3) hence is

๐‘žโˆ— + 11 โˆ’ ๐›ผ E (๐‘Œ โˆ’ ๐‘ž

โˆ—)+ = ๐‘žโˆ— +11 โˆ’ ๐›ผ

โˆซ 1

0

(๐นโˆ’1๐‘Œ (๐‘ข) โˆ’ ๐‘žโˆ—

)+d๐‘ข

= ๐นโˆ’1๐‘Œ (๐›ผ) +11 โˆ’ ๐›ผ

โˆซ 1

๐›ผ

๐นโˆ’1๐‘Œ (๐‘ข) โˆ’ ๐นโˆ’1๐‘Œ (๐›ผ) d๐‘ข

=11 โˆ’ ๐›ผ

โˆซ 1

๐›ผ

๐นโˆ’1๐‘Œ (๐‘ข) d๐‘ข = AV@R๐›ผ (๐‘Œ ).

2TheโŠฒ Average Value-at-Risk, or conditional Value-at-Risk,

is sometimes also calledโŠฒ Conditional Tail Expectation (CTE)โŠฒ expected shortfall,โŠฒ tail value-at-risk or newlyโŠฒ super-quantile.

rough draft: do not distribute

Page 47: Selected Topics from Portfolio Optimization

6.2 AVERAGE VALUE-AT-RISK 47

See Lemma (6.7) below for the remaining assertion. ๏ฟฝ

Lemma 6.5. It holds that

(i) AV@R0(๐‘Œ ) = E๐‘Œ ;

(ii) V@R๐›ผ (๐‘Œ ) โ‰ค AV@R๐›ผ (๐‘Œ );

(iii) AV@R๐›ผโ€ฒ (๐‘Œ ) โ‰ค AV@R๐›ผ (๐‘Œ ), provided that ๐›ผโ€ฒ โ‰ค ๐›ผ, and particularly

(iv) E๐‘Œ๏ธธ๏ธท๏ธท๏ธธrisk neutral

โ‰ค AV@R๐›ผ (๐‘Œ )๏ธธ ๏ธท๏ธท ๏ธธrisk averse

โ‰ค ess sup๐‘Œ๏ธธ ๏ธท๏ธท ๏ธธcompletely risk averse

.

Proof. Indeed, substitute ๐‘ข โ† ๐น๐‘Œ (๐‘ฆ) and we have AV@R0(๐‘Œ ) =โˆซ 10 ๐น

โˆ’1๐‘Œ(๐‘ข) d๐‘ข =

โˆซR๐‘ฆ d๐น๐‘Œ (๐‘ฆ) =

E๐‘Œ . In case a density is available then d๐น๐‘Œ (๐‘ฆ) = ๐‘“๐‘Œ (๐‘ฆ) d๐‘ฆ and thus AV@R0(๐‘Œ ) =โˆซR๐‘ฆ ๐‘“๐‘Œ (๐‘ฆ) d๐‘ฆ.

For the proof of (iii) see the more general Proposition 6.12 below. ๏ฟฝ

Proposition 6.6. The Average Value-at-Risk is a risk functional according the axioms ofDefinition 5.1.

Proof. Monotonicity, translation equivariance and positive homogeneity are evident by (6.2)and (6.3).

As for convexity let ๐‘žโˆ—0 (๐‘žโˆ—1, resp.) be optimal in (6.3) for the random variable ๐‘Œ0 (๐‘Œ1, resp.).Set ๐‘Œ_ B (1 โˆ’ _)๐‘Œ0 + _๐‘Œ1 and ๐‘ž_ B (1 โˆ’ _)๐‘žโˆ—0 + _๐‘ž

โˆ—1. Then

AV@R๐›ผ

((1 โˆ’ _)๐‘Œ0 + _๐‘Œ1

)โ‰ค ๐‘ž_ +

11 โˆ’ ๐›ผ E (๐‘Œ_ โˆ’ ๐‘ž_)+

= (1 โˆ’ _)๐‘žโˆ—0 + _๐‘žโˆ—1 +

11 โˆ’ ๐›ผ E

((1 โˆ’ _)

(๐‘Œ0 โˆ’ ๐‘žโˆ—0

)+ _

(๐‘Œ1 โˆ’ ๐‘žโˆ—1

) )+

โ‰ค (1 โˆ’ _)๐‘žโˆ—0 + _๐‘žโˆ—1 +1 โˆ’ _1 โˆ’ ๐›ผ E

(๐‘Œ0 โˆ’ ๐‘žโˆ—0

)+ +

_

1 โˆ’ ๐›ผ E(๐‘Œ1 โˆ’ ๐‘žโˆ—1

)+ (6.6)

= (1 โˆ’ _) AV@R๐›ผ (๐‘Œ0) + _ AV@R๐›ผ (๐‘Œ1),

where we have used Jensenโ€™s inequality in (6.6) for the convex function ๐‘ฆ โ†ฆโ†’ (๐‘ฆ โˆ’ ๐‘ž)+; thusthe assertion. ๏ฟฝ

Lemma 6.7. It holds that

AV@R๐›ผ (๐‘Œ ) = max{E๐‘Œ๐‘ : 0 โ‰ค ๐‘ โ‰ค 1

1 โˆ’ ๐›ผ , E ๐‘ = 1}= min

๐‘โˆˆR

{๐‘ + 11 โˆ’ ๐›ผ E (๐‘Œ โˆ’ ๐‘)+

}. (6.7)

Proof. We provide a prove of the statement for discrete random variables based on duality.Recall first that the linear problems

minimize (in ๐‘ฅ) ๐‘>๐‘ฅsubject to ๐ด1๐‘ฅ = ๐‘1

๐ด2๐‘ฅ โ‰ฅ ๐‘2๐‘ฅ โ‰ฅ 0

and

maximize (in _,`) _>๐‘1 + `>๐‘2subject to _>๐ด1 + `>๐ด2 โ‰ค ๐‘>

` โ‰ฅ 0

Version: May 28, 2021

Page 48: Selected Topics from Portfolio Optimization

48 EXAMPLES OF COHERENT RISK FUNCTIONALS

are dual to each other. We rewrite the initial problem (6.7)

โˆ’minimize (in ๐‘) โˆ‘๐‘– โˆ’๐‘๐‘–๐‘Œ๐‘–๐‘๐‘–

subject toโˆ‘

๐‘– ๐‘๐‘–๐‘๐‘– = 1โˆ’๐‘๐‘–๐‘๐‘– โ‰ฅ โˆ’๐‘๐‘– 11โˆ’๐›ผ๐‘๐‘– โ‰ฅ 0

with ๐‘๐‘– = โˆ’๐‘๐‘–๐‘Œ๐‘–, ๐ด1,๐‘– = ๐‘๐‘–, ๐‘1 = 1, ๐ด2,๐‘– = โˆ’๐‘๐‘– and ๐‘2,๐‘– = โˆ’ ๐‘๐‘–1โˆ’๐›ผ . Inserting in the dual gives

โˆ’maximize (in _, `) _ โˆ’โˆ‘๐‘–

๐‘๐‘–1โˆ’๐›ผ `๐‘–

subject to _๐‘๐‘– โˆ’ ๐‘๐‘–`๐‘– โ‰ค โˆ’๐‘๐‘–๐‘Œ๐‘– ,`๐‘– โ‰ฅ 0.

(6.8)

Now note that the latter two inequalities are `๐‘– โ‰ฅ 0 and `๐‘– โ‰ฅ _ + ๐‘Œ๐‘–. The maximum in (6.8) isattained for `๐‘– = max {0, _ + ๐‘Œ๐‘–}. Hence (6.8) rewrites as

โˆ’maximize_ = _ โˆ’ 11 โˆ’ ๐›ผ

โˆ‘๐‘–

๐‘๐‘– (๐‘Œ๐‘– + _)+ = minimize_ = โˆ’_ + 11 โˆ’ ๐›ผ

โˆ‘๐‘–

๐‘๐‘– (๐‘Œ๐‘– + _)+ ,

from which the assertion follows. ๏ฟฝ

6.3 ENTROPIC VALUE-AT-RISK

Definition 6.8. The Entropic Value-at-Risk at risk level ๐›ผ โˆˆ [0, 1) is

EV@R๐›ผ (๐‘Œ ) = inf๐‘ก>0

1๐‘กlog

11 โˆ’ ๐›ผ E ๐‘’

๐‘ก๐‘Œ .

It is a risk measure satisfying all conditions (i)โ€“(iv) in Definition 5.1.

6.4 SPECTRAL RISK MEASURES

Definition 6.9. Spectral risk measures3 are

R๐œŽ (๐‘Œ ) Bโˆซ 1

0๐œŽ(๐‘ข)๐นโˆ’1๐‘Œ (๐‘ข) d๐‘ข

for some spectral function ๐œŽ : [0, 1] โ†’ R; occasionally, the function ๐œŽ(ยท) is also called spec-trum.

Proposition 6.10. Spectral functions ๐œŽ : [0, 1) โ†’ Rโ‰ฅ0 necessarily satisfy

(i)โˆซ 10 ๐œŽ(๐‘ข) d๐‘ข = 1 (by translation equivariance),

(ii) ๐œŽ(ยท) โ‰ฅ 0 (by monotonicity) and

(iii) ๐œŽ(ยท) is nondecreasing (by convexity).

Proof. See Exercise 6.3. ๏ฟฝ

3Also: distortion risk funtionals

rough draft: do not distribute

Page 49: Selected Topics from Portfolio Optimization

6.5 KUSUOKAโ€™S REPRESENTATION OF LAW INVARIANT RISK MEASURES 49

Remark 6.11. The Average Value-at-Risk is a spectral risk measure for the spectrum

๐œŽ(๐‘ข) B{0 if ๐‘ข < ๐›ผ,11โˆ’๐›ผ if ๐‘ข โ‰ฅ ๐›ผ.

Proposition 6.12. Suppose thatโˆซ 1๐›ผ๐œŽ1(๐‘ข) d๐‘ข โ‰ค

โˆซ 1๐›ผ๐œŽ2(๐‘ข) d๐‘ข for all ๐›ผ โˆˆ (0, 1), then R๐œŽ1 (๐‘Œ ) โ‰ค

R๐œŽ2 (๐‘Œ ).

Proof. We verify the statement for ๐‘Œ bounded. Set ฮฃ๐‘– (๐‘ข) Bโˆซ 1๐‘ข๐œŽ๐‘– (๐‘) d๐‘. Then

R๐œŽ1 (๐‘Œ ) =โˆซ 1

0๐œŽ1(๐‘ข)๐นโˆ’1๐‘Œ (๐‘ข) d๐‘ข = โˆ’

โˆซ 1

0๐นโˆ’1๐‘Œ (๐‘ข) dฮฃ1(๐‘ข),

and by integration by parts

R๐œŽ1 (๐‘Œ ) = โˆ’๐นโˆ’1๐‘Œ (๐‘ข)ฮฃ1(๐‘ข)๏ฟฝ๏ฟฝ1๐‘ข=0 +

โˆซ 1

0ฮฃ1(๐‘ข) d๐นโˆ’1๐‘Œ (๐‘ข) = ๐นโˆ’1๐‘Œ (0) +

โˆซ 1

0ฮฃ1(๐‘ข) d๐นโˆ’1๐‘Œ (๐‘ข).

Note, that ฮฃ1(ยท) โ‰ค ฮฃ2(ยท) by assumption and ๐นโˆ’1๐‘Œ(ยท) is an increasing function, thus

R๐œŽ1 (๐‘Œ ) = ๐นโˆ’1๐‘Œ (0) +โˆซ 1

0ฮฃ1(๐‘ข) d๐นโˆ’1๐‘Œ (๐‘ข) โ‰ค ๐นโˆ’1๐‘Œ (0) +

โˆซ 1

0ฮฃ2(๐‘ข) d๐นโˆ’1๐‘Œ (๐‘ข) = R๐œŽ2 (๐‘Œ ).

๏ฟฝ

Lemma 6.13. R` (๐‘Œ ) Bโˆซ 10 AV@R๐›ผ (๐‘Œ ) ` (d๐›ผ) is a spectral risk measure, provided that

โŠฒโˆซ 10 ` (d๐›ผ) = 1 (to ensure translation equivariance) and

โŠฒ `(ยท) โ‰ฅ 0 (to ensure monotonicity).

Proof. Indeed,

R` (๐‘Œ ) =โˆซ 1

0AV@R๐›ผ (๐‘Œ ) ` (d๐›ผ) =

โˆซ 1

0V@R๐›ผ (๐‘Œ ) ๐œŽ (๐›ผ) d๐›ผ,

where ๐œŽ (๐‘) =โˆซ ๐‘

0` (d๐›ผ)1โˆ’๐›ผ is the spectrum. ๏ฟฝ

Lemma 6.14. For ๐‘Œ โ‰ฅ 0 a.s. we have the representation

R๐œŽ (๐‘Œ ) =โˆซ โˆž

0ฮฃ (๐น๐‘Œ (๐‘ž)) d๐‘ž (if ๐‘Œ โ‰ฅ 0 a.s.),

where ฮฃ (๐›ผ) =โˆซ 1๐›ผ๐œŽ (๐‘) d๐‘ is the negative antiderivative.

Version: May 28, 2021

Page 50: Selected Topics from Portfolio Optimization

50 EXAMPLES OF COHERENT RISK FUNCTIONALS

Figure 6.1: AV@R๐›ผโ€™s are extreme points, so there is some Choquet-Bishop-de Leeuw Rep-resentation for A (Krein-Milman Theorem).

6.5 KUSUOKAโ€™S REPRESENTATION OF LAW INVARIANT RISK MEASURES

A supremum of Choquet representations.

Theorem 6.15. Suppose R is a law invariant Risk measure. Then it has the Kusuoka-representation

R (๐‘Œ ) = sup`โˆˆM

โˆซ 1

0AV@R๐›ผ (๐‘Œ ) ` (d๐›ผ)

(M is a set of positive measures on [0, 1]).

Proof. We have that R(๐‘Œ ) = sup๐‘ E๐‘Œ๐‘ โˆ’ Rโˆ—(๐‘) = sup๐‘ โˆˆZ E๐‘Œ๐‘ as Rโˆ—(๐‘) โˆˆ {0,โˆž}. For ๐‘ โˆˆ Zgiven, let ๐‘Œ โ€ฒ have the same distribution as ๐‘Œ so that ๐‘Œ โ€ฒ and ๐‘ are comonotone. Then

R(๐‘Œ ) = R(๐‘Œ โ€ฒ) = sup๐‘ โˆˆZ

E๐‘Œ๐‘ โˆ’ Rโˆ—(๐‘) = sup๐‘ โˆˆZ

โˆซ 1

0๐นโˆ’1๐‘Œ (๐‘ข)๐นโˆ’1๐‘ (๐‘ข) d๐‘ข = sup

๐œŽโˆˆฮฃ

โˆซ 1

0๐œŽ(๐‘ข)๐นโˆ’1๐‘Œ (๐‘ข) d๐‘ข,

where ฮฃ = {๐น๐‘ (ยท) : ๐‘ โˆˆ Z} collects all distribution functions of Z. Hence the result. ๏ฟฝ

Proposition 6.16. For any law invariant risk functional R it holds that

E๐‘Œ โ‰ค R(๐‘Œ ).

Proof. Consider the functional R๐œŽ (ยท) first. Find ๏ฟฝ๏ฟฝ such that ๐œŽ(๐‘ข) โ‰ค 1 whenever ๐‘ข โ‰ค ๏ฟฝ๏ฟฝ and๐œŽ(๐‘ข) โ‰ฅ 1 for ๐‘ข โ‰ฅ ๏ฟฝ๏ฟฝ. Note as well that

โˆซ ๏ฟฝ๏ฟฝ

0 1 โˆ’ ๐œŽ(๐‘ข) d๐‘ข =โˆซ 1๏ฟฝ๏ฟฝ๐œŽ(๐‘ข) โˆ’ 1 d๐‘ข, as

(โˆซ ๏ฟฝ๏ฟฝ

0 +โˆซ 1๏ฟฝ๏ฟฝ

)๐œŽ(๐‘ข) d๐‘ข =(โˆซ ๏ฟฝ๏ฟฝ

0 +โˆซ 1๏ฟฝ๏ฟฝ

)1 d๐‘ข = 1. Thenโˆซ ๏ฟฝ๏ฟฝ

0(1 โˆ’ ๐œŽ(๐‘ข))๐นโˆ’1๐‘Œ (๐‘ข) d๐‘ข โ‰ค

โˆซ ๏ฟฝ๏ฟฝ

0(1 โˆ’ ๐œŽ(๐‘ข))๐นโˆ’1๐‘Œ (๏ฟฝ๏ฟฝ) d๐‘ข

=

โˆซ 1

๏ฟฝ๏ฟฝ

(๐œŽ(๐‘ข) โˆ’ 1)๐นโˆ’1๐‘Œ (๏ฟฝ๏ฟฝ) d๐‘ข โ‰คโˆซ 1

๏ฟฝ๏ฟฝ

(๐œŽ(๐‘ข) โˆ’ 1)๐นโˆ’1๐‘Œ (๐‘ข) d๐‘ข.

The assertion follows from Kusuokaโ€™s representation. ๏ฟฝ

Proposition 6.17. For any law invariant risk measure R and sub-sigma algebra G we havethat

R(E(๐‘Œ | G)

)โ‰ค R (๐‘Œ ) .

rough draft: do not distribute

Page 51: Selected Topics from Portfolio Optimization

6.6 APPLICATION IN INSURANCE 51

Proof. Note that ๐‘ฅ โ†ฆโ†’ (๐‘ฅ โˆ’ ๐‘ž)+ is convex. Thus, by the conditional Jensen inequality

(E(๐‘Œ | G) โˆ’ ๐‘ž)+ โ‰ค E ((๐‘Œ โˆ’ ๐‘ž)+ |G) .

It follows that

AV@R๐›ผ

(E(๐‘Œ |G)

)= min

๐‘žโˆˆR๐‘ž + 11 โˆ’ ๐›ผ E ((E๐‘Œ | G) โˆ’ ๐‘ž)+

โ‰ค min๐‘žโˆˆR

๐‘ž + 11 โˆ’ ๐›ผ EE ((๐‘Œ โˆ’ ๐‘ž)+ | G)

= min๐‘žโˆˆR

๐‘ž + 11 โˆ’ ๐›ผ E ((๐‘Œ โˆ’ ๐‘ž)+) = AV@R๐›ผ (๐‘Œ ).

The assertion follows form Kusuokaโ€™s representation. ๏ฟฝ

Remark 6.18. For the Average Value-at-Risk we have that AV@R๐›ผ (E(๐‘Œ |F )) โ‰ค AV@R๐›ผ (๐‘Œ ),where F is a sub-sigma algebra, and thus R (E(๐‘Œ |F )) โ‰ค R (๐‘Œ ) for law invariant risk function-als by Kusuokaโ€™s theorem.

Proposition 6.19 (Cf. Fรถllmer and Schied [2004, Theorem 4.67]). AV@R๐›ผ (ยท) is the smallestlaw invariant coherent risk functional dominating V@R๐›ผ (ยท) (cf. Lemma 6.5 (ii) and Exer-cise 6.4).

Proof. By translation equivariance and for ๐‘Œ โˆˆ ๐ฟโˆž we may assume that ๐‘Œ > 0. Set ๐ด B{๐‘Œ > V@R๐›ผ (๐‘Œ )} and consider ๐‘‹ B ๐‘Œ ยท 1

๐ด{+E(๐‘Œ |๐ด) ยท 1๐ด. Notice, that ๐‘‹ = E

(๐‘Œ

๏ฟฝ๏ฟฝ๐‘Œ ยท 1๐ด{

)and

V@R๐›ผ (๐‘‹) = E(๐‘Œ |๐ด). Suppose the coherent risk functional R(ยท) dominates V@R๐›ผ (ยท), then

R(๐‘Œ ) โ‰ฅ R(๐‘‹) โ‰ฅ V@R๐›ผ (๐‘‹) = E(๐‘Œ |๐ด) = AV@R๐›ผ (๐‘Œ )

by (6.5). ๏ฟฝ

6.6 APPLICATION IN INSURANCE

The Dutch Premium Principle

A comprehensive list of Kusuoka representations for important risk functionals is providedin [Pflug and Rรถmisch, 2007]. A compelling example is the absolute semi-deviation riskmeasure (the Dutch premium principle, introduced in [van Heerwaarden and Kaas, 1992])for some fixed \ โˆˆ [0, 1],

R\ (๐‘Œ ) B E [๐‘Œ + \ ยท (๐‘Œ โˆ’ E๐‘Œ )+] ,

assigning an additional loading of \ to any loss ๐ฟ exceeding the net-premium price E ๐ฟ. ItsKusuoka representation is

R\ (๐‘Œ ) = sup0โ‰ค`โ‰ค1

(1 โˆ’ \ ยท `)E๐‘Œ + \ ` ยท AV@R1โˆ’` (๐‘Œ )

(cf. Shapiro [2012], Shapiro et al. [2014]).

Theorem 6.20. R\ (ยท) is a convex risk functional.

Version: May 28, 2021

Page 52: Selected Topics from Portfolio Optimization

52 EXAMPLES OF COHERENT RISK FUNCTIONALS

Proof. We verify that R\ (ยท) is convex. Indeed, define ๐‘Œ_ B (1 โˆ’ _)๐‘Œ0 + _๐‘Œ1. Then

(๐‘Œ_ โˆ’ E๐‘Œ_)+ = ((1 โˆ’ _) (๐‘Œ0 โˆ’ E๐‘Œ0) + _ (๐‘Œ1 โˆ’ E๐‘Œ1))+ โ‰ค (1 โˆ’ _) (๐‘Œ0 โˆ’ E๐‘Œ0)+ + _ (๐‘Œ1 โˆ’ E๐‘Œ1)+

and hence

E [๐‘Œ_ + \ ยท (๐‘Œ_ โˆ’ E๐‘Œ_)+] โ‰ค (1 โˆ’ _)E๐‘Œ0 + _E๐‘Œ1 + \ (1 โˆ’ _) (๐‘Œ0 โˆ’ E๐‘Œ0)+ + _ (๐‘Œ1 โˆ’ E๐‘Œ1)+= (1 โˆ’ _) (E๐‘Œ0 + \ (๐‘Œ0 โˆ’ E๐‘Œ0)+) + _ (E๐‘Œ1 + \ (๐‘Œ1 โˆ’ E๐‘Œ1)+)= (1 โˆ’ _)R\ (๐‘Œ0) + _R\ (๐‘Œ1) ,

i.e., R\ is convex. ๏ฟฝ

6.7 PROBLEMS

Exercise 6.1. Compute the Average Value-at-Risk for the returns given in Table 9.2 for ๐›ผ =

20%, 40%, 60% and ๐›ผ = 80%.

Exercise 6.2 (Cf. Exercise 4.3). The matrix ฮž in Table 4.3 contains logarithmic, annualizedreturns of 3 shares at the end of 4 quarters.

(i) Compute the Average Value-at-Risk for the risk levels ๐›ผ = 20%, 40%, 60% and ๐›ผ = 80%for each asset.

(ii) You are invested with ๐‘ฅ = (40%, 30%, 30%). What is the Value-at-Risk of your returnsat the above risk-levels?

Exercise 6.3. Verify Proposition 6.10. Hint: try the random variables 1[_,1] (๐‘ˆ) to showmonotonicity, and ๐‘Œ0 B 1[๐‘ข,1] (๐‘ˆ) and ๐‘Œ1 B 1[๐‘ขโˆ’ฮ”,1โˆ’ฮ”] (๐‘ˆ) for convexity.

Exercise 6.4. Give a risk functional R and a random variable ๐‘Œ โˆˆ ๐ฟโˆž so that R(๐‘Œ ) > E๐‘Œ .

rough draft: do not distribute

Page 53: Selected Topics from Portfolio Optimization

7Portfolio Optimization Problems Involving Risk Measures

. . . : daรŸ keines von ihnen verlorengehe.

Edith Stein, ESGA, Band 1

7.1 INTEGRATED RISK MANAGEMENT FORMULATION

The portfolio optimization problem we want to consider here for simplicity and introduction is(cf. Figure 4.3 and (iv) in Theorem 9.10)

maximizein ๐‘ฅ โˆ’R (โˆ’๐‘ฅ>b) = A (๐‘ฅ>b)

subject to ๐‘ฅ> 1 โ‰ค 1C,๐‘ฅ โ‰ฅ 0,

where A(ยท) B โˆ’R(โˆ’ ยท) is an acceptability functional, cf. (5.1), Remark 5.2. The problem is no-tably unbounded without shortselling constraints. Equivalent is the formulation (cf. Figure 4.3,again)

minimizein ๐‘ฅ R (โˆ’๐‘ฅ>b)

subject to ๐‘ฅ> 1 โ‰ค 1C,๐‘ฅ โ‰ฅ 0,

which is apparently a convex problem formulation.Typical risk functionals are R(๐‘Œ ) B (1 โˆ’ ๐›พ)E๐‘Œ + ๐›พ AV@R๐›ผ (๐‘Œ ).

7.2 MARKOWITZ TYPE FORMULATION

Recall from Figure 4.3 that E๐‘Œ + R(โˆ’๐‘Œ ) (โ‰ฅ 0) is a one-sided deviation from the mean, whichcan be interpreted as risk. The formulation

๐‘ฃ(`) B minimizein ๐‘ฅ E ๐‘ฅ>b + R

(โˆ’๐‘ฅ>b

)(7.1)

subject to E ๐‘ฅ>b โ‰ฅ `,๐‘ฅ> 1 โ‰ค 1C,(๐‘ฅ โ‰ฅ 0)

specifies a convex problem in ๐‘ฅ, as the objective (i.e., R) is convex, the constraints are evenlinear. The function ๐‘ฃ is nondecreasing and it holds that 0 โ‰ค E๐‘Œ + R (โˆ’๐‘Œ ), i.e., ๐‘ฃ(`) โ‰ฅ 0.

53

Page 54: Selected Topics from Portfolio Optimization

54 PORTFOLIO OPTIMIZATION PROBLEMS INVOLVING RISK MEASURES

Let ๐‘ฅ` denote the optimal diversification in (7.1). For _ โˆˆ (0, 1) set `_ B (1 โˆ’ _)`0 + _`1and ๐‘ฅ`_ B (1 โˆ’ _)๐‘ฅ`0 + _๐‘ฅ`1 . By linearity, ๐‘ฅ`_ is feasible for ๐‘ฃ

(`_

)and we have from convexity

of R that

๐‘ฃ(`_

)โ‰ค E ๐‘ฅ>`_b + R

(โˆ’๐‘ฅ>`_b

)โ‰ค (1 โˆ’ _)๐‘ฃ(`0) + _๐‘ฃ(`1),

the function ๐‘ฃ(ยท) thus is convex. This gives rise to the efficient frontier ` โ†ฆโ†’(๐‘ฃ(`)`

)(which is

concave) and an accordant tangency portfolio.

Example 7.1. Consider R(๐‘Œ ) B (1โˆ’๐›พ)E๐‘Œ +๐›พ AV@R๐›ผ (๐‘Œ ), then the problem of integrated riskmanagement is (cf. Figure 4.3, and again)

minimizein ๐‘ฅ ๐›พ ยท E ๐‘ฅ>b + ๐›พ ยท AV@R๐›ผ (โˆ’๐‘ฅ>b)

subject to E ๐‘ฅ>b โ‰ฅ `,๐‘ฅ> 1 โ‰ค 1C,(๐‘ฅ โ‰ฅ 0) ,

(7.2)

with parameters ๐›ผ, ๐›พ โˆˆ (0, 1).Recall that AV@R๐›ผ (๐‘Œ ) = min๐‘žโˆˆR

{๐‘ž + 1

1โˆ’๐›ผ E(๐‘Œ โˆ’ ๐‘ž)+}. The discrete model problem (7.2)

thus may be rewritten as

minimizein ๐‘ฅ, ๐‘ž ๐›พ ยท ๐‘>ฮž๐‘ฅ + ๐›พ ยท

(๐‘ž + 1

1โˆ’๐›ผ ๐‘> (โˆ’ฮž๐‘ฅ โˆ’ ๐‘ž)+

)subject to ๐‘>ฮž๐‘ฅ โ‰ฅ `,

๐‘ฅ> 1 โ‰ค 1C,(๐‘ฅ โ‰ฅ 0) .

To eliminate the nonlinear expression (ยท)+ define the slack variable ๐‘ง๐‘– B (โˆ’๐‘ž โˆ’ ๐‘ฅ>b๐‘–)+ (๐‘– =1, . . . , ๐‘›) and note that ๐‘ง๐‘– โ‰ฅ 0 and โˆ’๐‘ž โˆ’ ๐‘ฅ>b๐‘– โ‰ค ๐‘ง๐‘–. So we get the linear program

minimizein ๐‘ฅ, ๐‘ง, ๐‘ž ๐›พ ยท ๐‘>ฮž๐‘ฅ + ๐›พ ยท ๐‘ž + ๐›พ

1โˆ’๐›ผ ๐‘>๐‘ง

subject to โˆ’๐‘ž โˆ’ ๐‘ฅ>b๐‘– โ‰ค ๐‘ง๐‘– (๐‘– = 1, . . . ๐‘›) ,๐‘>ฮž๐‘ฅ โ‰ฅ `,๐‘ฅ> 1 โ‰ค 1C,๐‘ง โ‰ฅ 0, (๐‘ฅ โ‰ฅ 0) ,

or re-written in matrix-form

minimizein ๐‘ฅ, ๐‘ง, ๐‘ž โˆ’๐›พ๐‘>ฮž๐‘ฅ + ๐›พ ยท ๐‘ž + ๐›พ

1โˆ’๐›ผ ๐‘>๐‘ง

subject to ยฉยญยซโˆ’ฮž โˆ’๐ผ๐‘› โˆ’1๐‘›1>๐‘†

0 . . . 0 0โˆ’๐‘>ฮž 0 0

ยชยฎยฌ ยฉยญยซ๐‘ฅ

๐‘ง

๐‘ž

ยชยฎยฌ โ‰ค ยฉยญยซ01โˆ’`

ยชยฎยฌ ,๐‘ง โ‰ฅ 0, (๐‘ฅ โ‰ฅ 0) .

rough draft: do not distribute

Page 55: Selected Topics from Portfolio Optimization

7.3 ALTERNATIVE FORMULATION 55

Example 7.2. Consider R(๐‘Œ ) B EV@R๐›ผ (๐‘Œ ), then the problem of integrated risk managementis

minimizein ๐‘ฅ, ๐‘ก E ๐‘ฅ>b + ๐‘ก log 1

1โˆ’๐›ผ E ๐‘’โˆ’๐‘ฅ> b/๐‘ก

subject to E ๐‘ฅ>b โ‰ฅ `,๐‘ฅ> 1 โ‰ค 1C,๐‘ก > 0, (๐‘ฅ โ‰ฅ 0) .

(7.3)

7.3 ALTERNATIVE FORMULATION

The formulation

๏ฟฝ๏ฟฝ(๐‘) B maximizein ๐‘ฅ E ๐‘ฅ>b

subject to โˆ’ R(โˆ’๐‘ฅ>b

)โ‰ฅ ๐‘,

๐‘ฅ> 1 โ‰ค 1C,(๐‘ฅ โ‰ฅ 0)

specifies a convex problem in ๐‘ฅ โˆˆ R๐‘† as well (the objective is linear, the constraints convex).It holds that ๐‘ โ‰ค โˆ’R (โˆ’๐‘ฅ>b) โ‰ค E ๐‘ฅ>b and thus ๏ฟฝ๏ฟฝ(๐‘) โ‰ฅ ๐‘. The function ๏ฟฝ๏ฟฝ(๐‘) is concave and the

frontier ๐‘ โ†ฆโ†’(๐‘

๏ฟฝ๏ฟฝ(๐‘)

)(or ๐‘ โ†ฆโ†’

(๐‘

๏ฟฝ๏ฟฝ(๐‘) โˆ’ ๐‘

)) is a concave efficient frontier, which again gives rise

for a tangency portfolio.

Version: May 28, 2021

Page 56: Selected Topics from Portfolio Optimization

56 PORTFOLIO OPTIMIZATION PROBLEMS INVOLVING RISK MEASURES

rough draft: do not distribute

Page 57: Selected Topics from Portfolio Optimization

8Expected Utility Theory

Buy on bad news, sell on good news.

Bรถrsenweisheit

The concept of utility functions dates back to Oskar Morgenstern1 and John von Neu-mann,2 expected utilities to Kenneth Arrow3 and John W. Pratt.4

Preference is given to ๐‘Œ over ๐‘‹, if

E ๐‘ข(๐‘‹) โ‰ค E ๐‘ข(๐‘Œ ). (8.1)

8.1 EXAMPLES OF UTILITY FUNCTIONS

The exponential utility for ๐›พ โ‰ฅ 0 is defined as

๐‘ข(๐‘ฅ) = 1 โˆ’ ๐‘’โˆ’๐›พ๐‘ฅ . (8.2)

For ๐›ผ > 0, ๐›ผ โ‰  1 the polynomial utility functions are defined as

๐‘ข(๐‘ฅ) ={

๐‘ฅ1โˆ’๐›ผ

1โˆ’๐›ผ ๐‘ฅ โ‰ฅ 0,โˆ’โˆž ๐‘ฅ < 0;

(8.3)

they are sometimes also termed power utility functions,

๐‘ข(๐‘ฅ) ={

๐‘ฅ^

^๐‘ฅ โ‰ฅ 0,

โˆ’โˆž ๐‘ฅ < 0;(8.4)

and in case of ๐›ผ = 1,

๐‘ข(๐‘ฅ) ={log ๐‘ฅ ๐‘ฅ โ‰ฅ 0,โˆ’โˆž ๐‘ฅ < 0.

Definition 8.1 (HARA utilities). Hyperbolic risk aversion (HARA) utilities are๐‘ˆ (๐‘ค) = 1โˆ’๐›พ๐›พ

(๐‘Ž๐‘ค1โˆ’๐›พ + ๐‘

)๐›พ.

8.2 ARROWโ€“PRATT MEASURE OF ABSOLUTE RISK AVERSION

Definition 8.2. The local risk aversion coefficient at ๐‘ (cf. Arrowโ€“Pratt measure of absoluterisk-aversion (ARA), also known as the coefficient of absolute risk), is

๐ด(๐‘) = โˆ’๐‘ขโ€ฒโ€ฒ(๐‘)๐‘ขโ€ฒ(๐‘) ,

11902 (in Gรถrlitz) โ€“ 197721903 โ€“ 195831921โ€“2017, Nobel memorial price in economic sciences in 197241931

57

Page 58: Selected Topics from Portfolio Optimization

58 EXPECTED UTILITY THEORY

the coefficient for relative risk aversion is

๐‘…(๐‘) = โˆ’๐‘ ยท ๐‘ขโ€ฒโ€ฒ(๐‘)

๐‘ขโ€ฒ(๐‘) .

For a motivation consider the Taylor-series expansion ๐‘ข(๐‘ฆ) โ‰ˆ ๐‘ข(๐‘ฅ)+(๐‘ฆโˆ’๐‘ฅ)๐‘ขโ€ฒ(๐‘ฅ)+ (๐‘ฆโˆ’๐‘ฅ)2

2 ๐‘ขโ€ฒโ€ฒ(๐‘ฅ).At ๐‘ฅ = E๐‘Œ , ๐‘ฆ = ๐‘Œ and after taking expectations we obtain

E ๐‘ข(๐‘Œ ) โ‰ˆ ๐‘ข(E๐‘Œ ) + E(๐‘Œ โˆ’ E๐‘Œ )๐‘ขโ€ฒ(E๐‘Œ ) + E(๐‘Œ โˆ’ E๐‘Œ )2

2๐‘ขโ€ฒโ€ฒ(E๐‘Œ ) = ๐‘ข(E๐‘Œ ) + var๐‘Œ

2๐‘ขโ€ฒโ€ฒ(E๐‘Œ ). (8.5)

Now apply a Taylor-series expansion to the inverse ๐‘ขโˆ’1(๐‘ฅ) โ‰ˆ ๐‘ขโˆ’1(๐‘ฆ) + ๐‘ฅโˆ’๐‘ฆ๐‘ขโ€ฒ (๐‘ขโˆ’1 (๐‘ฆ)) with ๐‘ฅ = E ๐‘ข(๐‘Œ )

and ๐‘ฆ = ๐‘ข(E๐‘Œ ) to get

๐‘ขโˆ’1(E ๐‘ข(๐‘Œ )) โ‰ˆ E๐‘Œ + E ๐‘ข(๐‘Œ ) โˆ’ ๐‘ข(E๐‘Œ )๐‘ขโ€ฒ(E๐‘Œ ) โ‰ˆ E๐‘Œ + ๐‘ขโ€ฒโ€ฒ(E๐‘Œ )

2๐‘ขโ€ฒ(E๐‘Œ )๏ธธ ๏ธท๏ธท ๏ธธArrowโ€“Pratt at E๐‘Œ

ยท var๐‘Œ,

by (8.5).

Example 8.3. The risk aversion coefficient is โˆ’๐›พ (thus constant) for the utility function (8.2),while ๐ด(๐‘) = ๐›ผ

๐‘and ๐‘…(๐‘) = ๐›ผ for the utility given in (8.3).

Example 8.4. Consider ๐‘ข(๐‘ฅ) = log ๐‘ฅ, then ๐ด(๐‘) = โˆ’๐‘ขโ€ฒโ€ฒ (๐‘)๐‘ขโ€ฒ (๐‘) =

1๐‘.

8.3 EXAMPLE: ST. PETERSBURG PARADOX5

Consider the following game. A fair coin is tossed until heads appears for the first time andsuppose this happens at the Nth toss. The player will then get 2๐‘โˆ’1 euros. What is the fairamount a player should pay in order to play the game?

The fee is given by the expected payout. By definition of the geometric distribution:

๐‘ƒ(๐‘ = ๐‘˜) = 2โˆ’๐‘˜ ๐‘˜ = 1, 2, . . .

Therefore the expected payout is

E 2๐‘โˆ’1 =โˆžโˆ‘๐‘˜=12โˆ’๐‘˜2๐‘˜โˆ’1 =

โˆžโˆ‘๐‘˜=1

12= โˆž.

This result obviously does not make sense. Several approaches were developed byN. Bernoulli and G. Cramer. Insetad of calculating the expected payout E

[2๐‘โˆ’1

], ๐‘ =

๐‘ขโˆ’1(E

[๐‘ข(2๐‘โˆ’1)

] )with ๐‘ข(๐‘ฅ) = log(๐‘ฅ) or ๐‘ข(๐‘ฅ) =

โˆš๐‘ฅ is calculated. In the case of ๐‘ข(๐‘ฅ) = log(๐‘ฅ), it

follows that

E log 2๐‘โˆ’1 =โˆžโˆ‘๐‘˜=12โˆ’๐‘˜ (๐‘˜ โˆ’ 1) log 2 = log(2)

โˆžโˆ‘๐‘˜=0

๐‘˜ ยท 2โˆ’๐‘˜

=log(2)4

โˆžโˆ‘๐‘˜=0

๐‘˜2๐‘˜โˆ’1 =log 24

1(1 โˆ’ 12

)2 = log 2

5by ruben schlotter

rough draft: do not distribute

Page 59: Selected Topics from Portfolio Optimization

8.4 PREFERENCES AND UTILITY FUNCTIONS 59

and ๐‘ = ๐‘’log 2 = 2.In case of ๐‘ข(๐‘ฅ) =

โˆš๐‘ฅ,

Eโˆš2๐‘โˆ’1 =

โˆžโˆ‘๐‘˜=12

๐‘˜โˆ’12 2โˆ’๐‘˜ =

1โˆš2

โˆžโˆ‘๐‘˜=1

(โˆš22

) ๐‘˜=1โˆš2

โˆš22

1 โˆ’โˆš22

=1

2 โˆ’โˆš2

and therefore ๐‘ =(2 โˆ’โˆš2)โˆ’2โ‰ˆ 2.914.

The expected payout was weighted with ๐‘ข(ยท) which yields a finite value. The weightingwith ๐‘ข can be interpreted as giving less importance to very high payouts which have smallprobabilities of occurring.

Remark. Note the shape of both log(๐‘ฅ) andโˆš๐‘ฅ. Such functions are called utility functions.

8.4 PREFERENCES AND UTILITY FUNCTIONS

The main aim is to

โŠฒ model decisions under uncertainty

โŠฒ compare random payouts (lotteries)

Definition 8.5. A function ๐น : Rโ†’ [0, 1] is called (cumulative) distribution function on R, if

โŠฒ F is monotone increasing and right continuous.

โŠฒ lim{๐‘ฅโ†’โˆ’โˆž ๐น (๐‘ฅ) = 0 and lim{๐‘ฅโ†’โˆž ๐น (๐‘ฅ) = 1

LetM be the set of all distribution functions onR. We define a relation onM. A preferenceonM is a relation 4 such that

โŠฒ ๐น 4 ๐น for all ๐น โˆˆ M

โŠฒ (๐น 4 ๐บ) โˆง (๐บ 4 ๐ป) implies that ๐น 4 ๐ป for all ๐น, ๐บ, ๐ป โˆˆ M

โŠฒ Either (๐น 4 ๐บ) or (๐บ 4 ๐น)

๐น 4 ๐บ is interpreted as ๐บ is preferred over ๐น. ๐น and ๐บ are called equivalent, denoted by๐น โˆผ๐บ if๐น 4 ๐บ and๐บ 4 ๐น. The preference4satisfies the continuity axiom if for all ๐น, ๐บ, ๐ป โˆˆ Msuch that ๐น 4 ๐บ 4 ๐ป there exists an ๐›ผ โˆˆ [0, 1] with

(1 โˆ’ ๐›ผ)๐น + ๐›ผ๐ป โˆผ ๐บ.

The preference4satisfies the independence of irrelevant alternatives axiom, if for all ๐น, ๐บ, ๐ป โˆˆM and all ๐›ผ โˆˆ [0, 1]

๐น 4 ๐บ โ‡โ‡’ (1 โˆ’ ๐›ผ)๐น + ๐›ผ๐ป 4 (1 โˆ’ ๐›ผ)๐บ + ๐›ผ๐ป

Remark. The continuity axiom means that โ€œgoodโ€ and โ€œbadโ€ risks can be pooled into anaverage one.

Version: May 28, 2021

Page 60: Selected Topics from Portfolio Optimization

60 EXPECTED UTILITY THEORY

A preference4has a numerical representation if there is a mapping ๐‘ˆ : M โ†’ [โˆ’โˆž,โˆž) ,such that

๐น 4 ๐บ โ‡โ‡’ ๐‘ˆ (๐น) โ‰ค ๐‘ˆ (๐บ)

This representation has a von Neumann-Morgenstern representation if there exists an-other function ๐‘ข : Rโ†’ [โˆ’โˆž,โˆž) such that for all ๐‘‹ with (cumulative) distribution function ๐น

๐‘ˆ (๐น) = E ๐‘ข(๐‘‹)

Theorem 8.6. Let 4 be a preference onM. Then the following are equivalent

(i) 4satisfies the continuity and the independence axiom

(ii) 4 has a vNM representation

Definition 8.7. ๐‘ข : Rโ†’ [โˆ’โˆž,โˆž) is called Bernoulli-utility function if ๐‘ข is monotone increasingand strictly concave.

Theorem 8.8. Let 4be a preference with a vNM-representation. Then ๐‘ข is a Bernoulli utilityfunction if and only if

(i) ๐‘ 4 ๐‘‘ โ‡โ‡’ ๐‘ โ‰ค ๐‘‘ for all ๐‘, ๐‘‘ โˆˆ R

(ii) ๐‘‹ 4 E ๐‘‹

Let๐‘ข be a Bernoulli utility function (wrt. .: (4๐‘ข) and ๐‘‹ have finite expectation then definethe certainty equivalent of ๐‘‹ by

๐‘ B ๐‘(๐‘‹, ๐‘ข) โˆˆ R, such that ๐‘ โˆผ๐‘ข ๐‘‹

In the introductory example the certainty equivalent of the payout ๐‘ with respect to 2different Bernoulli utility functions was calculated.

rough draft: do not distribute

Page 61: Selected Topics from Portfolio Optimization

9Stochastic Orderings

Ich bin so glรผcklich, ich habe meinenPosten verloren. Mein Chef ist nรคmlichin Konkurs gegangen. Mich bringtniemand mehr in ein Bankhaus.

Arnold Schรถnberg, 1874โ€“1951, anDavid Josef Bach

A particular utility function ๐‘ข(ยท) is occasionally considered as artifact which specifies avery particular and individual personal preference. Different investors might employ verydifferent utility functions to express their individual preference.

Some concepts of stochastic orderings robustify decisions by replacing a single utilityfunction by a class of functions, so that (8.1) holds for all of them.

9.1 STOCHASTIC DOMINANCE OF FIRST ORDER

Definition 9.1. For R-valued random variables ๐‘‹ and ๐‘Œ we say that ๐‘‹ is dominated by ๐‘Œ infirst order stochastic dominance,

๐‘‹ 4(1) ๐‘Œ or ๐‘‹ 4๐น๐‘†๐ท ๐‘Œ,

if (8.1) holds for all ๐‘ข โˆˆ U๐น๐‘†๐ท B {๐‘ข : Rโ†’ R nondecreasing} for which the integrals exists.

Remark 9.2. Not that the function ๐‘ข(๐‘ฅ) B ๐‘ฅ in nondecreasing, so that E ๐‘‹ โ‰ค E๐‘Œ whenever๐‘‹ 4(1) ๐‘Œ .

Remark 9.3. The relation๐‘‹ โ‰ค ๐‘Œ almost surely

(cf. Definition 5.1 (i)) is occasionally referred to as stochastic dominance of order 0 anddenoted ๐‘‹ 4(0) ๐‘Œ .

Theorem 9.4. The following are equivalent:

(i) ๐‘‹ 4(1) ๐‘Œ ,

(ii) ๐น๐‘‹ (ยท) โ‰ฅ ๐น๐‘Œ (ยท), i.e., ๐‘ƒ(๐‘‹ โ‰ค ๐‘ง) โ‰ฅ ๐‘ƒ(๐‘Œ โ‰ค ๐‘ง) for all ๐‘ง โˆˆ R and

(iii) ๐นโˆ’1๐‘‹(ยท) โ‰ค ๐นโˆ’1

๐‘Œ(ยท), i.e., V@R๐›ผ (๐‘‹) โ‰ค V@R๐›ผ (๐‘Œ ) for all ๐›ผ โˆˆ (0, 1).

Proof. The function ๐‘ข๐‘ง (๐‘ฅ) B 1(๐‘ง,โˆž) (๐‘ฅ) is nondecreasing and thus ๐‘ข๐‘ง โˆˆ U๐น๐‘†๐ท. Note thatE ๐‘ข๐‘ง (๐‘‹) = ๐‘ƒ(๐‘‹ > ๐‘ง), thus (8.1) is equivalent to ๐น๐‘‹ (๐‘ง) = ๐‘ƒ(๐‘‹ โ‰ค ๐‘ง) = 1โˆ’๐‘ƒ(๐‘‹ > ๐‘ง) = 1โˆ’E ๐‘ข๐‘ง (๐‘‹) โ‰ฅ1 โˆ’ E ๐‘ข๐‘ง (๐‘Œ ) = 1 โˆ’ ๐‘ƒ(๐‘Œ > ๐‘ง) = ๐‘ƒ(๐‘Œ โ‰ค ๐‘ง) = ๐น๐‘Œ (๐‘ง), so that (ii) follows from (i).

As for the converse note that every nondecreasing function ๐‘ข(ยท) may be approximated bya simple step function ๐‘ข๐‘› (ยท) =

โˆ‘๐‘›๐‘–=1 ๐›ผ๐‘– 1(๐‘ง๐‘– ,โˆž) (ยท) with ๐›ผ๐‘– > 0 so that |E ๐‘ข(๐‘‹) โˆ’ E ๐‘ข๐‘› (๐‘‹) | < Y and

61

Page 62: Selected Topics from Portfolio Optimization

62 STOCHASTIC ORDERINGS

probabilities 40% 20% 40%

๐‘Œ0 = ๐‘‹ 2 4 4๐‘Œ1 4 4 2

12 (๐‘Œ0 + ๐‘Œ1) 3 4 3

(a) The relation 4(1) is not convex, cf. Re-mark 9.5. Note that ๐‘Œ0 โ‰  ๐‘Œ1, but ๐น๐‘Œ0 = ๐น๐‘Œ1 .

probabilities 40% 20% 10% 30%

๐‘‹ 0 2 3 3๐‘Œ 1 1 1 4

(b) ๐‘‹ $(1) ๐‘Œ , but ๐‘‹ 4(2) ๐‘Œ , cf. Figure 9.1

Table 9.1: Counterexamples

|E ๐‘ข(๐‘Œ ) โˆ’ E ๐‘ข๐‘› (๐‘Œ ) | < Y. With (ii) it follows that E ๐‘ข๐‘› (๐‘‹) =โˆ‘๐‘›

๐‘–=1 ๐›ผ๐‘– ๐‘ƒ(๐‘‹ > ๐‘ง๐‘–) โ‰คโˆ‘๐‘›

๐‘–=1 ๐›ผ๐‘– ๐‘ƒ(๐‘Œ >

๐‘ง๐‘–) = E ๐‘ข๐‘› (๐‘Œ ), so that stochastic dominance in first order follows.It is evident that (ii) and (iii) are equivalent. ๏ฟฝ

Remark 9.5. Exercise 9.5 (Table 9.1a) demonstrates that the order 4(1) is not convex, i.e.,the sets

{๐‘Œ : ๐‘‹ 4(1) ๐‘Œ

}and

{๐‘Œ : ๐‘Œ 4(1) ๐‘‹

}are not convex.

9.2 STOCHASTIC DOMINANCE OF SECOND ORDER

Definition 9.6. For R-valued random variables ๐‘‹ and ๐‘Œ we say that ๐‘‹ is dominated by ๐‘Œ insecond order stochastic dominance,

๐‘‹ 4(2) ๐‘Œ or ๐‘‹ 4๐‘†๐‘†๐ท ๐‘Œ,

if (8.1) holds for all ๐‘ข โˆˆ U๐‘†๐‘†๐ท B {๐‘ข : Rโ†’ R nondecreasing and concave}, for which theintegrals exists.

Remark 9.7. As in Remark 9.2 we have that that E ๐‘‹ โ‰ค E๐‘Œ whenever ๐‘‹ 4(2) ๐‘Œ .

Remark 9.8. By Jensenโ€™s inequality ๐‘ข(E๐‘Œ ) โ‰ฅ E ๐‘ข(๐‘Œ ). This can be read as possessing (i.e.,not investing) the amount ๐‘ข(E๐‘Œ ) is given preference to investing.

Lemma 9.9. The set{๐‘Œ : ๐‘‹ 4(2) ๐‘Œ

}is convex (cf. Remark 9.5).

Proof. Let ๐‘‹ 4(2) ๐‘Œ0, ๐‘‹ 4(2) ๐‘Œ1 and ๐‘ข โˆˆ U๐‘†๐‘†๐ท be chosen. Define ๐‘Œ_ B (1 โˆ’ _)๐‘Œ0 + _๐‘Œ1. ByJensenโ€™s inequality it holds that ๐‘ข

((1 โˆ’ _)๐‘Œ0 + _๐‘Œ1

)โ‰ฅ (1 โˆ’ _)๐‘ข(๐‘Œ0) + _ ๐‘ข(๐‘Œ1) and thus

E ๐‘ข(๐‘Œ_) = E ๐‘ข((1 โˆ’ _)๐‘Œ0 + _๐‘Œ1

)โ‰ฅ

Jensenโ€™s inequality(1 โˆ’ _)E ๐‘ข(๐‘Œ0) + _E ๐‘ข(๐‘Œ1)

โ‰ฅ (1 โˆ’ _)E ๐‘ข(๐‘‹) + _E ๐‘ข(๐‘‹) = E ๐‘ข(๐‘‹)

by Jensenโ€™s inequality and hence ๐‘‹ 4(2) ๐‘Œ_. ๏ฟฝ

Theorem 9.10. The following are equivalent:

(i) ๐‘‹ 4(2) ๐‘Œ ,

(ii)โˆซ ๐‘ž

โˆ’โˆž ๐น๐‘‹ (๐‘ง) d๐‘ง โ‰ฅโˆซ ๐‘ž

โˆ’โˆž ๐น๐‘Œ (๐‘ง) d๐‘ง for all ๐‘ž โˆˆ R and

(iii)โˆซ ๐›ผ

0 ๐นโˆ’1๐‘‹(๐‘ข) d๐‘ข โ‰ค

โˆซ ๐›ผ

0 ๐นโˆ’1๐‘Œ(๐‘ข) d๐‘ข (this is what is called the absolute Lorentz function) for

๐›ผ โˆˆ (0, 1),

rough draft: do not distribute

Page 63: Selected Topics from Portfolio Optimization

9.2 STOCHASTIC DOMINANCE OF SECOND ORDER 63

0 1 2 3 40

0.2

0.4

0.6

0.8

1

๐‘ฅ

๐น๐‘‹ (๐‘ฅ)๐น๐‘Œ (๐‘ฅ)

(a) Cumulative distribution function, Theorem 9.4(ii)

0 0.2 0.4 0.6 0.8 10

1

2

3

4

๐›ผ

V@R๐›ผ (๐‘‹)V@R๐›ผ (๐‘Œ )

(b) Generalized inverse, V@R, Theorem 9.4(iii)

0 1 2 3 40

0.5

1

1.5

2

2.5

๐‘ฅ

โˆซ ๐‘ฅ

โˆ’โˆž ๐น๐‘‹ (๐‘ฆ) d๐‘ฆโˆซ ๐‘ฅ

โˆ’โˆž ๐น๐‘Œ (๐‘ฆ) d๐‘ฆ

(c) Integrated cdf, Theorem 9.10(ii)

0 0.2 0.4 0.6 0.8 10

0.5

1

1.5

๐›ผ

โˆซ ๐›ผ

0 V@R๐‘ข (๐‘‹) d๐‘ขโˆซ ๐›ผ

0 V@R๐‘ข (๐‘Œ ) d๐‘ข

(d) Integrated V@R, Theorem 9.10(iii)

Figure 9.1: Random variables ๐‘‹ (blue) and ๐‘Œ (orange) from Table 9.1b

Version: May 28, 2021

Page 64: Selected Topics from Portfolio Optimization

64 STOCHASTIC ORDERINGS

(iv) โˆ’AV@R๐›ผ (โˆ’๐‘‹) โ‰ค โˆ’AV@R๐›ผ (โˆ’๐‘Œ ) for all ๐›ผ โˆˆ (0, 1).1

Proof. By Riemannโ€“Stieltjes integration by parts we have thatโˆซ ๐‘ž

โˆ’โˆž๐น๐‘‹ (๐‘ฅ) d๐‘ฅ = ๐‘ฅ ยท ๐น๐‘‹ (๐‘ฅ) |๐‘ž๐‘ฅ=โˆ’โˆž โˆ’

โˆซ ๐‘ž

โˆ’โˆž๐‘ฅ d๐น๐‘‹ (๐‘ฅ) =

โˆซ ๐‘ž

โˆ’โˆž๐‘ž โˆ’ ๐‘ฅ d๐น๐‘‹ (๐‘ฅ) (9.1)

=

โˆซ โˆž

โˆ’โˆž(๐‘ž โˆ’ ๐‘ฅ)+ d๐น๐‘‹ (๐‘ฅ) = โˆ’E ๐‘ข๐‘ž (๐‘‹),

where๐‘ข๐‘ž (๐‘ฅ) B โˆ’(๐‘ž โˆ’ ๐‘ฅ)+.

The function ๐‘ข๐‘ž (ยท) is nondecreasing and concave, hence it follows from (8.1) that E ๐‘ข๐‘ž (๐‘‹) โ‰คE ๐‘ข๐‘ž (๐‘Œ ) and thus the assertion (ii).

A nondecreasing and concave function ๐‘ข(ยท) โˆˆ U๐‘†๐‘†๐ท can be approximated by ๐‘ข๐‘› (๐‘ฅ) Bโˆ‘๐‘›๐‘–=1 ๐›ผ๐‘– ยท ๐‘ข๐‘ž๐‘– (๐‘ฅ) where ๐›ผ๐‘– > 0 so that |E ๐‘ข(๐‘‹) โˆ’ E ๐‘ข๐‘› (๐‘‹) | < Y and |E ๐‘ข(๐‘Œ ) โˆ’ E ๐‘ข๐‘› (๐‘Œ ) | < Y (see

Mรผller and Stoyan [2002] for details). Assertion (i) then follows by combining (9.1) and (ii).Define

๐บ๐‘‹ (๐‘ž) Bโˆซ ๐‘ž

โˆ’โˆž๐น๐‘‹ (๐‘ฅ) d๐‘ฅ and ๐บโˆ’1๐‘‹ (๐›ผ) B

โˆซ ๐›ผ

0๐นโˆ’1๐‘‹ (๐‘) d๐‘.

Recall Youngโ€™s inequality (15.7), i.e.,

๐‘ž ๐›ผ โ‰คโˆซ ๐‘ž

0๐น๐‘‹ (๐‘ฅ) d๐‘ฅ๏ธธ ๏ธท๏ธท ๏ธธ๐บ๐‘‹ (๐‘ž)

+โˆซ ๐›ผ

0๐นโˆ’1๐‘‹ (๐‘) d๐‘๏ธธ ๏ธท๏ธท ๏ธธ๐บโˆ’1

๐‘‹(๐›ผ)

.

From (ii) we deduce that ๐‘ž ๐›ผ โˆ’ ๐บ๐‘‹ (๐‘ž) โ‰ค ๐‘ž ๐›ผ โˆ’ ๐บ๐‘Œ (๐‘ž) and thus

๐บโˆ’1๐‘‹ (๐›ผ) = sup๐‘žโˆˆR{๐‘ž ๐›ผ โˆ’ ๐บ๐‘‹ (๐‘ž)} โ‰ค sup

๐‘žโˆˆR{๐‘ž ๐›ผ โˆ’ ๐บ๐‘Œ (๐‘ž)} = ๐บโˆ’1๐‘Œ (๐›ผ).

The converse follows by the same reasoning.As for (iv) recall that E ๐‘ข๐‘ž (๐‘‹) โ‰ค E ๐‘ข๐‘ž (๐‘Œ ) is equivalent to E(๐‘žโˆ’๐‘‹)+ โ‰ฅ E(๐‘žโˆ’๐‘Œ )+, which holds

true for every ๐‘ž โˆˆ R. Thus ๐‘ž + 11โˆ’๐›ผ E(โˆ’๐‘‹ โˆ’ ๐‘ž)+ โ‰ฅ ๐‘ž +

11โˆ’๐›ผ E(โˆ’๐‘Œ โˆ’ ๐‘ž)+, from which assertion (iv)

follows after taking the infimum with respect to ๐‘ž โˆˆ R.As for the converse define ๐‘žโˆ—๐›ผ B โˆ’V@R๐›ผ (โˆ’๐‘Œ ) and recall that AV@R๐›ผ (โˆ’๐‘Œ ) = โˆ’๐‘žโˆ—๐›ผ +

11โˆ’๐›ผ E(โˆ’๐‘Œ + ๐‘ž

โˆ—๐›ผ)+. It follows that

โˆ’๐‘žโˆ—๐›ผ +11 โˆ’ ๐›ผ E(โˆ’๐‘‹ + ๐‘ž

โˆ—๐›ผ)+ โ‰ฅ AV@R๐›ผ (โˆ’๐‘‹) โ‰ฅ AV@R๐›ผ (โˆ’๐‘Œ ) = โˆ’๐‘žโˆ—๐›ผ +

11 โˆ’ ๐›ผ E

(โˆ’๐‘Œ + ๐‘žโˆ—๐›ผ

)+ ,

and thus E ๐‘ข๐‘žโˆ—๐›ผ (๐‘‹) โ‰ค E ๐‘ข๐‘žโˆ—๐›ผ (๐‘Œ ). The assertion follows now, as ๐‘žโˆ—๐›ผ can be adjusted for every๐›ผ โˆˆ (0, 1). ๏ฟฝ

Remark 9.11. It is evident that ๐‘‹ 4(1) ๐‘Œ implies ๐‘‹ 4(2) ๐‘Œ , but the converse is not true: cf.Exercise 9.6 (Table 9.1b).

1Recall from Remark 5.2 that A(๐‘Œ ) B โˆ’AV@R๐›ผ (โˆ’๐‘Œ ) is an acceptability functional.

rough draft: do not distribute

Page 65: Selected Topics from Portfolio Optimization

9.3 PORTFOLIO OPTIMIZATION 65

9.3 PORTFOLIO OPTIMIZATION

Let ๐‘‹ be a random variable understood as a benchmark (cf. (2.3) in the introduction). ByLemma 9.9, the problem

maximizein ๐‘ฅ E ๐‘ฅ>b

subject to ๐‘‹ 4(2) ๐‘ฅ>b

๐‘ฅ> 1 โ‰ค 1C,(๐‘ฅ โ‰ฅ 0)

(9.2)

is a convex optimization problem.The relation ๐‘‹ 4๐‘†๐‘†๐ท ๐‘ฅ>b in (9.2) can be restated as

maximizein ๐‘ฅ E ๐‘ฅ>b

subject to โˆ’AV@R๐›ผ (โˆ’๐‘‹) โ‰ค โˆ’AV@R๐›ผ (โˆ’๐‘ฅ>b) for all ๐›ผ โˆˆ (0, 1),๐‘ฅ> 1 โ‰ค 1C,(๐‘ฅ โ‰ฅ 0)

(9.3)

so that the problem has infinity many (uncountably many) constraints. We refer to Dentchevaand Ruszczynski [2011] for a discussion and numerical implementation schemes.

9.4 PROBLEMS

Exercise 9.1. Show that

(1 โˆ’ ๐›ผ) AV@R๐›ผ (๐‘Œ ) โˆ’ ๐›ผ AV@R1โˆ’๐›ผ (โˆ’๐‘Œ ) = E๐‘Œ .

Exercise 9.2. Compute the preferences E ๐‘ข๐›พ (๐‘‹) โ‰ค E ๐‘ข๐›พ (๐‘Œ ) for ๐‘ข๐›พ (๐‘ฅ) B ๐‘ฅ๐›พ

๐›พwith ๐›พ = 0.5 and

๐›พ = 0.6 for the random variable specified in Table 4.2.

Exercise 9.3. Compute the preferences E ๐‘ข_(๐‘‹) for ๐‘ข_(๐‘ฅ) B 1 โˆ’ ๐‘’โˆ’_๐‘ฅ with selections of _ toget different preferences (again Table 4.2).

Exercise 9.4. Show by using Theorem 9.4 that we have ๐‘‹ $(1) ๐‘Œ and ๐‘‹ $(2) ๐‘Œ for the randomvariable in Table 4.2.

Exercise 9.5. Consider the random variables in Table 9.1a and verify that the set{๐‘Œ : ๐‘‹ 4(1) ๐‘Œ

}is not convex.

Exercise 9.6. Verify for the random variables in Table 9.1b that ๐‘‹ $(1) ๐‘Œ , but ๐‘‹ 4(2) ๐‘Œ .

Exercise 9.7. Which portfolio is preferable in Table 9.2 if employing the utility function ๐‘ข(๐‘ฅ) =๐‘ฅ^ for ^ โˆˆ (0, 1)?

Version: May 28, 2021

Page 66: Selected Topics from Portfolio Optimization

66 STOCHASTIC ORDERINGS

probabilities 30% 70%

return ๐‘Œ1 10% 25%return ๐‘Œ2 25% 10%

Table 9.2: Return of different portfolios ๐‘Œ1 and ๐‘Œ2

rough draft: do not distribute

Page 67: Selected Topics from Portfolio Optimization

10Arbitrage

On ne peut vivre de frigidaires, depolitique, de bilans et de mots croisรฉs,voyez-vous! On ne peut plus vivre sanspoรฉsie, couleur ni amour.

Antoine de Saint-Exupรฉry,Lettre au gรฉnรฉral X, 30 juillet 1944

This section follows Cornuejols and Tรผtรผncรผ [2006, Chapter 4].

Definition 10.1. Arbitrage is a trading strategy,

Type A: that has a positive cash flow and no risk of a later loss;

Type B: that requires no initial cash input, has no risk of a loss, and a positive probability ofmaking profits in the future: ๐‘‰0 = 0, ๐‘ƒ(๐‘‰๐‘ก โ‰ฅ 0) = 1 and ๐‘ƒ(๐‘‰๐‘ก > 0) > 0, where ๐‘‰๐‘ก is theportfolio value at time ๐‘ก.

10.1 TYPE A

Consider the exchange rates in Table 10.1. Note, that converting any (!) currency forwardsand backwards will result in a loss; for example

1 EUR = 1.12 US$ = 1.12 โˆ— 0.892 EUR = 0.99904 EUR < 1 EUR,

etc.Exchanging a sequence of currencies is a loss as well (in general), e.g.,

1 EUR = 121 JPY= 121 โˆ— 0.00701 GBP= 121 โˆ— 0.00701 โˆ— 1.286 US$= 121 โˆ— 0.00701 โˆ— 1.286 โˆ— 0.892 EUR = 0.97 EUR < 1 EUR. (10.1)

to: EUR US$ GBP JPY1 EUR = 1.12 0.87 121.01 US$ = 0.892 0.777 110.71 GBP = 1.149 1.286 142.61 JPY = 0.0082 0.00900 0.00701

Table 10.1: Exchange rates

67

Page 68: Selected Topics from Portfolio Optimization

68 ARBITRAGE

However, the table allows for arbitrage (a free lunch of 1.76%), for example by converting

1 EUR = 1.12 US$= 1.12 โˆ— 0.777 GBP= 1.12 โˆ— 0.777 โˆ— 142.6 JPY= 1.12 โˆ— 0.777 โˆ— 142.6 โˆ— 0.0082 EUR = 1.0176 EUR > 1 EUR. (10.2)

Investing 1C thus will result in a profit of EUR 0.0176 without risk. Note that (10.2) is actuallythe reverse order of (10.1).

How can one detect an opportunity for arbitrage?

Define the variables

๐ธ๐ท, etc.: quantity of EUR (i.e., number of EUR banknotes) changed to US$, etc.

๐ด: quantity of EUR generated by arbitrage.

Then we may consider the optimization problem

maximize ๐ด

subject to 0.892๐ท๐ธ โˆ’ ๐ธ๐ท + 1.149๐‘ƒ๐ธ โˆ’ ๐ธ๐‘ƒ + 0.0082๐‘Œ๐ธ โˆ’ ๐ธ๐‘Œ โ‰ฅ ๐ด, (10.3)1.12๐ธ๐ท โˆ’ ๐ท๐ธ + 1.286๐‘ƒ๐ท โˆ’ ๐ท๐‘ƒ + 0.009๐‘Œ๐ท โˆ’ ๐ท๐‘Œ โ‰ฅ 0,0.87๐ธ๐‘ƒ โˆ’ ๐‘ƒ๐ธ + 0.777๐ท๐‘ƒ โˆ’ ๐‘ƒ๐ท + 0.00701๐‘Œ๐‘ƒ โˆ’ ๐‘ƒ๐‘Œ โ‰ฅ 0,121๐ธ๐‘Œ โˆ’ ๐‘Œ๐ธ + 110.7๐ท๐‘Œ โˆ’ ๐‘Œ๐ท + 142.6๐‘ƒ๐‘Œ โˆ’ ๐‘Œ๐‘ƒ โ‰ฅ 0, (10.4)๐ธ๐ท + ๐ธ๐‘Œ + ๐ธ๐‘ƒ โ‰ค 1000 EUR, (10.5)๐ธ๐ท โ‰ฅ 0, ๐ท๐ธ โ‰ฅ 0, etc. and ๐ด โ‰ฅ 0.

The constraints (10.3)โ€“(10.4) are balance equations (conservation equations1) for EUR, USD,GBP and JPY (resp.), while the budget constraint (10.5) limits the total, initial amount avail-able. (0, . . . , 0) is always a feasible solution with profit ๐ด = 0 (convert nothing, no arbitrage).We have found an arbitrage opportunity, if our solver returns a solution with objective ๐ด > 0.

Indeed, this is possible here as ๐ธ๐ท = 1, ๐ท๐‘ƒ = 1.12, ๐‘ƒ๐‘Œ = 1.12 โˆ— 0.777 = 0.87024, ๐‘Œ๐ธ =

1.12 โˆ— 0.777 โˆ— 142.6 = 124.09 (all other are ๐ธ๐‘ƒ = ๐ธ๐‘Œ = ๐ท๐ธ = ยท ยท ยท = 0) and ๐ด = 0.0176 > 0(cf. (10.2)) is feasible and indeed the optimal solution (up to scaling with 1000, cf. (10.5)).

10.2 TYPE B

Consider as series of options ๐‘– = 1, . . . ๐‘› with payoff ฮจ๐‘– (ยท), written on one single underlyingwith random terminal price ๐‘†. By investing the amount of ๐‘ฅ๐‘– in each option (we allow short-selling, i.e., ๐‘ฅ๐‘– โ‰ค 0 or ๐‘ฅ๐‘– โ‰ฅ 0), we obtain the random payoff

ฮจ๐‘ฅ (๐‘†) =๐‘›โˆ‘๐‘–=1

๐‘ฅ๐‘– ยท ฮจ๐‘– (๐‘†).

1Erhaltungsgleichungen, in German

rough draft: do not distribute

Page 69: Selected Topics from Portfolio Optimization

10.2 TYPE B 69

For call and put options with strike ๐พ๐‘–, the function ฮจ๐‘ฅ (ยท) is piecewise linear with kinks at thestrikes ๐พ๐‘– and thus everywhere nonnegative (thus generating arbitrage) in the range of theunderlying ๐‘† โˆˆ [0,โˆž), if

ฮจ๐‘ฅ (0) โ‰ฅ 0, ฮจ๐‘ฅ (๐พ ๐‘—) โ‰ฅ 0 for all ๐‘— = 1, . . . , ๐‘› and ฮจโ€ฒ๐‘ฅ (๐พ๐‘š๐‘Ž๐‘ฅ) โ‰ฅ 0,

where ๐พ๐‘š๐‘Ž๐‘ฅ B max ๐‘—=1,...๐‘› ๐พ ๐‘— is the largest of all strikes.Assume the price of option ๐‘– is ๐‘๐‘– and solve the linear problem

minimize๐‘ฅโˆˆR๐‘›

๐‘›โˆ‘๐‘–=1

๐‘ฅ๐‘–๐‘๐‘– (10.6)

subject to๐‘›โˆ‘๐‘–=1

๐‘ฅ๐‘– ยท ฮจ๐‘– (0) โ‰ฅ 0,

๐‘›โˆ‘๐‘–=1

๐‘ฅ๐‘– ยท ฮจ๐‘– (๐พ ๐‘—) โ‰ฅ 0 for ๐‘— = 1, . . . , ๐‘› and

๐‘›โˆ‘๐‘–=1

๐‘ฅ๐‘– ยท(ฮจ๐‘– (๐พ๐‘š๐‘Ž๐‘ฅ + 1) โˆ’ ฮจ๐‘– (๐พ๐‘š๐‘Ž๐‘ฅ)

)โ‰ฅ 0.

Then there is type B arbitrage, if the objective (10.6) โ‰ค 0 or unbounded; no arbitrage ispossible, if (10.6) > 0.

Example 10.2. How should one modify problem (10.6) to incorporate interest, for examplebecause the options are exercised in 1 year, e.g.?

Version: May 28, 2021

Page 70: Selected Topics from Portfolio Optimization

70 ARBITRAGE

rough draft: do not distribute

Page 71: Selected Topics from Portfolio Optimization

11The Flowergirl Problem1

Sell in May and go away.

investment strategy

11.1 THE FLOWERGIRL PROBLEM

Example 11.1 (The flowergirl problem, cf. [Pflug and Pichler, 2014]). A flowergirl has todecide how many flowers she orders from the wholesaler.

โŠฒ She

โ€“ buys for the price ๐‘ per flower and

โ€“ sells them for a price ๐‘  > ๐‘.

โ€“ The random demand is b.

โ€“ If the demand is higher than the available stock, she may procure additional flow-ers for an extra price ๐‘’ > ๐‘.

โ€“ Unsold flowers may be returned for a price of ๐‘Ÿ < ๐‘

โŠฒ What is the optimal order quantity ๐‘ฅโˆ—, if the expected profit should be maximized?

We formulate the profit as negative costs (expenses minus revenues) are

total costs B initial purchase = ๐‘ ๐‘ฅ

โˆ’ revenue from sales โˆ’ ๐‘  b+ extra procurement costs + ๐‘’ [b โˆ’ ๐‘ฅ]+โˆ’ revenue from returns. โˆ’ ๐‘Ÿ [b โˆ’ ๐‘ฅ]โˆ’;

here, [๐‘Ž]+ = max {๐‘Ž, 0} is the positive part of ๐‘Ž and [๐‘Ž]โˆ’ = max {โˆ’๐‘Ž, 0} is the negative part of๐‘Ž. Since ๐‘Ž = [๐‘Ž]+ โˆ’ [๐‘Ž]โˆ’, the cost function may be rewritten as

๐‘„(๐‘ฅ, b) = (๐‘ โˆ’ ๐‘Ÿ) ๐‘ฅ โˆ’ (๐‘  โˆ’ ๐‘Ÿ) b + (๐‘’ โˆ’ ๐‘Ÿ) [b โˆ’ ๐‘ฅ]+

= (๐‘ โˆ’ ๐‘Ÿ){๐‘ฅ + 11 โˆ’ ๐‘’โˆ’๐‘

๐‘’โˆ’๐‘Ÿ[b โˆ’ ๐‘ฅ]+

}โˆ’ (๐‘  โˆ’ ๐‘Ÿ) b.

Since AV@R๐›ผ (๐‘Œ ) = min{๐‘ฅ + 1

1โˆ’๐›ผ E[๐‘Œ โˆ’ ๐‘ฅ]+ : ๐‘ฅ โˆˆ R}

(see Average Value-at-Risk below) it fol-lows that

min๐‘ฅโˆˆR

E๐‘„(๐‘ฅ, b) = (๐‘ โˆ’ ๐‘Ÿ) AV@R๐›ผ (b) โˆ’ (๐‘  โˆ’ ๐‘Ÿ) E b,

1Also: Newsboy, or Newsvendor problem

71

Page 72: Selected Topics from Portfolio Optimization

72 THE FLOWERGIRL PROBLEM2

where ๐›ผ B ๐‘’โˆ’๐‘๐‘’โˆ’๐‘Ÿ .

To determine the order quantity consider the function

๐‘ฅ โ†ฆโ†’ (๐‘ โˆ’ ๐‘Ÿ){๐‘ฅ + 11 โˆ’ ๐‘’โˆ’๐‘

๐‘’โˆ’๐‘ŸE[b โˆ’ ๐‘ฅ]+

}โˆ’ (๐‘  โˆ’ ๐‘Ÿ)E b (11.1)

and its derivative

0 = (๐‘ โˆ’ ๐‘Ÿ){1 โˆ’ 11 โˆ’ ๐›ผ E1b>๐‘ฅ

}= (๐‘ โˆ’ ๐‘Ÿ)

{1 โˆ’ 11 โˆ’ ๐›ผ๐‘ƒ (b > ๐‘ฅ)

}.

Note next that this is equivalent to ๐›ผ = ๐‘ƒ(b โ‰ค ๐‘ฅ). The optimal procurement quantity of theflowergirl thus has the explicit expression

๐‘ฅโˆ— = V@R๐›ผ (b) = V@R ๐‘’โˆ’๐‘๐‘’โˆ’๐‘Ÿ(b) = ๐นโˆ’1b

(๐‘’ โˆ’ ๐‘๐‘’ โˆ’ ๐‘Ÿ

)(see, e.g., Pflug and Rรถmisch [2007, page 56]).

11.2 PROBLEMS

Exercise 11.1. Show that (11.1) is convex.

rough draft: do not distribute

Page 73: Selected Topics from Portfolio Optimization

12Duality For Convex Risk Measures

Risk measures, as introduced in Section 5, are convex. They hence have a representationR(๐‘Œ ) = sup {๐‘ฅโˆ—(๐‘ง) โˆ’ Rโˆ—(๐‘งโˆ—) : ๐‘งโˆ— โˆˆ ๐‘‹โˆ—}. To this end we specify the domain, its dual and the innerproduct ๐‘ฅโˆ—(๐‘ง).

Typical candidates for the domain of risk measures are ๐ฟ ๐‘ spaces, particularly ๐ฟโˆž. Recallthat the duals are ๐ฟ๐‘ž. Note further that the canonical inner product for the ๐ฟ ๐‘ โˆ’ ๐ฟ๐‘ž-duality is

๐ฟ ๐‘ ร— ๐ฟ๐‘ž 3 (๐‘Œ, ๐‘) โ†ฆโ†’ E๐‘Œ๐‘.

Proposition 12.1. Suppose that the risk measure R : ๐ฟ ๐‘ โ†’ R โˆช {โˆž} is positively homoge-neous. Then Rโˆ—(๐‘) โˆˆ {0,โˆž}.

Proof. Note that

Rโˆ—(๐‘) = sup๐‘Œ โˆˆ๐ฟ๐‘

E๐‘Œ๐‘ โˆ’ R(๐‘Œ )

= sup_โˆˆR

_ ยท sup๐‘Œ โˆˆ๐ฟ๐‘

(E๐‘Œ๐‘ โˆ’ R(๐‘Œ )) โˆˆ {0,โˆž} .

๏ฟฝ

Proposition 12.2. Suppose that the risk measure R : ๐ฟ ๐‘ โ†’ Rโˆช{โˆž} is translation equivariant.Then Rโˆ—(๐‘) = โˆž unless E ๐‘ = 1.

Proof.

Rโˆ—(๐‘) = sup๐‘Œ โˆˆ๐ฟ๐‘

E๐‘Œ๐‘ โˆ’ R(๐‘Œ )

= sup๐‘Œ โˆˆ๐ฟ๐‘

sup๐‘โˆˆR(E(๐‘Œ + ๐‘)๐‘ โˆ’ R(๐‘Œ + ๐‘)) = sup

๐‘Œ โˆˆ๐ฟ๐‘

E(๐‘Œ )๐‘ โˆ’ R(๐‘Œ ) + sup๐‘โˆˆR

๐‘ (E ๐‘ โˆ’ 1) ,

from which the assertion follows. ๏ฟฝ

Proposition 12.3. Suppose that the risk measure R : ๐ฟ ๐‘ โ†’ R โˆช {โˆž} is monotone. ThenRโˆ—(๐‘) = โˆž unless ๐‘ โ‰ฅ 0 almost surely.

Proof. Suppose that ๐‘ƒ(๐‘ โ‰ค 0) > 0. Set ๐ด B {๐‘ โ‰ค 0} and ๐‘Œ0 B 1๐ด. Note, that โˆ’๐‘Œ0 โ‰ค 0, henceR(โˆ’๐‘Œ0) โ‰ค R(0) and

Rโˆ—(๐‘) = sup๐‘Œ โˆˆ๐ฟ๐‘

E๐‘Œ๐‘ โˆ’ R(๐‘Œ )

โ‰ฅ sup_<0(E_๐‘Œ0๐‘ โˆ’ R(_๐‘Œ0)) โ‰ฅ sup

_<0E_ 1๐ด ๐‘ โˆ’ R(0) = โˆž.

Hence, ๐‘ โ‰ฅ 0 a.s. ๏ฟฝ

Definition 12.4. The support function of a set Z is ๐‘ Z (๐‘Œ ) B sup๐‘ โˆˆZ E๐‘Œ๐‘.

Theorem 12.5. Define Z B {๐‘ : Rโˆ—(๐‘) < โˆž}.

73

Page 74: Selected Topics from Portfolio Optimization

74 DUALITY FOR CONVEX RISK MEASURES

rough draft: do not distribute

Page 75: Selected Topics from Portfolio Optimization

13Stochastic Optimization: Terms, and Definitions, and theDeterministic Equivalent

13.1 EXPECTED VALUE OF PERFECT INFORMATION (EVPI) AND VALUE OF

STOCHASTIC SOLUTION (VSS)

The expected value of perfect information (EVPI) is the price that one would be willing to payin order to gain access to perfect information, that is to say the difference SP โ€“ wait-and-see,

E[min๐‘ฅ๐‘“ (๐‘ฅ, b)

]๏ธธ ๏ธท๏ธท ๏ธธ

wait-and-see

โ‰ค min๐‘ฅE ๐‘“ (๐‘ฅ, b)๏ธธ ๏ธท๏ธท ๏ธธSP

โ‰ค E ๐‘“ (๐‘ฅ, b). (13.1)

Both inequalities hold always.For ๐‘“ (๐‘ฅ, ยท) concave Jensenโ€™s inequality continues the sequence of inequalities above with

E ๐‘“ (๐‘ฅ, b) โ‰ค ๐‘“ (๐‘ฅ,E b) ,

drawing some attention to the strategy ๐‘ฅ0 โˆˆ argmin ๐‘“ (๐‘ฅ,E b) (if this exists at all): Giventhis reference strategy ๐‘ฅ0 the distance RHS โ€“ SP in (13.1) is called Value of the StochasticSolution (VSS). However, in a general context ๐‘“ (๐‘ฅ, ยท) are rather convex and no comparisonof ๐‘“ (๐‘ฅ0,E b) with (SP) is possible in (13.1) for this case (counter-example: Farmer Ted).

13.2 THE FARMER TED

See Jeffโ€™s lecture, http://homepages.cae.wisc.edu/โˆผlinderot/classes/ie495/lecture2.pdf.

13.3 THE RISK-NEUTRAL PROBLEM

Two-stage stochastic linear program with fixed recourse:

minimize ๐‘>๐‘ฅ + Eb

[๐‘ž>b๐‘ฆ b

]= ๐‘ (๐‘ฅ) + EQ (๐‘ฅ, ยท)

(in ๐‘ฅ and ๐‘ฆ)subject to ๐ด๐‘ฅ = ๐‘ 1st stage constraints

๐‘‡b ๐‘ฅ +๐‘Š๐‘ฆ b = โ„Žb for a.e. b โˆˆ ฮž 2nd stage constraints๐‘ฅ โˆˆ ๐‘‹, ๐‘ฆ b โˆˆ ๐‘Œ for a.e. b โˆˆ ฮž

(13.2)

Note, that minimization is done over a deterministic ๐‘ฅ and a random variable ๐‘ฆ.

75

Page 76: Selected Topics from Portfolio Optimization

76STOCHASTIC OPTIMIZATION: TERMS, AND DEFINITIONS, AND THE DETERMINISTIC

EQUIVALENT

13.4 GLOSSARY/ CONCEPT/ DEFINITIONS:

โŠฒ(๐‘ฅ, ๐‘ฆ b

)โ†ฆโ†’ ๐‘>๐‘ฅ + Eb

[๐‘ž>๐‘ฆ b

]is the objective function;

โ€“ ๐‘ฅ is called here-and-now decision (solution), 1st stage decision;

โ€“ the (optimal) random variable ๐‘ฆ b is called wait-and-see decision (solution), 2๐‘›๐‘‘stage decision or recourse action;

โŠฒ ๐‘ (deterministic) costs;

โŠฒ ๐‘ž: vector of recourse costs, which is sometimes considered random as well;

โŠฒ ๐‘Š is the recourse matrix. Fixed recourse is given, if โ€“ as in (13.2) โ€“ ๐‘Š = ๐‘Š (b) (i.e., thematrix is deterministic/ nonrandom);

โŠฒ ๐‘‡b are sometimes called technology matrices;

โŠฒ ๐‘Œ : feasible set of recourse actions;

โŠฒ the function

๐‘ฃ๐‘ž (๐‘ง) B{min๐‘ฆโˆˆ๐‘Œ {๐‘ž>๐‘ฆ : ๐‘Š๐‘ฆ = ๐‘ง} if feasible+โˆž else

is called second stage value function or recourse (penalty) function;

โŠฒ then defineQ (๐‘ฅ, b) B ๐‘ฃ

(โ„Žb โˆ’ ๐‘‡b ๐‘ฅ

)= min

๐‘ฆโˆˆ๐‘Œ

{๐‘ž>๐‘ฆ : ๐‘Š๐‘ฆ = โ„Žb โˆ’ ๐‘‡b ๐‘ฅ

},

(notice: ๐‘ฅ โ†ฆโ†’ Q (๐‘ฅ, b) is lsc.)

โŠฒ andQ (๐‘ฅ) B Eb [Q (๐‘ฅ, b)] = Eb

[๐‘ฃ(โ„Žb โˆ’ ๐‘‡b ๐‘ฅ

) ]is called expected value function, or expected minimum recourse function.

โŠฒ A recourse is relatively complete if ๐ด๐‘ฅ = ๐‘, ๐‘ฅ โ‰ฅ 0 implies Q (๐‘ฅ, b) < โˆž for a.e. b โˆˆ ฮž.

โŠฒ A recourse is complete if โˆ€๐‘ง : ๐‘ฃ (๐‘ง) < โˆž (i.e., there always exists a feasible recourseaction, โˆ€๐‘ง โˆƒ๐‘ฆ : ๐‘Š๐‘ฆ = ๐‘ง). As a consequence, Q (๐‘ฅ, b) < โˆž.

13.5 KKT FOR (13.2)

โŠฒ ๐‘ฃ is a LP itself and consequently, from duality,

๐‘ฃ (๐‘ง) = min๐‘ฆโ‰ฅ0

{๐‘ž>๐‘ฆ : ๐‘Š๐‘ฆ = ๐‘ง

}= max

_

{_>๐‘ง : _>๐‘Š โ‰ค ๐‘ž>

}(13.3)

where additionally _โˆ—> โˆˆ ๐œ•๐‘ฃ (๐‘ง) (_โˆ— being the optimal (argmax) solution of the dual prob-lem).

โŠฒ From the chain rule, ๐œ•๐‘ฅQ (๐‘ฅ, b) = ๐œ•๐‘ฅ๐‘ฃ(โ„Žb โˆ’ ๐‘‡b ๐‘ฅ

)3 โˆ’_โˆ—>

b๐‘‡b .

rough draft: do not distribute

Page 77: Selected Topics from Portfolio Optimization

13.6 DETERMINISTIC EQUIVALENT 77

โŠฒ Suppose further the probability space is discrete, that is P =โˆ‘

b โˆˆฮž ๐‘ b ยท๐›ฟb (๐‘ b B P [{b}]),then

Q (๐‘ฅ) = Eb [Q (๐‘ฅ, b)] =โˆ‘b โˆˆฮž

๐‘ bQ (๐‘ฅ, b) . (13.4)

For ๐‘ข>bB โˆ’_โˆ—>

b๐‘‡b โˆˆ ๐œ•๐‘ฅQ (๐‘ฅ, b) thus ๐‘ข> B Eb

[๐‘ข>b

]=

โˆ‘b โˆˆฮž ๐‘ b๐‘ข

>bโˆˆ ๐œ•Q (๐‘ฅ).

KKT, applied to the problem (13.2):๐‘ฅโˆ— is an optimal solution of (13.2) iff โˆƒ_โˆ—, `โˆ— โ‰ฅ 0 st.

(i) 0 โˆˆ ๐‘> + ๐œ•Q (๐‘ฅโˆ—) + _โˆ—>๐ด โˆ’ `โˆ—>,

(ii) `โˆ—>๐‘ฅโˆ— = 0.

13.6 DETERMINISTIC EQUIVALENT

Given the situation, that ฮž consists of finitely many (๐‘† B |ฮž|) atoms, Equation (13.2) can bereformulated in its deterministic equivalent, i.e.

minimize ๐‘>๐‘ฅ+ ๐‘ b1๐‘ž>๐‘ฆ b1+ ๐‘ b2๐‘ž

>๐‘ฆ b2+ . . . +๐‘ b๐‘†๐‘ž>๐‘ฆ b๐‘†

(in x and y)subject to ๐ด๐‘ฅ = ๐‘

๐‘‡b1๐‘ฅ +๐‘Š๐‘ฆ b1 +0 . . . +0 = โ„Žb1

๐‘‡b2๐‘ฅ +0 +๐‘Š๐‘ฆ b2. . .

... = โ„Žb2...

.... . .

. . . +0...

...

๐‘‡b๐‘†๐‘ฅ +0 . . . +0 +๐‘Š๐‘ฆ๐‘† = โ„Žb๐‘†๐‘ฅ โˆˆ ๐‘‹ ๐‘ฆ b1 โˆˆ ๐‘Œ ๐‘ฆ b2 โˆˆ ๐‘Œ ๐‘ฆ b๐‘† โˆˆ ๐‘Œ

(13.5)

where ๐‘ b B P [{b}].NB: (13.5) is a big LP (often too big, indeed), but linear and sparse. The size increases,

as the number of atoms (scenarios) ๐‘† increases. However, we expect that a lot of theseconstraints are redundant and we want to exploit this presumption.

13.7 L-SHAPED METHOD

i.e., Benderโ€™s decomposition, applied to (13.5).Let _โˆ—

b(๐‘ฅ)> โˆˆ argmax_

{_>

(โ„Žb โˆ’ ๐‘‡b ๐‘ฅ

): _>๐‘Š โ‰ค ๐‘ž

}be an optimal, dual solution to the re-

course problem in scenario b, then ๐‘ข (๐‘ฅ)> B โˆ’โˆ‘b ๐‘ b_

โˆ—>b(๐‘ฅ) ๐‘‡b โˆˆ ๐œ•Q (๐‘ฅ). Thus,

Q (๐‘ฅ) + ๐‘ข (๐‘ฅ)> (๐‘ฅ โˆ’ ๐‘ฅ) โ‰ค Q (๐‘ฅ) ,

hence ๐‘ฅ โ†ฆโ†’ Q (๐‘ฅ) + ๐‘ข (๐‘ฅ)> (๐‘ฅ โˆ’ ๐‘ฅ) is a supporting hyperplane, supporting Q from below. So isthe bundle,

Q๐ฟ (๐‘ฅ) B max๐‘™โˆˆ๐ฟQ (๐‘ฅ๐‘™) + ๐‘ข>๐‘™ (๐‘ฅ โˆ’ ๐‘ฅ๐‘™) โ‰ค Q (๐‘ฅ)

(where ๐‘ข๐‘™ B ๐‘ข (๐‘ฅ๐‘™)).

Version: May 28, 2021

Page 78: Selected Topics from Portfolio Optimization

78STOCHASTIC OPTIMIZATION: TERMS, AND DEFINITIONS, AND THE DETERMINISTIC

EQUIVALENT

13.8 FARKASโ€™ LEMMA

Lemma 13.1 (A Theorem on the Alternative). Exactly one of these following two statementsholds true:

โŠฒ There exists ๐‘ฆ such that ๐‘Š๐‘ฆ = ๐‘ง and ๐‘ฆ โ‰ฅ 0;

โŠฒ There exists ๐œŽ such that ๐œŽ>๐‘Š โ‰ค 0 and ๐œŽ>๐‘ง > 0.

13.9 L-SHAPED ALGORITHM.

The initial problem (13.2) can be restated equivalently, getting rid of the random variable inthe objective function at the same time, as

(13.2) โ‡โ‡’

minimize (in x) ๐‘>๐‘ฅ + Q (๐‘ฅ)subject to ๐ด๐‘ฅ = ๐‘,

๐‘ฅ โˆˆ ๐‘‹โ‡โ‡’ min

๐‘ฅโˆˆ๐‘‹

{๐‘>๐‘ฅ + Q (๐‘ฅ) : ๐ด๐‘ฅ = ๐‘

}โ‡โ‡’

minimize (in x, \) ๐‘>๐‘ฅ + \subject to ๐ด๐‘ฅ = ๐‘,

Q (๐‘ฅ) โ‰ค \,๐‘ฅ โˆˆ ๐‘‹

Algorithm 13.1 is based on the previous observation of supporting hyperplanes and the latter,equivalent formulation:

13.10 VARIANTS OF THE ALGORITHM.

โŠฒ Multicut: In view of linearity of (13.4) and (13.7) we may equally well start with the set

B B {(๐‘ฅ, \1, . . . \๐‘†) : ๐‘ฅ โ‰ฅ 0, ๐ด๐‘ฅ = ๐‘, \๐‘– โ‰ฅ \0}

and solve the problem

min

{๐‘>๐‘ฅ +

โˆ‘b

๐‘ b \ b : (๐‘ฅ, \1, . . . , \๐‘†) โˆˆ B}

instead of (13.6) โ€“ the problem is said to be separable into scenario sub-problems.

โ€“ The abort criterion reads: Q (๐‘ฅ) โ‰ค โˆ‘b ๐‘ b \ b?

โ€“ The feasibility cut (singular!) remains unchanged, as they do not involve \s.โ€“ The new optimality cuts (plural!) read

B โ† B โˆฉโ‹‚

b : Q( ๏ฟฝ๏ฟฝ, b )>\b

{(๐‘ฅ, \1, . . . \ b . . . , \๐‘†

): \ b โ‰ฅ Q (๐‘ฅ, b) + ๏ฟฝ๏ฟฝ>b (๐‘ฅ โˆ’ ๐‘ฅ)

}.

โŠฒ Chunked multicut: The same idea as multicut, but with a few b-clusters instead of theentire ฮž: ฮž = {b1, . . . b๐‘†} = ยค

โ‹ƒ๐ถ

๐‘˜=1๐‘†๐‘˜ . Define Q [๐‘†๐‘˜ ] (๐‘ฅ) Bโˆ‘

b โˆˆ๐‘†๐‘˜ ๐‘ bQ (๐‘ฅ, b) and proceedas for the multicut version.

rough draft: do not distribute

Page 79: Selected Topics from Portfolio Optimization

13.10 VARIANTS OF THE ALGORITHM. 79

Algorithm 13.1 L-Shaped Method

(i) Find \0 such that \0 โ‰ค Q (๐‘ฅ) (for all ๐‘ฅ) and define

B B {(๐‘ฅ, \) : ๐‘ฅ โ‰ฅ 0, ๐ด๐‘ฅ = ๐‘, \ โ‰ฅ \0} .

(B โ€“ to some extent โ€“ characterizes the epi-graph of the approximate Q๐ฟ and we haveB โŠ‡ epiQ; if not available then choose \0 B โˆ’โˆž.)

(ii) Solve the problemmin

{๐‘>๐‘ฅ + \ : (๐‘ฅ, \) โˆˆ B

}(13.6)

and call the solution found(๐‘ฅ, \

).

(iii) Compute Q (๐‘ฅ).

(a) If Q (๐‘ฅ) โ‰ค \ < โˆž, then ๐‘ฅ is the best (i.e., minimal) solution of our original problem.

(b) Optimality cut : if \ < Q (๐‘ฅ) < โˆž, then put

๏ฟฝ๏ฟฝ> B โˆ’โˆ‘b โˆˆฮž

๐‘ b_โˆ—b (๐‘ฅ)

> ๐‘‡b โˆˆ ๐œ•Q (๐‘ฅ) (13.7)

and send the additional hyperplane, that is

B โ† B โˆฉ{(๐‘ฅ, \) : \ โ‰ฅ Q (๐‘ฅ) + ๏ฟฝ๏ฟฝ> (๐‘ฅ โˆ’ ๐‘ฅ)

}.

Continue with step (ii).

(c) Feasibility cut : It holds that Q (๐‘ฅ) = โˆž. In this case there exists b, such that๐‘ฃ

(โ„Ž b โˆ’ ๐‘‡b ๐‘ฅ

)= โˆž, i.e.,

{๐‘ž>๐‘ฆ : ๐‘Š๐‘ฆ = โ„Ž b โˆ’ ๐‘‡b ๐‘ฅ

}= {}, i.e.

{๐‘ฆ : ๐‘Š๐‘ฆ = โ„Ž b โˆ’ ๐‘‡b ๐‘ฅ

}= {}.

Hence, โ„Ž b โˆ’ ๐‘‡b ๐‘ฅ is not feasible for ๐‘ฃ and thus ๐‘ฅ is not feasible for Q. Thus, byFarkasโ€™ lemma (Lemma 13.1), find an appropriate ๏ฟฝ๏ฟฝ, such that ๏ฟฝ๏ฟฝ>๐‘Š โ‰ค 0 and๏ฟฝ๏ฟฝ>

(โ„Ž b โˆ’ ๐‘‡b ๐‘ฅ

)> 0. To exclude this particular ๐‘ฅ for the future send the additional

conditionB โ† B โˆฉ

{(๐‘ฅ, \) : ๏ฟฝ๏ฟฝ>

(โ„Ž b โˆ’ ๐‘‡b ๐‘ฅ

)โ‰ค 0

}and continue with step (ii).

Version: May 28, 2021

Page 80: Selected Topics from Portfolio Optimization

80STOCHASTIC OPTIMIZATION: TERMS, AND DEFINITIONS, AND THE DETERMINISTIC

EQUIVALENT

โŠฒ Numerical experiments show that the algorithm is sometimes flipping around withoutimproving the solution significantly. In order to stop this misbehavior search for animproved local solution, by modifying the objective function as follows:

โ€“ Regularization

min

{๐‘>๐‘ฅ +

โˆ‘๐‘˜

๐‘๐‘†๐‘˜ \๐‘˜ : (๐‘ฅ, \1, . . . , \๐ถ) โˆˆ B, ๐‘ฅ โˆ’ ๐‘ฅ๐‘– โ‰ค ฮ”๐‘–

}โ€“ Regularized decomposition method

min

{๐‘>๐‘ฅ +

โˆ‘๐‘˜

๐‘๐‘†๐‘˜ \๐‘˜ + ๐‘ฅ โˆ’ ๐‘ฅ๐‘– 22๐œŒ

: (๐‘ฅ, \1, . . . , \๐ถ) โˆˆ B}

Additional controls on ฮ” and ๐œŒ are available here to (intuitively) enforce a direc-tion of (significantly) good descent. The same techniques as for (unconstrained),nonlinear optimization apply here.

โŠฒ Parallelizing: ๐‘ข>bB โˆ’_โˆ—>

b๐‘‡b โˆˆ ๐œ•Q (๐‘ฅ, b) has to be evaluated, which is an LP for all b โˆˆ ฮž

(or chunks). This work can be parallelized, reducing (optimistically) the total time bythe factor 1

# Computers .

โŠฒ Computing _โˆ—>bโˆˆ ๐œ•๐‘ฃ

(โ„Žb โˆ’ ๐‘‡b ๐‘ฅ

)is almost the same LP for all b โˆˆ ฮž, but without involving

๐‘† as a dimension: the constrains of the dual function stay unchanged, only the objectivefunction varies (if ๐‘ž is nonrandom). The idea of bunching is to avoid all those evalua-tions and instead build a basis of representative directions such that _โˆ—> = ๐‘ž>

๐ต๐‘Šโˆ’1

๐ต(cf.

(13.3)). This is particularly advantageous if dimฮž ๏ฟฝ dim ๐‘ž. The statement is based onthe following

โ€“ Theorem: Let ๐‘ฅ be optimal for (LPโ€™),

* then there is a basis such that ๐‘ž>๐‘โ‰ฅ ๐‘ž>

๐ต๐‘Šโˆ’1

๐ต๐‘Š๐‘ for the appropriate decompo-

sition ๐‘Š = (๐‘Š๐ต,๐‘Š๐‘ ) etc.;

* moreover, _โˆ—> B ๐‘ž>๐ต๐‘Šโˆ’1

๐ตis optimal for (DPโ€™).

rough draft: do not distribute

Page 81: Selected Topics from Portfolio Optimization

14Co- and Antimonotonicity

14.1 REARRANGEMENTS

Theorem 14.1 (Generalized Chebyshevโ€™s sum inequality). Let ๐‘๐‘– โ‰ฅ 0 withโˆ‘๐‘›

๐‘–=1 ๐‘๐‘– = 1. Then,for ๐‘ฅ1 โ‰ค ๐‘ฅ2 ยท ยท ยท โ‰ค ๐‘ฅ๐‘› and ๐‘ฆ1 โ‰ค ๐‘ฆ2 ยท ยท ยท โ‰ค ๐‘ฆ๐‘› it holds that(

๐‘›โˆ‘๐‘–=1

๐‘๐‘–๐‘ฅ๐‘–

)ยท(

๐‘›โˆ‘๐‘–=1

๐‘๐‘–๐‘ฆ๐‘–

)โ‰ค

๐‘›โˆ‘๐‘–=1

๐‘๐‘–๐‘ฅ๐‘–๐‘ฆ๐‘– .

Proof. Note that๐‘›โˆ‘๐‘—=1

๐‘›โˆ‘๐‘˜=1

๐‘ ๐‘— ๐‘๐‘˜ (๐‘ฅ ๐‘— โˆ’ ๐‘ฅ๐‘˜) (๐‘ฆ ๐‘— โˆ’ ๐‘ฆ๐‘˜)๏ธธ ๏ธท๏ธท ๏ธธโ‰ฅ0

โ‰ฅ 0,

as the components are increasing. Hence, by expanding,

0 โ‰ค๐‘›โˆ‘๐‘—=1

๐‘›โˆ‘๐‘˜=1

๐‘ ๐‘— ๐‘๐‘˜๐‘ฅ ๐‘— ๐‘ฆ ๐‘— โˆ’ ๐‘ ๐‘— ๐‘๐‘˜๐‘ฅ ๐‘— ๐‘ฆ๐‘˜ โˆ’ ๐‘ ๐‘— ๐‘๐‘˜๐‘ฅ๐‘˜ ๐‘ฆ ๐‘— + ๐‘ ๐‘— ๐‘๐‘˜๐‘ฅ๐‘˜ ๐‘ฆ๐‘˜ = 2๐‘›โˆ‘๐‘—=1

๐‘ ๐‘—๐‘ฅ ๐‘— ๐‘ฆ ๐‘— โˆ’ 2๐‘›โˆ‘๐‘—=1

๐‘ ๐‘—๐‘ฅ ๐‘—

๐‘›โˆ‘๐‘˜=1

๐‘๐‘˜ ๐‘ฆ๐‘˜ ,

from which the result is immediate. ๏ฟฝ

Theorem 14.2 (Chebyshevโ€™s sum inequality, the continuous version). Let ๐‘“ , ๐‘” : [0, 1] โ†’ R benondecreasing. Then it holds thatโˆซ 1

0๐‘“ (๐‘ฅ) d๐‘ฅ ยท

โˆซ 1

0๐‘”(๐‘ฅ) d๐‘ฅ โ‰ค

โˆซ 1

0๐‘“ (๐‘ฅ)๐‘”(๐‘ฅ) d๐‘ฅ.

Theorem 14.3 (The rearrangement inequality). Let ๐‘ฅ1 โ‰ค ๐‘ฅ2 ยท ยท ยท โ‰ค ๐‘ฅ๐‘› and ๐‘ฆ1 โ‰ค ๐‘ฆ2 ยท ยท ยท โ‰ค ๐‘ฆ๐‘›.Then

๐‘›โˆ‘๐‘–=1

๐‘ฅ๐‘›+1โˆ’๐‘–๐‘ฆ๐‘– โ‰ค๐‘›โˆ‘๐‘–=1

๐‘ฅ๐œŽ (๐‘–) ๐‘ฆ๐‘– โ‰ค๐‘›โˆ‘๐‘–=1

๐‘ฅ๐‘–๐‘ฆ๐‘– (14.1)

for every permutation ๐œŽ : {1, . . . , ๐‘›} โ†’ {1, . . . , ๐‘›}.Proof. Suppose the permutation ๐œŽ maximizing (14.1) were not the identity. Then find thesmallest ๐‘— so that ๐œŽ( ๐‘—) โ‰  ๐‘— . Note, that ๐œŽ( ๐‘—) > ๐‘— and there is ๐‘˜ > ๐‘— so that ๐œŽ( ๐‘—) = ๐‘˜. Now

๐‘— < ๐‘˜ =โ‡’ ๐‘ฆ ๐‘— โ‰ค ๐‘ฆ๐‘˜ and ๐‘— < ๐œŽ( ๐‘—) =โ‡’ ๐‘ฅ ๐‘— โ‰ค ๐‘ฅ๐œŽ ( ๐‘—)and thus 0 โ‰ค

(๐‘ฅ๐œŽ ( ๐‘—) โˆ’ ๐‘ฅ ๐‘—

) (๐‘ฆ๐‘˜ โˆ’ ๐‘ฆ ๐‘—

), i.e.,

๐‘ฅ๐œŽ ( ๐‘—) ๐‘ฆ ๐‘— + ๐‘ฅ ๐‘— ๐‘ฆ๐‘˜ โ‰ค ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ฅ๐œŽ ( ๐‘—) ๐‘ฆ๐‘˜ . (14.2)

Define the permutation exchanging the values ๐œŽ( ๐‘—) and ๐œŽ(๐‘˜), i.e., ๐œ(๐‘–) B

๐‘– for ๐‘– โˆˆ {1, . . . , ๐‘—}๐œŽ( ๐‘—) if ๐‘– = ๐‘˜๐œŽ(๐‘–) else

and

observe that the right hand side of (14.2) is better for ๐œ(ยท) and ๐œ( ๐‘—) = ๐‘— . It follows that ๐œŽ(ยท) isthe identity. ๏ฟฝ

81

Page 82: Selected Topics from Portfolio Optimization

82 CO- AND ANTIMONOTONICITY

14.2 COMONOTONICITY

Definition 14.4. The random variables ๐‘‹๐‘–, ๐‘– = 1, . . . , ๐‘› are comonotonic (aka. nondecreas-ing), if (

๐‘‹๐‘– (๐œ”) โˆ’ ๐‘‹๐‘– (๏ฟฝ๏ฟฝ))ยท(๐‘‹ ๐‘— (๐œ”) โˆ’ ๐‘‹ ๐‘— (๏ฟฝ๏ฟฝ)

)โ‰ฅ 0 for all ๐œ”, ๏ฟฝ๏ฟฝ โˆˆ ๐‘ and ๐‘–, ๐‘— โ‰ค ๐‘› (14.3)

and anti-monotone, if(๐‘‹๐‘– (๐œ”) โˆ’ ๐‘‹๐‘– (๏ฟฝ๏ฟฝ)

)ยท(๐‘‹ ๐‘— (๐œ”) โˆ’ ๐‘‹ ๐‘— (๏ฟฝ๏ฟฝ)

)โ‰ค 0 for all ๐œ”, ๏ฟฝ๏ฟฝ โˆˆ ๐‘ and ๐‘–, ๐‘— โ‰ค ๐‘›,

where ๐‘ƒ(๐‘) = 1.

Theorem 14.5 (Cf. Denneberg [1994]). Let ๐‘‹ and ๐‘Œ be R-valued random variables. Thefollowing are equivalent:

(i) ๐‘‹ and ๐‘Œ are comonotonic;

(ii) there exists an R-valued random variable ๐‘ and nondecreasing functions ๐‘ฃ, ๐‘ค : Rโ†’ R

so that๐‘‹ = ๐‘ฃ(๐‘) and ๐‘Œ = ๐‘ค(๐‘);

(iii) there are nondecreasing functions ๐‘ฃ, ๐‘ค : Rโ†’ R so that

๐‘‹ = ๐‘ฃ(๐‘‹ + ๐‘Œ ) and ๐‘Œ = ๐‘ค(๐‘‹ + ๐‘Œ ).

Remark. The subsequent proof verifies that ๐‘ฃ and ๐‘ค are monotone on the range of ๐‘.

Proof. (๐‘–) =โ‡’ (๐‘–๐‘–๐‘–): Define ๐‘ B ๐‘‹ + ๐‘Œ . For ๐‘ง = ๐‘ (๐œ”) define ๐‘ฃ(๐‘ง) B ๐‘‹ (๐œ”) and ๐‘ค(๐‘ง) B ๐‘Œ (๐œ”).To see that ๐‘ฃ(ยท) and ๐‘ค(ยท) are well-defined choose ๐œ”1, ๐œ”2 โˆˆ ๐‘โˆ’1({๐‘ง}), then ๐‘‹ (๐œ”1) + ๐‘Œ (๐œ”1) =๐‘ (๐œ”1) = ๐‘ง = ๐‘ (๐œ”2) = ๐‘‹ (๐œ”2) + ๐‘Œ (๐œ”2), and thus

๐‘‹ (๐œ”1) โˆ’ ๐‘‹ (๐œ”2) = โˆ’(๐‘Œ (๐œ”1) โˆ’ ๐‘Œ (๐œ”2)

). (14.4)

If ๐‘‹ (๐œ”1) โˆ’ ๐‘‹ (๐œ”2) โ‰ค 0, then ๐‘Œ (๐œ”1) โˆ’ ๐‘Œ (๐œ”2) โ‰ฅ 0 by (14.4) and ๐‘Œ (๐œ”1) โˆ’ ๐‘Œ (๐œ”2) โ‰ค 0 bycomonotonicity, thus ๐‘Œ (๐œ”1) โˆ’ ๐‘Œ (๐œ”2) = 0 and consequently ๐‘‹ (๐œ”1) = ๐‘‹ (๐œ”2) by (14.4). Hence,๐‘ฃ(ยท) and ๐‘ค(ยท) are well-defined.

To see that ๐‘ฃ(ยท) and ๐‘ค(ยท) are monotonic pick ๐œ”1, ๐œ”2 โˆˆ ฮฉ with ๐‘ง1 B ๐‘ (๐œ”1) โ‰ค ๐‘ (๐œ”2) =: ๐‘ง2,then we find

๐‘‹ (๐œ”1) โˆ’ ๐‘‹ (๐œ”2) โ‰ค โˆ’(๐‘Œ (๐œ”1) โˆ’ ๐‘Œ (๐œ”2)

). (14.5)

If ๐‘‹ (๐œ”1) > ๐‘‹ (๐œ”2), then ๐‘Œ (๐œ”1) < ๐‘Œ (๐œ”2) by (14.5), which is in contrast to our assumption oncomonotonicity and hence ๐‘‹ (๐œ”1) โ‰ค ๐‘‹ (๐œ”2); similarly we find that ๐‘Œ (๐œ”1) โ‰ค ๐‘Œ (๐œ”2). It followsthat

๐‘ฃ(๐‘ง1) = ๐‘‹ (๐œ”1) โ‰ค ๐‘‹ (๐œ”2) = ๐‘ฃ(๐‘ง2) and๐‘ค(๐‘ง1) = ๐‘Œ (๐œ”1) โ‰ค ๐‘Œ (๐œ”2) = ๐‘ค(๐‘ง2),

i.e., ๐‘ฃ(ยท) and ๐‘ค(ยท) are nondecreasing.We shall verify next that ๐‘ฃ(ยท) and ๐‘ค(ยท) are Lipschitz with constant 1. Note that we have

๐‘ฃ(๐‘ง) + ๐‘ค(๐‘ง) = ๐‘‹ (๐œ”) + ๐‘Œ (๐œ”) = ๐‘ (๐œ”) = ๐‘ง.

rough draft: do not distribute

Page 83: Selected Topics from Portfolio Optimization

14.2 COMONOTONICITY 83

If ๐‘ง1 โ‰ค ๐‘ง2, then

๐‘ง2 โˆ’ ๐‘ง1 = ๐‘ฃ(๐‘ง2) + ๐‘ค(๐‘ง2) โˆ’ ๐‘ฃ(๐‘ง1) โˆ’ ๐‘ค(๐‘ง1) โ‰ฅ ๐‘ฃ(๐‘ง2) โˆ’ ๐‘ฃ(๐‘ง1) = |๐‘ฃ(๐‘ง2) โˆ’ ๐‘ฃ(๐‘ง1) |

by monotonicity of ๐‘ค(ยท); if ๐‘ง1 โ‰ฅ ๐‘ง2, then

๐‘ง1 โˆ’ ๐‘ง2 = ๐‘ฃ(๐‘ง1) + ๐‘ค(๐‘ง1) โˆ’ ๐‘ฃ(๐‘ง2) โˆ’ ๐‘ค(๐‘ง2) โ‰ฅ ๐‘ฃ(๐‘ง1) โˆ’ ๐‘ฃ(๐‘ง2) = |๐‘ฃ(๐‘ง2) โˆ’ ๐‘ฃ(๐‘ง1) | ,

i.e., ๐‘ฃ(ยท) and ๐‘ค(๐‘ง) = ๐‘ง โˆ’ ๐‘ฃ(๐‘ง) are both Lipschitz on ๐‘ (ฮฉ).Finally note that a Lipschitz function ๐‘“ (ยท) with Lipschitz constant ๐ฟ can be extended to the

entire domain by setting ๐‘“ (๐‘ง) B inf๐‘ก โˆˆ๐‘ (ฮฉ) ๐‘“ (๐‘ก) + ๐ฟ |๐‘ก โˆ’ ๐‘ง | while keeping the Lipschitz constant๐ฟ (cf. Exercise 14.1).(๐‘–๐‘–๐‘–) =โ‡’ (๐‘–๐‘–) is evident by choosing the random variable ๐‘ B ๐‘‹ + ๐‘Œ .(๐‘–๐‘–) =โ‡’ (๐‘–): Let ๐œ”1, ๐œ”2 โˆˆ ฮฉ. To prove (i) we may assume that ๐‘‹ (๐œ”1) > ๐‘‹ (๐œ”2), i.e., by

assumption, ๐‘ฃ(๐‘ (๐œ”1)) > ๐‘ฃ(๐‘ (๐œ”2)). As ๐‘ฃ(ยท) is monotone we deduce that ๐‘ (๐œ”1) โ‰ฅ ๐‘ (๐œ”2). As๐‘ค(ยท) is monotone as well it follows that ๐‘Œ (๐œ”1) = ๐‘ค(๐‘ (๐œ”1)) > ๐‘ค(๐‘ (๐œ”2)) = ๐‘Œ (๐œ”2) and hence (i),i.e., ๐‘‹ and ๐‘Œ are comonotonic. ๏ฟฝ

Corollary 14.6. The random variables ๐‘‹๐‘– are comonotonic iff

(๐‘‹1, . . . ๐‘‹๐‘›) โˆผ(๐นโˆ’1๐‘‹1 (๐‘ˆ), . . . , ๐น

โˆ’1๐‘‹๐‘›(๐‘ˆ)

)for one uniform random variable ๐‘ˆ.

Proof. It is evident that(๐นโˆ’1๐‘‹1(๐‘ˆ), . . . , ๐นโˆ’1

๐‘‹๐‘›(๐‘ˆ)

)are comonotonic, as ๐นโˆ’1

๐‘‹๐‘–(ยท) are nondecreasing

and ๐นโˆ’1๐‘‹๐‘–(๐‘ˆ) โˆผ ๐‘‹๐‘–.

As for the converse apply Theorem 14.5 (ii) and (4.3). Then ๐‘‹ = ๐นโˆ’1๐‘‹(๐‘ˆ) = ๐‘ข โ—ฆ ๐นโˆ’1

๐‘(๐‘ˆ). ๏ฟฝ

Proposition 14.7 (Upper Frรฉchet1 bound). If ๐‘‹๐‘– are pairwise comonotonic, then

๐น๐‘‹1,...๐‘‹๐‘›(๐‘ฅ1, . . . , ๐‘ฅ๐‘›) = min

๐‘–=1,...๐‘›๐น๐‘‹๐‘–(๐‘ฅ๐‘–).

Proof. By Theorem 14.5 (ii) we have

๐น๐‘‹1,๐‘‹2 (๐‘ฅ1, ๐‘ฅ2) = ๐‘ƒ(๐‘‹1 โ‰ค ๐‘ฅ1, ๐‘‹2 โ‰ค ๐‘ฅ2) = ๐‘ƒ(๐‘ฃ(๐‘) โ‰ค ๐‘ฅ1, ๐‘ค(๐‘) โ‰ค ๐‘ฅ2).

As ๐‘ฃ(ยท) and ๐‘ค(ยท) are monotone we have either {๐‘ฃ(๐‘) โ‰ค ๐‘ฅ1} โŠ‚ {๐‘ค(๐‘) โ‰ค ๐‘ฅ2} or {๐‘ฃ(๐‘) โ‰ค ๐‘ฅ1} โŠƒ{๐‘ค(๐‘) โ‰ค ๐‘ฅ2}.

If {๐‘ฃ(๐‘) โ‰ค ๐‘ฅ1} โŠ‚ {๐‘ค(๐‘) โ‰ค ๐‘ฅ2}, then

๐น๐‘‹1,๐‘‹2 (๐‘ฅ1, ๐‘ฅ2) = ๐‘ƒ(๐‘‹1 โ‰ค ๐‘ฅ1, ๐‘‹2 โ‰ค ๐‘ฅ2) = ๐‘ƒ(๐‘ฃ(๐‘) โ‰ค ๐‘ฅ1) = ๐น๐‘‹1 (๐‘ฅ1) โ‰ค ๐น๐‘‹2 (๐‘ฅ2),

and if {๐‘ฃ(๐‘) โ‰ค ๐‘ฅ1} โŠƒ {๐‘ค(๐‘) โ‰ค ๐‘ฅ2}, then

๐น๐‘‹1,๐‘‹2 (๐‘ฅ1, ๐‘ฅ2) = ๐‘ƒ(๐‘ฃ(๐‘) โ‰ค ๐‘ฅ1, ๐‘ค(๐‘) โ‰ค ๐‘ฅ2) = ๐‘ƒ(๐‘ค(๐‘) โ‰ค ๐‘ฅ2) = ๐น๐‘‹2 (๐‘ฅ2) โ‰ค ๐น๐‘‹1 (๐‘ฅ1).

The assertion follows. ๏ฟฝ

Remark 14.8. It is always true that ๐น๐‘‹1,...๐‘‹๐‘›(๐‘ฅ1, . . . , ๐‘ฅ๐‘›) โ‰ค min๐‘–=1,...๐‘› ๐น๐‘‹๐‘–

(๐‘ฅ๐‘–). For comonotonicrandom variables, however, the upper Frรฉchet bound is attained.

1Maurice Renรฉ Frรฉchet, 1878โ€“1973

Version: May 28, 2021

Page 84: Selected Topics from Portfolio Optimization

84 CO- AND ANTIMONOTONICITY

๐น (๐‘ฅ + ฮ”๐‘ฅ, ๐‘ฆ + ฮ”๐‘ฆ)

๐น (๐‘ฅ + ฮ”๐‘ฅ, ๐‘ฆ)

๐น (๐‘ฅ, ๐‘ฆ + ฮ”๐‘ฆ)

๐น (๐‘ฅ, ๐‘ฆ)

Figure 14.1: Probability of an area

Corollary 14.9. Let ๐‘Œ and ๐‘ be comonotonic. Then E๐‘Œ ยท E ๐‘ โ‰ค E๐‘Œ๐‘.

Proof. Integrate (14.3) with respect to ๐‘ƒ(d๐œ”) โŠ— ๐‘ƒ(d๐œ”โ€ฒ) and proceed as in Chebyshevsโ€™s suminequality, Theorem 14.1. ๏ฟฝ

Corollary 14.10. The covariance cov( ๏ฟฝ๏ฟฝ,๐‘Œ ) among all random variables with ๏ฟฝ๏ฟฝ โˆผ ๐‘‹ and ๐‘Œ โˆผ ๐‘Œis maximal, if ๏ฟฝ๏ฟฝ and ๐‘Œ are comonotonic.

14.3 INTEGRATION OF RANDOM VECTORS

For R-valued random variables we have

E ๐‘”(๐‘‹) =โˆซฮฉ

๐‘”(๐œ”)๐‘ƒ(d๐œ”) =โˆซ โˆž

โˆ’โˆž๐‘”(๐‘ฅ) d๐น๐‘‹ (๐‘ฅ) =

โˆซ โˆž

โˆ’โˆž๐‘”(๐‘ฅ) ๐‘“๐‘‹ (๐‘ฅ) d๐‘ฅ,

where the latter is only possible if the derivative (density) d๐น๐‘‹ (๐‘ฅ) = ๐‘“๐‘‹ (๐‘ฅ) d๐‘ฅ exists.How do these formulae generalize for higher dimensions?

E ๐‘”(๐‘‹,๐‘Œ ) =โˆซฮฉ

๐‘”(๐‘‹ (๐œ”), ๐‘Œ (๐œ”)

)๐‘ƒ(d๐œ”) =

โˆฌR2๐‘”(๐‘ฅ, ๐‘ฆ) d2๐น๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ) =

โˆฌR2๐‘”(๐‘ฅ, ๐‘ฆ) ๐‘“๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ) d๐‘ฅd๐‘ฆ.

(14.6)To this end observe that

๐‘ƒ(๐‘‹ โˆˆ [๐‘ฅ, ๐‘ฅ + ฮ”๐‘ฅ), ๐‘Œ โˆˆ [๐‘ฆ, ๐‘ฆ + ฮ”๐‘ฅ)

)(14.7)

= ๐น๐‘‹,๐‘Œ (๐‘ฅ + ฮ”๐‘ฅ, ๐‘ฆ + ฮ”๐‘ฆ) โˆ’ ๐น๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ + ฮ”๐‘ฆ) โˆ’ ๐น๐‘‹,๐‘Œ (๐‘ฅ + ฮ”๐‘ฅ, ๐‘ฆ) + ๐น๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ)

and it is thus evident what d2๐น๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ) in (14.6) has to stand for (cf. Figure 14.1).Generalizations to random vectors in R๐‘› are obvious, the general form for (14.7), however,

involves 2๐‘› evaluations of ๐น๐‘‹1,...๐‘‹๐‘›(๐‘ฅ1, . . . , ๐‘ฅ๐‘›).

14.4 COPULA

Definition 14.11. The copula function of a random vector (๐‘‹1, . . . , ๐‘‹๐‘›) is the cdf ๐ถ : [0, 1]๐‘› โ†’[0, 1] on [0, 1]๐‘š expressing the joint distribution function by all marginal distributions, i.e.,

๐น๐‘‹1,...,๐‘‹๐‘›(๐‘ฅ1, . . . , ๐‘ฅ๐‘›) = ๐ถ

(๐น๐‘‹1 (๐‘ฅ1), . . . ๐น๐‘‹๐‘›

(๐‘ฅ๐‘›)).

Remark 14.12. Note, that the copula in dimension 1 is trivial, as ๐ถ (๐‘ข) = ๐‘ข.

Remark 14.13 (Independence copula). The independence copula

๐ถ (๐‘ข1, . . . ๐‘ข๐‘›) = ๐‘ข1 ยท . . . ๐‘ข๐‘›

governs independent random variables ๐‘‹1, . . . ๐‘‹๐‘›.

rough draft: do not distribute

Page 85: Selected Topics from Portfolio Optimization

14.5 PROBLEMS 85

Lemma 14.14. Copulas functions can be assumed to be uniformly continuous; more pre-cisely, it holds that

๐ถ (๐‘ข1, . . . ๐‘ข๐‘›) โˆ’ ๐ถ (๐‘ฃ1, . . . ๐‘ฃ๐‘›) โ‰ค |๐‘ฃ1 โˆ’ ๐‘ข1 | + ยท ยท ยท + |๐‘ฃ๐‘› โˆ’ ๐‘ข๐‘› | .

Proof. Just observe that

๐‘ƒ (๐‘‹ โ‰ค ๐‘ฅ2, ๐‘Œ โ‰ค ๐‘ฆ2) โˆ’ ๐‘ƒ (๐‘‹ โ‰ค ๐‘ฅ1, ๐‘Œ โ‰ค ๐‘ฆ1) โ‰ค |๐‘ƒ (๐‘‹ โ‰ค ๐‘ฅ2, ๐‘Œ โ‰ค ๐‘ฆ2) โˆ’ ๐‘ƒ (๐‘‹ โ‰ค ๐‘ฅ1, ๐‘Œ โ‰ค ๐‘ฆ2) |+ |๐‘ƒ (๐‘‹ โ‰ค ๐‘ฅ1, ๐‘Œ โ‰ค ๐‘ฆ2) โˆ’ ๐‘ƒ (๐‘‹ โ‰ค ๐‘ฅ1, ๐‘Œ โ‰ค ๐‘ฆ1) |

โ‰ค |๐‘ƒ (๐‘‹ โ‰ค ๐‘ฅ2) โˆ’ ๐‘ƒ (๐‘‹ โ‰ค ๐‘ฅ1) | + |๐‘ƒ (๐‘Œ โ‰ค ๐‘ฆ2) โˆ’ ๐‘ƒ (๐‘Œ โ‰ค ๐‘ฆ1) |

and thus๐ถ (๐‘ข1, ๐‘ข2) โˆ’ ๐ถ (๐‘ฃ1, ๐‘ฃ2) โ‰ค |๐‘ข1 โˆ’ ๐‘ฃ1 | + |๐‘ข2 โˆ’ ๐‘ฃ2 | .

This generalizes to higher dimensions. ๏ฟฝ

Lemma 14.15. It holds that

E ๐‘”(๐‘‹1, . . . ๐‘‹๐‘›) =โˆซ 1

0ยท ยท ยท

โˆซ 1

0๐‘”

(๐นโˆ’1๐‘‹1 (๐‘ข1), . . . ๐น

โˆ’1๐‘‹1(๐‘ข1)

)d๐‘›๐ถ (๐‘ข1, . . . ๐‘ข๐‘›).

Proof. By (14.6) and substituting the marginals ๐‘ฅ๐‘– โ† ๐นโˆ’1๐‘‹๐‘–(๐‘ข๐‘–) we have

E ๐‘”(๐‘‹1, . . . ๐‘‹๐‘›) =โˆซ โˆž

โˆ’โˆžยท ยท ยท

โˆซ โˆž

โˆ’โˆž๐‘”(๐‘ฅ1, . . . ๐‘ฅ๐‘›) d๐‘›๐น๐‘‹1,...๐‘‹๐‘›

(๐‘ฅ1, . . . ๐‘ฅ๐‘›)

=

โˆซ 1

0ยท ยท ยท

โˆซ 1

0๐‘”

(๐นโˆ’1๐‘‹1 (๐‘ข1), . . . ๐น

โˆ’1๐‘‹1(๐‘ข1)

)d๐‘›๐น๐‘‹1,...,๐‘‹๐‘›

(๐นโˆ’1๐‘‹1 (๐‘ข1), . . . ๐น

โˆ’1๐‘‹1(๐‘ข1)

)=

โˆซ 1

0ยท ยท ยท

โˆซ 1

0๐‘”

(๐นโˆ’1๐‘‹1 (๐‘ข1), . . . ๐น

โˆ’1๐‘‹1(๐‘ข1)

)d๐‘›๐ถ (๐‘ข1, . . . ๐‘ข1) .

๏ฟฝ

Example 14.16. The copula for comonotonic random variables is๐ถ (๐‘ข1, . . . , ๐‘ข๐‘›) = min๐‘–=1,...๐‘› ๐‘ข๐‘– .

Lemma 14.17. For comonotonic random variables ๐‘‹๐‘–, ๐‘– = 1, . . . , ๐‘›, we have

E ๐‘”(๐‘‹1, . . . , ๐‘‹๐‘›) =โˆซ 1

0๐‘”

(๐นโˆ’1๐‘‹1 (๐‘ข), . . . ๐น

โˆ’1๐‘‹๐‘›(๐‘ข)

)d๐‘ข.

14.5 PROBLEMS

Exercise 14.1 (McShaneโ€™s Lemma on Lipschitz extensions). Let (๐‘, ๐‘‘) be a set equippedwith a metric and let ๐‘“ : ๐‘ˆ โ†’ R be Lipschitz with Lipschitz constant ๐ฟ, where ๐‘ˆ โŠ‚ ๐‘. Then๐‘“ (๐‘ง) B inf๐‘ขโˆˆ๐‘ˆ ๐‘“ (๐‘ข) + ๐ฟ๐‘‘ (๐‘ข, ๐‘ง) is well-defined for ๐‘ง โˆˆ ๐‘, ๐‘“ (๐‘ข) = ๐‘“ (๐‘ข) for ๐‘ข โˆˆ ๐‘ˆ and ๐‘“ : ๐‘ โ†’ R

has Lipschitz constant ๐ฟ.Kirszbraunโ€™s theorem provides an extension for vector-valued functions, although the as-

sertion for general Lipschitz functions is false.

Version: May 28, 2021

Page 86: Selected Topics from Portfolio Optimization

86 CO- AND ANTIMONOTONICITY

rough draft: do not distribute

Page 87: Selected Topics from Portfolio Optimization

15Convexity

Some parts follow a lecture by Mete Soner, but the content can be found in many elementarytextbooks on convex analysis, for example in Bot et al. [2009].

In what follows ๐‘‹ is a real topological vector space.

15.1 PROPERTIES OF CONVEX FUNCTIONS

We consider functions to the extended reals, ๐‘“ : ๐‘‹ โ†’ R โˆช {ยฑโˆž}.

Definition 15.1. The domain of ๐‘“ is dom ๐‘“ B { ๐‘“ < โˆž}. ๐‘“ is proper, if its domain is not empty.

Definition 15.2. We shall say that ๐‘“ : ๐‘‹ โ†’ R โˆช {ยฑโˆž} is lower semicontinuous (lsc.) if thesets { ๐‘“ > _} are open for all _ โˆˆ R. The sets { ๐‘“ โ‰ค _} are called lower levelsets, sublevel setsor trenches.

Lemma 15.3. The following are equivalent.

(i) ๐‘“ is lsc. at ๐‘ฅ0 โˆˆ ๐‘‹;

(ii) for every Y > 0 there exists a neighborhood so that ๐‘“ (๐‘ฅ) > ๐‘“ (๐‘ฅ0) โˆ’ Y for every ๐‘ฅ โˆˆ ๐‘ˆ;

(iii) then lim inf๐‘ฅโ†’๐‘ฅ0 ๐‘“ (๐‘ฅ) โ‰ฅ ๐‘“ (๐‘ฅ0).

Lemma 15.4. The following are equivalent.

(i) ๐‘“ is lsc.;

(ii) the epigraph epi ๐‘“ B {(๐‘ฅ, ๐›ผ) โˆˆ ๐‘‹ ร— R : ๐›ผ โ‰ฅ ๐‘“ (๐‘ฅ)} is closed in ๐‘‹ ร— R;

(iii) the level sets { ๐‘“ โ‰ค _} are closed for all _ โˆˆ R.

Example 15.5. ๐›ฟ๐ด B

{0 ๐‘ฅ โˆˆ ๐ด+โˆž else

is lsc., iff ๐ด is closed.

Definition 15.6. We shall call ๐‘“ : ๐‘‹ โ†’ R

โŠฒ convex, if ๐‘“((1 โˆ’ _)๐‘ฅ + _๐‘ฆ

)โ‰ค (1 โˆ’ _) ๐‘“ (๐‘ฅ) + _ ๐‘“ (๐‘ฆ) for _ โˆˆ [0, 1],

โŠฒ concave, if ๐‘“((1 โˆ’ _)๐‘ฅ + _๐‘ฆ

)โ‰ฅ (1 โˆ’ _) ๐‘“ (๐‘ฅ) + _ ๐‘“ (๐‘ฆ) for _ โˆˆ [0, 1], and

โŠฒ affine, if ๐‘“((1 โˆ’ _)๐‘ฅ + _๐‘ฆ

)= (1 โˆ’ _) ๐‘“ (๐‘ฅ) + _ ๐‘“ (๐‘ฆ) for all _ โˆˆ R.

Remark 15.7. Note, that ๐‘“ is affine iff ๐‘“ (๐‘ฅ) = ๐‘‘ + ๐‘ฅโˆ—(๐‘ฅ) for some linear ๐‘ฅโˆ—. Indeed, for ๐‘“ affinewrite ๐‘“ (๐‘ฅ) = ๐‘Ž(0) + ๐‘“ (๐‘ฅ) โˆ’ ๐‘Ž(0) and show that ๐‘“ (๐‘ฅ) โˆ’ ๐‘Ž(0) is linear.

87

Page 88: Selected Topics from Portfolio Optimization

88 CONVEXITY

Lemma 15.8. If ๐‘“ is lsc. and convex, then

๐‘“ (๐‘ฅ) = sup๐‘Ž ( ยท) โ‰ค ๐‘“ ( ยท)

๐‘Ž(๐‘ฅ) for all ๐‘ฅ โˆˆ ๐‘‹,

where ๐‘Ž is affine and ๐‘Ž(ยท) โ‰ค ๐‘“ (ยท) iff ๐‘Ž(๐‘ฅ) โ‰ค ๐‘“ (๐‘ฅ) for all ๐‘ฅ โˆˆ ๐‘‹.

Proof. Note first that sup๐‘Ž ( ยท) โ‰ค ๐‘“ ( ยท) ๐‘Ž(๐‘ฅ) is convex and lsc.Conversely, consider

๐‘€ B {(๐‘ฅโˆ—, ๐›ผ) โˆˆ ๐‘‹โˆ— ร— R : ๐‘ฅโˆ—(๐‘ฅ) + ๐›ผ โ‰ค ๐‘“ (๐‘ฅ) for all ๐‘ฅ โˆˆ ๐‘‹} .

We show that ๐‘€ is not empty.If ๐‘“ โ‰ก +โˆž, the always (๐‘ฅโˆ—, ๐›ผ) โˆˆ ๐‘€, hence ๐‘€ โ‰  โˆ….Otherwise, there is ๐‘ฆ โˆˆ ๐‘‹ so that ๐‘“ (๐‘ฆ) โˆˆ R. Then epi ๐‘“ โ‰  โˆ… and (๐‘ฆ, ๐‘“ (๐‘ฆ) โˆ’ 1) โˆ‰ epi ๐‘“ . As

๐‘“ is lsc., it follows from Lemma 15.4 that epi ๐‘“ is closed and convex. By the Hahnโ€“Banachtheorem there exists (๐‘ฅโˆ—, ๐›ผ) โˆˆ ๐‘‹โˆ— ร— R so that

๐‘ฅโˆ—(๐‘ฆ) + ๐›ผ( ๐‘“ (๐‘ฆ) โˆ’ 1) < ๐‘ฅโˆ—(๐‘ฅ) + ๐›ผ๐‘Ÿ for all (๐‘ฅ, ๐‘Ÿ) โˆˆ epi ๐‘“ . (15.1)

As (๐‘ฆ, ๐‘“ (๐‘ฆ)) โˆˆ epi ๐‘“ it follows that ๐›ผ > 0 and by rescaling (๐‘ฅโˆ—, ๐‘), we may assume that ๐›ผ = 1and we get ๐‘ฅโˆ—(๐‘ฆ โˆ’ ๐‘ฅ) + ๐‘“ (๐‘ฆ) โˆ’ 1 < ๐‘Ÿ for all (๐‘ฅ, ๐‘Ÿ) โˆˆ epi ๐‘“ . For ๐‘ฅ โˆˆ dom ๐‘“ we have that(๐‘ฅ, ๐‘“ (๐‘ฅ)) โˆˆ epi ๐‘“ , and thus ๐‘ฅโˆ—(๐‘ฆ โˆ’ ๐‘ฅ) + ๐‘“ (๐‘ฆ) โˆ’ 1 < ๐‘“ (๐‘ฅ), which actually holds for all ๐‘ฅ โˆˆ ๐‘‹.Consequently, the function ๐‘ฅ โ†ฆโ†’ โˆ’๐‘ฅโˆ—(๐‘ฅ) + ๐‘ฅโˆ—(๐‘ฆ) + ๐‘“ (๐‘ฆ) โˆ’ 1 is a minorant and ๐‘€ โ‰  โˆ….

We thus have๐‘“ (๐‘ฅ) โ‰ฅ sup {๐‘ฅโˆ—(๐‘ฅ) + ๐›ผ : (๐‘ฅโˆ—, ๐›ผ) โˆˆ ๐‘€} ,

it remains to be shown that equality holds.Assume there were ๐‘ฅ โˆˆ ๐‘‹ and ๐‘Ÿ โˆˆ R such that

๐‘“ (๐‘ฅ) > ๐‘Ÿ > sup {๐‘ฅโˆ—(๐‘ฅ) + ๐›ผ : (๐‘ฅโˆ—, ๐›ผ) โˆˆ ๐‘€} . (15.2)

Then (๐‘ฅ, ๐‘Ÿ) โˆ‰ epi ๐‘“ . Again, by Hahnโ€“Banach theorem, there are (๐‘ฅโˆ—, ๏ฟฝ๏ฟฝ) โˆˆ ๐‘‹โˆ— ร— R and Y > 0such that

๐‘ฅโˆ—(๐‘ฅ) + ๏ฟฝ๏ฟฝ๐‘Ÿ > ๐‘ฅโˆ—(๐‘ฅ) + ๏ฟฝ๏ฟฝ๐‘Ÿ + Y for all (๐‘ฅ, ๐‘Ÿ) โˆˆ epi ๐‘“ . (15.3)

It follows that ๏ฟฝ๏ฟฝ โ‰ฅ 0, as (๐‘ฅ, ๐‘Ÿ โ€ฒ) โˆˆ epi ๐‘“ for every ๐‘Ÿ โ€ฒ โ‰ฅ ๐‘Ÿ.Assume that ๐‘“ (๐‘ฅ) โˆˆ R. Then ๏ฟฝ๏ฟฝ (๐‘Ÿ โˆ’ ๐‘Ÿ) > Y by (15.3), thus ๏ฟฝ๏ฟฝ > 0. It follows from (15.3) that

๐‘“ (๐‘ฅ) > 1๏ฟฝ๏ฟฝ๐‘ฅโˆ—(๐‘ฅ โˆ’ ๐‘ฅ) + ๐‘Ÿ + Y

๏ฟฝ๏ฟฝ. (15.4)

Hence, ๐‘ฅ โ†ฆโ†’ 1๏ฟฝ๏ฟฝ๐‘ฅโˆ—(๐‘ฅ โˆ’ ๐‘ฅ) + ๐‘Ÿ + Y

๏ฟฝ๏ฟฝis a minorant of ๐‘“ (ยท) which evaluates to ๐‘Ÿ + Y

๏ฟฝ๏ฟฝat ๐‘ฅ, so that we

get from (15.2) that ๐‘“ (๐‘ฅ) > ๐‘Ÿ > ๐‘Ÿ + Y๏ฟฝ๏ฟฝ

, which is a contradiction.Hence, ๐‘“ (๐‘ฅ) = โˆž, i.e., ๐‘ฅ โˆ‰ dom ๐‘‹. As the domain of ๐‘‹ is convex, we may separate ๐‘ฅ from

dom ๐‘“ , i.e., there is ๐‘ฅโˆ— so that ๐‘ฅโˆ—(๐‘ฅ โˆ’ ๐‘ฅ) > Y > 0 (i.e., we may choose ๏ฟฝ๏ฟฝ = 0 in (15.3)).Consider the function ๐‘งโˆ—(๐‘ฅ) B โˆ’๐‘ฅโˆ—(๐‘ฅ โˆ’ ๐‘ฅ) + Y. By (15.3) we get that ๐‘งโˆ—(๐‘ฅ) โ‰ค 0 for every

๐‘ฅ โˆˆ dom ๐‘“ . As ๐‘€ โ‰  โˆ…, there are ๐‘ฆโˆ— โˆˆ ๐‘‹โˆ— and ๐›ฝ โˆˆ R such that ๐‘ฆโˆ—(๐‘ฅ) + ๐›ฝ โ‰ค ๐‘“ (๐‘ฅ) for all ๐‘ฅ โˆˆ ๐‘‹. Itfollows from (15.2) that ๐‘Ÿ > ๐‘ฆโˆ—(๐‘ฅ) + ๐›ฝ, so ๐›พ B 1

Y(๐‘Ÿ โˆ’ ๐‘ฆโˆ—(๐‘ฅ) โˆ’ ๐›ฝ) > 0.

The function ๐‘Ž(๐‘ฅ) B ๐‘ฆโˆ—(๐‘ฅ) โˆ’ ๐›พ๐‘ฅโˆ—(๐‘ฅ) + ๐›พ๐‘ฅโˆ—(๐‘ฅ) + ๐›ฝ + ๐›พY is affine. For ๐‘ฅ โˆˆ dom ๐‘“ we have that

๐‘Ž(๐‘ฅ) = ๐‘ฆโˆ—(๐‘ฅ) + ๐›ฝ + ๐›พยฉยญยญยญยซโˆ’๐‘ฅ

โˆ—(๐‘ฅ โˆ’ ๐‘ฅ) + Y๏ธธ ๏ธท๏ธท ๏ธธ๐‘งโˆ— (๐‘ฅ)

ยชยฎยฎยฎยฌ โ‰ค ๐‘ฆโˆ—(๐‘ฅ) + ๐›ฝ โ‰ค ๐‘“ (๐‘ฅ). Hence ๐‘Ž(ยท) is an affine minorant of ๐‘“

and for ๐‘ฅ = ๐‘ฅ one gets ๐‘Ž(๐‘ฅ) B ๐‘ฆโˆ—(๐‘ฅ) โˆ’ ๐›พ๐‘ฅโˆ—(๐‘ฅ) + ๐›พ๐‘ฅโˆ—(๐‘ฅ) + ๐›ฝ + ๐›พY = ๐‘Ÿ, which contradicts (15.4).Hence, ๐‘“ is the pointwise supremum of affine functions. ๏ฟฝ

rough draft: do not distribute

Page 89: Selected Topics from Portfolio Optimization

15.2 DUALITY 89

15.2 DUALITY

Definition 15.9 (Legendreโ€“Fenchel1 transformation). The convex conjugate function is

๐‘“ โˆ— : ๐‘‹โˆ— โ†’ R โˆช {โˆž}๐‘“ โˆ— (๐‘ฅโˆ—) B sup

๐‘ฅโˆˆ๐‘‹๐‘ฅโˆ—(๐‘ฅ) โˆ’ ๐‘“ (๐‘ฅ) (15.5)

and the bi-conjugate is

๐‘“ โˆ—โˆ— : ๐‘‹ โ†’ R โˆช {โˆž}๐‘“ โˆ—โˆ—(๐‘ฅ) B sup

๐‘ฅโˆ—โˆˆ๐‘‹โˆ—๐‘ฅโˆ—(๐‘ฅ) โˆ’ ๐‘“ โˆ—(๐‘ฅโˆ—).

Remark 15.10 (Fenchelโ€™s inequality, or Fenchelโ€“Young2 inequality). By (15.5) it holds that

๐‘ฅโˆ—(๐‘ฅ) โ‰ค ๐‘“ (๐‘ฅ) + ๐‘“ โˆ—(๐‘ฅโˆ—) (15.6)

for all ๐‘ฅ โˆˆ ๐‘‹ and ๐‘ฅโˆ— โˆˆ ๐‘‹โˆ—.

Lemma 15.11. We have that ๐‘“ โ‰ค ๐‘” implies that ๐‘“ โˆ— โ‰ฅ ๐‘”โˆ—.

Proof. Cf. Exercise 15.1. ๏ฟฝ

Lemma 15.12. If ๐‘Ž(๐‘ฅ) = ๐‘‘ + ๐‘ฆโˆ—(๐‘ฅ) is affine linear, then ๐‘Žโˆ—(๐‘ฅโˆ—) ={โˆ’๐‘‘ if ๐‘ฅโˆ— = ๐‘ฆโˆ—

+โˆž else.and ๐‘Žโˆ—โˆ—(๐‘ฅ) =

๐‘Ž(๐‘ฅ).

Proof. Observe that

๐‘Žโˆ—(๐‘ฅโˆ—) = sup๐‘ฅโˆˆ๐‘‹

๐‘ฅโˆ—(๐‘ฅ) โˆ’ ๐‘‘ โˆ’ ๐‘ฆโˆ—(๐‘ฅ) ={โˆ’๐‘‘ if ๐‘ฅโˆ— = ๐‘ฆโˆ—

+โˆž else.

Further,

๐‘Žโˆ—โˆ—(๐‘ฅ) = sup๐‘ฅโˆ—โˆˆ๐‘‹โˆ—

๐‘ฅโˆ—(๐‘ฅ) โˆ’ ๐‘Žโˆ—(๐‘ฅโˆ—) = sup{๐‘ฆโˆ—(๐‘ฅ) + ๐‘‘๏ธธ ๏ธท๏ธท ๏ธธ

๐‘ฅโˆ—=๐‘ฆโˆ—

, ๐‘ฅโˆ—(๐‘ฅ) โˆ’ โˆž๏ธธ ๏ธท๏ธท ๏ธธ๐‘ฅโˆ—โ‰ ๐‘ฆโˆ—

}= ๐‘ฆโˆ—(๐‘ฅ) + ๐‘‘ = ๐‘Ž(๐‘ฅ)

for every ๐‘ฅ โˆˆ ๐‘‹, i.e., ๐‘Ž = ๐‘Žโˆ—โˆ—. ๏ฟฝ

Example 15.13. On ๐‘‹ = R let ๐‘“ (๐‘ฅ) B 1๐‘|๐‘ฅ |๐‘, then ๐‘“ โˆ—(๐‘ฆ) = 1

๐‘ž|๐‘ฆ |๐‘ž, where 1

๐‘+ 1

๐‘ž= 1.

Indeed, the maximum is attained at 0 = dd๐‘ฅ ๐‘ฅ๐‘ฆ โˆ’

1๐‘|๐‘ฅ |๐‘ = ๐‘ฆ โˆ’ ๐‘ฅ๐‘โˆ’1, so ๐‘ฅโˆ— = ๐‘ฆ

1๐‘โˆ’1 and thus

๐‘“ โˆ—(๐‘ฆ) = ๐‘ฅโˆ—๐‘ฆ โˆ’ 1๐‘|๐‘ฅโˆ— |๐‘ = ๐‘ฆ

1๐‘โˆ’1 ๐‘ฆ โˆ’ 1

๐‘๐‘ฆ

๐‘

๐‘โˆ’1 =๐‘ โˆ’ 1๐‘

๐‘ฆ๐‘

๐‘โˆ’1 =1๐‘ž๐‘ฆ๐‘ž .

Remark 15.14. ๐‘“ โˆ— and ๐‘“ โˆ—โˆ— are lsc.

1Werner Fenchel, 1905โ€“19882William Henry Young, 1863โ€“1942

Version: May 28, 2021

Page 90: Selected Topics from Portfolio Optimization

90 CONVEXITY

Theorem 15.15 (Fenchelโ€“Moreau Theorem, Rockafellar). Let ๐‘‹ be a Banach space. Let๐‘“ : ๐‘‹ โ†’ R be a proper, extended real valued lsc. and convex, function. Then ๐‘“ = ๐‘“ โˆ—โˆ—, where

๐‘“ โˆ—โˆ—(๐‘ฅ) B sup๐‘ฅโˆ—โˆˆ๐‘‹โˆ—

๐‘ฅโˆ—(๐‘ฅ) โˆ’ ๐‘“ โˆ—(๐‘ฅโˆ—).

Proof. By Fenchelโ€™s inequality (15.6) we have that that ๐‘“ (๐‘ฅ) โ‰ฅ ๐‘ฅโˆ—(๐‘ฅ) โˆ’ ๐‘“ โˆ—(๐‘ฅโˆ—), and thus

๐‘“ (๐‘ฅ) โ‰ฅ sup๐‘ฅโˆ—โˆˆ๐‘‹โˆ—

๐‘ฅโˆ—(๐‘ฅ) โˆ’ ๐‘“ โˆ—(๐‘ฅโˆ—) = ๐‘“ โˆ—โˆ—(๐‘ฅ),

i.e., ๐‘“ โ‰ฅ ๐‘“ โˆ—โˆ—.Let ๐‘Ž be affine so that ๐‘Ž โ‰ค ๐‘“ . Then ๐‘Žโˆ— โ‰ฅ ๐‘“ โˆ— and ๐‘Žโˆ—โˆ— โ‰ค ๐‘“ โˆ—โˆ—. Now by Lemma 15.12 we have

that ๐‘Ž = ๐‘Žโˆ—โˆ—, hence๐‘“ (๐‘ฅ) = sup

๐‘Žโ‰ค ๐‘“๐‘Ž(๐‘ฅ) โ‰ค sup

๐‘Žโ‰ค ๐‘“ โˆ—โˆ—๐‘Ž(๐‘ฅ) = ๐‘“ โˆ—โˆ—(๐‘ฅ),

which is the converse inequality. ๏ฟฝ

Corollary 15.16 (The bipolar theorem). The polar cone is

๐ถโ—ฆ B {๐‘ฆ โˆˆ ๐‘‹โˆ— : ๐‘ฆ(๐‘) โ‰ค 0 for all ๐‘ โˆˆ ๐ถ} .

Let ๐ถ be a cone, then ๐ถโ—ฆโ—ฆ B (๐ถโ—ฆ)โ—ฆ = conv {_๐‘ : _ โ‰ฅ 0, ๐‘ โˆˆ ๐ถ}.

Proof. Consider the indicator function ๐‘“ (๐‘) B ๐›ฟ๐ถ (๐‘) B{0 if ๐‘ โˆˆ ๐ถ,+โˆž else.

Then ๐›ฟโˆ—๐ถ(๐‘ฆ) = sup๐‘โˆˆ๐ถ ๐‘ฆ(๐‘)

and ๐›ฟโˆ—โˆ—๐ถ(๐‘) = ๐›ฟ๐ถโ—ฆโ—ฆ (๐‘) iff ๐ถ = ๐ถโ—ฆโ—ฆ. ๏ฟฝ

Corollary 15.17 (Youngโ€™s inequality). For ๐‘”(ยท) strictly increasing it holds that

๐‘ฅ๐‘ฆ โ‰คโˆซ ๐‘ฅ

0๐‘”(๐‘ข) d๐‘ข +

โˆซ ๐‘ฆ

0๐‘”โˆ’1(๐‘ฃ) d๐‘ฃ,

where ๐‘ฅ > 0 and ๐‘ฆ โˆˆ [0, ๐‘”(๐‘ฅ)]; particularlyโˆซ ๐‘ฆ

0๐‘”โˆ’1(๐‘ฃ) d๐‘ฃ = sup

๐‘ฅ

{๐‘ฅ๐‘ฆ โˆ’

โˆซ ๐‘ฅ

0๐‘”(๐‘ข) d๐‘ข

}and

โˆซ ๐‘ฅ

0๐‘”(๐‘ข) d๐‘ข = sup

๐‘ฆ

{๐‘ฅ๐‘ฆ โˆ’

โˆซ ๐‘ฆ

0๐‘”โˆ’1(๐‘ฃ) d๐‘ฃ

}. (15.7)

Proof. Set ๐‘“ (๐‘ฅ) Bโˆซ ๐‘ฅ

0 ๐‘”(๐‘ข) d๐‘ข, then 0 = dd๐‘ฅ ๐‘ฅ๐‘ฆ โˆ’

โˆซ ๐‘ฅ

0 ๐‘”(๐‘ข) d๐‘ข = ๐‘ฆ โˆ’ ๐‘”(๐‘ฅ), i.e., ๐‘ฅโˆ— = ๐‘”โˆ’1(๐‘ฆ). Hence

๐‘“ โˆ—(๐‘ฆ) = ๐‘ฆ๐‘”โˆ’1(๐‘ฆ) โˆ’โˆซ ๐‘”โˆ’1 (๐‘ฆ)0 ๐‘”(๐‘ข) d๐‘ข =

โˆซ ๐‘ฆ

0 ๐‘”โˆ’1(๐‘ฃ) d๐‘ฃ, and hence the assertion. ๏ฟฝ

15.3 PROBLEMS

Exercise 15.1. Verify Lemma 15.11.

Exercise 15.2. For a family ๐‘“ ] it holds that (inf ] ๐‘“ ])โˆ— (๐‘ฅโˆ—) = sup ] ๐‘“ โˆ—] (๐‘ฅโˆ—), but(sup ] ๐‘“ ]

)โˆ— (๐‘ฅโˆ—) โ‰คinf ] ๐‘“ โˆ—] (๐‘ฅโˆ—).

Exercise 15.3. Show that ((1 โˆ’ _) ๐‘“0 + _ ๐‘“1)โˆ— โ‰ค (1 โˆ’ _) ๐‘“ โˆ—0 + _ ๐‘“โˆ—1 for _ โˆˆ [0, 1].

Exercise 15.4. Set ๐‘”(๐‘ฅ) B ๐›ผ + ๐›ฝ ยท ๐‘ฅ + ๐›พ ๐‘“ (_๐‘ฅ + ๐›ฟ), then ๐‘”โˆ—(๐‘ฅโˆ—) = โˆ’๐›ผโˆ’ ๐›ฟ ๐‘ฅโˆ—โˆ’๐›ฝ_+ ๐›พ ๐‘“ โˆ—

(๐‘ฅโˆ—โˆ’๐›ฝ_๐›พ

), where

_ โ‰  0 and ๐›พ >0.

rough draft: do not distribute

Page 91: Selected Topics from Portfolio Optimization

15.3 PROBLEMS 91

Exercise 15.5 (Infimal convolution). Define the infimal convolution

( ๐‘“๏ฟฝ๐‘”) (๐‘ฅ) B inf { ๐‘“ (๐‘ฅ โˆ’ ๐‘ฆ) + ๐‘”(๐‘ฆ) : ๐‘ฆ โˆˆ R๐‘›}

and more generally, ( ๐‘“1๏ฟฝ . . .๏ฟฝ ๐‘“๐‘š) (๐‘ฅ) B inf{โˆ‘๐‘š

๐‘–=1 ๐‘“๐‘– (๐‘ฅ๐‘–) :โˆ‘๐‘š

๐‘–=1 = ๐‘ฅ}. Then, for ๐‘“๐‘– proper, con-

vex and lsc., ( ๐‘“1๏ฟฝ . . .๏ฟฝ ๐‘“๐‘š)โˆ— = ๐‘“ โˆ—1 + ยท ยท ยท + ๐‘“โˆ—๐‘š.

Exercise 15.6. Show that the conjugate of ๐‘“ (๐‘ฅ) = ๐‘’๐‘ฅ is ๐‘“ โˆ—(๐‘ฅโˆ—) =

๐‘ฅโˆ— log ๐‘ฅโˆ— โˆ’ ๐‘ฅโˆ— if ๐‘ฅโˆ— > 00 if ๐‘ฅโˆ— = 0+โˆž if ๐‘ฅโˆ— < 0

.

Version: May 28, 2021

Page 92: Selected Topics from Portfolio Optimization

92 CONVEXITY

rough draft: do not distribute

Page 93: Selected Topics from Portfolio Optimization

16Sample Average Approximation (SAA)

This chapter is based on Shapiro et al. [2014]

16.1 SAA

Let ๐‘‹ โŠ‚ R๐‘› be closed and ๐‘‹ โ‰  โˆ…. Consider the problem ๐œ—โˆ— B min๐‘ฅโˆˆ๐‘‹ E ๐น (๐‘ฅ, b)๏ธธ ๏ธท๏ธท ๏ธธ=: ๐‘“ (๐‘ฅ)

which we

compare with ๏ฟฝ๏ฟฝ๐‘ B min๐‘ฅโˆˆ๐‘‹1๐‘

๐‘โˆ‘๐‘–=1

๐น (๐‘ฅ, b ๐‘—)๏ธธ ๏ธท๏ธท ๏ธธ=: ๐‘“๐‘ (๐‘ฅ)

for the empirical measure ๐‘ƒ๐‘ = 1๐‘

โˆ‘๐‘๐‘–=1 ๐›ฟb๐‘– and iid

samples b๐‘–.

16.1.1 Pointwise LLN

Suppose that E ๐น (๐‘ฅ, b) < โˆž.

Lemma 16.1. The following hold true:

(i) E ๐‘“๐‘ (๐‘ฅ) = ๐‘“ (๐‘ฅ), i.e., ๐‘“๐‘ (๐‘ฅ) is an unbiased estimator for ๐‘“ (๐‘ฅ);

(ii) (LLN) For every ๐‘ฅ โˆˆ ๐‘‹ it holds that ๐‘“๐‘ (๐‘ฅ) โ†’ ๐‘“ (๐‘ฅ), as ๐‘ โ†’โˆž with probability 1.

Proposition 16.2. The estimator ๏ฟฝ๏ฟฝ๐‘ is not necessarily consistent, it holds in general that

lim sup๐‘โ†’โˆž

๏ฟฝ๏ฟฝ๐‘ โ‰ค ๐œ—โˆ—.

Proof. We have that ๏ฟฝ๏ฟฝ๐‘ โ‰ค ๐‘“๐‘ (๐‘ฅ) for every ๐‘ฅ โˆˆ ๐‘‹, thus

lim sup๐‘โ†’โˆž

๏ฟฝ๏ฟฝ๐‘ โ‰ค lim๐‘โ†’โˆž

๐‘“๐‘ (๐‘ฅ) = ๐‘“ (๐‘ฅ)

by the Law of Large Numbers (ii). Thus

lim sup๐‘โ†’โˆž

๏ฟฝ๏ฟฝ๐‘ โ‰ค inf ๐‘“ (๐‘ฅ) = ๐œ—โˆ—.

๏ฟฝ

Proposition 16.3. The estimator ๏ฟฝ๏ฟฝ๐‘ is downside biased, it holds that E ๏ฟฝ๏ฟฝ๐‘+1 โ‰ค E ๏ฟฝ๏ฟฝ๐‘ โ‰ค ๐œ—โˆ—.Proof. It holds that

E ๏ฟฝ๏ฟฝ๐‘ = E ๐‘“๐‘ (๐‘ฅ) = Emin๐‘ฅโˆˆ๐‘‹

1๐‘

๐‘โˆ‘๐‘—=1

๐น (๐‘ฅ, b ๐‘—) โ‰ค min๐‘ฅโˆˆ๐‘‹

E1๐‘

๐‘โˆ‘๐‘—=1

๐น (๐‘ฅ, b ๐‘—)

= min๐‘ฅโˆˆ๐‘‹

1๐‘

๐‘โˆ‘๐‘—=1E ๐น (๐‘ฅ, b ๐‘—) = min

๐‘ฅโˆˆ๐‘‹๐‘“ (๐‘ฅ) = ๐œ—โˆ—.

93

Page 94: Selected Topics from Portfolio Optimization

94 SAMPLE AVERAGE APPROXIMATION (SAA)

Further, note that ๐‘“๐‘+1(๐‘ฅ) = 1๐‘+1

โˆ‘๐‘+1๐‘–=1 ๐น (๐‘ฅ, b๐‘–) = 1

๐‘+1โˆ‘๐‘+1

๐‘–=11๐‘

โˆ‘๐‘—โ‰ ๐‘– ๐น (๐‘ฅ, b ๐‘—), thus

E ๏ฟฝ๏ฟฝ๐‘+1 = Emin๐‘ฅโˆˆ๐‘‹

๐‘“๐‘+1(๐‘ฅ) = Emin๐‘ฅโˆˆ๐‘‹

1๐‘ + 1

๐‘+1โˆ‘๐‘–=1

1๐‘

โˆ‘๐‘—โ‰ ๐‘–

๐น (๐‘ฅ, b ๐‘—)

โ‰ฅ E 1๐‘ + 1

๐‘+1โˆ‘๐‘–=1min๐‘ฅโˆˆ๐‘‹

1๐‘

โˆ‘๐‘—โ‰ ๐‘–

๐น (๐‘ฅ, b ๐‘—)๏ธธ ๏ธท๏ธท ๏ธธ๏ฟฝ๏ฟฝ๐‘

= E1

๐‘ + 1

๐‘+1โˆ‘๐‘–=1

๏ฟฝ๏ฟฝ๐‘ = E ๏ฟฝ๏ฟฝ๐‘ ,

thus the assertion. ๏ฟฝ

16.1.2 Pointwise and Functional CLT

Suppose here that

(i) ๐œŽ(๐‘ฅ)2 B var ๐น (๐‘ฅ,ฮž) < โˆž and

(ii) |๐น (๐‘ฅ, b) โˆ’ ๐น (๐‘ฅ โ€ฒ, b) | โ‰ค ๐ถ (b) โ€–๐‘ฅ โˆ’ ๐‘ฅ โ€ฒโ€– a.s. and E๐ถ (b)2 < โˆž.

Lemma 16.4. ๐‘“ (ยท) is Lipschitz with constant E๐ถ (b).

Proof. It follows from (ii) by taking expectations that ๐‘“ (๐‘ฅ) โˆ’ ๐‘“ (๐‘ฅ โ€ฒ) = E ๐น (๐‘ฅ, b) โˆ’ E ๐น (๐‘ฅ โ€ฒ, b) โ‰คE๐ถ โ€–๐‘ฅ โˆ’ ๐‘ฅ โ€ฒโ€–, from which the assertion follows. ๏ฟฝ

We have that โˆš๐‘ ( ๐‘“๐‘ (๐‘ฅ) โˆ’ ๐‘“ (๐‘ฅ))

Dโˆ’โˆ’โ†’ ๐‘Œ (๐‘ฅ) โˆผ N(0, ๐œŽ(๐‘ฅ)2

).

More generally,

โˆš๐‘ ( ๐‘“๐‘ (๐‘ฅ1) โˆ’ ๐‘“ (๐‘ฅ1), . . . ๐‘“๐‘ (๐‘ฅ๐‘›) โˆ’ ๐‘“ (๐‘ฅ๐‘›))

Dโˆ’โˆ’โ†’ ๐‘Œ (๐‘ฅ) โˆผ N (0, ฮฃ) ,

where ฮฃ = cov (๐น (๐‘ฅ๐‘– , b), ๐น (๐‘ฅ๐‘– , b))๐‘›๐‘–, ๐‘—=1.In a functional way,

โˆš๐‘ ( ๐‘“๐‘ (ยท) โˆ’ ๐‘“ (ยท))

Dโˆ’โˆ’โ†’ ๐‘Œ : ฮฉโ†’ ๐ถ (๐‘‹),

where ๐‘Œ : ฮฉโ†’ ๐ถ (๐‘‹) is called a random element in ๐ถ (๐‘‹).

Theorem 16.5. If (i) and (ii), then

(i) ๏ฟฝ๏ฟฝ๐‘ = inf๐‘ฅ ๐‘“๐‘ (๐‘ฅ) + ๐‘œ(๐‘โˆ’1/2

)and

(ii) ๐‘1/2(๏ฟฝ๏ฟฝ๐‘ โˆ’ ๐œ—โˆ—

) Dโˆ’โˆ’โ†’ inf๐‘ โˆˆ๐‘† ๐‘Œ (๐‘ ), where ๐‘† = argmin๐‘ฅโˆˆ๐‘‹ ๐‘“ (๐‘ฅ) โŠ‚ ๐‘‹.

Proof. The proof uses the ฮ”-method described in Section 16.2 below for finite dimensions.๏ฟฝ

Remark 16.6. We obtain from (ii) that ๏ฟฝ๏ฟฝ๐‘ = ๐œ—โˆ— + ๐‘โˆ’1/2 inf๐‘ โˆˆ๐‘† ๐‘Œ (๐‘ ) + ๐‘œ(๐‘โˆ’1/2

). For ๐‘  = {๐‘ฅโˆ—} we

have that inf๐‘ โˆˆ๐‘† ๐‘Œ (๐‘ ) = ๐‘Œ (๐‘ฅโˆ—) โˆผ ๐‘(0, ๐œŽ(๐‘ฅโˆ—)2

)and hence E ๏ฟฝ๏ฟฝ๐‘ = ๐œ—โˆ— + ๐‘œ

(๐‘โˆ’1/2

).

However, convergence is slower, in general, if ๐‘† consists of more than 1 point.

rough draft: do not distribute

Page 95: Selected Topics from Portfolio Optimization

16.2 THE ฮ”-METHOD 95

16.2 THE ฮ”-METHOD

Proposition 16.7. Let ๐‘Œ๐‘ โˆˆ R๐‘‘ be random vectors with, ๐‘Œ๐‘ โ†’ ` โˆˆ R๐‘‘ in probability and

R 3 ๐œ๐‘ โ†— โˆž deterministic numbers such that ๐œ๐‘ (๐‘Œ๐‘ โˆ’ `)Dโˆ’โˆ’โ†’ ๐‘Œ . Further, let ๐บ : R๐‘‘ โ†’ R๐‘› be

differentiable at `. Then ๐œ๐‘ (๐บ (๐‘Œ๐‘ ) โˆ’ ๐บ (`))Dโˆ’โˆ’โ†’ ๐ฝ ยท ๐‘Œ , where ๐ฝ = โˆ‡๐บ (`) is the ๐‘› ร— ๐‘‘ Jacobian

matrix at `.

Proof. Notice, that ๐บ (๐‘ฆ) โˆ’ ๐บ (`) = ๐ฝ (๐‘ฆ โˆ’ `) + ๐‘Ÿ (๐‘ฆ), where ๐‘Ÿ (๐‘ฆ) = ๐‘œ(โ€–๐‘ฆ โˆ’ `โ€–), so that wehave ๐œ๐‘ (๐บ (๐‘Œ๐‘ ) โˆ’ ๐บ (`)) = ๐ฝ ๐œ๐‘ (๐‘Œ๐‘ โˆ’ `)๏ธธ ๏ธท๏ธท ๏ธธ

Dโˆ’โˆ’โ†’๐‘Œ

+๐œ๐‘ ๐‘Ÿ (๐‘Œ๐‘ ). We have that ๐œ๐‘ (๐‘Œ๐‘ โˆ’ `) = O(1) (as it

converges in distribution), hence โ€–๐‘Œ๐‘ โˆ’ `โ€– = O(1) and thus ๐‘Ÿ (๐‘Œ๐‘ ) = ๐‘œ (โ€–๐‘Œ๐‘ โˆ’ `โ€–) = ๐‘œ(๐œโˆ’1๐‘

).

Thus the result. ๏ฟฝ

Claim 16.8. For ๐‘1/2 (๐‘Œ๐‘ โˆ’ `)Dโˆ’โˆ’โ†’ N (0, ฮฃ) we have particularly that ๐‘1/2 (๐บ (๐‘Œ๐‘ ) โˆ’ ๐บ (`))

Dโˆ’โˆ’โ†’N (0, ๐ฝฮฃ๐ฝ>).

Version: May 28, 2021

Page 96: Selected Topics from Portfolio Optimization

96 SAMPLE AVERAGE APPROXIMATION (SAA)

rough draft: do not distribute

Page 97: Selected Topics from Portfolio Optimization

17Weak Topology of Measures

17.1 GENERAL CHARACTERISTICS

Definition 17.1. Let (๐‘‹, ๐‘‘) be a metric space. The weak topology of probability measures ischaracterized byโˆซ

๐‘‹

โ„Ž d๐‘ƒ๐‘› โˆ’โˆ’โˆ’โˆ’โ†’๐‘›โ†’โˆž

โˆซ๐‘‹

โ„Ž d๐‘ƒ for all bounded and continuous functions โ„Ž : ๐‘‹ โ†’ R.

Theorem 17.2 (Riesz representation theorem). For any continuous linear functional ๐œ“ : ๐ถ0(๐‘‹) โ†’R (the bounded functions vanishing at infinity on a locally compact Hausdorff space ๐‘‹) thereis a regular, countatbly additive measure ` on the Borels so that

๐œ“(โ„Ž) =โˆซ๐‘‹

โ„Ž d` for all โ„Ž โˆˆ ๐ถ0(๐‘‹).

Definition 17.3. Let `, a be (probability) measures. The Lรฉvyโ€“Prokhorov metric is

๐œ‹(`, a) B inf {Y > 0: `(๐ด) โ‰ค a(๐ดY) + Y and a(๐ด) โ‰ค `(๐ดY) + Y for all ๐ด โˆˆ โ„ฌ(๐‘‹)} , (17.1)

where ๐ดY Bโ‹ƒ

๐‘Žโˆˆ๐ด ๐ตY (๐‘Ž) is the Y-fattening (or Y-enlargement) of ๐ด โŠ‚ ๐‘‹.

Remark 17.4. Notice, that ๐œ‹(๐‘ƒ,๐‘„) โ‰ค 1 for probability measures.

Definition 17.5. A collection ๐‘€ โŠ‚ P(๐‘‹) of probability measures on (๐‘‹, ๐‘‘) is tight iff for everyY > 0 there is a compact set ๐พY โŠ‚ ๐‘‹ so that

`(๐พY) > 1 โˆ’ Y for all ๐‘ƒ โˆˆ ๐‘€.

Example 17.6. The collection {๐›ฟ๐‘› : ๐‘› = 1, 2, . . . } on R is not tight, while{๐›ฟ1/๐‘› : ๐‘› = 1, 2, . . .

}is.

Example 17.7. A collection of Gaussian measures {N (`๐‘– , ฮฃ๐‘–) : ๐‘– โˆˆ ๐ผ} is tight, if {`๐‘– : ๐‘– โˆˆ ๐ผ}and {ฮฃ๐‘– : ๐‘– โˆˆ ๐ผ} are uniformly bounded.

Theorem 17.8 (Prokhorovโ€™s theorem). The following hold true:

(i) The metric ๐œ‹ in (17.1) metrizes the weak topology of measures.

(ii) A set K โŠ‚ P(๐‘‹) is tight iff K is sequentially compact.

(iii) If {๐‘ƒ๐‘› : ๐‘› = 1, 2, . . . } โŠ‚ P(R๐‘‘

)is tight, then there is a subsequence and a measure

๐‘ƒ โˆˆ P(R๐‘‘

)so that ๐‘ƒ๐‘› โ†’ ๐‘ƒ weakly.

97

Page 98: Selected Topics from Portfolio Optimization

98 WEAK TOPOLOGY OF MEASURES

Properties

(i) If (๐‘‹, ๐‘‘) is separable, convergence of measures in the Lรฉvyโ€“Prokhorov metric is equiv-alent to weak convergence of measures. Thus, ๐œ‹ is a metrization of the topology ofweak convergence on P(๐‘‹).

(ii) The metric space (P(๐‘‹), ๐œ‹) is separable if and only if (๐‘‹, ๐‘‘) is separable.

(iii) If (P(๐‘‹), ๐œ‹) is complete then (๐‘‹, ๐‘‘) is complete. If all the measures in P(๐‘‹) haveseparable support, then the converse implication also holds: if (๐‘‹, ๐‘‘) is complete then(P(๐‘‹), ๐œ‹) is complete.

(iv) If (๐‘‹, ๐‘‘) is separable and complete, a subset K โŠ† P(๐‘‹) is relatively compact if and onlyif its ๐œ‹-closure is ๐œ‹-compact.

17.2 THE WASSERSTEIN DISTANCE

Some points here follow [Pflug and Pichler, 2014].

Definition 17.9 (Optimal transportation cost). Given two probability spaces (ฮž, F , ๐‘ƒ) and(ฮž, F , ๏ฟฝ๏ฟฝ

), the Wasserstein distance of order ๐‘Ÿ โ‰ฅ 1 (optimal transportation costs) is

d๐‘Ÿ(๐‘ƒ, ๏ฟฝ๏ฟฝ

)= inf

๐œ‹

(โˆฌฮžร—ฮž

๐‘‘(b, b

)๐‘Ÿ๐œ‹

(๐‘‘b, ๐‘‘b

) )1/๐‘Ÿ, (17.2)

where the infimum is taken over all (bivariate) probability measures ๐œ‹ on ฮž ร— ฮž having themarginals ๐‘ƒ and ๏ฟฝ๏ฟฝ, that is

๐œ‹(๐ด ร— ฮž

)= ๐‘ƒ(๐ด) and ๐œ‹ (ฮž ร— ๐ต) = ๏ฟฝ๏ฟฝ(๐ต) (17.3)

for all measurable sets ๐ด โˆˆ F and ๐ต โˆˆ F . The optimal measure ๐œ‹ is called the optimaltransport plan.

Remark 17.10. Occasionally, the Wasserstein distance is also considered for a (convex)function ๐‘(๐‘ฅ, ๐‘ฆ) instead of the distance ๐‘‘ (๐‘ฅ, ๐‘ฆ)๐‘Ÿ .

Proposition 17.11 (Embedding). It holds that

d๐‘Ÿ(๐‘ƒ, ๐›ฟb0

)๐‘Ÿ=

โˆซฮž

d (b, b0)๐‘Ÿ ๐‘ƒ (db) ,

and the mapping

๐‘– : (ฮž, d) โ†’ (P๐‘Ÿ (ฮž; d) , d๐‘Ÿ ) ,b โ†ฆโ†’ ๐›ฟb (ยท)

assigning to each point b โˆˆ ฮž its point measure ๐›ฟb located on b (Dirac measure1) is anisometric embedding for all 1 โ‰ค ๐‘Ÿ < โˆž ((ฮž, d) โ†ฉโ†’ P๐‘Ÿ (ฮž; d)).

1 ๐›ฟb (๐ด) B 1๐ด(b) ={1 if b โˆˆ ๐ด0 if b โˆ‰ ๐ด

is the usual Dirac measure.

rough draft: do not distribute

Page 99: Selected Topics from Portfolio Optimization

17.3 THE REAL LINE 99

Proof. There is just one single measure with marginals ๐‘ƒ and ๐›ฟb0 , which is the transport plan๐œ‹ = ๐‘ƒ โŠ— ๐›ฟb0 . Hence

d๐‘Ÿ(๐‘ƒ, ๐›ฟb0

)๐‘Ÿ=

โˆซฮž

โˆซฮž

d(b, b

)๐‘Ÿ๐›ฟb0

(๐‘‘b

)๐‘ƒ (๐‘‘b) =

โˆซฮž

d (b, b0)๐‘Ÿ ๐‘ƒ (๐‘‘b) ,

the first assertion.For the particular choice ๐‘ƒ = ๐›ฟb0 the latter formula simplifies to

d๐‘Ÿ(๐›ฟ b0 , ๐›ฟb0

)๐‘Ÿ=

โˆซฮž

d (b, b0)๐‘Ÿ ๐›ฟ b0 (๐‘‘b) = d(b0, b0

)๐‘Ÿ,

and hence b โ†ฆโ†’ ๐›ฟb is an isometry. ๏ฟฝ

Notice that if d is inherited by b, then d๐‘Ÿ (๐‘ƒ, ๐›ฟb0)๐‘Ÿ =โˆซฮžโ€–b โˆ’ b0โ€–๐‘Ÿ ๐‘ƒ(๐‘‘b).

17.3 THE REAL LINE

Theorem 17.12 (Cf. Rachev and Rรผschendorf [1998, Theorem 2.18]). Let ๐‘ƒ and ๏ฟฝ๏ฟฝ be prob-ability measures on the real line with โ€œcdfโ€ ๐น (๐‘ฅ) B ๐‘ƒ ((โˆ’โˆž, ๐‘ฅ]) and ๐บ (๐‘ฅ) B ๏ฟฝ๏ฟฝ ((โˆ’โˆž, ๐‘ฅ]) . Let๐œ‹ be the measure on R2 with cdf. ๐ป (๐‘ฅ, ๐‘ฆ) B min {๐น (๐‘ฅ), ๐บ (๐‘ฆ)}. Then ๐œ‹ is optimal for the Kan-torovich transportation problem between ๐‘ƒ and ๏ฟฝ๏ฟฝ for every cost function ๐‘(๐‘ฅ, ๐‘ฆ) = ๐‘(๐‘ฅ โˆ’ ๐‘ฆ)where ๐‘(ยท) is convex. Further,

d๐‘ (๐‘ƒ, ๏ฟฝ๏ฟฝ) =โˆซ 1

0๐‘

(๐นโˆ’1(๐‘ข) โˆ’ ๐บโˆ’1(๐‘ข)

)d๐‘ข.

Corollary 17.13. For the cost function ๐‘(ยท) = |ยท| we have further that d(๐‘ƒ, ๏ฟฝ๏ฟฝ) =โˆซ 10

๏ฟฝ๏ฟฝ๐นโˆ’1(๐‘ข) โˆ’ ๐บโˆ’1(๐‘ข)๏ฟฝ๏ฟฝ d๐‘ข =โˆซ โˆžโˆ’โˆž |๐น (๐‘ฅ) โˆ’ ๐บ (๐‘ฅ) | d๐‘ฅ.

Remark 17.14.

(i) If ๐บ does not give mass to points, then one may define ๐‘‡ B ๐บโˆ’1 โ—ฆ ๐น and it holds thatโˆซ ๐‘ฅ

โˆ’โˆžd๐‘ƒ = ๐น (๐‘ฅ) = ๐บ (๐‘‡ (๐‘ฅ)) =

โˆซ ๐‘‡ (๐‘ฅ)

โˆ’โˆžd๏ฟฝ๏ฟฝ (17.4)

The transport map ๐‘‡ is a monotone rearrangement of ๐‘ƒ to ๏ฟฝ๏ฟฝ.

(ii) Suppose ๐‘ƒ and ๏ฟฝ๏ฟฝ have the densities ๐‘“ = ๐น โ€ฒand ๐‘” = ๐บ โ€ฒ. Then differentiating (17.4) gives

๐‘“ (๐‘ฅ) = ๐‘”(๐‘‡ (๐‘ฅ)) ยท ๐‘‡ โ€ฒ(๐‘ฅ).

Version: May 28, 2021

Page 100: Selected Topics from Portfolio Optimization

100 WEAK TOPOLOGY OF MEASURES

rough draft: do not distribute

Page 101: Selected Topics from Portfolio Optimization

18Topologies For Set-Valued Convergence

18.1 TOPOLOGICAL FEATURES OF MINKOWSKI ADDITION

Theorem (Topological properties of Minkowski addition in locally compact vector spaces).Let ๐ด and ๐ต be sets.

(i)โ—ฆ๐ด + ๐ต is open and

โ—ฆ๐ด + ๐ต โŠ‚ (๐ด + ๐ต)โ—ฆ โŠ‚ ๐ด + ๐ต. If ๐ด or ๐ต is open, then ๐ด + ๐ต = (๐ด + ๐ต)โ—ฆ;

(ii) ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ โŠ‚ ๐ด + ๐ต. If ๐ด or ๐ต is bounded, then ๐ด + ๐ต = ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ;

(iii) For ๐ด and ๐ต closed and ๐ด (or ๐ต) bounded, ๐œ• (๐ด + ๐ต) โŠ‚ ๐œ•๐ด + ๐œ•๐ต.

Proof. Let ๐‘ฅ โˆˆโ—ฆ๐ด + ๐ต have the composition ๐‘ฅ = ๐‘Ž + ๐‘ with ๐‘Ž โˆˆ ๐ต๐‘Ÿ (๐‘Ž) โŠ‚

โ—ฆ๐ด for some ๐‘Ÿ > 0 and

๐‘ โˆˆ ๐ต. Then ๐‘ฅ โˆˆ ๐ต๐‘Ÿ (๐‘ฅ) = ๐ต๐‘Ÿ (๐‘Ž + ๐‘) = ๐ต๐‘Ÿ (๐‘Ž) + {๐‘} โŠ‚โ—ฆ๐ด + ๐ต, so

โ—ฆ๐ด + ๐ต is open. As

โ—ฆ๐ด + ๐ต โŠ‚ ๐ด + ๐ต

andโ—ฆ๐ด + ๐ต open it is immediate that

โ—ฆ๐ด + ๐ต โŠ‚ (๐ด + ๐ต)โ—ฆ, which is ((i)) (the rest being obvious).

Let ๐‘Ž โˆˆ ๏ฟฝ๏ฟฝ and ๐‘ โˆˆ ๏ฟฝ๏ฟฝ. Choose ๐‘Ž๐‘˜ โˆˆ ๐ด with ๐‘Ž๐‘˜ โ†’ ๐‘Ž โˆˆ ๏ฟฝ๏ฟฝ and ๐‘๐‘˜ โˆˆ ๐ต with ๐‘๐‘˜ โ†’ ๐‘ โˆˆ ๏ฟฝ๏ฟฝ.Obviously ๐‘Ž๐‘˜ + ๐‘๐‘˜ โˆˆ ๐ด + ๐ต and thus ๐‘Ž + ๐‘ โˆˆ ๐ด + ๐ต, whence ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ โŠ‚ ๐ด + ๐ต.

As for the converse let ๐‘ฅ โˆˆ ๐ด + ๐ต, so there is a sequence ๐‘ฅ๐‘˜ = ๐‘Ž๐‘˜ + ๐‘๐‘˜ with ๐‘Ž๐‘˜ โˆˆ ๐ด and๐‘๐‘˜ โˆˆ ๐ต and ๐‘ฅ๐‘˜ โ†’ ๐‘ฅ. Assume (wlog.) ๐ด bounded, thus there is a subsequence such that๐‘Ž๐‘˜ โ†’ ๐‘Ž โˆˆ ๏ฟฝ๏ฟฝ, and thus ๐‘๐‘˜ = ๐‘ฅ๐‘˜ โˆ’ ๐‘Ž๐‘˜ โ†’ ๐‘ฅ โˆ’ ๐‘Ž converges as well with ๐‘๐‘˜ โ†’ ๐‘ฅ โˆ’ ๐‘Ž =: ๐‘ โˆˆ ๏ฟฝ๏ฟฝ. Thatis ๐‘ฅ = ๐‘Ž + ๐‘ โˆˆ ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ and thus ๐ด + ๐ต โŠ‚ ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ.

Observe first that ๐œ• (๐ด + ๐ต) โŠ‚ ๐ด + ๐ต โŠ‚ ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ = ๐ด + ๐ต. Suppose that ๐‘ฅ โˆˆ ๐œ• (๐ด + ๐ต) can be

written as ๐‘ฅ = ๐‘Ž + ๐‘ for some ๐‘Ž โˆˆโ—ฆ๐ด and ๐‘ โˆˆ ๐ต. Then ๐‘ฅ = ๐‘Ž + ๐‘ โˆˆ

โ—ฆ๐ด + ๐ต โŠ‚ (๐ด + ๐ต)โ—ฆ, whence

๐‘ฅ โˆ‰ ๐œ• (๐ด + ๐ต). This is a contradiction, so ๐‘Ž โˆ‰โ—ฆ๐ด, that is ๐‘Ž โˆˆ ๐œ•๐ด. By similar reasoning (๐ด and ๐ต

reversed) we find that ๐‘ โˆˆ ๐œ•๐ต as well, which is the desired assertion. ๏ฟฝ

The assertion bounded in (ii) may not be dropped: To see this consider the closed sets๐ด B R ร— {0} โŠ‚ R2 and ๐ต B

{(๐‘ฅ, eโˆ’๐‘ฅ2

): ๐‘ฅ โˆˆ R

}. ๐ด + ๐ต is open though, and ๏ฟฝ๏ฟฝ + ๏ฟฝ๏ฟฝ ๐ด + ๐ต.

18.1.1 Topological features of convex sets

Theorem (Topological properties of convex sets).

โŠฒ If ๐ด is open, then conv ๐ด is open;1

โŠฒ If ๐ด is bounded, then conv ๐ด is bounded;

โŠฒ If ๐ด is closed and bounded, then conv ๐ด is closed and bounded;

โŠฒ conv (๐ด + ๐ต) = conv ๐ด + conv ๐ต.

1The statement if ๐ด is closed, then conv ๐ด is closed is wrong (why?).

101

Page 102: Selected Topics from Portfolio Optimization

102 TOPOLOGIES FOR SET-VALUED CONVERGENCE

Proof. Let ๐‘Ž โˆˆ conv ๐ด have a representation ๐‘Ž =โˆ‘๐‘›

๐‘–=1 _๐‘–๐‘Ž๐‘– with ๐‘Ž๐‘– โˆˆ ๐ด, whence ๐‘Ž โˆˆ โˆ‘๐‘›๐‘–=1 _๐‘–๐ด โŠ‚

conv ๐ด. As ๐ด is open it follows from Theorem (18.1) ((i)) thatโˆ‘๐‘›

๐‘–=1 _๐‘–๐ด is open, whence conv ๐ดis open.

Boundedness is obvious.Let ๐‘Ž โˆˆ conv ๐ด. Then there is a sequence ๐‘Ž๐‘˜ =

โˆ‘๐‘‘+1๐‘–=1 _

(๐‘˜)๐‘–๐‘Ž(๐‘˜)๐‘–โˆˆ conv ๐ด with ๐‘Ž๐‘˜ โ†’ ๐‘Ž (here

we use Carathรฉodoryโ€™s theorem; the statement is wrong in non-finite dimensions). By pickingsubsequences we may assume that ๐‘Ž (๐‘˜)1 converges, ๐‘Ž (๐‘˜)2 converges, etc. and finally ๐‘Ž (๐‘‘+1)

๐‘˜,

and moreover all _ (๐‘˜)๐‘–

. ๏ฟฝ

18.2 PRELIMINARIES AND DEFINITIONS

In a vector space ๐‘‹ the Minkowski sum (also known as dilation) of two sets ๐ด and ๐ต is๐ด + ๐ต B {๐‘Ž + ๐‘ : ๐‘Ž โˆˆ ๐ด, ๐‘ โˆˆ ๐ต}, and the product with a scalar ๐‘ is ๐‘ ยท ๐ด B {๐‘ ยท ๐‘Ž : ๐‘Ž โˆˆ ๐ด}.

18.2.1 Convexity, and Conjugate Duality

The support function of a set ๐ด โŠ‚ ๐‘‹ is

๐‘ ๐ด (๐‘ฅโˆ—) B sup๐‘Žโˆˆ๐ด

๐‘ฅโˆ— (๐‘Ž) , (18.1)

where ๐‘ฅโˆ— โˆˆ ๐‘‹โˆ— is from the dual of a Banach space (๐‘‹, โ€–ยทโ€–) with norm โ€–ยทโ€–; its dual we willdenote as (๐‘‹โˆ—, โ€–ยทโ€–), as no confusion with denoting the norm in the dual again by โ€–ยทโ€– will bepossible anyway.

Remark 18.1 (A collection of properties). Important properties of the support function include

(i) ๐‘ ๐ด โ‰ค ๐‘ ๐ต whenever ๐ด โŠ‚ ๐ต (more specifically, ๐‘ ๐ด (๐‘ฅโˆ—) โ‰ค ๐‘ ๐ด (๐‘ฅโˆ—) for all ๐‘ฅโˆ—),

(ii) ๐‘ _ยท๐ด (๐‘ฅโˆ—) = ๐‘ ๐ด (_ ยท ๐‘ฅโˆ—) = _ ยท ๐‘ ๐ด (๐‘ฅโˆ—) for _ > 0 (positive homogeneity) and ๐‘ ๐ด (0) = 0,

(iii) ๐‘ ๐ด+๐ต = ๐‘ ๐ด + ๐‘ ๐ต,

(iv) ๐‘ conv ๐ด = ๐‘ ๐ด2 and

(v) ๐‘ ๐ด is convex, that is ๐‘ ๐ด((1 โˆ’ _) ๐‘ฅโˆ—0 + _๐‘ฅ

โˆ—1)โ‰ค (1 โˆ’ _) ๐‘ ๐ด

(๐‘ฅโˆ—0

)+ _๐‘ ๐ด

(๐‘ฅโˆ—1

)whenever 0 โ‰ค _ โ‰ค 1.

By employing the indicator function of the set ๐ด, I๐ด (๐‘Ž) B{0 if ๐‘Ž โˆˆ ๐ดโˆž else , it is immediate

that๐‘ ๐ด (๐‘ฅโˆ—) = sup

๐‘ฅโˆˆ๐‘‹๐‘ฅโˆ— (๐‘ฅ) โˆ’ I๐ด (๐‘ฅ) ,

where the supremum ranges over all ๐‘ฅ โˆˆ ๐‘‹ โŠƒ ๐ด now. The support function itself thus is theusual convex conjugate function of I๐ด, which we denote ๐‘ ๐ด = Iโˆ—

๐ด. The bi-conjugate function

of I๐ด (the conjugate of ๐‘ ๐ด) is the function

๐‘ โˆ—๐ด (๐‘Ž) B sup๐‘ฅโˆ—โˆˆ๐‘‹โˆ—

๐‘ฅโˆ— (๐‘Ž) โˆ’ ๐‘ ๐ด (๐‘ฅโˆ—) ={0 if ๐‘ฅโˆ— (๐‘Ž) โ‰ค ๐‘ conv ๐ด (๐‘ฅ

โˆ—) for all ๐‘ฅโˆ— โˆˆ ๐‘‹โˆ—โˆž else,

2conv ๐ด B {โˆ‘๐‘– _๐‘–๐‘Ž๐‘– : _๐‘– โ‰ฅ 0,โˆ‘๐‘– _๐‘– = 1 and ๐‘Ž๐‘– โˆˆ ๐ด} is the convex hull of ๐ด.

rough draft: do not distribute

Page 103: Selected Topics from Portfolio Optimization

18.2 PRELIMINARIES AND DEFINITIONS 103

and by the Rockafellar-Fenchel-Moreau-duality Theorem (cf. Rockafellar [1974]) one furtherinfers that ๐‘ โˆ—

๐ด= Iconv ๐ด.

This also reveals the relation

conv ๐ด ={๐‘ โˆ—๐ด < โˆž

}=

โ‹‚๐‘ฅโˆ—โˆˆ๐‘‹โˆ—

{๐‘Ž : ๐‘ฅโˆ— (๐‘Ž) โ‰ค ๐‘ ๐ด (๐‘ฅโˆ—)} =โ‹‚

๐‘ฅโˆ—โˆˆ๐‘‹โˆ—{๐‘ฅโˆ— โ‰ค ๐‘ ๐ด (๐‘ฅโˆ—)} ,

from which follows that the correspondence ๐ด โ†ฆโ†’ ๐‘ ๐ด, restricted to convex, compact sets๐ด โˆˆ C, is one-to-one (injective).

18.2.2 Pompeiuโ€“Hausdorff Distance

Having addition and multiplication available for sets an adequate and fitting notion of dis-tance is useful. For this define the distance from ๐‘Ž to a set ๐ต as ๐‘‘ (๐‘Ž, ๐ต) B inf๐‘โˆˆ๐ต ๐‘‘ (๐‘Ž, ๐‘),where ๐‘‘ is the distance function. The deviation of the set ๐ด from the set ๐ต is D (๐ด, ๐ต) Bsup๐‘Žโˆˆ๐ด ๐‘‘ (๐‘Ž, ๐ต),3 and the Pompeiu-Hausdorff distance is H (๐ด, ๐ต) B max {D (๐ด, ๐ต) , D (๐ต, ๐ด)}(cf. Rockafellar and Wets [1997]). Note that D (๐ด, ๐ต) = 0 iff ๐ด is contained in the topologi-cal closure, ๐ด โŠ‚ ๏ฟฝ๏ฟฝ, and H (๐ด, ๐ต) = 0 iff ๏ฟฝ๏ฟฝ = ๏ฟฝ๏ฟฝ; moreover H (๐ด, ๐ต) = H

(๐ด, ๐ต

)and obviously

H (๐ด, ๐ต) โ‰ค sup๐‘Žโˆˆ๐ด,๐‘โˆˆ๐ต ๐‘‘ (๐‘Ž, ๐‘).In a normed space with ๐‘‘ (๐‘Ž, ๐‘) = โ€–๐‘ โˆ’ ๐‘Žโ€– it is enough to consider the boundaries, as we

have in addition that H (๐ด, ๐ต) = H (๐œ•๐ด, ๐œ•๐ต) if ๐ด and ๐ต are (sequentially) compact (i.e., ๐ด and๐ต are relatively compact); moreover

H (๐ด, ๐ต) = โ€–๐‘ โˆ’ ๐‘Žโ€– (18.2)

for some ๐‘Ž โˆˆ ๐œ•๐ด and ๐‘ โˆˆ ๐œ•๐ต in this situation.

Lemma 18.2. The deviation D and the Pompeiu-Hausdorff distance H satisfy the triangleinequality, D (๐ด,๐ถ) โ‰ค D (๐ด, ๐ต) + D (๐ต,๐ถ) and H (๐ด,๐ถ) โ‰ค H (๐ด, ๐ต) + H (๐ต,๐ถ).(C, H), where C is the set of all nonempty, compact and convex subsets of ๐‘‹, is a Polish

space (i.e. a complete, separable and metric space), provided that (๐‘‹, ๐‘‘) is Polish.

Proof. See, e.g., Castaing and Valadier [1977]. ๏ฟฝ

The concept of the Hausdorff distance and the support functions introduced above linkas follows to a nice ensemble: in a normed space (๐‘‘ (๐‘Ž, ๐‘) = โ€–๐‘ โˆ’ ๐‘Žโ€–) the deviation D,using Minkowski addition, rewrites as D (๐ด,๐ถ) = inf {๐‘Ÿ > 0: ๐ด โŠ‚ ๐ถ + ๐‘Ÿ ยท ๐ต๐‘‹ } where ๐ต๐‘‹ =

{๐‘ฅ : โ€–๐‘ฅโ€– โ‰ค 1} is the unit ball and ๐ถ๐‘Ÿ B ๐ถ + ๐‘Ÿ ยท ๐ต๐‘‹ is the ๐‘Ÿ-fattening of ๐ถ. If ๐ด and ๐ถ areconvex, then D (๐ด,๐ถ) = inf

{๐‘Ÿ > 0: ๐‘ ๐ด โ‰ค ๐‘ ๐ถ + ๐‘Ÿ ยท ๐‘ ๐ต๐‘‹

}, where โ€œโ‰คโ€ is the usual โ€œโ‰คโ€-comparison

of functions (๐‘ ๐ด โ‰ค ๐‘ ๐ถ + ๐‘Ÿ ยท ๐‘ ๐ต๐‘‹iff ๐‘ ๐ด (๐‘ฅโˆ—) โ‰ค ๐‘ ๐ถ (๐‘ฅโˆ—) + ๐‘Ÿ ยท ๐‘ ๐ต๐‘‹

(๐‘ฅโˆ—) for all ๐‘ฅโˆ— โˆˆ ๐‘‹โˆ—). As ๐‘ ๐ต๐‘‹(๐‘ฅโˆ—) =

sup๐‘โˆˆ๐ต๐‘‹๐‘ฅโˆ— (๐‘) = โ€–๐‘ฅโˆ—โ€–, the norm of ๐‘ฅโˆ— in the dual (๐‘‹โˆ—, โ€–.โ€–) by the Hahn-Banach Theorem, this

simplifies further and it follows for general sets that

D (conv ๐ด, conv๐ถ) = inf{๐‘Ÿ > 0: ๐‘ ๐ด โˆ’ ๐‘ ๐ถ โ‰ค ๐‘Ÿ ยท ๐‘ ๐ต๐‘‹

}= supโ€–๐‘ฅโˆ— โ€–โ‰ค1

๐‘ ๐ด (๐‘ฅโˆ—) โˆ’ ๐‘ ๐ถ (๐‘ฅโˆ—) ,

and the Pompei-Hausdorff distance thus is

H (conv ๐ด, conv๐ถ) = supโ€–๐‘ฅโˆ— โ€–โ‰ค1

|๐‘ ๐ด (๐‘ฅโˆ—) โˆ’ ๐‘ ๐ถ (๐‘ฅโˆ—) | (18.3)

3in some references Hess [2002] also called excess of ๐ด over ๐ต.

Version: May 28, 2021

Page 104: Selected Topics from Portfolio Optimization

104 TOPOLOGIES FOR SET-VALUED CONVERGENCE

in terms of seminorms. These observations convincingly relate the Pompeiu-Hausdorff dis-tance with Minkowski addition of convex sets.

It follows from the preceding discussion and remarks that for relatively compact sets ๐ด

and ๐ถ there are ๐‘Ž โˆˆ ๐œ•๐ด, ๐‘ โˆˆ ๐œ•๐ถ and โ€–๐‘ฅโˆ—โ€– โ‰ค 1 such that D (๐ด,๐ถ) = โ€–๐‘ โˆ’ ๐‘Žโ€– = ๐‘ฅโˆ— (๐‘Ž โˆ’ ๐‘). ๐‘ฅโˆ— isan outer normal for both sets, conv ๐ด and conv๐ถ.

18.3 LOCAL DESCRIPTION

The sub-differential of a real-valued function ๐‘“ : ๐‘‹โˆ— โ†’ R at a point ๐‘ฅโˆ— โˆˆ ๐‘‹โˆ— is the set 4

๐œ• ๐‘“ (๐‘ฅโˆ—) B {๐‘ข โˆˆ ๐‘‹ : ๐‘“ (๐‘งโˆ—) โˆ’ ๐‘“ (๐‘ฅโˆ—) โ‰ฅ ๐‘งโˆ— (๐‘ข) โˆ’ ๐‘ฅโˆ— (๐‘ข) for all ๐‘งโˆ— โˆˆ ๐‘‹โˆ—} โŠ‚ ๐‘‹.

Notably ๐œ• ๐‘“ (๐‘ฅโˆ—) is a subset of ๐‘‹, so ๐œ• ๐‘“ is a set-valued mapping which is expressed by writing

๐œ• ๐‘“ : ๐‘‹โˆ— โ‡’ ๐‘‹

๐‘ฅโˆ— โ†ฆโ†’ ๐œ• ๐‘“ (๐‘ฅโˆ—) .

The symbolโ‡’ ๐‘‹ indicates that the outcomes are subsets โ€“ a collection of elements โ€“ of ๐‘‹.With the sub-differential at hand we may add the following standard characterization of

the support function ๐‘ ๐ด of a set ๐ด, which will turn out useful for our purpose:

Lemma 18.3. The support function ๐‘ ๐ด has the sub-differential ๐œ•๐‘ ๐ด (๐‘ฅโˆ—) = argmaxconv ๐ด ๐‘ฅโˆ—.5

Moreover ๐œ•๐‘ ๐ด (๐‘ฅโˆ—) โŠ‚ ๐œ•๐ด.

Proof. With ๐‘ข โˆˆ argmaxconv ๐ด ๐‘ฅโˆ— โŠ‚ conv ๐ด, for any ๐‘งโˆ— โˆˆ ๐‘‹โˆ— we have that ๐‘ ๐ด (๐‘งโˆ—) โ‰ฅ ๐‘งโˆ— (๐‘ข) =๐‘ ๐ด (๐‘ฅโˆ—) + ๐‘งโˆ— (๐‘ข) โˆ’ ๐‘ฅโˆ— (๐‘ข) and hence ๐‘ข โˆˆ ๐œ•๐‘ ๐ด (๐‘ฅโˆ—).

Conversely, with ๐‘Ž โˆˆ ๐œ•๐‘ ๐ด (๐‘ฅโˆ—) we have that ๐‘ ๐ด (๐‘งโˆ—) โˆ’ ๐‘ ๐ด (๐‘ฅโˆ—) โ‰ฅ ๐‘งโˆ— (๐‘Ž) โˆ’ ๐‘ฅโˆ— (๐‘Ž) or

๐‘ฅโˆ— (๐‘Ž) โ‰ฅ ๐‘ ๐ด (๐‘ฅโˆ—) + ๐‘งโˆ— (๐‘Ž) โˆ’ ๐‘ ๐ด (๐‘งโˆ—) (18.4)

for all ๐‘งโˆ—. For the particular choice ๐‘งโˆ— = 0 we find that ๐‘ฅโˆ— (๐‘Ž) โ‰ฅ ๐‘ ๐ด (๐‘ฅโˆ—) and it remains to showthat ๐‘Ž โˆˆ conv ๐ด. Suppose that ๐‘Ž โˆ‰ conv ๐ด, then โ€“ by the Hahn-Banach Theorem โ€“ there is a ๐‘งโˆ—

such that ๐‘งโˆ— (๐‘Ž) > sup{๐‘งโˆ— (๐‘Žโ€ฒ) : ๐‘Žโ€ฒ โˆˆ conv ๐ด

}= ๐‘ ๐ด (๐‘งโˆ—). This same equation holds for multiples

_ ยท ๐‘งโˆ— (_ > 0), hence (18.4) cannot hold in general; thus, ๐‘Ž โˆˆ conv ๐ด.The second statement is obvious. ๏ฟฝ

4note, that ๐œ• ๐‘“ (๐‘ฅโˆ—) โŠ‚ ๐‘‹ is a subset in the pre-dual ๐‘‹ rather than ๐‘‹โˆ—โˆ—.5We shall abbreviate the argument of the maximum of a function ๐‘“ restricted to ๐ท by argmax๐ท ๐‘“ B

argmax { ๐‘“ (๐‘ฅ) : ๐‘ฅ โˆˆ ๐ท}.

rough draft: do not distribute

Page 105: Selected Topics from Portfolio Optimization

Index

CCapital Asset Pricing Model (CAPM), 32capital market line, CML, 27

DDirac measure, 98distribution function

cumulative, cdf, 37

Mmarginal, 98

Pportfolio

market, 30most efficient, 27tangency, 27

Rrisk

aggregate, 33idiosyncratic, 33residual, 33systematic, 33undiversifiable, 33unsystematic, 33

Ssecurity market line (SML), 32Sharpe ratio, 33

105