rights / license: research collection in copyright - non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[...

196
Research Collection Doctoral Thesis Robust methods for the SABR model and related processes Analysis, asymptotics and numerics Author(s): Horvath, Blanka N. Publication Date: 2015 Permanent Link: https://doi.org/10.3929/ethz-a-010604629 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection . For more information please consult the Terms of use . ETH Library

Upload: others

Post on 05-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Research Collection

Doctoral Thesis

Robust methods for the SABR model and related processesAnalysis, asymptotics and numerics

Author(s): Horvath, Blanka N.

Publication Date: 2015

Permanent Link: https://doi.org/10.3929/ethz-a-010604629

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For moreinformation please consult the Terms of use.

ETH Library

Page 2: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

DISS. ETH NO. 22945

ROBUST METHODS FOR THE SABR MODELAND RELATED PROCESSES:

ANALYSIS, ASYMPTOTICS AND NUMERICS

A dissertation submitted to

ETH ZURICH

for the degree ofDoctor of Sciences

presented by

BLANKA NORA HORVATH

Dipl. Math. University of BonnM. Econ. The University of Hong Kong

born August 22, 1985citizen of Hungary

accepted on the recommendation of

Prof. Dr. Josef Teichmann examinerProf. Dr. Johannes Muhle-Karbe co-examinerProf. Dr. Peter Friz co-examiner

2015

Page 3: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

ii

Page 4: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

To Maria and Béla.

iii

Page 5: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

iv

Page 6: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Abstract

This thesis is dedicated to the study of the stochastic alpha beta rho (SABR)model and related stochastic processes both from a theoretical and a practicalperspective. During its 14 years of existence, the SABR model has become in-dustry standard and is now ubiquitous in interest rate modelling. Its popularityarose from a tractable asymptotic expansion for implied volatility, derived byheat kernel methods, which strengthened the connection between geometry andfinance. In recent years markets have moved to historically low rates for whichthis expansion is prone to yield inconsistent prices, thereby inducing distortionsand arbitrage into modelling. As SABR is deeply embedded in the markets,there is undisputed need for uniform pricing methods—suitable for the SABRmodel—which eliminate the observed irregularities. Since the emergence of theaforementioned problems, the so-called arbitrage issue has been addressed in nu-merous approaches. Despite several excellent contributions to this matter in thepast years, to date there does not seem to be a consensus about a method foradjusting the SABR formula, or about the reasons for the potential appearanceof inconsistent prices under this model using different pricing tools. The aim ofthis thesis is to investigate the properties of the model from this perspective andto propose some effective solutions to the arbitrage problem.An analysis of the SABR model for near-zero (positive or negative) rates callsfor a more general functional analytic framework than the one that Riemanniangeometry and heat kernel expansions can provide. This in turn can be held ac-countable for the breakdown of the asymptotic formula in this region. Whileseveral available methods aim for exact approximation of the absolutely continu-ous part of the SABR distribution, we confirm that the asymptotics of the impliedvolatility for extreme strikes (for which the inconsistencies typically appear) canbe fully characterized by the mass at zero, regardless of the absolutely continuouspart of the distribution.Accordingly, we also propose a finite element method—tailored to the specificdegeneracy of the model at the origin—in order to evaluate option prices underthe SABR model. Finally we prove convergence and derive error estimates forthe proposed numerical scheme.

v

Page 7: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

vi

Page 8: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Kurzfassung

Diese Arbeit ist der Untersuchung des Stochastischen Alpha Beta Rho (SABR)Modells und verwandter stochastischer Prozesse gewidmet. In den 14 Jahren sei-ner Existenz hat sich das SABR Modell zu einem der meist verbreiteten Modelleentwickelt und ist heute Industriestandard auf Zinsmärkten. Eine leicht hand-habbare asymptotische Formel für die implizite Volatilität trug wesentlich zurBeliebtheit des Modells bei. Diese asymptotische Formel wurde aus einer Ent-wicklung des Wärmeleitungskerns auf einer passend gewählten Mannigfaltigkeithergeleitet. Dies intensivierte in den darauffolgenden Jahren die Verwendung vonMethoden aus der Riemannschen Geometrie in der Finanzmathematik. Die jüng-sten Entwicklungen der Finanzmärkte zeigten eine deutliche und persistente Ver-schiebung der Zinsraten in einen historisch niedrigen Bereich, in dem das SABRModell widersprüchliche Optionspreise generieren kann. Das daraus entstande-ne sogenannte Arbitrage Problem und die damit verbundenen Marktverzerrun-gen ließen keinen Zweifel, dass eine Korrektur der SABR Formel notwendig ist.Seitdem wurde das SABR Arbitrage Problem vielfältig aufgegriffen. Dennochgibt es bis heute keine einheitlich etablierte Aufklärung und Lösung dieses Pro-blems. Ausgangspunkt dieser Arbeit ist die bekannte Beobachtung, dass standardWärmeleitungskernentwicklungen in der Nähe der Null nicht angewandt werdenkönnen. Klassische Wärmeleitungs-Methoden approximieren den absolutstetigenTeil der Verteilung, hingegen kann die Asymptotik der impliziten Volatilität na-he bei Null vollständig durch den singulären Teil der Verteilung charakterisiertwerden. Wir zeigen, dass das Problem in einem allgemeineren funktionalanalyti-schen Rahmen lösbar ist, nämlich mit Hilfe der Theorie der nicht-symmetrischenDirichlet-Formen. In diesem Rahmen entwickeln wir eine Finite Elemente Metho-de, um Optionspreise unter dem SABR Modell numerisch zu approximieren, undbeweisen Konvergenzraten.

vii

Page 9: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

viii

Page 10: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Acknowledgements

First and foremost I would like to express my gratitude to my advisors JosefTeichmann and Johannes Muhle-Karbe for their trust in me and in this PhD pro-ject, for supporting, accompanying and mentoring me in various ways throughoutthe past years even beyond purely mathematical contents and to Peter Friz foraccepting the task of acting as my co-examiner.I would also like to thank Paul Embrechts and the Group 3 for providing suchan excellent working atmosphere at ETH Zürich during these years. Within theETH working group my special thanks go to Leif Döring and Oleg Reichmann forthe collaboration and the fruitful mathematical discussions which helped deepenmy understanding in several perspectives. I am deeply grateful to my co-authorsArchil Gulisashvili and Antoine Jacquier for sharing their enthusiasm with me andalways providing me with an ample source of inspiration and encouragement.I greatly appreciate all my discussions of mathematical and non-mathematicalnature with Philipp Harms, Martin Herdegen, Sebastian Herrmann, Marcel Nutzand Robbin Tops. Moreover I would like to express my thanks to those friendsand colleagues at ETH with whom I had the pleasure to share an office: Ozan Ag-dogan, Benjamin Bernard, Christoph Czichowsky, Florian Leisch, Lavinia Perez-Ostafe, Christian Reichlin, Anja Richter, Winslow Strong, and Danijel Zivoi.And likewise to Denise Künzli, Patricia Malzacher-Lienhard, Martin Vollenweider,Sonja Cox, Christa Cuchiero, Philipp Dörsek, Nicoletta Gabrielli, Selim Gökay,Lukas Gonon, Georg Grafendorfer, Chong Liu, Ren Liu, David Prömel, Max Rep-pen, Sara Svaluto-Ferro, Robert Salzmann, Annina Saluz, Dejan Velusek, Mir-jana Vukelja and Marcus Wunsch, and all those current and former colleagueswho made my years at ETH memorable in so many ways and to all who particip-ated in keeping the tradition of the Sola relay race alive and running in the pastyears... you know who you are.I would particularly like to thank Anton Bovier and Evangelia Petrou for en-couraging me to commence my PhD studies at ETH Zürich and my friends fromBonn, Hong-Kong, Budapest and Aleppo for the ever since ongoing conversations,Boryana Dimitrova, Tristan Buckmaster, Robert Kucharczyk, Nikolai Nowaczyk,Lara Skuppin, Rolf Mengert, Nico Keller and Lydia Göbel.And last but most certainly not least am I deeply indebted to Elias, my friends andmy family for their patience and understanding particularly in the last monthsand for all support I have gotten.

ix

Page 11: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

x

Page 12: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Contents

I Introduction . . . . . . . . . . . . . . . . . . . . . . . . 11 The SABR model . . . . . . . . . . . . . . . . . . . . . . 12 Overview of the thesis . . . . . . . . . . . . . . . . . . . . 7

II Geometry and Time Change: Functional Analytic (Ir-)RegularityProperties of SABR-type Processes . . . . . . . . . . . . . 111 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 112 The time change as a change of geometry. . . . . . . . . . . . 153 The semigroup point of view . . . . . . . . . . . . . . . . . 18

3.1 Banach spaces and (generalized) Feller properties . . . . . 183.2 Hilbert spaces and (symmetric) Dirichlet forms . . . . . . 22

4 Asymptotics . . . . . . . . . . . . . . . . . . . . . . . . 274.1 Large-time asymptotics . . . . . . . . . . . . . . . . . . . . 274.2 Short-time asymptotics and a generalized distance . . . . . 28

5 Proofs. . . . . . . . . . . . . . . . . . . . . . . . . . . 306 Reminder on some properties of diffusions . . . . . . . . . . . 45

6.1 Scalar diffusions:Speed measure, local time and time change . . . . . . . . . 45

6.2 Symmetric Dirichlet forms: closability . . . . . . . . . . . 486.3 Closed forms, Dirichlet forms and their generators . . . . . 526.4 Diffusions as Dirichlet forms and the intrinsic metric . . . 53

III Mass at Zero in the SABR Modeland Small-Strike Implied Volatility Expansions . . . . . . . 551 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 552 Mass at zero in the uncorrelated SABR model . . . . . . . . . 58

2.1 The decomposition formula for the mass . . . . . . . . . . 582.2 Small-time asymptotics . . . . . . . . . . . . . . . . . . . . 602.3 Large-time asymptotics . . . . . . . . . . . . . . . . . . . . 612.4 Representations of the density of the integrated variance . 66

3 Mass at zero for the correlated SABR model . . . . . . . . . . 733.1 SABR geometry and geometry preserving mappings . . . . 743.2 Application: Large-time behavior of the mass . . . . . . . 80

4 Implied volatility and small-strike expansions . . . . . . . . . . 835 Proofs of Section 2 . . . . . . . . . . . . . . . . . . . . . 86

xi

Page 13: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

xii Contents

IV Dirichlet Forms and Finite Element Methodsfor the SABR Model . . . . . . . . . . . . . . . . . . . . 931 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 932 Preliminaries and problem formulation . . . . . . . . . . . . . 97

2.1 General analytic setup . . . . . . . . . . . . . . . . . . . . 982.2 Analytic setting for the SABR model . . . . . . . . . . . . 1012.3 Well-posedness of the variational pricing equations

and a non-symmetric SABR Dirichlet form . . . . . . . . . 1053 Discretization . . . . . . . . . . . . . . . . . . . . . . . 109

3.1 Space discretization and the semidiscrete problem . . . . . 1093.2 Time discretization and the fully discrete scheme . . . . . 115

4 Error estimates . . . . . . . . . . . . . . . . . . . . . . . 1164.1 Approximation estimates . . . . . . . . . . . . . . . . . . . 1174.2 Discretization error and convergence

of the finite element method . . . . . . . . . . . . . . . . . 1205 Reminder on some properties of the considered function spaces . . 123

5.1 Weighted Sobolev spaces . . . . . . . . . . . . . . . . . . . 1235.2 Bochner spaces . . . . . . . . . . . . . . . . . . . . . . . . 1245.3 Tensor products of Hilbert spaces . . . . . . . . . . . . . . 125

6 Non-symmetric Dirichlet forms . . . . . . . . . . . . . . . . 1267 Further Proofs . . . . . . . . . . . . . . . . . . . . . . . 129

7.1 Approximation estimates for CEV: Alternative . . . . . . . 132

V Portfolio Choice with Ambiguous Drifts and Volatilities . . . 1331 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 1332 Benchmark case without ambiguity aversion . . . . . . . . . . 1353 Setting with ambiguity aversion . . . . . . . . . . . . . . . . 1364 Main result . . . . . . . . . . . . . . . . . . . . . . . . 1375 The parameters Ψ1 and Ψ2 . . . . . . . . . . . . . . . . . . 1386 Illustration of the optimal strategy . . . . . . . . . . . . . . 1397 Proof of the main result . . . . . . . . . . . . . . . . . . . 1408 Discussion of the ambiguity parameters . . . . . . . . . . . . 142

A Martingale problems . . . . . . . . . . . . . . . . . . . . 1451 Well-posedness of the martingale problem for SABR . . . . . . . 145

B Some Background on Analysis on Manifolds. . . . . . . . . 1491 The Laplace-Beltrami operator and the Riemannian volume form . 149

1.1 The SABR Laplace-Beltrami operators . . . . . . . . . . . 1562 Heat equation and heat kernel on manifolds . . . . . . . . . . 156

2.1 Operator semigroup and self-adjointness . . . . . . . . . . 1573 Transformation of the heat kernel under isometry . . . . . . . . 163

3.1 Polar coordinates and the Eikonal equation . . . . . . . . . 165

Page 14: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Chapter I

Introduction

1 The SABR model

The SABR articles, Managing Smile Risk [89] and Probability Distribution in theSABR model of Stochastic Volatility [90] are undoubtedly a landmark for thepractice of pricing of interest rate derivatives with stochastic volatility. Thesetwo articles made a great impact by establishing a realistic model for fixed in-come markets, which came along with a tractable and easy-to-implement asymp-totic expansion for implied volatility. The first one [89], which appeared 2003 inWilmott Magazine introduced the SABR model (1.1) and made an intriguinglyconvincing case for its advantages over local volatility models—which were pre-valently used at the time—to closely fit the shape and realistically predict thedynamics of the implied volatility surface. Singular perturbation methods wereused to arrive at a tractable asymptotic expansion to approximate the Black im-plied volatility for an asset following SABR dynamics. Furthermore, the influenceof each parameter of the model on the shape and dynamics of the implied volat-ility smile is explained in a clear and concise way. The model—or more preciselythe asymptotic formula itself—soon became benchmark in interest rates markets[10, 16, 17, 137, 147] and is today deeply embedded in the markets.The second SABR article [90] has been available as a preprint since 2004. Itsuggested a geometric viewpoint and a justification for the singular perturbationexpansion by drawing a precise link between the so-called normal (β = ρ = 0)SABR model and Brownian motion on the hyperbolic plane; such as between thegeneral SABR model and Brownian motion on a suitably chosen manifold (theSABR plane). This provided an insightful means to explain the original Haganexpansion by Varadhan’s short time asymptotic formula (for Brownian motion onthe SABR manifold) combined with a simpler—regular—perturbation. The geo-metric viewpoint of [90] and its aesthetic appeal further promoted the integrationof stochastic analysis on manifolds and heat kernel expansions into the contextof mathematical finance [75, 74, 115, 114, 142]. It soon became one of the mostwidely circulated preprints in this context and was recently published (June 2015)in Large Deviations and Asymptotic Methods in Finance. We will elaborate onthese statements in more detail in the following sections and chapters.

Page 15: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 I Introduction

The model: Definition, nomenclature and model parameters

In the first article, the SABR model [89, Formulas 2.15 a-c] is introduced as

dFt = αtFβt dWt, F0 = f > 0,

dαt = ναt dZt, α0 = α > 0,d〈Z,W 〉t = ρ dt,

(1.1)

where ν ≥ 0, ρ ∈ [−1, 1], β ∈ [0, 1], and W and Z are correlated Brownianmotions on a probability space (Ω,F , (Ft)t≥0,P). The form (1.1) explains theacronym SABR (stochastic alpha beta rho) model in terms of the parametersof the model and the initial value of the volatility. The acronym SABR is lessevident from the representation used in the second article [90, Formulas (1)-(3)],which describes the dynamics more generally by

dFt = ΣtC(Ft) dWt, F0 = F > 0,dΣt = νΣt dZt, Σ0 = Σ > 0,

d〈Z,W 〉t = ρ dt,(1.2)

for a smooth positive non-decreasing function C with C(−x) = −C(x), x > 0.Since in [90] the focus is on the geometric viewpoint on SABR and not on themodel parameters, no confusion is possible. Recently, another nomenclature tookover

dXt = YtXβt dWt, X0 = x > 0,

dYt = αYt dZt, Y0 = y > 0,d〈Z,W 〉t = ρ dt,

(1.3)

according to which the parameter α governs the volatility of the volatility Y .While today there is no general consensus about the appearance of the modelacross the vast related literature, there is a broad agreement on the propertiesof the SABR model, the role and influence of the model parameters and theimportance of the model itself. The literature related to SABR is not only broadbut also dynamic. Therefore, claiming to give a full and comprehensive review ofall compelling articles, contributions and discussions related to SABR circulatingamong academics and practitioners and in various influential online platformsand forums would not do justice to this topic. In the following, we include a briefreminder of some relevant properties of the SABR model and attempt a snapshotof different approaches in the literature, which are relevant for our discussion inthe following chapters.

Brief reminder on SABR and related literature

Today market prices are quoted in terms of implied volatility. The popularity ofthe SABR model arose from its ability to capture the shape and dynamics (whenthe current value x of the asset changes) of the implied volatility smile observedin the market. The tractable asymptotic expansion and the clearly identified roleof each model parameter on the shape and dynamics of the volatility surface—presented in [89]—further added to its appeal. Recall that for a forward contract

Page 16: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

1 The SABR model 3

F := F (t;T, Tset) (with value x at t = 0) with exercise date and settlement dateT < Tset this (Black) implied volatility σimpK,T is the unique solution to the equation

CB(0, T, Tset, x,K, σ) = D(Tset)C(K,T )

where C(K,T ) is the market price of a call option with strike K and maturity T ,D(Tset) denotes the discount factor1 and CB is the Black formula2 for a Europeancall option see [89, (2.16a)], and [63, (2.6)]. The implied volatility formula [89,(A. 65)] (see also [140]) for the SABR model (1.3) reads

σimpK,T (x, y, α, β, ρ) =y log(x/K)(x1−β−K1−β

1−β

) ( ζ

χ(ζ)

×

1 +

[2β(β−1)

x2av−(

βxav

)2

+ 1x2av

24y2x2β

av

+1

4ραy

β

xavxβav +

2− 3ρ2

24α2

]T + . . .

,

(1.4)

where we used the abbreviations

χ(ζ) := log

(√1−2ρζ+ζ2−ρ+ζ

1−ρ

)such as xav :=

√xK, γ1 := β

xav, γ2 := β(β−1)

x2av

and ζ := ααx−Kxβ

.

Figure I.1: The SABR implied volatility surface generated by the formula (1.3) forthe parameters (y, α, β, ρ) = (0.3, 0.45, 0.5,−0.25) as a function of log-moneynesslog(x/K) and time to maturity T .

The SABR formula (1.4) uses a set of five input values coming from the model:the initial values and the model parameters, which are (x, y, α, β, ρ), in represent-ation (1.3) of the SABR model. While the initial rate x can be directly observed

1That is the value today of one unit of currency delivered on date Tset.2CB(0, T, Tset, x,K, σ) = D(Tset) (xΦ(d1)−KΦ(d2)), where d1,2 = log(x/K)±σ2T

σ√T

.

Page 17: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

4 I Introduction

in the market, the other parameters (y, α, β, ρ) are subject to calibration (andchoice3). Typical values of these parameters at the time before 2007 were between0.05 ∼ 0.06 for x, and (depending on the maturity of the contract) y calibratedto 0.02 ∼ 0.04, α to 0.2 ∼ 0.7 and −0.25 ∼ ρ, while the parameter β was of-ten chosen at level 0.5 see, [147, Table 3.1] and also [89, Tables 3.1 and 3.2] formore details. Today, values of x have moved to lower base rates, which are nowtypically close to zero and can even be negative. At the same time markets havebecome more volatile, leading to higher values of y, see [14, 16, 17]. In suchmarket environments certain anomalies and challenges arise as discussed in moredetail below.

The key ingredient is the constant elasticity of variance (CEV) parameter β,which determines the general dynamics of the volatility smile. It also determines—alongside with the parameter ρ—the shape and steepness of the smile. Further-more, an increase in the initial volatility y results in an upwards shift of the smile[89, 147]. In [89, Figures 3.1 and 3.2] the dependence of the dynamics of the smileon the parameter β are examined. While a high value β ∼ 1 results in nearlyflat shifts of the implied volatility as x varies, the slope of these shifts is morepronounced for lower values of β. The discussion in [89, Section 3] concludes thatthe exponent β can be chosen according to “aesthetic” considerations, normallyresulting in the choices β = 1 (stochastic lognormal), β = 1/2 (stochastic CIR) orβ = 0 (stochastic normal), see also [17] for a further discussion on typical choicesof β and [147, Section 3.6] for potential complications in calibration and [77] forsmart parameters.Indeed, if the volatility of volatility is set to α = 0, SABR reduces to the CEVmodel [105, Section 6.4] for which computations are tractable, and in fact, theasymptotic formula in [89, 90] is an expansion in terms of the ratio α2T , whereT refers to the time (to maturity) scale. This allows to interpret the volatility ofvolatility as a scaling factor of time, cf. [90], see also Chapter II below.We refer to the original articles [89, 90] and to the renowned monographs [72,114, 137, 147] for further details and a clear and in-depth discussion of the aboveconcepts and statements.

Today SABR is ubiquitous in the markets. The widely used asymptotic for-mula obviously loses accuracy by its very nature for long-dated derivatives, orwhen the volatility of volatility α is large. However, in the low interest rate andhigh volatility environment we are facing today the implied volatility obtained bythe expansion which made SABR popular in the first place, can yield a negativedensity function for the process X in (1.3), therefore exhibiting arbitrage.

The anomaly of negative densities and the problem of possible arbitragehave become more eminent in today’s low interest-rate environments. The phe-nomenon of near zero interest rates (and even negative rates in some parts offixed-income markets) could be persistently observed in Switzerland in the past

3The model is over-determined, see e.g. [147].

Page 18: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

1 The SABR model 5

years. It also reached the Euro, the US dollar and numerous other currencies andis not likely to disappear anywhere where central banks are holding down interestrates in an attempt to spur growth. The anomalies that SABR encounters havetherefore initiated extensive research and the arbitrage issue has been addressedby several leading experts, among them Hagan et al., who proposed [88] (January2013) a modification of SABR which was named arbitrage-free SABR.There exist several refinements to Hagan’s original asymptotic formula. Based ongeneral results on stochastic volatility models obtained by Berestycki et al. [26],Obłój proposed [140] (March 2008) to fine tune the smile by suggesting a cor-rection in the leading order to the asymptotic formula, thereby reducing theproblem of negative densities. Later on, inspired by the geometric viewpoint [90,115] which suggested the application of heat kernel expansion techniques, theasymptotic formula was further refined by Paulot [142] (June 2009) providinga second order term. At the same time Islah [102] (October 2009) derives theexact probability density (for the absolutely continuous part of the distribution)in the uncorrelated SABR model applying a time-change and so do Antonov andSpector [15] (March 2012) independently. Their advanced analytics [15] further-more include an approximation of the correlated case. However, at that time theexpressions for the uncorrelated case are already fairly involved and the obtainedformulas containing triple integrals make computations slow and intractable. Ant-onov, Konikov and Spector soon resolve the tractability issue [16] (August 2013)by proposing effective simplifications of the involved exact formulae in the un-correlated case and approximations in the correlated case. This allowed SABRto spread its wings into further regions of extreme strikes. While the analysis in[16] concerns the absolutely continuous part of the SABR distribution, Doust [54](Spring 2012) points out the importance of the probability mass at the origin,which can accumulate to significant values when imposing absorbing boundaryconditions on the process at zero. Monte Carlo approximations are provided toquantify the mass and an upper bound on the maturities is derived beyond whichthe accumulation of mass is so large that the Hagan formula cannot be expectedto be valid. The connection was made by Gulisashvili, Horvath and Jacquierin [86] (February 2015), which eliminates the arbitrage issue for the uncorrel-ated and the normal SABR models by providing asymptotic approximations tothe singular part of the SABR distribution and to the implied volatility. Theresults obtained there are based on model independent results of De Marco, Hil-lairet and Jacquier [46] (October 2013) and of Gulisashvili [85], where asymptoticformulae for the implied volatility were derived in the presence of a positive mass.

Further contributions addressing the SABR model in numerous different per-spectives and contexts can be found in [12, 64, 65, 101, 111, 115, 141, 145]. Inthe context of asymptotic refinements of the Hagan formula one should single outsome alternative perspectives to the ones described in the previous paragraph.An approach of calculating local times for the SABR model, was suggested byBenhamou and Croissant [25] (September 2007). A similar path was taken byBalland and Tran [17] (June 2013), who found that a normal base model is better

Page 19: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

6 I Introduction

suited than Black-Scholes for the asymptotic approximation of implied volatilityfor SABR and proclaimed that SABR goes normal. In the meanwhile Andreasenand Huge [11] (January 2013) proposed an expansion of the forward volatilityalongside with an extension (ZABR) of the SABR model to include a CEV para-meter in the volatility process. These approaches provide fixes to the aforemen-tioned anomalies by proposing modifications of the original SABR model or ofthe base model for the implied volatility expansion but do not give a full accountfor the appearance of the irregularities at interest rates near zero in the originalmodel. In the latter refinement results asymptotic techniques are typicallycombined with numerical methods for efficient approximation.

The most popular numerical approximation methods which were consideredso far for the SABR model or closely related models fall into the following classes:probabilistic methods comprising of path simulation of the process combinedwith suitable (quasi-) Monte Carlo approximation, see [14, 40] for the SABRmodel and [6, 5, 38, 43, 100] for related models. Difficulties of Euler methodswith respect to simulation bias in the context of SABR are discussed in [40].Splitting methods—where the infinitesimal generator of the process is decom-posed into suitable operators for which the pricing equations can be computedmore efficiently—provide a powerful tool in terms of computation efficiency forsufficiently regular processes. Such methods are considered in [21] for a modelclosely related to SABR (see also [20]), and in [53] for a large class of models.However, the applicability of corresponding convergence results to the SABRmodel itself is not fully resolved.Among fully deterministic PDE methods are most notably finite differencemethods and finite element methods. Finite difference methods were con-sidered in [11, 116] for the modification [88] of the SABR model. For the SABRmodel (1.3) itself, proving convergence of finite difference schemes is not obviousdue to the degeneracies of the model in the critical region near the origin. Infact, standard convergence results for usual finite difference schemes do not applyhere. Furthermore, in lack of an established benchmark solution in the criticalregion, empirical convergence rates may not be very informative as the “bench-mark solution” may exhibit the same numerical errors as the tested one.Finite element methods, were described in the context of mathematical financein [172], see also the references therein. In the recent textbook [96] finite elementmethods have been applied to a large class of financial models and provide arobust and flexible framework to handle the stochastic finesses of models withdegeneracies of CEV-type. Although this numerical method is tailor-made forCEV and SABR-type models, finite element methods did not appear in the con-text of the SABR model in the related literature so far. For a broad review ofsimulation schemes used for the SABR model (1.3) in specific parameter regimes,see [122] and references therein.To summarize, up to this date there is no numerical scheme with a proven con-vergence rate for the original SABR model (1.3), which is applicable for generalparameters.

Page 20: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Overview of the thesis 7

2 Overview of the thesisThe central aims of this thesis are to shed light on the irregularities of the SABRmodel appearing at near-zero interest rates and to establish a suitable analyticsetting for the model. In the following chapters we identify functional analyticproperties of the model, which can be held accountable for the breakdown of theasymptotic formulae considered so far (Chapter II). We provide effective solu-tions to the irregularities near the origin by proposing methods robust enoughto handle the degeneracies of the SABR model along the two general lines of re-search described in the previous paragraph: asymptotic techniques and numericalapproximation.Tractable asymptotic expansions are derived for the implied volatility whicheliminate the arbitrage issue for the uncorrelated and the normal SABR mod-els (Chapter III). We then derive a finite element approximation scheme for theSABR model under mild assumptions on the parameters, which are easily ful-filled in practical scenarios. Furthermore we prove convergence and provide anerror analysis for our finite element scheme (Chapter IV). Finally, we return tothe benchmark Black-Scholes case and turn from pricing to portfolio optimiza-tion under the assumption that the parameters of the process are unknown andsubject to estimation errors. We derive a setting for portfolio optimization whichsmoothly interpolates between extreme scenarios of full information and full un-certainty. In this setup we provide explicit formulae for the optimal strategiessuch as asymptotic formulae to identify the effect of respective parameters on thestrategy (Chapter V). The thesis and the interdependencies of the chapters areorganized as illustrated in the diagram below, however, each chapter is writtenin a self-contained way, and necessary results from other chapters are always re-called.

I. Introduction

vvV. Portfolio Choice under Ambiguity II. Functional Analytic Properties

vv III. Asymptotics for Mass and Volatility IV. Finite Element Methods

Page 21: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

8 I Introduction

Content and contributions of this work

The geometric viewpoint on the SABR model proposed in [90] plays an importantrole in this thesis. The fundamental contribution of the second SABR article was aderivation of a short-time asymptotic expansion for the probability density of theSABR process. Hagan, Lesniewski and Woodward introduced the so-called SABRplane—a manifold which is isometric to the hyperbolic plane—and observed thatthe SABR process coincides with Brownian motion on this manifold up to a lowerorder (drift) term. They derived the density for Brownian motion on this manifoldfrom the hyperbolic heat kernel for which an explicit solution is available.The connection to Varadhan’s formula was established for the leading order termof the expansion of the SABR probability density and the contribution of the firstorder (drift) term was identified in terms of a regular perturbation. This openedthe door to heat kernel methods allowing to determine further higher order termsof the Hagan expansion. These have lead to better approximations but did notfully eliminate the anomalies that the model exhibits.Indeed, the irregularities of the SABR formula appearing near the origin areunlikely to be explained by deriving further higher order terms of the SABRformula via heat kernel techniques.The way to understanding the anomalies of the SABR formula much rather leadsto an analysis of the leading order term of the expansion. In this work we identifyreasons for this, which are twofold:

(i) Varadhan’s formula relies on the assumption of ellipticity of the infinitesimalgenerator of the considered process. This assumption is violated in the caseof SABR for all β ∈ [0, 1], since for these parameters the infinitesimalgenerator is degenerate at zero and at the same time zero lies in the statespace of the process.

Although Varadhan-type short time asymptotics are available beyond theuniformly elliptic setup (see [22, 23, 50, 51] for the hypoelliptic case), it isnot evident whether such short-time asymptotic results can be establishedfor SABR, due to the nature of its degeneracies.

In the case β ∈ [1/2, 1) the origin (X = 0) is naturally absorbing for theSABR process, while this behavior is not reflected in a purely Riemanniananalysis of geodesic distances near X = 0. This gives a strong indicationthat for these parameters Varadhan-type short time asymptotics will breakdown in this region. In fact, such a conclusion was made in [56] for adiffusion exhibiting similar degeneracies as the SABR process, see (II.4.2).

(ii) Although the ellipticity assumption is not violated in the case β = 0, theasymptotic formula for the SABR probability density derived in [90] canbecome erroneous, if one poses Dirichlet boundary conditions on the forwardprocess (X) at the origin. This is due to the accumulation of probabilitymass at zero, while the formula of [90] is derived from the hyperbolic heatkernel, which is supported on the whole upper half-plane.

Page 22: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Overview of the thesis 9

In Chapter II we examine such irregularities of the SABR model and relatedprocesses and also prove certain regularity properties of the SABR Semigroupon suitable Hilbert- and Banach spaces. We propose a geometry in the languageof Dirichlet forms, which is better suited to describe the behavior of the processnear the origin than classical Riemannian geometry. With view to this, we derivesuch symmetric Dirichlet forms for the uncorrelated and for the normal SABRmodels, which allow for a natural extension of the notion of geodesic distancesto the setting of SABR-type degeneracies. For general parameters this symmetryproperty breaks down.

Despite the potential breakdown of Varadhan-type short-time asymptotics forSABR, it is possible to derive such tractable asymptotic formula for its impliedvolatility which are well-adapted to SABR-type degeneracies. Approximations ofthe implied volatility in terms of asymptotic expansions are available, not only forsmall and large maturities, but also for extreme strikes. Roger Lee’s celebratedMoment Formula [119] relates the behavior of the implied volatility for smallstrikes to the price of a European Put option. This model-independent resultwas subsequently refined by Benaim and Friz [24] and Gulisashvili [84].In [46, 85] De Marco, Hillairet and Jacquier as well as Gulisashvili showed thatwhen the underlying distribution has an atom at zero, the small-strike asymptoticbehavior of the implied volatility is solely determined by this mass, irrespectiveof the distribution of the process on (0,∞).

The results in [46, 85] provide an explanation why the anomalies and thearbitrage issue could not be fully eliminated at extreme strikes in [16, 15, 102].In these results the distribution of the uncorrelated SABR model was consideredon the absolutely continuous part of the distribution only.

Our asymptotic formulae in Chapter III provide the singular counterpart tothe results of [16]. While an abstract version of short-time asymptotic results inthe spirit of Varadhan is available for certain diffusions exhibiting similar degen-eracies (II.4.2) as the SABR process, their implications with view to asymptoticformulae for the SABR-implied volatility at extreme strikes are not obvious. Inour short-time asymptotic expansion of the uncorrelated case we do not aim forVaradhan-type asymptotics, but instead we take an inverse Laplace-transformapproach after decoupling the SABR process into a CEV part and an independ-ent time change. We derive short-time asymptotic formulae for the mass at zerofor the uncorrelated SABR model as well as large-time formulae for the uncor-related and normal SABR models and infer the respective wing asymptotics forSABR from [46]. Our results eliminate the arbitrage issue near the origin forthe uncorrelated and normal SABR models, while tractability of the formulae—which contributed considerably to the popularity of SABR in the first place—ispreserved in the presence of mass.

Page 23: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

10 I Introduction

In Chapter IV we turn to general parameter regimes for which symmetry ofDirichlet forms associated to SABR cannot be expected. We derive a suitableanalytic setup on which a non-symmetric Dirichlet form can be associated withthe SABR model. That is, we construct a Gelfand triple of suitably chosen So-bolev spaces with singular weights, consisting of the domain of the Dirichlet form-its dual space- and the pivotal Hilbert space. In this setting, the SABR-Dirichletform captures the information encoded in the corresponding Kolmogorov pri-cing equations in variational form. We show well-posedness of the variationalformulation of the SABR-pricing equations for vanilla and barrier options onthis triple and present a concrete finite element discretization scheme based on a(weighted) multiresolution wavelet approximation in space and a θ-scheme in timeand provide an error analysis for the finite element discretization. Our pricingmethod is valid both in moderate interest rate environments and in the currentlyprevalent low interest rate regimes and is consistently applicable under very mildassumptions on parameter configurations of the process, which are easily met inall practical scenarios.

Finally, in Chapter V we follow a different strand of research and consider aninvestor who maximizes the present value of her future expected returns penalizedfor variances, with ambiguity about both drift rates and volatilities of the under-lying risky asset. We derive a setting for portfolio optimization which smoothlyinterpolates between extreme scenarios of full information and full uncertainty.Specifically, we do not use a classical worst-case approach, but instead penalizeless plausible models far from a Black-Scholes reference point. Explicit formu-las for optimal strategies and their performance obtain. An example calibratedto empirical data illustrates that for portfolio choice, the effect of the volatilityuncertainty is typically dominated by the much larger uncertainty about futureexpected returns.

Page 24: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Chapter II

Geometry and Time Change:Functional Analytic (Ir-)RegularityProperties of SABR-type Processes

1 Introduction

The SABR model is closely related to Brownian motion on a suitably chosentwo dimensional manifold. We review a well-known time change that partiallydecouples the equations and suggest a perspective on the time change as a trans-formation of the geometry underlying the state space of the processes. As anapplication we present a simple method to prove the Feller-Dynkin property ofthe SABR process, for which several usual methods do not apply. Aided by ageometric viewpoint we prove generalized Feller properties on Banach spaces con-taining functions with unbounded growth and conclude strong continuity of theSABR semigroup thereon.

The lack of (hypo-)ellipticity in SABR-type processes—due to the degener-acy of the CEV processes near zero—suggests to turn to Dirichlet forms for theanalysis of these processes rather than using the currently prevalent language ofclassical Riemannian geometry. Motivated by this, we construct Hilbert spaces(weighted L2 -spaces), on which we can associate a strongly continuous symmet-ric semigroup to SABR under certain restrictions on its parameters, derive theassociated symmetric Dirichlet forms and determine all classes of parameter con-figurations for which such a construction is possible.

Furthermore, we characterize the large time behavior of the considered pro-cesses and comment on their short time asymptotic behavior, on their intrinsicgeometry in the framework of Dirichlet forms and on the transformation of thelatter under the time change.

Page 25: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

12 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

We consider the SABR stochastic differential equation in the formdXt = YtX

βt dWt

dYt = αYtdZt

d〈Z,W 〉t = ρdt

X0 = x, Y0 = y, t ≥ 0

(1.1)

with α ≥ 0, 0 ≤ β ≥ 1 and ρ ∈ [−1, 1] and state space D := R≥0 × R>0, TheSABR model was designed to fit the shape and dynamics of the implied volatilitysmiles and skews. The key parameter is the constant elasticity of variance (CEV)parameter β, which influences the general shape and dynamics of the smile whilethe parameter α governs the volatility of the stochastic volatility. In fact, if α = 0,the model reduces to a variant of the CEV model [105, Section 6.4]. We study theapplications of a time change that partially decouples the equations and allowsto relate SABR to a CEV model running on a stochastic clock.

The first time-change applied to SABR appeared in the original work [89]and was referred to as a rescaling. Indeed, regarding the volatility of volatil-ity as a scaling parameter of time was a key observation leading—via singularperturbation—to the remarkable asymptotic formulas (the so-called Hagan ex-pansion) which allows for a clear interpretation of how each of the model para-meters in the formula affects the shape of the implied volatility smile. Later on, afurther time change was proposed for the SABR model in [15, 102], where not onlythe volatility of volatility, but the entire volatility path is absorbed into the (nowrandom) rescaling of time. Time change arguments of this type appear in differ-ent contexts for in the finance literature. Such arguments are used for instancein [98] to give a direct proof of the martingality of the SABR model and to studythe asymptotic behavior of lognormal models. They also appear in [15, 40, 64]in the context of the SABR model, and in [170] and [110] in a more general setup.

The geometric viewpoint on the SABR model was put forward in [90, 115],followed by the monograph [114], drawing a precise link between the so-callednormal SABR model (β = ρ = 0) and Brownian motion on the hyperbolic plane.It provides an insightful means to explain the original Hagan expansion by anisometry [90, Equation (52)] to Brownian motion on a suitably chosen manifold(the SABR plane, cf. [90]) combined with a simpler—regular—perturbation. Thegeometric insight further promoted the integration of stochastic analysis on mani-folds and heat kernel expansions [59, 99] into the context of mathematical finance.It also initiated refinements of the initial Hagan expansion, which appeared tolack accuracy in the vicinity of the origin, by refining the leading order [26, 140]and providing a second order term [142]. As it turns out, although Riemanniangeometry allows for a reliable interpretation of the absolutely continuous part ofthe SABR distribution (when X ∈ (0,∞)), it cannot provide the machinery tohandle process (1.1) for the values β ∈ (0, 1) of the CEV parameter at the ori-gin. As a result, Riemannian heat kernel expansion techniques may break down

Page 26: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

1 Introduction 13

already in leading order near the singularity at the origin. Under the assump-tion of hypo-ellipticity short time asymptotic results are available [22, 23, 50, 51,45] beyond the Riemannian setup. - These results are however not available forSABR, due to the lack of hypoellipticity of the CEV part of the distribution.For—non hypoelliptic—diffusions exhibiting such degeneracies as the CEV andSABR processes, the setting of Dirichlet forms [161, 164, 57] provides a suitablemachinery for such short-time asymptotic expansions.

We review here the time change of [15, 102] and suggest a perspective on thetime change as a transformation of the underlying geometry of the state space ofthe processes. As applications (of either perspective) we present regularity prop-erties and certain irregularities of the SABR model and related processes withrespect to strong continuity on suitable Banach- and Hilbert spaces and to theirshort- and large time behavior.

As a first application of the time change we propose a simple method to provethe Feller-Dynkin property. This method readily applies to other stochastic mod-els as well. Although the invariance of the Feller-Dynkin property under timechange is well studied in different contexts, the setting of SABR dynamics isa particularly interesting example, since several available results—such as [108,Theorems 21.11 and 23.16], [32, Theorem 4.1] and [134, Theorem 1]—do not ap-ply here: the Brownian motions in (1.1) are not independent, the function in themultiplicative perturbation resulting from the time change is unbounded and soare the coefficients of the SDE (1.1). In the case of unbounded coefficients, Fellerproperties were derived in [31] under rather weak assumptions on the symbolof the generator. These however are not met for the SABR generator. Feller(-Dynkin) processes and Feller semigroups have a rich theory with an interest of itsown, see for instance [32, 36, 47], but Feller-Dynkin properties are also of interestfor numerical simulation. For instance in [30] sample path properties and MonteCarlo methods are studied. Moreover, general convergence theory as developedin Hansen, Ostermann [91] provides a framework, which allows for the derivationof splitting schemes with optimal convergence orders for unbounded operators.The applicability of their results to the SABR model requires an analytic settingin which the SABR semigroup is a strongly continuous contraction semigroup ona suitable Banach space, for which Feller properties are needed. From a financialperspective it is desirable to consider such Banach spaces which include functionswith unbounded growth. For this we derive so-called generalized Feller properties(cf. [53, 152]) by relating the geometry of the state space of (1.1) to that of thehyperbolic upper half-plane.

Having determined Banach spaces on which the SABR semigroup is stronglycontinuous, we turn to the regularity of the SABR semigroup on Hilbert spaces:We construct weighted L2 -spaces, on which we can associate a strongly con-tinuous symmetric semigroup to SABR and derive the associated symmetric Di-richlet forms. Furthermore, we determine all classes of parameter configurations,

Page 27: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

14 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

for which symmetric Dirichlet forms is associated to the SABR model. As abyproduct we derive symmetric Dirichlet forms for the CEV model and for thetime changed processes.

A further application of the time change is a characterization of the large-timebehavior of the SABR process and of Brownian motion on the SABR plane. Tostudy the large time behavior of the SABR process (Section 1) we write the SABRprocess as a time change of a simpler process—the CEV process—for which theasymptotic behavior is known. If the time change does not level off (i.e. reachesinfinity) in finite time, then the CEV process reaches its t→∞ limit and hencethe process of interest hits zero. Otherwise, the process of interest has a non-trivial limit behavior since it is the position of the CEV process at a finite randomtime. In [98] similar results are derived. We pursue this line of argumentationfor general values of the parameter β in the uncorrelated SABR model and forBrownian motion on the SABR plane. The perspective of the time change asa transformation of the underlying geometry allows us to relate the large timebehavior of the latter to correlated Euclidean Brownian motions.

We emphasize the scope of applications of Dirichlet forms by commentingon the short time behavior (Section 2) of a diffusion closely related to the CEVprocess, for which Varadhan-type asymptotics fail in the parameter regime β ∈[1/2, 1), see [56]. The geometry induced by (cf. [163, 162]) respective Dirichletforms corresponding to the processes seems to better reflect the behavior of thisprocess near the singularities than classical Riemannian geometry. This geometryis a rather general extension of the latter, where also the scope of sub-Riemanniangeometry can be naturally embedded, see [161, Section 1]. While in Riemanniangeometry the distance between two points is determined by the squared gradientalong a (length-minimizing) geodesic curve connecting them, in the Dirichlet geo-metry the squared gradient is replaced by the so-called energy measure. Finally,we conclude the chapter by calculating the energy measures for the SABR andCEV Dirichlet forms and observing that the time-change transforms the “Dirichletgeometry” analogously to the Riemannian case.

Page 28: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 The time change as a change of geometry 15

2 The time change as a change of geometryFor notational simplicity we will set without loss of generality α = 1 from nowon. A crucial observation for the time change arguments we study here is thatthe generator of the SABR model (1.1) factorizes as

Af(x, y) = y2

2(x2β∂2

x,x + 2ρxβ∂2x,y + ∂2

y,y)f(x, y)

= y2Af(x, y), f ∈ C∞c (D)(2.1)

for an operator A and y2 acts as a multiplicative perturbation. The martingaleproblem for the operator A and is solved by the law of a process with dynamics

dXt = Xβt dWt

dYt = dZt

d〈Z,W 〉t = ρdt

X0 = x, Y0 = y, t ≥ 0.

(2.2)

The multiplicative perturbation (2.1) of generators suggests (cf. [32, Theorem4.1] an [19, Theorem VI.3.7]) a close relationship between (1.1) and (2.2):

Theorem 2.1 (Random time change for SABR). Let the law of the process(Xt, Yt

)t≥0

be a solution of (2.2). Then the following processes coincide in law

with the SABR process (Xt, Yt)t≥0(X∫ t

0 (Ys)2ds

Y∫ t0 (Ys)2ds

)=

(Xτ−1

t

Yτ−1t

), t ≥ 0, (2.3)

where the time change τt :=∫ t

0Y −2s ds is defined up to the hitting time t < T Y0 =

inft : Yt = 0. In particular, the law of (2.3) is a solution of the martingaleproblem (5.1) for A in (2.1).

There is a clear advantage in studying SABR from the point of view of The-orems 2.1. In (2.2) the full coupling in the coefficients is removed and put intothe time change. An important feature of the SABR time change is that it onlyinvolves the second coordinate processes Y and Y . It is nothing but the timechange of a Brownian motion into a geometric Brownian motion and thereforesome important calculations can be handled. Furthermore, the time change isa continuous additive functional of Brownian motion and as such it has a rep-resentation as a (unique) mixture of Brownian local times [108, Theorem 22.25].This suggests a further perspective on the time change as a transformation of theunderlying geometry, cf. Appendix 1.2. See also [7] on a related matter.

Time change and the underlying geometry of the process

In a certain sense, the time change changes the geometry underlying the statespace S of the model. We shall give a brief motivation for this statement here. It

Page 29: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

16 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

comes as no surprise that the volatility (the instantaneous variance) of a diffusiondetermines the geometry of the process, cf. [75, 73, 87, 99, 114, 115], see also [3].If the instantaneous covariance matrix of the diffusion is non-degenerate on S,then its inverse determines the coefficients of a Riemannian metric on the statespace of the process. For example for the SABR process (1.1) with β = 0 andβ = 1 (the normal and lognormal SABR models1) this manifold and Riemannianmetric are

S0 := R× (0,∞) and g0(x, y) := 11−ρ2

(dx2

y2 − 2ρ dxdyy2 + dy2

y2

), (x, y) ∈ S1

S1 := (0,∞)2 and g1(x, y) := 11−ρ2

(dx2

y2x2 − 2ρ dxdyy2x

+ dy2

y2

), (x, y) ∈ S0,

(2.4)respectively2 . Indeed the metric determined this way is the “intrinsic metric”for such (non-degenerate) diffusions: The short time asymptotic behavior of theirtransition density p is described in the leading order by Varadhan’s formula

limt→0

t log pt(x, y) = −d(x, y)2

2, x, y ∈ S, (2.5)

where the distance d(·, ·) appearing on the right hand side of (2.5) is preciselythe Riemannian distance induced by the Riemannian metric g(·, ·) constructedabove. Moreover, the infinitesimal generator of a diffusion of the above typecoincides in leading order with the Laplace operator3 of the respective manifold.The heat equation of the manifold induced by this Laplace operator is solved bythe corresponding heat kernel which determines the law of the Brownian motionof the manifold, cf. [99]. Naturally, the short-time asymptotics of this heatkernel coincide in leading order with those of the transition density p in (2.5),as Varadhan’s formula also applies to the Laplace operator of the manifold. Fora complete expansion of the heat kernel see [99] and the Minakshisundaram-Pleijel recursion formulas, see [39, Section VI.3]. If the geometry is flat, thedistance appearing in (2.5) is the Euclidean distance and the transition densitycoincides the well-known Gaussian kernel in the leading order. Correspondingly,the geometry of the process (2.2) for the parameters β = 0, β = 1 is determinedas above, this manifold and Riemannian metric are

S0 := R2, g0(x, y) := 11−ρ2 (dx2 − 2ρ dxdy + dy2) , (x, y) ∈ S1

S1 := (0,∞)× R, g1(x, y) := 11−ρ2

(1x2dx

2 − 2ρxdxdy + dy2

), (x, y) ∈ S0,

(2.6)respectively4. The transition from one geometry to the other is induced—in ac-cord with Theorem 2.1—by the time-change. In fact, the multiplicative perturb-ation y2 in (2.1) is the inverse of the density of the hyperbolic volume element

1Note that we have not posed any boundary conditions at x = 0 on the process (1.1) fornow.

2Note that for ρ = 0, the Riemannian manifolds (S0, g0) is hyperbolic space H2, see [90,Section 3.1], and the case β = 1 is related to H3, cf. [114, pages 176-178].

3More commonly: Laplace-Beltrami operator, see Appendix B below or [81, Equation (3.45),page 68] for a precise definition.

4Note that the geometry of (S0, g0) is a flat, i.e. Euclidean.

Page 30: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 The time change as a change of geometry 17

(1/y2dxdy), and it is easy to see that in the uncorrelated case we indeed passby the time change from a Euclidean geometry (S0, g0) to the geometry of thehyperbolic plane (S0, g0).An analogous statement to Theorem 2.1 holds for the so-called Brownian motionon the SABR plane (henceforth SABR-Brownian motion) cf. [90, Section 3.2].This process was first considered in [90] and is characterized as a stochastic pro-cess, whose law solves the martingale problem (see (2.9) below) corresponding tothe Laplace-Beltrami operator ∆g of the so-called SABR manifold5 (S, g), wherethe manifold and metric tensor are

S := (0,∞)2 and g(x, y) :=1

1− ρ2

(dx2

y2x2β− 2ρ dxdy

y2xβ+dy2

y2

), (x, y) ∈ S.

(2.7)The martingale problems corresponding to the operators ∆g and ∆g are

f(X t, Yt)− f(X0, Y0)−∫ t

0

∆gf(Xs, Ys)ds, resp.

f(X t, Yt)− f(X0, Y0)−∫ t

0

∆gf(Xs, Ys)ds,

where the operators ∆g and ∆g have (in orthogonal coordinates) the followingrepresentation

∆gf = β2y2x2β−1 ∂f

∂x+ Af = y2

(β2x2β−1 ∂f

∂x+ Af

)=: y2∆gf, f ∈ C∞0 (D).

(2.8)

Theorem 2.2 (Random time change for SABR-Brownian motion). Consider theprocesses

dX t = YtXβ

t dWt + βY 2t X

2β−1

t dt

dYt = αYtdZt

d〈Z,W 〉t = ρdt

X0 = x, Y0 = y, t ≥ 0,

resp.

dX t = X

β

t dWt + βX2β−1

t dt

dYt = dZt

d〈Z,W 〉t = ρdt

X0 = x, Y0 = y, t ≥ 0.

(2.9)Then

(X t, Yt

)t≥0

and the time changed process(Xτ−1

t, Yτ−1

t

)t≥0

with the time

change τt =∫ t

0Y −2s ds for t < T Y0 = inft : Yt = 0 coincide in law. Furthermore,

the law of(X t, Yt

)t≥0

resp. of(X t, Yt

)t≥0

solves the martingale problem to ∆g

resp. to ∆g.

5Note that the here the manifold is only a subset of the state space of the process, as theaxis (x, y) : x = 0 is excluded. On this set the instantaneous covariance matrix degeneratesand the Riemannian metric is not defined.

Page 31: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

18 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

3 The semigroup point of viewLet us turn to applications of the SABR time change. We start with the regularityof the transition semigroup (3.1) of the SABR process (1.1). For the problemof pricing contingent claims on a forward, suppose that the stochastic processmodeling the forward X = (Xt), t ≥ 0 follows SABR-dynamics. If f denotesthe payoff function of a financial contract, the valuation of the fair price of thiscontract reduces to the computation of E(x,y) [f(Xt, Yt)] , t ∈ [0, T ] under somerisk neutral (martingale-) measure, where (x, y) is the initial value of the forwardand volatility. For a suitable set of admissible payoffs f ∈ B, these expectationsform a semigroup

Ptf(x, y) = E(x,y)[f(Xt, Yt)], t ≥ 0, (3.1)

of bounded linear operators.

1 Banach spaces and (generalized) Feller properties

We speak about Feller properties of the semigroup (3.1) if it satisfies:

(F) (Semigroup properties) P0 = Id, and Pt+s = PtPs for all t, s ≥ 0,

(F) (Continuity properties) for all f ∈ B and x ∈ X, limt→0+ Ptf(x) = f(x),

(F) (Positive contraction properties) ||Pt||L(B) ≤ 1 for all t ≥ 0 andPt is positive for all t ≥ 0, that is, for any f ∈ B f ≥ 0 implies Ptf ≥ 0,

with choice of admissible payoffs being a suitable Banach space B. A semigroupwith the properties (F1)− (F3) that is invariant on its domain B, i.e.

Pt maps B into itself (F, FD, FG)

is referred to as a Feller semigroup (F), a Feller-Dynkin semigroup (FD) or aGeneralized-Feller semigroup (FF), depending on the underlying invariant Banachspace B. We recall the Feller properties (F ) and (FD). The transition semigroup(Pt)t≥0 has the Feller property (F ), if it acts on the Banach space BF = (Cb, ||·||∞),that is if

Pt maps Cb = f : R+ × R+ → R | f continuous and bounded into itself. (F)

The transition semigroup (Pt)t≥0 has the Feller-Dynkin property (FD), if it actson the Banach space BFD = (C∞, || · ||∞), that is if

Pt maps C∞ = f : R+×R+ → R | f continuous, lim|(x,y)|→∞

f(x, y) = 0 into itself.

(FD)Note that strong continuity of the semigroup Pt (which is the required analytic

setting in [91]) is an immediate consequence of the Feller-Dynkin property (FD)[108, p. 369] but not of the weaker Feller property (F). While the Feller property(F) is a direct consequence of the well-posedness of the martingale problem, theFeller-Dynkin property is not always straightforward to verify.

Page 32: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 The semigroup point of view 19

Lemma 3.1. The SABR semigroup (3.1) for (1.1) and heat semigroup (3.1) cor-responding to SABR-Brownian motion (2.9) (henceforth SABR-heat semigroup)satisfy the Feller property (F ).

Theorem 3.2. The SABR semigroup (3.1) for (1.1) and the SABR-heat semig-roup both satisfy the Feller-Dynkin property (FD).

1.1 The generalized Feller property and weighted spaces

In [53] a theoretical framework is constructed, where the applicability of splittingand cubature methods is extended to the more realistic case of unbounded func-tion spaces. We suggest a framework of so-called weighted spaces for the SABRmodel, on which Feller-like properties (properties (F)-(F) and (FF)) hold on anappropriate invariant Banach space B. This setting is a natural generalizationof the Feller-Dynkin property (FD), which is suitable for pricing of options withunbounded payoff. That is, the set of payoffs under consideration—which wasC∞ for the Feller-Dynkin property— is now extended to an appropriate Banachspace Bψ, which includes functions whose growth is controlled by some admissiblefunction ψ (see Def. 3.3 below.). For this, we first recall the framework of [53].We denote by D the state space of the stochastic process under consideration.

Definition 3.3 (Admissible weight functions and weighted spaces). On a com-pletely regular Hausdorff space D, a function ψ : D → (0,∞) is an admissibleweight function if the sub-level sets

KR := x ∈ D : ψ(x) ≤ R (3.2)

are compact for all R > 0. A pair (D,ψ) where D is a completely regularHausdorff space and ψ an admissible weight function is called a weighted space.

Note that admissibility renders weight functions ψ lower semi-continuous andbounded from below by some δ > 0, cf. [53].

Lemma 3.4. Consider the set

Bψ(D) :=

f : D → R : sup

x∈Dψ(x)−1||f(x)||R <∞

, with

||f ||ψ := supx∈D

ψ(x)−1||f(x)||R.

Then the pair (Bψ(D), || · ||ψ) is a Banach space and Cb(D) ⊂ Bψ(D), where(Cb(D), || · ||∞) denotes the space of bounded continuous functions endowed withthe supremum norm, see [53, Section 2.1].

Definition 3.5. For a weighted space (D,ψ) we define the function space Bψ asthe closure

Bψ := Cb(D)||·||ψ (3.3)

of the set of bounded continuous functions in Bψ(D) under the norm || · ||ψ. Werefer to elements of the space Bψ as functions with growth controlled by ψ.

Page 33: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

20 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

The spaces Bψ: ψ admissible in Definition 3.5 above, coincide with thespaces constructed in [152, equation (2.2)], where the authors study well-posednessof martingale problems in the sense of Stroock and Varadhan in an SPDE set-ting. A key feature of Bψ spaces is that they allow for complete characterization oftheir respective dual spaces Bψ(D)∗ via a Riesz representation, which was provedfor Bψ-spaces in [53, Theorem 2.5] and [152, Theorem 5.1] independently. Westated above that the considered Banach spaces are a generalization of the spaceC∞(D) to ones which include functions of unbounded growth. In fact, the follow-ing criterion given in [53] restores the vanishing-at-infinity property for functionsf := |f |

ψweighted by admissible functions ψ.

Proposition 3.6. Consider the sets KR := x ∈ D : ψ(x) ≤ R for positive realR and any f : D −→ R. Then f ∈ Bψ if and only if f |KR ∈ C(KR) for all R > 0and

limR→∞

supx∈D\KR

ψ(x)−1|f(x)| = 0.

Proof. See [53, Theorem 2.7]

Definition 3.7 (Generalized Feller Property). Consider the family (Pt)t≥0 ofbounded linear operators on a weighted space Bψ(D). The semigroup has thegeneralized Feller property (cf. [53, Section 3]) if (F1), (F2) and the followingproperties are satisfied:

(F3) for any f ∈ Bψ(X), f ≥ 0 implies Ptf ≥ 0.

(F4) there exist a constant C ∈ R and ε > 0, such that ||Pt||L(Bψ(D)) ≤ C for allt ∈ [0, ε].

It is immediate that (F3) covers the positivity statement of (F). The crucialproperty is (F4), which yields the contraction property of (F) such as the invari-ance (FF) of the domain Bψ under the semigroup action. As stated above, theapplicability of convergence theorems requires an analytic setting in which theSABR semigroup is a strongly continuous contraction semigroup on a suitableBanach space B. The strong continuity of Pt on Bψ is stated in the followingtheorem.

Theorem 3.8. Let (Pt)t≥0 be a generalized Feller semigroup on the Banach spaceBψ(D). Then (Pt)t≥0 is a strongly continuous semigroup on (Bψ, || · ||ψ).

Proof. See [53, Theorem 3.2].

The following theorem and corollary provide a characterization of suitableBanach spaces Bψ for the SABR processes (1.1) and (2.9) are generalized Fellersemigroups (and hence, by Theorem 3.8 strongly continuous) and which containpayoff functions of polynomial growth.

Page 34: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 The semigroup point of view 21

Theorem 3.9 (Generalized Feller properties for the SABR-heat and SABRsemigroups). Consider the SABR-heat semigroup (resp. the SABR semigroup)

Ptf(x, y) := E(x,y) [f(Xt, Yt)] , t ≥ 0, (x, y) ∈ D, f ∈ Bψc,n(D), (3.4)

where (X, Y ) is the process (2.9) (resp. the process (1.1)) for a family of functionsψc,n := Ln(n+1) rc, n ∈ N, where Ln(n+1) denote Legendre polynomials of ordern, n ∈ N. For c ∈ [0,∞), let rc : (0,∞)2 → (0,∞) denote the family of functions

rc(x, y) :=1 + y2

2y+

(x1−β

1−β − ρy − c)2

(1− ρ2)2y, (x, y) ∈ (0,∞)2. (3.5)

(i) The functions ψc,n, n ∈ N, c ∈ [0,∞) are admissible weight functions (cf.Definition 3.3).

(ii) The SABR-heat semigroup is a generalized Feller semigroup on the Banachspaces Bψc,n(D), for any (c, n) ∈ [1,∞)×N and any configuration of SABRparameters (ρ, β) ∈ (−1, 1)× [0, 1). The same statement holds for (c, n) ∈[0, 1) × N in the parameter regime (ρ, β) ∈ (0, 1) × [0, 1) and as long asc > |ρ| also for the parameters (ρ, β) ∈ (−1, 0] × [0, 1). Otherwise6 thestatement holds under the restriction (ρ, β) ∈ (−1, 0)× 2m−1

2m,m ∈ N.

(iii) The function ψ : (0,∞)2 → (,∞) defined for any β ∈ [0, 1) as

ψ(x, y) := y + 2x1−β +x2−2β

y, (x, y) ∈ (0,∞)2 (3.6)

is an admissible weight function and for any (β, ρ) ∈ [0, 1) × (−1, 1) theSABR semigroup is a generalized Feller semigroup—and hence stronglycontinuous—on the space Bψ(D), where ψ is the function (3.6) with βmatching the SABR-parameter.

(iv) Furthermore, the SABR semigroup is a generalized Feller semigroup onthe spaces Bψ0,n(D), n ∈ N for any configuration of SABR parameters in(ρ, β) ∈ (−1, 0)× 0 ∪ 2m−1

2m,m ∈ N.

Corollary 3.10. For any β ∈ [0, 1) there exists an N ∈ N large enough, suchthat the Banach spaces Bψn(D), for n ∈ N , contain functions of linear growth.Hence for any β ∈ [0, 1) one can find N ∈ N such that the SABR (resp. theSABR-heat) semigroup is a strongly continuous contraction semigroup on a spaceBψn(D) which contains payoff functions of European call contracts.For example if in the SABR model ρ < 0 and one believes that the market behaveslike β = 1

2, then the space Bψ(D) with ψ as in (3.6) as well as any of the spaces

Bψc,n(D) with c = 0 and n ≥ 1 are suitable. For β = 34, one can choose any

Bψc,n(D) with c = 0 and n ≥ 2.

6That is c ≤ |ρ| with ρ < 0.

Page 35: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

22 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

2 Hilbert Spaces and Dirichlet forms

In the previous section, regularity properties of the semigroups (3.1) correspond-ing to SABR-type processes were investigated on the Banach spaces BF = (Cb, ||·||∞) and BFD = (C∞, || · ||∞) such as BFG = (Bψ, || · ||ψ). Now the underly-ing spaces will be Hilbert spaces (H, || · ||H), endowed with a norm induced bythe scalar product || · ||H = (·, ·)1/2

H . Therefore the Riesz representation propertywhich was an essential feature of the Banach spaces constructed in Section 1 isnow encoded in the Hilbert space structure. In this setting we can also speakabout symmetry properties of the semigroup.

Definition 3.11. A family (Pt)t≥0 of operators with domain H satisfying

(S1) (Semigroup properties) P0 = Id and for all s, t > 0 Pt+s = PtPs

(S2) (Strong continuity) for all f ∈ H limt→0 ||Ptf − f ||H = 0

(S3) (Contraction property) for all f ∈ H and all t > 0 ||Ptf ||H ≤ ||f ||H,

is a strongly continuous symmetric semigroup on (H, (·, ·)H) if properties (S1)−(S3) hold, and

(Ptf, g)H = (f, Ptg)H for all f, g ∈ H andPt maps H into itself. (S)

The set of symmetric semigroups will be denoted by Π. Furthermore, theset7 of closed (symmetric bilinear) forms on H is denoted by F and is defined asfollows, cf. [67, Section 1.3].

Definition 3.12. A symmetric form on (H, (·, ·)H) is a pair (E , D(E)), where Eis a non-negative definite symmetric bilinear form with dense domain D(E) ⊂ H.That is, E is a symmetric form if D(E) ⊂ H dense linear subspace, and satisfies

(B1) (Nonnegativity) E : D(E)×D(E) −→ R and E(f, f) ≥ 0 for all f ∈ D(E)

(B2) (Symmetry) E(f, g) = E(g, f) and E(f, f) ≥ 0 for all f, g ∈ D(E)

(B3) (Bilinearity) E(αf, g) = αE(f, g), such as E(f1 + f2, g) = E(f1, g) + E(f2, g)for all f, g, f1, f2 ∈ D(E) and α ∈ R.

A closed form is a symmetric form on (H, (·, ·)H) such that the pair (D(E), || · ||E1)with norm ||f ||E1 :=

√E(f, f) + ||f ||2H, f ∈ D(E) is a Hilbert space. That is, if

(B4) (Completeness) the space D(E) is complete with respect to the norm || · ||E1 .

A Dirichlet form is a closed form on (H, (·, ·)H) which satisfies

(B5) (Markovianity) for all u ∈ D(E), u+ ∧ 1 ∈ D(E) andE(u+ u+ ∧ 1, u+ ∧ 1) ≤ E(u, u).

7The correspondence between the sets Π and F being the one described in [67, Section 1.3],for example via Riesz representation for the generator of the semigroup.

Page 36: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 The semigroup point of view 23

Definition 3.13. Let D(A) ⊂ H be a dense subset of the Hilbert space H. Alinear operator A : D(A) → H is self-adjoint if A = A∗ and D(A) = D(A∗),where A∗ denotes the adjoint operator: (Af, g)H = (f, A∗g)H for all f ∈ D(A),g ∈ D(A∗), and

D(A∗) = g ∈ H : ∃v ∈ H s.th. ∀f ∈ D(A) (Af, g)H = (f, v)H. (3.7)

A self adjoint operator A is called negative if spec(A) ⊂ (−∞, 0]. The set ofnegative self-adjoint operators is denoted by A.

Proposition 3.14. Let F, A and Π be the sets of closed forms, of negativeself-adjoint operators and of symmetric semigroups on the Hilbert space H asin Definitions 3.12, 3.13 and 3.11 respectively. Then the following diagram iscommmutative and the maps Φ,Θ and Ψ are bijective:

F Φ //A

Π

Ψ

OO

Θ

>> . (3.8)

See [34] for details and for the construction of the maps (3.8). In the Appendix3 we recall the mappings Φ and Φ−1. If (E , D(E)) is a Dirichlet form, we call(Φ(E),Φ(D(E))) the generator of the Dirichlet form. Below we construct Dirichletforms—see the Theorem 3.17—whose generators correspond to the infinitesimalgenerators (2.1) and (2.8) of the SABR model and of the SABR-Brownian motionon a suitable domain Φ(D(E)).

Theorem 3.15 (Dirichlet forms for SABR: General parameters). The only para-meter configurations of the SABR model (1.1) such that there exists a Hilbertspace Hm := L2 (S,m(x, y)dxdy) for a dxdy-a.s. positive Borel function m :S → [0,∞), and a bilinearform

Em(f1, f2) :=1

2

∫S

Γ(f1, f2)m(x, y)dxdy,

for a symmetric operator Γ such that

(Af1, f2)Hm = Em(f1, f2), for f1, f2 ∈ C∞0 (S)

is satisfied by the SABR infinitesimal generator A in (2.1) are the following cases:

(i) β = 0, ρ ∈ [0, 1): this case is covered in (iii) of Theorem 3.17 and it holdsfurthermore, that Af = ∆gf for all C∞0 (S). Note in particular, that in thespecial case ρ = 0, ∆g is the Laplace-Beltrami operator of the hyperbolicplane, cf. [90].

(ii) β ∈ [0, 1], ρ = 0: this case is covered in (iv) of Theorem 3.17. Note thatin this case it holds that Af = ∆Υµf for all C∞0 (S), where ∆Υµ denotesthe Laplace-Beltrami operator of the weighted manifold (S, g,Υµ), where gis as in (2.7), µ =

√det(g) and Υ(x, y) = x−β, see (5.49) and also [81,

Definition 3.17].

Page 37: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

24 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

(iii) β ∈ [0, 1], ρ ∈ (0, 1), ν = 0: this (univariate) case is covered in Lemma3.18.

(iv) β = 1, ρ ∈ (0, 1): In this case the weighted space on which the symmetry ofthe generator is established has the weight function

m(x, y) =1

y2x1+1/(1−ρ2)exp

(ρ√

1−ρ2y

). (3.9)

For all other parameter configurations of the SABR model (1.1), correspondingDirichlet forms fail to satisfy the symmetry property (B2) for any m(x, y)dxdy.

Corollary 3.16. It is a consequence of Theorem 3.15 above and of the one-to-one correspondence between closed forms and symmetric semigroups stated inProposition 3.14, that the cases (i) − (iv) above are the only configurations ofSABR-parameters, for which a symmetric semigroup can be associated with theSABR model on a Hilbert space Hm = L2 (S,m(x, y)dxdy) as above.

Theorem 3.17 (Symmetric Dirichlet forms for SABR-Brownian motion and un-correlated SABR). We consider the domain S := R × (0,∞) and the Hilbertspaces Hj := L2(S,mj(x, y)dxdy), j = 0, 1 for the measures mj(x, y)dxdy :=

1ρxβ(1+j)y2dxdy, j = 0, 1 and on Hj the following bilinear forms

Emj(f1, f2) :=1

2

∫S

Γ(f1, f2)mj(x, y)dxdy, (3.10)

where the integrand is defined as

Γ(f1, f2) := y2(x2β∂xf1∂xf2 + 2ρxβ∂xf1∂yf2 + ∂yf1∂yf2

). (3.11)

Then the following statements hold:

(i) The pair (Emj , C∞0 (S)) is a symmetric form on Hj for j = 0, 1 and closable8

for j = 0 for all β ∈ [0, 1) and for j = 1 for all β ∈ [0, 1/2).

(ii) The pair (Emj , D(Emj)) is a Dirichlet form on Hj for j = 0 for all β ∈ [0, 1)and for j = 1 for all β ∈ [0, 1/2) with the domain

D(Emj) :=

u ∈ Hj, B(S)−mb. , s.th. ∀ uy, ux ∃ abs. cont. version,

s.th. Γ(u, u) ∈ L1(S,mj(x, y)dxdy)

(3.12)where uy := (x 7→ u(x, y)), ux := (y 7→ u(x, y)).

(iii) The generator Φ(Em0) (see Appendix 3) satisfies

(Φ(Em0)f1, f2)Hm0= (∆gf1, f2)Hm0

= Em0(f1, f2), for f1, f2 ∈ C∞0 (S),

8See sections 2, and 1.1 for the definition and implications of closability.

Page 38: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 The semigroup point of view 25

where ∆g is the Laplace-Beltrami operator in (2.8)9. For β = 0 in particular

(Φ(Em0)f1, f2)Hm0= (Af1, f2)Hm0

= Em0(f1, f2), f1, f2 ∈ C∞0 (S), (3.13)

where A denotes the SABR infinitesimal generator (2.1).

(iv) Furthermore, for ρ = 0 the SABR infinitesimal generator A in (2.1) andthe generator Φ(Em1) satisfies

(Φ(Em1)f1, f2)Hm1= (Af1, f2)Hm1

= Em1(f1, f2), for f1, f2 ∈ C∞0 (S).

Lemma 3.18 (Symmetric Dirichlet form for CEV). Consider the Hilbert space

Hβ := L2(R,mβ(x)dx)

with the speed measure mβ(x)dx := 1/x2βdx, and on Hβ the following bilinearform

Ec(f1, f2) :=1

2

∫R

Γc(f1, f2)mβ(x)dx, (3.14)

where the integrand is defined as

Γc(f1, f2) := σx2β∂xf1∂xf2. (3.15)

Then the following statements hold:

(i) The pair (Ec, C∞0 (R)) is a symmetric form on Hβ for β ∈ [0, 1] and closablefor β ∈ [0, 1

2).

(ii) The pair (Ec, D(Ec)) is a Dirichlet form on Hβ for β ∈ [0, 12) with the domain

D(Ec) :=

u ∈ Hj, B(R)−mb. , s.th. ∃ abs. cont. version

for which Γc(u, u) ∈ L1(R,m(x, y)dxdy) (3.16)

(iii) It holds for all β ∈ [0, 1] that (σ2x2β∂xxf1, f2)H = Ec(f1, f2), for f1, f2 ∈C∞0 ((0,∞)).

Remark 3.19. Note that in [96] further (non-symmetric) bilinear forms for theCEV model are considered on the larger spaces spaces L2((0,∞), 1/xβdx) andL2((0,∞), dx). Note also, that for β ∈ [0, 1], the CEV-manifold is ((0,∞), gc),with the Riemannian the metric gc(x, x) := σ2x2βdx ⊗ dx. In particular, thedistance between points a, b > 0 remains finite as a → 0 for all β ∈ [0, 1), butthe limit becomes infinite for β = 1. Note also, that although the weights mβ forany β ensure symmetry, but for β ∈ [1/2, 1] the induced measure is no longer aRadon measure and therefore available closability results (see section 2) are nolonger applicable.

9Note however that D(Φ(Em0)) 6= W 20 (S, µ(x, y)dxdy). The latter is the domain of the

Dirichlet-Laplace-Beltrami operator on S.

Page 39: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

26 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

Analogous statements to Theorem 3.17 hold for the processes (2.2), (2.9).

Theorem 3.20 (Symmetric Dirichlet forms for the time changed processes). Wenow consider the Hilbert spaces Hj := L2(R2, mj(x, y)dxdy), j = 0, 1 for themeasures mj(x, y)dxdy := 1

ρxβ(1+j)dxdy, j = 0, 1 and on Hj the following bilinearforms

Emj(f1, f2) :=1

2

∫R2

Γ(f1, f2)mj(x, y)dxdy, (3.17)

where the integrand is defined as

Γ(f1, f2) := x2β∂xf1∂xf2 + 2ρxβ∂xf1∂yf2 + ∂yf1∂yf2. (3.18)

Then the following statements hold:

(i) The pair (Emj , C∞0 (R2)) is a symmetric form on Hj for j = 0, 1 and closablefor j = 0 for all β ∈ [0, 1) and for j = 1 for all β ∈ [0, 1/2).

(ii) The pair (Emj , D(Emj)) is a Dirichlet form on Hj for j = 0 for all β ∈ [0, 1)and for j = 1 for all β ∈ [0, 1/2) with the domain

D(Emj) :=

u ∈ Hj, B(R2)−mb. , s.th. ∀ uy, ux ∃ abs. cont. version,

s.th. Γ(u, u) ∈ L1(S,mj(x, y)dxdy)

(3.19)where uy := (x 7→ u(x, y)), ux := (y 7→ u(x, y)).

(iii) The generator Φ(Em0) satisfies

(Φ(Em0)f1, f2)Hm0= (∆gf1, f2)Hm0

= Em0(f1, f2), for f1, f2 ∈ C∞0 (S),

where ∆g is as in (2.8). For β = 0 it holds in particular that

(Φ(Em0)f1, f2)Hm0= (Af1, f2)Hm0

= Em0(f1, f2), f1, f2 ∈ C∞0 (S), (3.20)

where A is as in (2.1).

(iv) Furthermore, A in (2.1) for ρ = 0 and the generator Φ(Em1) satisfy

(Φ(Em1)f1, f2)Hm1= (Af1, f2)Hm1

= Em1(f1, f2), for f1, f2 ∈ C∞0 (S).

Remark 3.21. The infinitesimal generator of a diffusion which solves the SDEdXt = YtX

βt dWt

dYt = YtdZt − ρβY 2t X

β−1t dt

d〈W,Z〉t = ρ

resp.

dXt = Xβ

t dWt

dYt = dZt − ρβXβ−1t dt

d〈W,Z〉t = ρ

(3.21)coincides on C∞0 (S) with the generators of the Dirichlet forms Φ(Em1), resp.Φ(Em1) in (iv) of Theorem 3.17 resp. Theorem 3.20 for any β ∈ [0, 1] and ρ ∈(0, 1). Indeed, for ρ = 0 in (3.21), the uncorrelated SABR model (1.1) resp. theprocess (2.2) is obtained.

Page 40: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

4 Asymptotics 27

4 AsymptoticsA direct application of the SABR time change is a characterization of its largetime behavior. In [98] similar asymptotic conclusions are derived in a log-normalsetting and in [98, Example 5.2] a special case of the SABR model is presented(β = 0, ρ = 1), where the process a.s. has a non-trivial limit.

Figure II.1: Large-time behavior of the SABR model on the time horizon T = 100years. The sample paths of the process are either absorbed at zero or they level offat a positive value. The image showsM = 50 sample paths of theX-coordinate ofthe SABR process (1.1) for the parameters y = 0.3, α = 0.45, ρ = −0.25, β = 0.5.Sample paths are generated by an explicit Euler scheme with absorbing boundaryconditions at zero.

1 Large-time asymptotics

The SABR process and the SABR-Brownian motion have a non-trivial large-timebehavior and the time-change gives insight into the sample-path behavior of themodel. The second coordinate process (Yt)t≥0 of (1.1) (resp. of (2.9)) is a driftlessgeometric Brownian motion and as such converges almost surely to 0. For thefirst coordinate two scenarios are possible: either the geometric Brownian motionY stays long enough over some threshold so that the CEV process forces X (resp.X) “has time” to hit zero; or, Y gets small quickly enough so that the fluctuationsof X level off and X (resp. X) converges to a non-zero limit. The next theoremshows that both happens with positive probability both for the (uncorrelated)SABR model (1.1) and for the SABR-Brownian motion (2.9):

Theorem 4.1. Let (X, Y ) denote the uncorrelated the SABR model (i.e. we setρ = 0 in (1.1)) and let (X,Y ) denote the SABR-Brownian motion (2.9) withβ ∈ [0, 1) and ρ ∈ (−1, 1). Then in both cases the limit

limt→∞(Xt, Yt) =: (X∞, Y∞) resp. limt→∞(X t, Yt) =: (X∞, Y∞) (4.1)

Page 41: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

28 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

exists a.s. and it holds that 0 < P(X∞ > 0) < 1 resp. 0 < P(X∞ > 0) < 1.

The time change reduces the characterization of the limiting behavior of theprocess (1.1) with β = 0, ρ ∈ (−1, 1) on a hyperbolic plane to determining thehitting time of the coordinate axis of two correlated Brownian motions in the firstquadrant of the (Euclidean) plane. Therefore, a lower bound for the probabilityP(X∞ > 0) for the SABR-Brownian motion (2.9) follows from [28].

From a financial perspective, such time change constructions can be usedto investigate whether the price process has the potential to hit zero in finitetime. Such properties have implications for option prices in the limits of extremestrikes as already remarked in [98]. Indeed, the Moment Formula of [119] relatesthe behavior of the implied volatility for extreme strikes to the price of a Put(resp. Call) option. This model-independent result was refined in [24, 84] andextended in [46, 85] to the case where the price process has positive mass atzero. Based on these insights, arbitrage free wing asymptotics were derived forthe SABR model in [86], where an asymptotic approximation of the mass at zerois available for the regimes ρ = 0 and β = 0.

2 Short-time asymptotics and a generalized distance

In the non-degenerate (i.e. uniformly elliptic) case a unique geometry and in-trinsic metric is determined by the diffusion, which describes the short-timeasymptotic behavior (2.5) of the diffusion at leading order. Similar statementscan be made beyond the uniformly elliptic case in the realm of the geometry ofDirichlet forms, see [56, 97, 164] and in fact it was shown in [138], that the smalltime asymptotic behavior of a symmetric elliptic diffusion on a Lipschitz manifoldis determined by the intrinsic metric defined in terms of the associated Dirichletform. Moreover, also a converse statement is true: in [162] the question whether adiffusion10 is determined by its intrinsic metric is answered affirmatively (for dif-fusions with continuous coefficients). In the uniformly elliptic case, the intrinsicmetric d(x, y) is the Riemannian distance resulting from the Riemannian metricassociated with the inverse of the matrix of coefficients of the highest order termof the operator:

d2(x, y) = inf∫ 1

0

〈γi(t), γj(t)〉gi,j(γ(t))dt : γ ∈ C1([0, 1]), γ(0) = x, γ(1) = y

see [163, Section 5.5.4], where the term in the integral is the squared gradientof a parametrised curve. Note that the Riemannian distance is invariant underreparametrisation and it is conventional to parametrise to arc length of the curve(cf. [118, Section 6]). In this case, the squared gradient has unit length, andhence the Eikonal equation (which is the starting point of the analysis in [26]) issatisfied along the whole curve.

10See Appendix 4 for a reminder of the definition of a diffusion in the more general contextof Dirichlet forms.

Page 42: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

4 Asymptotics 29

Let us turn to those cases, where the instantaneous covariance matrix and theinfinitesimal generator are degenerate. The intrinsic metric in a subelliptic caseis discussed in [106] [163, Theorem 4.3], and in [87] the distance in the Hestonmodel is studied specifically.The asymptotic relation (2.5) is true beyond the uniformly elliptic setup, see forexample [22, 23, 45, 50, 51] for short-time asymptotics in the hypoelliptic case.For diffusions with other types of degeneracies, similar asymptotic results and asuitable generalization of the intrinsic metric for Dirichlet forms, is discussed in[97, 56, 163]. Results of [56] indicate that for large classes of degenerate ellipticdiffusions the asymptotic relation (2.5) fails, and difficulties often arise from non-ergodic behavior. In a series of articles (e.g. [57, 58, 56]) second-order operatorsof the form

A = −∑d

i,j=1 ∂i(ξi,j)∂j

are considered on Rd with bounded real symmetric measurable coefficients, suchthat ξi,j ≥ 0 almost everywhere, hence allowing for the possibility that (ξi,j)i,j isdegenerate. The setup includes a univariate operator

∂x

(x2β

(1 + x2)β

)∂x, (4.2)

for β ≥ 0, which approximates for x ∼ 0 the generator of the CEV model. It isshown in [56], that for β ∈ [0, 1/2) the validity of the asymptotic relation (2.5)prevails for (4.2) and for β ≥ 1/2 it fails. In the latter case, the origin is naturallyabsorbing, and hence acts as an impenetrable boundary although the Riemanniandistance d(−x, x) is finite for any x > 0, see [56]. Indeed, it is well known that asimilar phenomenon holds for the CEV model: The origin is naturally absorbingfor β ≥ 1/2 see [15], although the Riemannian distance to the origin is finite,see Remark 3.19. In [56] a more general metric is proposed, which better reflectsthe behavior of the diffusion at zero. In the Appendix 4 we included a preciseformulation of the intrinsic metric considered in [56] for the operator (4.2) for easyreference. The approach taken in [56] was introduced in [97], where a set-theoreticversion of Varadhan’s formula is proven for open sets A,B ⊂ Rd

limt→0

t log pt(A,B) = −d2(A,B)

2, (4.3)

where the set theoretic distance d(A,B) is as usual, the infimum of d(x, y), x ∈A, y ∈ B, for a suitable intrinsic metric d(x, y) on Rd, determined by the Dirichletform. Under certain regularity assumptions on the latter (see [161, Section 1] fordetails) the intrinsic metric can be written in the following form

d(x, y) = supu(x)− u(y) : u ∈ Dloc(E) ∩ C(X),

dΓ(u, u)

dm≤ 1

,

where the squared gradient is generalized to the so-called energy measure (seecarré du champ operators in the Appendix 2 for the definition and the monographs[34, 67] for a comprehensive discussion thereof). That is, the energy measure Γ

Page 43: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

30 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

is absolutely continuous with respect to the speed measure m, and the densitydΓ(u,u)dm

(z) should be interpreted as the square of the gradient of u at z ∈ X, cf.[161].

Corollary 4.2 (to Theorems 3.17 and 3.20 and Lemma 3.18). The energy meas-ures for the SABR and CEV Dirichlet forms (3.14) and (3.14) such as the time-changed Dirichlet form (3.17) are the operators (3.11), (3.18) and (3.15) respect-ively. On S (resp. (0,∞)) these energy measures in fact coincide with the squaredgradient of the respective (weighted) manifolds where the Riemannian metric isas in (2.7) for SABR and as in Remark 3.19 for CEV.

See Theorem 6.19 in Appendix 2 for a proof of the statements about theenergy measure and [81, pages 108-109: “Second Proof”] to compare with thesquared gradient on a weighted manifold. Note that in the time-changed formsthe energy measures and hence the intrinsic metric (Dirichlet distances) changeaccordingly as predicted.

Remark 4.3. Results in the spirit of [56] leading to Varadhan-type asymptoticsoften rely on the assumption that the topology induced by the intrinsic metric isequivalent to the original topology of the underlying space X and that all ballsBr(x), of radius r > 0, x,∈ X are relatively compact, cf. [56, Section 2, ConditionL]. Note that if the topologies coincide (which is the case for the Riemanniandistance, cf. [117, Proposition 11.20]), the fact that all balls are relatively compactis equivalent to the completeness of the metric space cf. [161], which implies in thecase of Riemannian manifolds (by the Hopf-Rinow theorem, cf. [118, Theorem6.13]) that the manifold has no boundary. This is the case for the manifold(S1, g1) considered in (2.6), but not the case for (S, g) in (2.7) for β ∈ [0, 1), sincethe Riemannian distance to the coordinate axis (x, y) ∈ R2 : x = 0 is infinitefor β = 1, but finite for β ∈ [0, 1).

5 ProofsProposition 5.1. Imposing absorbing boundary conditions at X = 0, the SABR-martingale problem is well-posed for any β ∈ [0, 1]. That is, the process

M ft := f(Xt, Yt)− f(X0, Y0)−

∫ t

0

Af(Xs, Ys) ds, f ∈ C∞c (D), (5.1)

with A as in (2.1) is a martingale for any smooth function with compact support,and uniqueness in law holds for solutions of (1.1) for any initial value (x, y) ∈ D.Furthermore, the solutions P (x,y), x, y ≥ 0 form a (strong) Markov family, andeven pathwise uniqueness holds.

Proof of Proposition 5.1. From Ito’s formula it follows for SABR that (5.1) is alocal martingale11. The statement about pathwise uniqueness follows by local

11In fact, by [61, Chapter 4.8. p.228], this is sufficient for the subsequent time changearguments.

Page 44: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs 31

Lipschitz continuity of the coefficients in (1.1) away from the origin. Well-posedness of the martingale problem and uniqueness then a consequence of theYamada-Watanabe theorem, see for instance [108, Lemma 21.17]. Finally thestrong Markov property follow from the well-posedness of the martingale prob-lem [108, Theorem 21.11, p.421.].

Remark 5.2. In fact, for β ∈ [12, 1], the SABR-martingale problem is well-posed

without imposing any boundary conditions at zero, which essentially follows fromthe well-posedness of the CEV-martingale problem for these parameters, see [14].For the parameters β ∈ [0, 1

2), uniqueness holds by local-Lipschitz continuity of

the parameters on R>0 × R>0, and imposing absorbing boundary conditions atX = 0, uniqueness holds on the whole state space R≥0 × R>0.

Proof of Theorems 2.1 and 2.2. The statement of Theorems 2.1 and 2.2 followsfrom the Dambis-Dubins-Schwarz Theorem [110, Theorem 1] or [108, Proposition21.13] directly, and it is immediate that the time change t 7→

∫ t0(Ys)

2ds is nothingbut a time change of a Brownian motion Y started at y > 0 to a geometricBrownian motion Y .

Remark 5.3. We remark here briefly that the corresponding time-change, whichtransforms the a Brownian motion (Wt)t≥0 started at x > 0 into a CEV processwith parameter β ∈ [0, 1] reads t 7→

∫ t0|Ws|2βds with corresponding stopping

time τβt = infu :(∫ ·

0|Ws|2βds

)u≥ t. Note also that a similar scaling can be

induced following [110, Theorem 2]. That is, we replace the Brownian motion Zby a symmetric α-stable process, see [110, Definition 1]. The corresponding timechange is t 7→

∫ t0|Ys|αds and for α = 2 the symmetric α-stable Lévy motion in

[110, Theorem 2] is a Brownian motion.

Proof of Lemma 3.1. The martingale problem is well-posed so the SABR pro-cess is (strong) Markov process and the corresponding semigroup Ptf(x, y) =Ex,y[f(Xt, Yt)], t ≥ 0, automatically satisfies a version of the Feller property withweak continuity properties. It is a consequence of the well-posedness and the gen-eral theory of martingale problems that Pt satisfies the (simple) Feller property:the semigroup properties (F) are implied by the Chapman-Kolmogorov equationsof the Markov process, the pointwise continuity (F) is a consequence of the con-tinuity of paths and the property (F), that for any fixed t ≥ 0 Pt maps Cb(D)back into Cb(D) is equivalent (cf. [108, Lemma 19.3]) to convergence in distribu-tion (Xx′

t , Yy′

t )→ (Xxt , Y

yt ) as (x′, y′)→ (x, y), which is implied by uniqueness in

law for solutions of (1.1), cf. [108, Theorem 21.9].

Note that if the coefficients of the SDE were bounded, then well-posedness ofthe martingale problem would readily yield the Feller-Dynkin property cf. [108,Theorem 21.11.]. Since this the coefficients of the SDE (1.1) are unbounded, theFeller-Dynkin property needs to be proven separately.

Page 45: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

32 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

Proof of the Feller-Dynkin propertyfor SABR and SABR-Brownian motion

We use the SABR time change in order to prove the Feller-Dynkin property ofSABR. By the time change one relate SABR to a CEV model running on astochastic clock which in turn allows to Feller-Dynkin property for SABR fromFeller’s boundary classification of the boundary at infinity for the CEV process.The property, which distinguishes a Feller-Dynkin process (FD) from a simpleFeller process (F) is the requirement that for any f ∈ C∞[0,∞) and any t ≥ 0the following convergence property is satisfied

limx→∞

Ex[f(Xt

)]= 0. (5.2)

Hence, the Feller-Dynkin property for SABR is established by the following pro-position:

Proposition 5.4. Let X be the CEV process (5.3) with parameters β ∈ [0, 1] andσ > 0 on a stochastic basis (Ω,F , (F)t≥0,P) resp. the process12 X correspondingto the first coordinate of SABR-Brownian motion (2.9)

dXt = σXβt dWt, t > 0, X0 = x > 0

resp. dX t = σXβ

t dWt + σ2βX2β−1

t dt, t > 0, X0 = x > 0(5.3)

Furthermore, for any t ≥ 0 and y ≥ 0, let (τ yt )t≥0,y∈[0,∞) be a family of P-a.s.finite (F)t≥0-stopping times, such that for any t ≥ 0

τ ytP−→ τ yt , as y → y. (5.4)

Then for any f ∈ C∞[0,∞) and y ∈ [0,∞) the following convergence statementshold

limx→∞, y→y

Ex[f(Xτ yt

)]= 0 resp. lim

x→∞, y→yEx[f(Xτ yt

)]= 0. (5.5)

Proof. For notational simplicity we only consider X here. The statement (5.5)follows if for any ε > 0 and r > there exist constants N := N(ε, r) > 0 andδ := δ(ε, r), such that

Px[Xτ yt≤ r]< ε for all x ≥ N, y ∈ By(δ). (5.6)

Let us first consider τ yt deterministic, say τ yt = t < ∞. Then indeed, for anyf ∈ C∞[0,∞) and any ε > 0, let f := sups∈[0,∞) f(s) and [0, r] ⊂ [0,∞) denotethe compact set such that |f(s)| < ε for s /∈ [0, r]. Then

Ex[f(Xt

)]= Ex

[f(Xt

)1Xt>r

]+ Ex

[f(Xt

)1Xt≤r

]≤ ε+ f Px

[Xt ≤ r

].

12The process X corresponds to a Stratonovic-SDE version of the CEV process X, cf. [144,Chapter 5, Example 2, p.284]

Page 46: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs 33

Now let ε > 0 be arbitrary but fixed. Since τ yt , τyt a.s. finite, there exists a T > 0

such thatP [τ yt > T ] <

ε

3, (5.7)

and since τ yt , τyt are stopping times, for any δ > 0 the sets |τ yt − τ

yt | > δ are

measurable. By (5.4) there exists a δ > 0 such that

P[|τ yt − τ

yt | > δ

]<ε

3. (5.8)

Then for any r > 0 there exists an N(ε, r, T ) > 0, such that

Px[Xτ yt≤ r]

= Px[X

τ yt≤ r ∩ τ yt ≤ T

]+ Px

[X

τ yt≤ r ∩ τ yt > T

](5.7)< Px

[X

τ yt≤ r ∩ τ yt ≤ T ∩ |τ yt − τ

yt | ≤ δ

]+Px

[X

τ yt≤ r ∩ τ yt ≤ T ∩ |τ yt − τ

yt | > δ

]+ ε

3

(5.8)< Px

[X

τ yt≤ r ∩ τ yt ≤ T ∩ |τ yt − τ

yt | ≤ δ

]+ ε

3+ ε

3

≤ Px[X

τ yt≤ r ∩ τ yt ≤ T + δ

]+ ε

3+ ε

3

≤ Px[∃ t ∈ [0, T + δ] : Xt ∈ [0, r]

]+ ε

3+ ε

3.

(5.9)The last probability in (5.9) coincides with the probability of hitting [0, r] beforeT + δ

Px[∃ t ∈ [0, T + δ] : Xt ∈ [0, r]

]= Px

[T X[0,r] ≤ T + δ

].

Hence, (5.6) follows if for any ε, r, T := T + δ > 0 there exists a constantN(ε, r, T ) > 0, such that

Px[T X[0,r] ≤ T

]<ε

3(5.10)

for all x ≥ N(ε, r, T ). For regular diffusions on an interval [a, b] ⊂ R, Feller’sboundary classification [108, p. 461, eq. (21)] provides a sufficient conditionfor (5.10): Observe that the value of Px

[TX[0,r] ≤ T

]is clearly non-decreasing in

T and by the strong Markov property non-increasing in x. Hence if the rightboundary point (b =∞) is not of entrance type13, then it holds in particular forany finite T > 0 that

limx→∞

Px[TX[0,r] ≤ T

]= inf

x>rPx[TX[0,r] ≤ T

]= 0, (5.11)

where the first equality holds by monotonicity in x, the second by monotonicityin T .

13 That is, by [108, p. 461] if

limT→∞

(infx>r

Px[TX[0,r] ≤ T

])= 0, r > 0.

Page 47: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

34 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

Lemma 5.5. Let (Ω,F , (F)t≥0,P) be a stochastic basis and let X denote CEVprocess on the interval [a, b] = [0,∞] with parameters β ∈ [0, 1] and σ > 0 resp.X its Stratonovic version as in (5.3). Then for any β ∈ [0, 1] the right endpointb =∞ is not an entrance boundary for X and X.

Proof. For the CEV process it is well known that the right endpoint b = ∞ isnot an entrance boundary for any β ∈ [0, 1]. We recall the argument from [108,Theorem 23.13 (iii)] here. Note that the speed measure ν of the process X hasdensity σ(x)−2 = x−2β, x ≥ 0 [108, p. 458]). Then the claim follows from Feller’sboundary classification [108, Theorem 23.12]) and from∫ ∞

1

xν(dx) =

∫ ∞1

x1−2βdx =

∞ if β ≤ 1

12β−2

if β > 1.

For the process X in (5.3) the function p(x) = xβ+1

β+1is scale function (see [108,

Theorem 23.7, p.456]). In fact, the process St := p(Xt), t ≥ 0 satisfies

dSt = cS2β/(β+1)t dWt, t > 0, S0 = p(x), (5.12)

for c := (β+1)2β/(β+1)

2and the arguments from [108, Theorem 23.13 (iii)] apply

to S. Furthermore, 2β/(β + 1) ∈ [0, 1] whenever β ∈ [0, 1] and b = ∞ is notentrance.

Proof of the Generalized Feller propertyfor SABR and SABR-Brownian motion

According to the following Lemma 5.6, a key ingredient for the characterization ofthe Banach spaces Bψ in Theorem 3.9 is to construct admissible weight functionsψ, which are sub-eigenfunctions (that is, functions which satisfy (5.13)) of therespective infinitesimal generators ∆g of (2.9) resp. A of (1.1).

Lemma 5.6 (Reduction to Sub-Eigenspaces). Let A denote the infinitesimalgenerator of the semigroup (Pt)t≥0 and ψ an admissible weight function. Thenproperty (F4) follows if there exists a constant λ > 0 such that

Aψ(x) ≤ λψ(x) for all x ∈ D. (5.13)

Proof. Assume that (5.13) holds. Then there is an ε > 0 such that

Ptψ(x) ≤ ψ(x) + λ

∫ t

0

Psψ(x)ds, for any t ∈ [0, ε]. (5.14)

Then Gronwall’s inequality yields that

|Ptψ(x)| ≤ λψ(x), for any t ∈ [0, ε], x ∈ D. (5.15)

The definition of the norm || · ||ψ yields with λ′ := ||f ||ψ the inequality

f(x) ≤ λ′ψ(x) for all x ∈ D.

Page 48: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs 35

Hence, by positivity (F3) of the semigroup and by linearity the following estimate

Ptf(x) ≤ λ′Ptψ(x) ≤ λψ(x), for any x ∈ D, t ∈ [0, ε] (5.16)

holds, where we used (5.15) in the second inequality setting λ := λ′λ.

Lemma 5.7 (An admissible weight function ψ: The ad-hoc approach). Thefunction

ψ : S −→ (0,∞),

ψ(x, y) := y + 2x1−β +x2−2β

y

is a sub-eigenfunction of the SABR infinitesimal A, that is

y2(x2β ∂2

∂x2 + 2ρxβ ∂2

∂x∂y+ ∂2

∂y2

)ψ(x, y) ≤ 2ψ(x, y) holds for all x, y ∈ S.

Proof. The derivatives of ψ are

∂xxψ(x, y) = (2− 2β)(1− 2β)x−2β

y− β(1− β)2x−1−β

∂2xyψ(x, y) = −(2− 2β)x

1−2β

y2

∂yyψ(x, y) = 2x2−2β

y3 .

Therefore,

Aψ(x, y) =(

(2− 2β)(1− 2β)y − β(1− β)2xβ−1y2 − 2ρ(2− 2β)x1−β + 2x2−2β

y

)≤ 2y + 4x1−β + 2

x2−2β

y= 2ψ(x, y),

The claimed inequality follows from the estimates

(2− 2β)(1− 2β) ≤ 2 for all β ∈ [0, 1]

− β(1− β)2xβ−1y2 ≤ 0 for all x, y ∈ R≥0 × R+ and all β ∈ [0, 1]

− 2ρ(2− 2β) x1−β ≤ 4 x1−β for all x ∈ R≥0, all β ∈ [0, 1] and all ρ ∈ [−1, 1].

The ad-hoc approach yields a suitable weight function for all SABR paramet-ers. However, this function grows only at a rate x2−2β, and therefore with thischoice of ψ for parameters β > 1

2the space Bψ(D) does not include the payoff

function of a European call option. Therefore one might want to extend thisapproach to higher exponents of the following form:

Remark 5.8 (Extension to higher exponents of the ad-hoc approach). Let usnow observe the above type of candidates for sub-eigenfunctions, while varyingthe exponents of x, y

ψ(x, y) = C1xn−mβ

yh+ C2x

k−lβ + C3yixj,

Page 49: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

36 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

for some constants C1, C2, C3 ∈ R, where h, i, j, k, l,m, n ∈ R≥0 such that n −mβ > 1, or j > 1, with the aim to determine conditions on the constantsh, i, j, k, l,m, n ∈ R≥0, such that there exist λ > 0 with Aψ(x, y) ≤ λψ(x, y),is satisfied for all x, y.

Instead we take a geometric approach, which yields more insight on the gen-eral structure of the above type of sub-eigenspaces: We will determine true ei-genspaces of the hyperbolic Laplace-Beltrami operator and construct from thesesuitable eigenspaces of the generator of the SABR-heat semigroup and also sub-eigenspaces of the above type for arbitrarily high (integer) exponents for theSABR infinitesimal generator. Indeed, in [90] a local isometry is introduced fromthe SABR-plane to the Poincaré-plane

φ : (S, g) −→ (H, h),

(x, y) 7−→ (x, y) :=

(x1−β

ρ(1− β)− ρy

ρ, y

).

(5.17)

Lemma 5.9 (Radial eigenspaces on the Poincaré plane). Radial eigenfunctionsof hyperbolic Laplace-Beltrami operator ∆h are well known. They are of the form

Lλ(rZ(z)) = Lλ(cosh(d(Z, z)) = Lλ

(1 +

(x−X)2 + (y − Y )2

2yY

), z ∈ H2,

where Z = (X, Y ) ∈ H is an arbitrary fixed reference point, d denotes the hyper-bolic distance

d(Z, z) = arcosh(

1 +(x−X)2 + (y − Y )2

2yY

), Z, z ∈ H (5.18)

and Lλ is a Legendre polynomial with λ = −n(n+ 1), for some n ∈ N.

Proof. The hyperbolic Laplace-Beltrami operator ∆h has the following represent-ation in polar coordinates around the reference point Z:

∆h =∂2

∂t2+ coth(t)

∂t+

1

sinh(t)2∆S1 , (5.19)

where t(z) := d(Z, z), for z ∈ H2 denotes the hyperbolic distance from Z, S1

denotes the unit circle about Z, and ∆S1 the Laplace-Beltrami operator on theunit circle. See [81, pages 82 and 275]. When ∆h is applied to radial functionsf(d(Z, ·)), constant on the spheres z ∈ H2 : d(Z, z) = t of fixed hyperbolicradius t around Z, the spherical component ∆S1 vanishes. Hence, radial eigen-functions of ∆h, are for appropriate λ ∈ R solutions of equations

∂2

∂t2f(t) + coth(t)

∂tf(t) = λf(t). (5.20)

On a cosinus-hyperbolicus scale r = cosh(t), and with f(t) = Lλ(r) the equation(5.20) is transformed to Legendre’s differential equation

(1− r2)∂2

∂r2Lλ(r)− 2r

∂rLλ(r) = λLλ(r). (5.21)

Page 50: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs 37

For λ = −n(n + 1), n ∈ N, Legendre Polynomials of degree n are solutions toLegendre’s differential equations (5.21). Hence, radial eigenfunctions of hyper-bolic Laplace-Beltrami operator ∆h with eigenvalue λ (solutions to the equation(5.20)) are of the form

Lλ(rZ(z)) = Lλ(cosh(d(Z, z)) = Lλ

(1 +

(x−X)2 + (y − Y )2

2yY

), z ∈ H2,

where Z = (X, Y ) ∈ H is an arbitrary fixed reference point, d denotes the hyper-bolic distance

d(Z, z) = arcosh(

1 +(x−X)2 + (y − Y )2

2yY

), Z, z ∈ H (5.22)

and Lλ is a Legendre polynomial with λ = −n(n+ 1), for some n ∈ N.

Example 5.10. The simplest example is the hyperbolic distance itself, on cosinus-hyperbolicus scale. Choosing the reference point Z = (0, 1), the function

r0(z) := cosh d(z, Z) = 1 +x2 + (y − 1)2

2y, z ∈ H2

is an eigenfunction of ∆h satisfying ∆h r0 = 2 r0.

The local isometry property of (5.17) and an application of [81, Lemma 3.27](see (5.27) below) allows us to construct eigenfunctions of the Laplace-Beltramioperator ∆g of the SABR plane from eigenfunctions of the Laplace-Beltramioperator ∆h of the hyperbolic plane.

Lemma 5.11 (Implied eigenspaces on the SABR-plane). For any radial eigen-function ψ with eigenvalue λ of the Laplace-Beltrami operator ∆h on the Poincaré-plane (H, h) the pullback φ∗ψ under the SABR-isometry (5.17) (i.e. the composi-tion φ ψ : S → R) is a radial eigenfunction of the Laplace-Beltrami operator ∆g

of the SABR-plane (S, g), with S = (x, y) : x, y > 0 to the same eigenvalue.Therefore, radial eigenfunctions of ∆g are of the form

(cosh δ(z, Z))

)= Lλ

(cosh d(φ(z), φ(Z))

), z ∈ S (5.23)

for some fixed Z = (X, Y ) ∈ S, where Lλ denote Legendre polynomials withλ = −n(n + 1), for some n ∈ N, and δ is the Riemannian distance function onthe SABR-plane. Explicitly,

cosh d(φ(z), φ(Z)) = 1 +

(1

1−β (X1−β − x1−β)− ρ(Y − y))2

(1− ρ2)2Y y+

(Y − y)2

2Y y. (5.24)

Proof. Let Z := φ(Z), z := φ(z) ∈ H2. Then from the construction of the distancefunction

δ(Z, z) = d(φ(Z), φ(z)), Z, z ∈ S (5.25)

Page 51: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

38 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

on the SABR-plane it is immediate that for any f : R≥0 → R

f(d(Z, z)) = f(d(φ(Z), φ(z)) = f(δ(Z, z)), Z, z ∈ S (5.26)

and the form (5.24) follows from (5.22) and (5.17) directly. Furthermore, byLemma 5.9

ψλ := Lλ cosh d

is an eigenfunction of ∆h to the eigenvalue λ, hence by Lemma 5.9 and by [81,Lemma 3.27], the pullback φ∗ψλ is an eigenfunction of ∆g to the same eigenvalue

∆g(φ∗ψλ)(z) = ∆gLλ(δ(Z, z)) = ∆hLλ(d(Z, z))

= ∆hψλ(z) = λψλ(z) = λ(φ∗ψλ)(z).(5.27)

Proof of Theorem 3.9. Note that (3.5) is nothing but the function (5.24) withreference point φ(Z) = ( 1

ρ2 , 1) ∈ H. Hence the functions (3.5) are eigenfunctionsof ∆g, which is an immediate consequence of Lemma 5.11. For c ∈ [1,∞) thereference point φ(Z) = (c/ρ, 1) ∈ H2 can always be realized as the image underφ of a point Z ∈ S = (0,∞)× (0,∞) (that is, there exists Z = (X, Y ) ∈ S suchthat φ(X, Y ) = (c/ρ, 1)): By the definition the map φ in (5.17),

(X, Y ) =(

((1− β)(c+ ρ))1/(1−β) , 1)

⇒ φ(X, Y ) = (c/ρ, 1) , (5.28)

and indeed for c ∈ [1,∞) the value of c+ρ is always positive. The same statementholds whenever ρ ∈ (0, 1) or when ρ ∈ (−1, 0) but c > |ρ|. If ρ ∈ (−1, 0) andc ≤ |ρ|, then it holds that X > 0, whenever the following conditions on theparameter β are satisfied β ∈ 0 ∪

2m−1

2m,m ∈ N

. In this case, it is ensured

that 1/(1− β) = 2m for some m ∈ N and hence X = ((1− β)(c+ ρ))1/(1−β) > 0

and hence (X, Y ) ∈ S although c + ρ < 0. Furthermore, the eigenfunctions ψare admissible weight functions if the range is bounded from below. The range ofthe function rZ(·) ≡ cosh(δ(Z, ·)) is on (1,∞) cf. (5.24), hence composition withthe Legendre polynomials Ln(n+1)

(rZ(·)

)yields functions with range on (1,∞)

by the properties of Legendre polynomials, cf. [1, Section 8]. Having constructedadmissible weight functions, which are eigenfunctions of the operator, it followsfrom Lemma 5.6 that the heat-semigroup (3.4) has the generalized Feller property(FF). The strong continuity of the SABR-heat semigroup (3.4) on the Banachspaces Bψn , n ∈ N is then a consequence of Theorem 3.8. This completes theproof of all statements of Theorem 3.9 about the SABR-heat semigroup. Nowfor the SABR semigroup, recall that the following relationship holds between ∆g

and the generator A of the SABR-model

Af = ∆gf − βy2x2β−1∂f

∂x, f ∈ C∞c (D). (5.29)

Page 52: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs 39

Therefore, if an eigenfunction ψλ of the Laplace-Beltrami operator allows forsufficient control with respect to the first order term in (5.29), then ψλ is alsosub-eigenfunction of the SABR infinitesimal generator

∆gψλ = λψλ and Aψλ ≤ λψλ for some λ, λ ∈ R. (5.30)

Such a sufficient control is guaranteed by the following drift condition: Thereexists a constant λ ∈ R such that

0 ≤ βy2x2β−1∂ψλ∂x

+ λψλ for all x, y > 0. (5.31)

For any ψλ, which satisfies (5.31), the statement (5.30) holds for any λ ≥ (λ+ λ),where λ is an eigenvalue of ∆g and λ ∈ R is the constant from the drift condition(5.31). Hence an eigenfunction of ∆g which satisfies (5.31) is a sub-eigenfunctionof A

Aψλ = ∆gψλ − βy2x2β−1∂ψλ∂x≤ λψλ.

For such (sub-) eigenfunctions the statement of Theorem 3.9 about the generalizedFeller property (FF) of the SABR semigroup (3.1) follows from Lemma 5.6 analog-ously as above and strong continuity of the SABR semigroup on the Banach spacesBψn , n ∈ N is again a consequence of Theorem 3.8. It remains to find eigenfunc-tions which satisfy (5.31) for some λ ∈ R. For the functions ψc,n := Ln(n+1) rc,where Ln(n+1) denote Legendre polynomials of order n(n + 1), n ∈ N the driftpart in (5.29) is of the form

βy2x2β−1∂xψc,n = βy2x2β−1(∂rLn(n+1) ∂xrc)

= ∂rLn(n+1)(rc)β

1− ρ2

(y

(1− β)− xβ−1y(ρy + c)

).

(5.32)

The derivatives or the Legendre polynomials14 for n = 0, 1, 2 are clearly positivefor ∂rL0(rc) ≡ 0, ∂rL2(rc) ≡ 1 and ∂rL6(rc) ≡ rc, which is positive for allx > 0, y > 0 by construction. In these cases ψc,n satisfies the drift condition(5.31) with λ = 0, since if ρ < 0 and c = 0 the expression in (5.32) is clearlynonnegative for all x > 0, y > 0 and β ∈ [0, 1). The restriction on β stated in(iii) of Theorem 3.9 is a consequence of the choice c = 0 and follows from thearguments presented above. This string of argumentation can be generalized toLegendre polynomials of arbitrary order. Indeed, the derivatives of the Legendrecan become negative, but this happens only for |rc| ∈ [0, 1) and for rc ∈ [1,∞)they are always positive. On the other hand, the range of (3.5) is rc(x, y) ∈ [1,∞)for all x, y and all SABR-parameters β, ρ. The latter statement can be read offthe equation15 (3.5), noting 1+y2

2y≥ 1 for all y > 0, while the second term in (3.5)

is always nonnegative.14 Recall that that L0(rc) ≡ 1, L2(rc) ≡ rc and L6(rc) ≡ 1

2 (3r2c − 1). Generally, it holdsfor the derivatives that (r2c − 1)∂rLn(n+1)(rc) = n(rcLn(n+1)(rc) − L(n−1)n(rc)), which can benegative for values of rc in [0, 1), but all derivatives are nonnegative on rc ∈ [1,∞).

15 It can be seen more directly from the representation (5.24), where we recall that (3.5) isthe function (5.24), for the choice following choice of Z: φ(Z) = ( 1

ρ2, 1).

Page 53: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

40 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

Proof of Corollary 3.10. Note that whenever n ∈ N is large enough such thatβ ≤ 2n−1

2n, then the sub-eigenfunctions

ψn(x, y) = Ln(n+1)(rc(x, y)), for x, y ∈ S (5.33)

have at least linear growth in x. Recall that in (5.33), the function rc denotes(3.5), and Ln(n+1) is a Legendre polynomial of order n and c(n) is a finite constantdepending on n, for n ∈ N.

Proof of Theorem 4.1. Recall that for a geometric Brownian motion both bound-ary points 0 and ∞ are natural and hence P[St = 0] = 0 for any t > 0.Moreover, for the geometric Brownian Motion Y from (1.1) we have limt→∞ Yt =0, P− a.s. In particular, by a slight abuse of notation, Y∞ = 0, P-a.s., Yt > 0 forall t ∈ [0,∞), and inft > 0 : Yt = 0 =∞. Therefore,

0 = Y∞ = Yτ(∞) = Yτ1 = Yτ0 (5.34)

where the random times τ0 and τ1 are as in [61, Chapter 6, p. 307]

τ0 := infu : (Yu)

2 = 0

τ1 := inf

u :

∫ u

0

(Ys)−2ds =∞

. (5.35)

Note that for almost every sample path τ0 = τ1 and Y (τ0) = 0 when τ0 < ∞,since Yt > 0 for all t ∈ [0,∞), τ1 is the first time when Y hits zero, which justifiesthe last equality in (5.34). In fact, since Y is a Brownian motion started aty > 0, τ0, i.e. the first hitting time T Y0 of zero is finite with positive probability.Since X is a non-negative martingale, limt→∞Xt = X∞ exists almost surely. Nowdecomposing X via time change, the first component becomes a CEV process X(on a stochastic clock), and as such it hits zero in finite time for any β < 1, [155,Theorem 51.2]. There are three cases, which can occur: For any β ∈ (0, 1), theCEV-process X and the Brownian motion Y started at y reach zero a.s.

Case 1 T X0 < T Y0 : In this case, the process X hits zero and after hitting zeroremains zero until T Y0 , which is the end of our time consideration.

Case 2 T X0 > T Y0 : In this case, the process Y hits zero first and ends our timeconsideration, while X is still positive (and finite). Then the dynamics afterT Y0 become dXt = 0Xβ

t dWt = 0.

Case 3 T X0 = T Y0 : In this case, X approaches zero as Y does and thereforeX asymptotically approaches zero for t→∞.

To show that P(X∞ > 0) ∈ (0, 1) we use the SABR time change: By continuity

limt→∞

Xt = limt→∞

Xτ−1t

= Xτ−1∞,

Since also the first hitting time T X0 of X is absolutely continuous with strictlypositive density on [0,∞) it follows that

P(X∞ > 0) = P(Xτ−1∞> 0) = P(τ−1

∞ < T X0 ) ∈ (0, 1). (5.36)

Page 54: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs 41

Note that for the proof of the statement about (X,Y ) follows by analogous ar-guments for (1.1) and (2.2) with β = 0 and general ρ ∈ [−1, 1], since the firstcoordinate process X in (5.3) has an explicit solution [144, Example 2, p. 284] ofpower form in β

X t =(x1−β + σ2(1− β)Wt

)1/(1−β), (5.37)

therefore, the expression (5.37) takes the value zero if and only if it vanishesfor β = 0. The time change (a change of the underlying geometry) reduces theproblem of determining the limiting behavior of the process (1.1) with β = 0, ρ ∈[−1, 1], which is a (possibly correlated) Brownian motion on a hyperbolic planeto determining the hitting time of the coordinate axis of two correlated Brownianmotions in the first quadrant of the (Euclidean) plane, which is available in [28,equation (5.8)].

Proof of Lemma 3.18. The statement follows immediately by partially integrat-ing (3.14) and applying Theorem 6.7 to obtain the closability on C∞0 ((0,∞)) andthe domain of closedness follows from Lemma 6.10.

Proof of Theorems 3.15, 3.17 and 3.20. Let us denote the coefficients of the SABRinfinitesimal generator (2.1) by the following matrix

ξ : R2 −→ R2×2

(x, y) 7−→(y2x2β y2ρxβ

y2ρxβ y2

)(5.38)

and let us denote the SABR-bilinear forms

Em(u, v) =1

2

∫S

∑i,j

ξij(x) ∂iu(x) ∂jv(x) m(x) dx

appearing in Theorem 3.15 and in (3.10) of Theorem 3.17 for short. The essentialstatements is these theorems are the closability of the bilinear form (assertion(i) of Theorems 3.17 and 3.20) and the specification of the domain of closedness(assertion (ii) of Theorems 3.17 and 3.20) which can be concluded from [33,Proposition 1] or [153, Section 4] as we will demonstrate here. We includedtheir statement in the Appendix 6.16) for easy reference. The Borel functionξ from (5.38) clearly satisfies Condition (HG2) (see Section 6.16) for all β ∈[0, 1] and all ρ ∈ (−1, 1). Then if there exists a Borel function m : R2 −→R+, which satisfies Condition (HG1) (see Section 6.16) then Proposition 6.19 isapplicable, i.e. the Bilinearform Em is closable on C∞0 (S) (as a consequence of[153, Section 4]) and it follows from [33, Proposition 1] that the pair (Em, D(Em))corresponding with to this choice of speed measure m is a Dirichlet form onthe Hilbert space L2(mdx). In particular, if m satisfies Condition (HG1) (seeAppendix 6.15), then (Em, D(Em)) is a closed form, and its generator Φ(Em) (seeAppendix 3 for the construction) is a negative self-adjoint operator Am, such

Page 55: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

42 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

that Em(u, v) = (Amu, v)L2(mdx) for all v ∈ D(Em) and u ∈ D(Φ(Em)). Indeed bypartial integration (Greens formulas) we get

Em(u, v) =− 1

2

∑i,j

∫R2

∂j

(ξij(x)∂iu(x)m(x)

)v(x) dx

=− 1

2

∫R2

∑i,j

(∂jξij(x) ∂iu(x) + ξij(x)∂j∂iu(x)

+ ξij(x)∂iu(x)∂jm(x)

m(x)

)v(x) m(x)dx

=(Amu, v)L2(mdx),

from which we can derive the following simple no drift condition: The GeneratorAm of the Dirichlet form Em has no lower order terms if and only if∑

i

∑j

(∂jξij(x) + ξij(x)

∂jm(x)

m(x)

)∂iu(x) = 0, (5.39)

for any u ∈ D(Φ(Em)). If the matrix ξ is the matrix of SABR-coefficients (5.38),then the no drift condition (5.39) implies that the generator Am of the Dirichletform Em coincides with the SABR infinitesimal generator (2.1) on their commondomain. Since Em is closable on C∞0 (S) in Hm, this common domain will alwaysinclude the dense subset C∞0 (S). Inserting the values (5.38) into (5.39) yields thefollowing differential equations for m:

x2β−1y2

(2β + x

∂xm(x, y)

m(x, y)

)+ ρxβy

(2 + y

∂ym(x, y)

m(x, y)

)= 0 (5.40)

ρxβ−1y2

(β + x

∂xm(x, y)

m(x, y)

)+ y

(2 + y

∂ym(x, y)

m(x, y)

)= 0. (5.41)

Assume that m satisfies (5.40). Let us define the auxiliary function

f(x, y) := log(m(x, y)).

This function satisfies

x2β−1y2 (2β + x∂xf) + ρxβy (2 + y∂yf) = 0,

ρxβ−1y2 (β + x∂xf) + y (2 + y∂yf) = 0.(5.42)

Multiplying the second equation in (5.42) by ρxβ and equating to the first yields

x2β−1y2 (2β + x∂xf) = ρ2x2β−1y2 (β + x∂xf) .

This yields for x 6= 0 6= y

x∂xf =(ρ2 − 2)β

1− ρ2=: a.

Page 56: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs 43

This implies for some function g the identity

f(x, y) = a log(x) + g(y). (5.43)

Similarly, multiplying the first equation in (5.42) by ρ and the second by xβ yields

ρx2β−1y2β + ρ2xβy (2 + y∂yf) = yxβ (2 + y∂yf) .

This yields for x 6= 0 6= y

y∂yf = −2 +ρβxβ−1y

1− ρ2.

Solving this we obtain for some function h the identity

f(x, y) = −2 log(y) +ρβxβ−1y

1− ρ2+ h(x). (5.44)

Thus we know

a log(x) + g(y) = −2 log(y) +ρβxβ−1y

1− ρ2+ h(x),∀x 6= 0 6= y. (5.45)

Setting x = 1 in (5.45) yields

g(y) = −2 log(y) +ρβy

1− ρ2+ h(1),∀y 6= 0, (5.46)

and setting y = 1 in (5.45) implies

a log(x) + g(1) =ρβxβ−1

1− ρ2+ h(x),∀x 6= 0. (5.47)

Setting y = 1 in (5.46) and plugging into (5.47) yields

a log(x) +ρβ

1− ρ2+ h(1) =

ρβxβ−1

1− ρ2+ h(x),∀x 6= 0,

or

h(x) = a log(x) +ρβ

1− ρ2+ h(1)− ρβxβ−1

1− ρ2,∀x 6= 0. (5.48)

Substituting the equations for g, h given by (5.46), (III.3.2) into (5.45) we end upwith

ρβy

1− ρ2+ h(1) =

ρβxβ−1y

1− ρ2+

ρβ

1− ρ2+ h(1)− ρβxβ−1

1− ρ2,∀x 6= 0 6= y,

or after simplifying

y + xβ−1 = xβ−1y + 1,∀x 6= 0 6= y.

Page 57: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

44 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

This is only true for β = 1, β = 0 and ρ = 0. Now for the last statements aboutthe generator Ψ(Em) of the Dirichlet form consider the following operator

∆Υµ =1

Υµ

∑i

∂xi

(∑j

Υµ gij∂

∂xj

)(5.49)

In the case of SABR coefficients one has µ(x, y) = 1ρxβy2 , and (5.49) reads

∆Υµ = xβ

Υ(x,y)y2(∂xΥ(x, y)xβ∂x + Υ(x, y)xβ∂2

xx + ∂xΥ(x, y)ρ∂y + Υ(x, y)ρ∂2xy

)+ xβy2

Υ(x,y)

(∂yΥ(x, y)ρ∂x + Υ(x, y)ρ∂2

yx + ∂yΥ(x,y)xβ

∂y + Υ(x,y)xβ

∂2yy

).

Choosing Υ ≡ 1 yields the Laplace-Beltrami operator ∆g in (2.8) and choosingΥ(x, y) = x−β, the corresponding operator becomes

∆Υµ = y2(x2β∂2

xx + 2ρxβ∂2xy + ∂2

yy

)− y2ρβxβ−1∂y

= A− y2ρβxβ−1∂y = y2(A− ρβxβ−1∂y

)=: y2∆Υµ.

By construction, the operator ∆x−βµ is symmetric on L2(

1x2βy2dx

), and the time

changed operator ∆x−βµ is symmetric on L2(

1x2β dxdy

). Note that for ρ = 0 the

weighted Laplace-Beltrami operators coincide with the SABR (and timechangedSABR) generators ∆x−βµ = A and ∆x−βµ = A in (2.1) on their common domainwhich includes the dense subset C∞0 (S). The proof of Theorem 3.20 follows bythe same arguments as presented above.

Remark 5.12. If (S, g) is a Riemannian manifold with volume element µ (that is,µ =√

det g) and Υ : S → [0,∞) is smooth and non-vanishing on S, then triplet(S, g,Υµ) forms a weighted manifold, cf. [81, Definition 3.17]. In this case,the operator (5.49) is the Dirichlet Laplace-Beltrami operator of the weightedmanifold (S, g,Υµ), cf. [81, Equation (3.45) p. 68.], and in particular ∆Υµ is self-adjoint ([81, Theorem 4.6]) on the weighted Sobolev space D(∆Υµ) = W 2

0 (S,Υµ),see [81, page 104].

Page 58: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

6 Reminder on some properties of diffusions 45

6 Reminder on some properties of diffusions

1 Scalar diffusions:Speed measure, local time and time change

We recall some basic definitions and facts on scalar diffusions from [29, ChapterII] and [108, Chapter 23] for convenience to motivate corresponding concepts forDirichlet forms and to give a justification of our statements about the role oflocal time in the time-change in an simple setting. In addition to motivating theanalysis for the SABR model, the statements on scalar diffusions discussed in thissection readily apply to the CEV model. We consider a scalar diffusion on aninterval I = [l, r] with −∞ ≤ l and r ≤ ∞, that is a time homogeneous Markovprocess X with lifetime ζ := inft : Xt /∈ I taking values in I, together with itscanonical filtration F c and the probability measure P x associated with X whenthe process is started at x ∈ I, such that for all x ∈ I

t 7→ Xt(ω) is continuous on [0, ζ) P x − a.s.,Ex[η θT |F cT+] = EXT [η] P x − a.s., (6.1)

where η is a bounded F c∞-measurable random variable, T is an F c∞-stopping timeand θ is a shift operator. A scalar diffusion has three basic characteristics: speedmeasure, scale function and killing measure:

Definition 6.1 (Speed measure m). The speed measure is a measure on theBorel set B(I) such that 0 < m((a, b)) <∞, l < a < b < r. For every t > 0 andx ∈ I the measure A 7→ Pt(x,A), A ∈ B(I) is absolutely continuous with respectto m:

Pt(x,A) =

∫A

p(t, x, y)m(dy), (6.2)

where the density p in (6.2) can be taken to be positive, jointly continuous in allvariables and symmetric p(t, x, y) = p(t, y, x). See Section 1.2 for a motivation ofthe name speed measure in terms of local time.

Definition 6.2 (Killing measure k). The killing measure measure k is a measureon B(I) such that k(a, b) <∞, l < a, b, r is associated to the distribution of thelocation of the process at its lifetime:

Px(Xζ− ∈ A; ζ < t) =

∫ t

0

∫A

p(s, x, y)k(dy)ds, A ∈ B(I) (6.3)

Definition 6.3 (Scale function s). The scale function s is an increasing continu-ous function from I to R such that if k((a, b)) = 0 for an interval (a, b) ∈ I thenfor a ≤ x ≤ b

P x(Ta < Tb) = 1− P x(Tb < Ta) =s(b)− s(x)

s(b)− s(a), (6.4)

where Ta and Tb denote the first hitting times of a and b respectively. X is on anatural scale if s(x) = x. In this case X is a local martingale, cf. [29, p. II. 4].

Page 59: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

46 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

1.1 Beurling-Deny-LeJan Formulae

Among strong Markov processes on a domain U ⊂ Rd, d ≥ 1 with continuoussample paths there is a special class (Hunt processes, see [67, Appendix A.2] for aprecise definition) for which there is a well known correspondence with symmetricDirichlet forms on Hilbert spaces L2(U,m), where m is a positive Radon measurewith supp(m) = Rd, the speed measure. In this case (i.e. if the jump measure isvanishing), every symmetric Markovian form E on L2(U,m) with D(E) = C∞0 (U)which is closable in L2(U,m), can be expressed uniquely (cf. [67, Theorem 3.2.3]and [67, Example 1.2.1]) as

E(f1, f2) =d∑

i,j=1

∫U

(∂if1)(∂jf2) µi,j(dx) +

∫U

f1f2 k(dx), (6.5)

where (µi,j)i,j is a non-negative definite matrix of Radon measures on U , i.e. forany ξ ∈ Rd and any compact set K ⊂ U

d∑i,j=1

ξiξjµij(K) ≥ 0 and µi,j(K) = µj,i(K) for all 1 ≤ i, j ≤ d,

and k is a positive Radon measure, the killing measure. The crucial propertyfor the reverse conclusion—i.e. whether it is possible to construct an associatedHunt process to (6.5)—is the closability of the form, see Appendix 2. Closabilityin the univariate case is completely solved (cf. [153, Section 1]), see Theorem 6.7for closability conditions for k = 0, see also Remark 6.9 for the correspondingconditions for scalar diffusions. For time-changes of Dirichlet forms by additivefunctionals see [67, Chapter 5 and Section 6.2] and for their closability under achange of speed measure see [153, Section 5].

1.2 Random Time change via local times

For the local time of a diffusion we recall [29, p. II.13].

Definition 6.4 (Local time). For a regular diffusion X the a family of randomvariables

L(t, x) : x ∈ I, t ≥ 0, (6.6)

is called local time of X, if∫ t

0

IA(Xs)ds =

∫A

IAL(t, x)m(dx), a.s., A ∈ B(I) (i)

L(t, x) = limε→0

Leb(s < t : x− ε < Xs < x+ ε)m((x− ε, x+ ε))

, a.s. (ii)

L(t, x, ω) = L(s, x, ω) + L(t− s, x, θs(ω)), a.s., (iii)

where in (iii), s < t and θ denotes the shift operator.

Page 60: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

6 Reminder on some properties of diffusions 47

The following connection between local time and the transition density p in(6.2) holds here:

Ex[L(t, y)] =

∫ t

0

p(s, x, y)ds.

Now consider a diffusion X on a natural scale (recall: s(x) = x for the scalefunction) and let m denote the speed measure of X. Furthermore, let W be aBrownian motion on R with X0 = W0. The following construction of a randomtime change via local times (cf. [29, p. II.16]) enables us to obtain a version of Xfrom W : For this let us introduce

L(m)(t) :=

∫RL(t, y)m(dy). (6.7)

and let us assume m(dx) = 2m(x)dx. In this case we have

L(m)(t)

∫ ∞0

m(Wu)du. (6.8)

L(m) constitutes an additive functional of W (see [29, p. II.21.] for further ref-erence) is non-decreasing and induces random measure on the time axis. Weset

τ (m)(t) = infu : L(m)(u) > t (6.9)

If the boundaries l, r are absorbing then the random time change of the Brownianmotion W based on L(m) coincides in law with X:

W (m) := Wτ (m)(t) : t > 0 ∼ X. (6.10)

Furthermore, also the local times of W (m) and X with respect to the speed meas-ure m coincide in law:

LX(t, x) : t ≥ 0, x ∈ I ∼ L(τ (m)(t), x), t ≥ 0, x ∈ I. (6.11)

where LX(t, x) denotes the local time of X with respect to the speed measurem. From the definition of local time L(m) with respect to the speed measure isevident that if a Brownian particle is moving at time t on the region where speedthe measure m takes large values, then L(m) is increasing rapidly. By (6.9) thetime change behaves as τ (m)(t+ h) ' τ (m)(t) even for large h > 0, and thereforethe increment Wτ (m)(t+h)−Wτ (m)(t) is “small" and hence the increment Xt+h−Xt

as well. Due to this property m has been given the name speed-measure. Anextreme case is obtained at the points with positive m-mass (so-called stickypoints, cf. [29, p. II.7] for the definition.) in the vicinity of which X will “spendmuch time”, cf. [29, p. II.16]. See also [108, Theorem 23.9] on this matter.

Remark 6.5 (Geometric perspective on the time change). By the time change(6.10) above, one changes the time that a Brownian particle spends in a smallneighborhood Ux(ε) of a point x on the plane (perspective of random time change).An alternative perspective on this is to define a new geometry (intrinsic metric or

Page 61: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

48 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

“energy" metric) on the plane in a way to match the speed of the particle: pointswhere the particle moves slowly are “far away“ (the surface is stretched) andpoints, where the particle moves quickly are “close" (the surface is compressed)in the energy metric. This is very well illustrated by comparing the time changedprocess in Theorem 2.1 in the normal SABR model (β = 0,ρ = 0) with thegeometry (S0, g0) of the hyperbolic plane: the time-changed Brownian particlemoves slowly where Y is small (hence the speed measure is large). Indeed, pointson the hyperbolic plane near the horizontal axis are infinitely far in the hyperbolicdistance from interior points.

2 Symmetric Dirichlet forms: closability

Definition 6.6 (Closability). A form (E , D(E)) is called closable if it has a closedextension. This is equivalent to the condition

un ∈ D(E), E(un − um, un − um) −→n,m→∞ 0

(un, un) −→n→∞ 0 ⇒ E(un, un) −→n→∞ 0.(6.12)

A sufficient condition for (6.12) is

un ∈ D(E), (un, un) −→n→∞ 0

⇒ E(un, v) −→n→∞ 0, ∀v ∈ D(E).(6.13)

2.1 Univariate case

In general, not all non-negative definite symmetric bilinear forms on C∞0 (R) areclosed or even closable. In the one-dimensional case [67, Chap.3 §3.1 (3), p. 105]and [4, Section 2. p.405.] give a precise condition for closability of a form

E(u, v) =∫∞−∞ u

′(x)v′(x)ν(dx)

D(E) = C∞0 (R),(6.14)

on a Hilbert space L2(R,m) for a positive Radon measure m in terms of thesingular set of ν.

Theorem 6.7 (Hamza condition for closability). The form (6.14) is closable inthe Hilbert space L2(R,m) for a positive Radon measure m if and only if thefollowing conditions are satisfied:

(i) ν is absolutely continuous (i.e. ν(dx) = ρ(x)dx)

(ii) The density function ρ a vanishes a.e. on its singular set.

Proof. See For the one-dimensional case [67, Chap.3 §3.1 (3), p. 105] and [4,Section 2. p.405.].

Page 62: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

6 Reminder on some properties of diffusions 49

Definition 6.8 (Regular and singular sets, univariate case). Given a Borel-measurable function ρ : R→ R+, x ∈ R is a regular point of ρ, if there exists anε > 0 such that ∫ x+ε

x−ε

1|ρ(x)|dx <∞, (6.15)

and regular set of ρ is defined as the (open) set of regular points in R, denotedby R(ρ). The complement is called the singular set and it is denoted by S(ρ) :=R \R(ρ).

Remark 6.9. Note that if (i) is fulfilled and the weight ρ : R → R≥0 in (6.14)is Borel-measurable and such that ρ = 0 ds-a.e. on the singular set S(ρ) andρ > 0 ds-a.e. on the regular set R(ρ). Then L2(R(ρ), ρ ds) ⊂ L1

loc(R(ρ), ds)continuously. In particular, this is the case if ρ is a power-type weight (or amore generally a Muckenhaupt weight). Note also, that the Hamza conditioncoincides with the Engelbert and Schmidt condition [108, Theorem 23.1] for scalardiffusions. There, the corresponding conclusion if the condition is satisfied, is thatweak existence for the SDE holds if (ii) of 6.7 holds and uniqueness in law holdsfor every initial distribution if and only if the density function vanishes nowhereelse but on the singular set S(ρ).

Lemma 6.10 (Domain of closedness). Let ρ be as in Theorem 6.7. Consider theset

D(E) = u ∈ L2(R, ρds) : ∃ an abs. cont. versionof u on R(ρ), s.th. du(s)

ds∈ L2(R, ρds) (6.16)

together with the bilinearform

E(u, v) =

∫R

u(s)

ds

v(s)

dsρ(s)ds.

Then (E , D(E)) is a closed form on the Hilbert space L2(R, ρds).

Proof. See [4, Section 2], in particular Condition (H) and Theorem 2.2 (i).

Remark 6.11. The function u has an absolutely continuous dx-version, if existsa function u such that u = u ds-a.e. and that the (classical) partial derivativesof u exist almost everywhere. See for example [174, Chapter 2.1, p.44.], see alsoProposition 6.20 below.

2.2 Multivariate case

The following theorem gives conditions for closability of a non-negative definitesymmetric bilinear form E in the multivariate case and also specifies the domainD(E) where it is a Dirichlet form (i.e. closed). The theorem can be found in[33, Proposition 1] or [153, Section 4]. For r ∈ N, we denote by B(Rr) the Borelsigma-field on Rr and by λr the Lebesgue measure on (Rr,B(Rr)). When noconfusion is possible, we denote it by dx.

Page 63: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

50 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

Definition 6.12 (Regular and singular sets, multivariate case). For any B(Rd),d ∈ N measurable function ρ : Rd → R+, let R(ρ) denote the regular set of ρ, i.e.the largest open set in Rd on which ρ−1 is locally integrable:∫

K

1|ρ(x)|dx <∞ ∀K ⊂ R(ρ),

where K ⊂ R(ρ) denotes a compact set in R(ρ). Similarly to the univariate case,the complement is called the singular set and is denoted by S(ρ) := Rd \R(ρ).

Definition 6.13. Let u : Rd → R be any B(Rd)-measurable function, where d >0 ∈ N. For any i ∈ 1, . . . , d with corresponding x := (x1, . . . , xi−1, xi+1, . . . , xd) ∈Rd−1, consider the function u

(i)x defined by u(i)

x : R → R, s 7→ u((x, s)i), where(x, s)i := (x1, . . . xi−1, s, xi+1, . . . , xd).

Definition 6.14 (The Bilinear Form and its Domain). Let the following positiveBorel function m : Rd → R+, denote the speed measure and consier an Rd×d-valued symmetric Borel function

ξ : Rd −→ Rd×d

x := (x1, . . . , xd) 7−→ (ξij(x))1≤i,j≤d.(6.17)

• Consider the following symmetric bilinear form on L2(mdx)

E(u, v) =1

2

∫Rd

∑i,j

ξij(x)∂iu(x)∂jv(x)m(x) dx. (6.18)

• Define the domain of E as the set

D(E) =u ∈ L2(mdx) B(Rd)-mb: for all i ∈ 1, . . . , d and λd−1-almost all

x ∈ Rd−1, u(i)x has an absolutely continuous16 version u(i)

x on R(m(i)x ), s.th.

∑i,j

ξij∂u

∂xi

∂u

∂xj∈ L1(mdx), where

∂u

∂xi:=

du(i)x

ds.

Condition 6.15 (HG1). m(i)x = 0, λ1-a.e. on R \R(m

(i)x ), for any i ∈ 1, · · · , d

and λd−1-almost all x ∈ y ∈ Rd−1 :∫Rm

(i)y (s) ds > 0.

Condition 6.16 (HG2). There exists an open set O ⊂ Rd such that λd(Rd\O) =0 and ξ is locally elliptic on O in the sense that for any compact subset K, in O,there exists a positive constant cK such that

∀x = (x1, · · · , xd) ∈ K,d∑

i,j=1

ξij(x)xixj ≥ cK |x|2.

16 See for example: [4, page 406].

Page 64: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

6 Reminder on some properties of diffusions 51

Definition 6.17 (Carré du Champ operator). Let E be a Dirichlet form onL2(X,m) with Domain D(E). We say that E admits a carré du champ operatoralso square field operator or energy measure if the following property holds:

(R) There exists a subspace H of D(E) ∩ L∞(X), dense in D(E) such that

For all f ∈ H ∃ f ∈ L1(X), such that for all h ∈ D(E) ∩ L∞(X)

2E(fh, f)− E(h, f 2) =

∫X

hfdm.

If f , g ∈ L1(X) as above, we define by polarisation a form Γ : D(E)×D(E)→ L1

Γ(f, g) :=1

4((f + g)− (f − g)). (6.19)

We call Γ the carré du champ operator associated with E .Remark 6.18 (Characterizing property of the carré du champ operator). If (R)is satisfied, then the form Γ defined in (6.19) is the unique positive symmetriccontinuous bilinear form Γ : D(E)×D(E)→ L1 with the characterizing propertythat for all f, g, h ∈ D(E) ∩ L∞(X)

E(fh, g) + E(gh, f)− E(h, fg) =

∫h Γ(f, h)dm. (6.20)

Note that for f = g we are in the situation (R) hence for the f there, we can usethe notation f = Γ(f, f). See: [34] Proposition 4.1.3.

Theorem 6.19 (Conditions for closability and domain of closedness, Röckner--Wielens and Bouleau-Denis). Let E denote the bilinearform with domain D(E) in(6.18) satisfying conditions 6.15 and 6.16. Then the pair (E , C∞0 (Rd)) is closableon L2(mdx) and (E , D(E)) is a Dirichlet form on L2(mdx) which admits a carrédu champ operator Γ given by

Γ[u, v] =∑i,j

ξij∂iu∂jv, for u, v ∈ D(E).

Proof. The proofs can be found in [153, Section 4] for the former statement and[33, Proposition 1] for the latter. We briefly elaborate on the energy measure: Asremarked in (6.20), the carré du champ operator Γ[u, u] is characterized by theidentity 2E(uv, u)− E(u2, v) =

∫Rd Γ[u, u](x) v(x)m(x)dx for any u, v ∈ D(E).

2E(uv, u)− E(u2, v)=∫Rd∑

i,j ξij(x)∂i(uv)∂j(u)m(x) dx− 12

∫Rd∑

i,j ξij(x)∂i(u2)∂j(v)m(x) dx

=∫Rd∑

i,j ξij(x)∂iu(x)∂ju(x) v(x)m(x) dx,

where the last step follows by symmetry of ξi,j.

Proposition 6.20 (Characterization of the domain of closedness in terms ofSobolev spaces). Let O ⊆ Rn, be an open subset and suppose u ∈ Lp. Thenu ∈ W 1,p(O), p ≥ 1 if and only if u has a representative that is absolutelycontinuous on almost all line segments in O parallel to the coordinate axes andwhose (classical) partial derivatives belong to Lp(O).

Proof. See [174, Chapter 2.1, Theorem 2.1.4, p.44].

Page 65: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

52 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

3 Closed forms, Dirichlet forms and their generators

Proposition 6.21. Let F and A be the sets of closed forms, of negative self-adjoint operators and of symmetric semigroups on the Hilbert space H as indefinitions 3.12 and 3.13 respectively. Then the map F

Φ−→ A is bijective. If(E , DE) is a Dirichlet form, that is a closed and Markovian form, then we referto Φ(E) as the Generator of the Dirichlet form. See [34] for full details.

Φ : F → A If E ∈ F is a closed form in H, then the generator A = Φ(E) ofE is defined by

D(A) = f ∈ D(E) | ∃g ∈ H s.th. ∀h ∈ D(E) E(f, h) = (g, h)HAf = g

(6.21)

and the image A of E is in fact self-adjoint and negative, hence A ∈ A.

Remark 6.22. We can give a sufficient condition for the existence of an elementg as in (6.21). Let g∗ denote the functional

g∗ : D(E) −→ Rh 7→ E(f, h).

If g∗ is a linear functional of D(E) such that

||g∗|| = sup|g∗(h)|R

∣∣ ||h||H ≤ 1<∞,

then there exists an element g as in (6.21): Since D(E) ⊂ H is a linear subspace,there exists by the Hahn-Banach theorem an extension g∗ : H −→ R of g∗ toH, and ||g∗|| = ||g∗||. Then the Riesz representation theorem in turn yields theunique element g ∈ H such that E(f, h) = (g, h)H for all h ∈ D(E).

D(A) = f ∈ D(E) | h 7→ E(f, h) is a lin. functional of D(E), bdd. in H.(6.22)

Note that the linear functional g∗ on H, belongs to H∗ if and only if there existsa g in H such that E(f, h) = (g, h)H for all h ∈ D(E) See [2, Theorem 1.11.].

Φ−1 : A→ F Indeed, the map Φ is bijective: Given A ∈ A we define

D(E) = D(√−A) and

E(f) = ||√−Af ||2H, f ∈ D(

√−A)

(6.23)

Remark 6.23. Since A ∈ A are self-adjoint, the square root√−A of −A exists.

Furthermore,√−A itself is a self-adjoint operator on H: It is a consequence of

the spectral theorem applied to the Borel function ϕ(·) =√· that for any densely

defined self-adjoint operator A on the Hilbet space H, ifspec(A) ⊂ [0,∞)

for the spectrum of A, then there exists a non-negative definite operator X suchthat X2 = A.See also: [81], Appendix A.5.

Page 66: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

6 Reminder on some properties of diffusions 53

4 Diffusions as Dirichlet forms and the intrinsic metric

Let X be a connected locally compact separable metric space with a positiveRadon measure m with supp(µ) = X. Consider the Hilbert space H = L2(X,m)of real-valued functions, which are square-integrable with respect to m. Let Ebe a Dirichlet form L2(X,m), i.e. a closed and Markovian, non-negative definitesymmetric bilinear form E on a dense subspace D(E) ⊂ L2(X,m).

Definition 6.24 (Regular Dirichlet Form). A Dirichlet Form E in the Hilbertspace H is called regular if there is a subset of D(E)∩Cc(X), which is a core of E ,i.e., which is dense inD(E) with respect to the natural norm ϕ 7→ (E(ϕ)+||ϕ||H)

12 ,

and which is dense in C0(X) with respect to the supremum norm L∞.

Definition 6.25 (Strongly Local Quadratic Form). Let E be any positive quad-ratic form on a Hilbert space H, we call E strongly local if E(ψ, ϕ) = 0 for allϕ, ψ ∈ D(E) with supp(ψ) and supp(ϕ) compact with ψ constant on a neighbor-hood of supp(ϕ).

Definition 6.26 (Diffusion). We call a strongly local regular Dirichlet form adiffusion.

Definition 6.27 (Normal Contraction). A map F : Rn → R is a contraction onRn if it satisfies

|F (x)− F (y)| ≤n∑i=1

|xi − yi|, (6.24)

for all x, y ∈ Rn. F is a normal contraction (denoted by F ∈ T n0 ), if F (0) = 0.

Lemma 6.28. Let E be a Dirichlet form on L2(X,m) with Domain D(E). Letϕ ∈ D(E) ∩ L∞(X) be a non-negative function, ψ1 ∈ D(E) ∩ L∞(X) and ψ2 =F ψ1 for a normal contraction17 F ∈ T 1

0 .

0 ≤ E(ψ2ϕ, ψ2)− 12E(ψ2

2, ϕ) ≤ E(ψ1ϕ, ψ1)− 12E(ψ2

1, ϕ) ≤ ||ϕ||L∞E(ψ1)

Proof. The statement follows from Propositions 2.3.3 and 4.1.1. of [34].

In particular, E(ψ1h, ψ1) − 12E(ψ2

1, h) < ∞. This ensures that the followingmap is well defined.

Definition 6.29. Let E be a Dirichlet form on L2(X,m). For ψ ∈ D(E)∩L∞(X),we define the map

I(E)ψ : D(E) ∩ L∞(X) −→ R

ϕ 7−→ E(ψϕ, ψ)− 12E(ψ2, ϕ).

(6.25)

Definition 6.30 (D(E)loc). Let E be a diffusion. Define Dloc(E) as the vectorspace of equivalence classes of measurable functions ψ : X 7−→ C, such that forevery compact subset K ⊂ X there exists a ψ ∈ D(E) with ψ|K = ψ|K .

17See Definition 6.27 above cf. [34, Def.2.3.2.].

Page 67: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

54 II Functional Analytic (Ir-)Regularity Properties of SABR-type Processes

The extension of I from D(E) ∩ L∞(X) to D(E)loc ∩ L∞(X) for diffusions isdefined as follows.

Definition 6.31 (I). Let E be a diffusion. Let us write Iψ(ϕ) = I(E)

ψ(ϕ) for

shorter notation when E is fixed. Let L∞c (X) denote the bounded functions onX with compact support. Then for any ψ ∈ D(E)loc ∩ L∞(X) define the map

Iψ : D(E) ∩ L∞c (X) −→ Rϕ 7−→ Iψ(ϕ).

Definition 6.32. For any ψ ∈ D(E)loc ∩ L∞(X) we define

|||Iψ||| := sup|Iψ(ϕ)| : ϕ ∈ D(E) ∩ L∞c (X), ||ϕ||L1(X) ≤ 1 (6.26)

Definition 6.33 (dψ(A,B) ). Let ψ ∈ L∞(X) be an arbitrary bounded functionand A,B ⊂ X measurable sets. We define a distance function, taking values in(−∞,∞] by

dψ(A,B) := supM ∈ R : ψ(x)− ψ(y) ≥M for λ-a.e.x, λ-a.e.y=ess infx∈Aψ(x)− ess supy∈Bψ(y),

(6.27)

where ess supy∈Bψ(y) = infm ∈ R : λ(y ∈ B : ψ(y) > m) = 0, and where λdenotes the Lebesgue measure.

Definition 6.34 (The ter Elst intrinsic metric induced by the Dirichlet formd(E)(A,B)). The set-theoretic distance function for measurable sets A,B ∈ X,which appears in the generalized version of Varhadhan’s formula is

d(E)(A,B) = supdψ(A,B) : ψ ∈ D0(E), (6.28)

where the set D0(E) is defined as

D0(E) = ψ ∈ D(E)loc ∩ L∞(X) : |||Iψ||| ≤ 1. (6.29)

Page 68: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Chapter III

Mass at Zero and Small-StrikeImplied Volatility Expansions in theSABR Model

1 IntroductionWe study the probability mass at the origin in the SABR stochastic volatilitymodel, and derive several tractable expressions for it, in particular when timebecomes small or large. In the uncorrelated case, saddlepoint expansions allowfor (semi) closed-form asymptotic formulae. As an application–the original mo-tivation for this work–we derive small-strike expansions for the implied volatilitywhen the maturity becomes short or large. These formulae, by definition arbit-rage free, allow us to quantify the impact of the mass at zero on currently usedimplied volatility expansions. In particular we discuss how much those expan-sions become erroneous.

We consider the SABR model of Hagan, Kumar, Lesniewski andWoodward in [89,90] in the form

dXt = YtXβt dWt, X0 = x0 > 0,

dYt = νYtdZt, Y0 = y0 > 0,d〈Z,W 〉t = ρdt,

(1.1)

where ν > 0, ρ ∈ (−1, 1), β ∈ (0, 1), and W and Z are two correlated Brownianmotions on a given filtered probability space (Ω,F , (Ft)t≥0,P). It is well knownthat the popularity of SABR arose from a tractable asymptotic expansion of theimplied volatility (derived in [89]), and from its ability to capture the observedvolatility smile; calibration therefore being made easier using the aforementionedexpansion. In today’s low interest rate and high volatility environment, the im-plied volatility obtained by this very expansion can however yield a negativedensity function for the process X in (1.1), therefore exhibiting arbitrage. Theasymptotic formula happens to lose accuracy in such market environments where

Page 69: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

56 III Mass at Zero and Small-Strike Implied Volatility Expansions

the CEV exponent (the parameter β) is chosen close to zero: The parameter βgoverns the dynamics of the smile, and small values are usually chosen on marketswhere the forward rate is close to zero and for long-dated options [17, 89], whichare precisely those environments where the original asymptotic formula fails.

Indeed, it comes as no surprise that the Hagan formula–which is an asymptoticexpansion for small values of ν2T–breaks down for large maturities, but it is wellknown that the reasons for the inconsistencies of the SABR formula are subtlerthan that. What we highlight here is that the mass at zero can be held ac-countable for the irregularities in this case as well. While standard numericalmethods proved reliable when the process remains strictly positive, computingthe probability mass of the SABR model at the origin is a more delicate issue.Due to the singularity at the origin, usual regularity assumptions ensuring stabil-ity of numerical techniques (finite differences or Monte Carlo) are violated at thispoint, and a rigorous error analysis for these methods is not (yet) available. Inaddition, producing reference values becomes computationally intensive for shorttime scales.

Since much of the popularity of the SABR model is due to the tractability ofits asymptotic formula, one should aim at preserving it while taking into accountthe mass at zero. The parameter sets ρ = 0 or β = 0 are the most tractable, andin fact (as observed in Chapter IV) the only ones where certain advantageousregularity properties of the SABR process can be expected. We therefore concen-trate here on the singular part of the distribution for these regimes, that is, westudy the probability P(XT = 0) and provide tractable formulae and asymptoticapproximations. The relevance of these parameter configurations is emphasizedby recent results [13], which suggest a so-called ‘mixture’ SABR (a combinationof the ρ = 0 and β = 0 cases) approach to handle negative interest rates in anarbitrage-free way. From a modelling perspective, one may question the relev-ance of an absorbing boundary condition at zero in a financial context, wherenegative rates can actually occur. In fact, from a stochastic analysis perspective,when β = 0 there is no need to impose such a boundary condition. Remarkablyhowever, as pointed out in [13], even in market conditions where interest ratesbecome negative, the historical evolution of interest rates suggests that their dy-namics follow processes whose probability distribution exhibit a singularity at theorigin1, which makes the computation of the mass at zero rates relevant for thesemarket scenarios as well.

A further application is a direct approximation of the left wing of the impliedvolatility smile. In order to understand the small-strike behaviour of the SABRsmile it is essential to determine the probability mass at the origin: asymptoticapproximations of the implied volatility are available, not only for small andlarge maturities, but also for extreme strikes. Roger Lee’s celebrated MomentFormula [119] relates the behaviour of the implied volatility IT (K) for small

1That is, rates ‘stick’ to zero for certain periods of time, see [13] for more details.

Page 70: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

1 Introduction 57

strikes K to the price of a Put option on X. This model-independent result wassubsequently refined by Benaim and Friz [24] and Gulisashvili [84]. In [46, 85]De Marco et. al. and Gulisashvili showed that when the underlying distributionhas an atom at zero, the small-strike behaviour of the implied volatility is solelydetermined by this mass, irrespective of the distribution of the process on (0,∞).Indeed, in our approximation of the implied volatility via the probability mass,we find good agreement with Call option prices in [16].

This chapter is organised as follows: in Section 2 we derive explicit formulaefor the mass at zero P(XT = 0) in the SABR model for finite time as well asfor large times in the uncorrelated case. Under this assumption, it is possible todecompose the distribution into a CEV component and an independent stochastictime change. Such time change techniques have been applied to the SABR modelin the uncorrelated case in [15, 40, 65, 102] to determine the exact distributionof the absolutely continuous part of the distribution on (0,∞). Therefore, ourformulae complement these by providing the singular part of the distribution(see [98, 170] for more details about time change techniques in stochastic volatilitymodels). In Section 2 Section 3, we derive asymptotic expansions of the atom atthe origin in the short time and long time cases.

In Section 3 we study a drift extended version of (1.1), which we refer to as theBrownian motion on the SABR plane (see (3.1)), and study its dynamics underdifferent configurations of the SABR parameters. Of particular interest are thezero correlation case, which we study in Section 2, and the case β = 0 (and generalρ), which is prevalently chosen on markets when the current asymptotic formulabreaks down. Besides, when β = 0, the original SABR dynamics (1.1) coincidewith those of the Brownian motion on the SABR plane, and we propose severalspace transformations to translate the properties of one parameter configurationto another, thus deriving an explicit formula for the mass at zero for large timeswhen β = 0 in the correlated case.

In Section 4, we use the results of the previous sections to determine the leftwing of the SABR implied volatility. Using the formulae provided in [46, 85],we highlight the fact that some of the widely used expansions exhibit arbitragein the left wing, and we propose a way to regularise them in this arbitrageableregion.

Preliminaries and Notations: the process X in (1.1) is a martingale [107,Remark 2]. If we consider X on the state space [0,∞), then the origin, which canbe attained, has to be absorbing (see [104, Chapter III, Lemma 3.6]). For a givenreal-valued stochastic process X (with continuous paths) and a real number z,we define by τXz := inft ≥ 0 : Xt = z the first hitting time of X at level z. For

Page 71: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

58 III Mass at Zero and Small-Strike Implied Volatility Expansions

convenience, we shall use the (now fairly standard) notation ρ :=√

1− ρ2. Fortwo functions f and g, we shall write f(z) ∼ g(z) as z tends to zero wheneverlimz→0

f(z)/g(z) = 1.

2 Mass at zero in the uncorrelated SABR model

1 The decomposition formula for the mass

In the case where the correlation coefficient ρ is null, the mass at the origin can becomputed semi-explicitly. Conditioning on the path of the volatility process Y ,the resulting process X satisfies the CEV stochastic differential equation

dXt = YtXβt dWt, X0 = x0,

where Y is a deterministic time-dependent volatility coefficient, and represents,for fixed ω ∈ Ω, a realisation of the paths of Y . Consider now the simple CEVequation dXt = Xβ

t dWt starting from x0, and set

Gt :=X

2(1−β)t

(1− β)2and Gt :=

X2(1−β)t

(1− β)2.

Then Gt = Z∫ t0 Y

2s ds, where Z is a Bessel process satisfying the SDE [102, Subsec-

tion 1.1]

dZt =1− 2β

1− βdt+ 2

√|Zt|dWt, Z0 =

x2(1−β)0

(1− β)2.

By Itô’s formula, the process G solves the same SDE, so that Z = G, and thereforeXt = X∫ t

0 Y2s ds, for all t ≥ 0. It follows that X can be obtained from X using the

stochastic time change

t 7→∫ t

0

Y 2s ds, (2.1)

namely Xt = X∫ t0 Y

2s ds. Since this time change process is independent of X, one

can decompose the mass at zero of the SABR model into the mass of the CEVcomponent at zero and the density of the time change:

P (Xt = 0) =

∫ ∞0

P(Xr = 0

)P(∫ t

0

Y 2s ds ∈ dr

)dr, (2.2)

where the mass at zero in the CEV model is given by (see [46] or [105])

P(Xr = 0

)= 1− Γ

(1

2(1− β),

x2(1−β)0

2r(β − 1)2

), (2.3)

Page 72: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Mass at zero in the uncorrelated SABR model 59

with Γ, the lower incomplete Gamma function Γ(v, z) ≡ Γ(v)−1∫ z

0uv−1e−udu.

Remark 2.1. If β ∈ [1/2, 1) in (1.1), the origin is naturally absorbing, and themass at zero is given by (2.3). When β ∈ [0, 1/2), the solution to (1.1) is notunique, and a boundary condition at the origin has to be imposed. Should oneconsider the origin to be reflecting, the transition density would then becomenorm preserving, and no mass at the origin would be present. However, it iseasy to see that there is an arbitrage opportunity if the origin is reflecting. For-mula (2.3) carries over to the case β ∈ [0, 1/2) when the origin is assumed tobe absorbing, which we shall always consider from now on. This is of course inline with [104, Chapter III, Lemma 3.6], mentioned above, which states that theorigin has to be absorbing for a non-negative supermartingale.

Since for each s ≥ 0, Ys is lognormally distributed, we can write

P(∫ t

0

Y 2s ds ∈ dr

)= P

(∫ t

0

e2νZ(−ν/2)s ds ∈ dr

), (2.4)

where r := r/y20 and Z(−ν/2)

s := Zs − 12νs. In [29, Formula 1.10.4, page 264], the

density of the above functional is given by (see also Section 4.3 below for details)

P(∫ t

0

e2νZ(−ν/2)s ds ∈ dr

)=

21/4√ν

r3/4exp

(−ν

2t

8− 1

4ν2r

)m2ν2t

(−3

4,

1

4ν2r

)dr,

(2.5)where the function m is defined as ([29, page 645]):

my(µ, z) ≡8z3/2Γ(µ+ 3

2)e

π2

4y

π√

2πy

∫ ∞0

e−z cosh(2u)− 1yu2

M

(−µ, 3

2, 2z sinh(u)2

)sinh(2u) sin

(πu

y

)du,

(2.6)

and where the Kummer function M reads

M(a, b, x) ≡ 1 +∞∑k=1

a(a+ 1) . . . (a+ k − 1)xk

b(b+ 1) . . . (b+ k − 1)k!. (2.7)

Page 73: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

60 III Mass at Zero and Small-Strike Implied Volatility Expansions

2 Small-time asymptotics

We now study the behavior of the mass at zero P (Xt = 0) as time becomes small.The main challenge is to provide a short-time asymptotic formula for the dens-ity of the time change process, for which standard expansion techniques are notapplicable. The additive functional arising from the density of an integral overthe exponential of Brownian motion often appears in the pricing of Asian optionsand is of interest on its own. This density is notoriously difficult to evaluate insmall time, due to a highly oscillating factor connected to the Hartman-Watsondistribution [130, 131] and [83, Section 4.6]. These numerical issues are discussedin [18], and Gerhold [78] used saddlepoint methods to provide short-time estim-ates. Because of the time change and the complexity of the Kummer function (inthe integrand), small-time asymptotics of the mass at zero cannot be estimateddirectly. Instead, we use an inverse Laplace transform approach, inspired by [78],to provide small-time asymptotic estimates for the density of the time change.In Section 4 below, we provide several alternative representations for this dens-ity, and relate them to the existing literature. We recall the mapping y = 2ν2t,and we shall alternate between the two notations without ambiguity in order tosimplify some formulations.

Remark 2.2. Let $ := 1/y. The function m has the form

m$(·) = c$

∫ ∞0

e−u2$f$(u)du,

for some c$ and f$. One might be tempted to use a standard Laplace methodhere to determine the asymptotic behavior as $ tends to infinity. However, at thesaddlepoint u∗ = 0, attained at the left boundary of the integration domain, allthe derivatives of the function f$–appearing as coefficients of the expansion–arenull, and hence the method does not apply.

We now formulate one of the main results of this chapter, which characterisesthe small-time asymptotic behavior of the mass at zero in the uncorrelated SABRmodel. For every r, y > 0, let uy denote the largest (positive) solution to theequation

2µ− 1 + 4uy + 2 log(z/2)√u−√u log(u) = 0, (2.8)

with z :=y20

4ν2r. Clearly, uy depends on r, but we shall omit this dependence in

the notation. Set

My :=log(uy)

16u3/2y

− α

8u3/2y

+1− 2µ

8u2y

. (2.9)

Proposition 2.3. For t → 0, the mass of the atom at zero in the uncorrelatedSABR model can be approximated by the following asymptotic formula

P (Xt = 0) ∼ y3/20 e5/4

27/4√νπ

exp(−ν2t

8

)∫ ∞0

exp

log(uy)

(µ− 1

2

)2

− uyy +√uy

g(r)√My

dr,

Page 74: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Mass at zero in the uncorrelated SABR model 61

whereg(r) ≡ P

(Xr = 0

) 1

r5/4exp

(− y2

0

4ν2r

).

Remark 2.4. Our numerical experiments underline the validity of this statement.A justification of Proposition 2.3 lies in (2.2) together with the following assertion.The related problem of determining uniform estimates for (2.10) is subject tofuture research.

Proposition 2.5. As y (equivalently t) tends to zero, we have (recall that y =2ν2t)

P(∫ t

0

Y 2s ds ∈ dr

)∼ y

3/20 e5/4

r5/427/4√νπ

exp

(− y

16− y2

0

4ν2r

× exp

[log(uy)

2

(µ− 1

2

)− uyy +

√uy

]dr√My

.

(2.10)

The proof relies on the following proposition, proved in Section 5.

Proposition 2.6. As y tends to zero, the function my in (2.6) satisfies

my(µ, z) ∼e

12−µ√z2π

exp

[log(uy)

2

(µ− 1

2

)− uyy +

√uy

]√π

My

.

Remark 2.7. The proof of the proposition uses saddlepoint analysis. The saddle-point uy is the solution to the equation (2.8), but does not admit a closed-formexpression; however, as seen in the proof, it is possible to expand it as y tends tozero to obtain

my(µ, z) =

√z| log(y)|2√π

exp

− log(y)2

4y+| log(y)|

2y+

(1

2− µ

)[1− log

(| log(y)|

2y

)]×[y−3/2 +O

(y3/2

)],

but numerical computations show that this estimate is not very accurate.

3 Large-time asymptotics

We now concentrate on the large-time behavior of the mass at zero in the uncor-related SABR model. From [29, Formula 1.8.4, page 612], the formula

P(∫ ∞

0

exp(2νZ(−ν/2)

s

)ds ∈ dr

)=

r−3/2

ν√

2πexp

(− 1

2ν2r

)dr

holds, so that the decomposition (2.2) together with (2.4) implies that the massat zero reads (recall that r = r/y2

0)

P∞ := limt↑∞

P(Xt = 0) = y0

ν√

∫ ∞0

[1− Γ

(1

2(1−β),x

2(1−β)0

2r(β−1)2

)]r−3/2 exp

(− y2

0

2ν2r

)dr

= 1− y0

ν√

∫ ∞0

Γ

(1

2(1− β),

x2(1−β)0

2r(β − 1)2

)r−3/2 exp

(− y2

0

2ν2r

)dr. (2.11)

Page 75: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

62 III Mass at Zero and Small-Strike Implied Volatility Expansions

In the case where β = 0(= ρ), the SABR model (1.1) reduces to a Brownianmotion on the hyperbolic plane (up to a deterministic time change, see Section 1),and a simple computation shows that (2.11) simplifies to

P∞|β=0 = 1− 2

πatan

(νx0

y0

).

When β 6= 0, the integral in (2.11) does not have a closed-form expression. Ex-panding the exponential factor for small y0, we can however write, for any n ∈ N,the nth-order approximation

P(n)∞ :=

∫ ∞0

[1− Γ

(1

2(1− β),

x2(1−β)0

2r(β − 1)2

)]y0

νr3/2√

n∑k=0

1

k!

(− y2

0

2ν2r

)kdr

=n∑k=0

y2k+10

k!ν√

(− 1

2ν2

)k ∫ ∞0

[1− Γ

(1

2(1− β),

x2(1−β)0

2r(β − 1)2

)]r−(k+3/2)dr

=2y0(1− β)

Γ(

12(1−β)

)ν√πx1−β

0

n∑k=0

(−1)k

k!

(y2

0(β − 1)2

ν2x2(1−β)0

)k Γ(k + 1 + β

2−2β

)(1 + 2k)

.

Note in particular that

P(0)∞ =

2Γ(

1 + β2−2β

)Γ(

12−2β

) y0(1− β)

ν√πx1−β

0

. (2.12)

When r tends to infinity, the integrand clearly converges to zero fast enough. Us-ing the properties of Gamma functions in [1, Chapter 6], the asymptotic behavior

1− Γ

(a,

1

r

)∼ r1−a exp(−1/r)

Γ(a)

holds as r tends to zero, ensuring that the integral is well defined for all n ∈ N.Theorem 2.8 below shows how well (and when) the sequence P(n)

∞ approximatesthe mass at zero P∞. Using the Taylor formula with Lagrange’s form of theremainder, we obtain

exp

(− y2

0

2ν2r

)=

n∑k=0

(−1)k1

n!

(y2

0

2ν2r

)k+

(−1)n+1

(n+ 1)!e−θ(

y20

2ν2r

)n+1

,

for some θ ∈ (0, y20/(2ν

2r)). Therefore,∣∣∣∣∣exp

− y2

0

2ν2r

n∑k=0

(−1)k1

n!

(y2

0

2ν2r

)k∣∣∣∣∣ ≤ 1

(n+ 1)!

(y2

0

2ν2r

)n+1

. (2.13)

For any n ≥ 0, set

bn :=2y0(1− β)

Γ(

12(1−β)

)ν√πx1−β

0

(y2

0(β − 1)2

ν2x2(1−β)0

)n Γ(n+ 1 + β

2−2β

)n!(1 + 2n)

, (2.14)

Page 76: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Mass at zero in the uncorrelated SABR model 63

so that from (2.13), (2.14), and the definitions of P∞ and P(n)∞ , it follows that

P(n)∞ =

n∑k=0

(−1)kbk, n ≥ 0, (2.15)

and ∣∣P∞ − P(n)∞∣∣ ≤ bn+1, n ≥ 0. (2.16)

Theorem 2.8. The following statements hold for the sequence P(n) in (2.15):

(i) If y20(β − 1)2 > ν2x

2(1−β)0 , or y2

0(β − 1)2 = ν2x2(1−β)0 and 2

3≤ β < 1, then

the sequence (P(n)∞ )n≥0 diverges, and hence cannot be an approximation to

the mass at zero P∞;

(ii) If y20(β − 1)2 < ν2x

2(1−β)0 , or y2

0(β − 1)2 = ν2x2(1−β)0 and 0 ≤ β < 2

3then,

P∞ = P(n)∞ +O

(n−1+ β

2−2β exp

(−n log

(ν2x

2(1−β)0

y20(β − 1)2

))), as n→∞.

(2.17)

Remark 2.9. Recall that x0 denotes the initial value of the stock price or interestrate. For all practical and sensible values of the parameters, condition (ii) in thetheorem is always in force.

Proof. From (2.14), Stirling’s formula for the Gamma function yields

bk ∼y0(1− β)

Γ(

12(1−β)

)ν√πx1−β

0

k−1+ β2−2β

(y2

0(β − 1)2

ν2x2(1−β)0

)k

, (2.18)

as k tends to infinity. From (2.15), if the conditions of Theorem 2.8(i) hold,then the general term of the series

∑∞k=0(−1)kbk does not tend to zero, and

hence the sequence (P(n)∞ )n≥0 diverges. On the other hand, if the conditions of

Theorem 2.8(ii) hold, then (2.16) and (2.18) imply (2.17), which completes theproof of Theorem 2.8.

For practical purposes, depending on whether the conditions for convergenceof the sequence (P(n)

∞ )n≥0 in Theorem 2.8, it may or may not be useful to usedirectly the integral form (2.11). Consider the following values: (y0, ν, β, x0) =(0.1, 1.0, 0.2, 0.2), for which convergence is ensured. The exact mass at zero inthis case is P∞ = 20.833%. Using Theorem 2.8, the table below computes theerror using the sequence (P(n)

∞ )n≥0:

n = 0 n = 1 n = 2 n = 3 n = 4 n = 5

|P− P(n)∞ | 6.43E-3 3.41E-4 2.13E-05 1.43E-06 1.01E-07 7.29E-09

Comp.time (s) 6.8E-05 8.6E-05 1.3E-4 1.9E-4 2.2E-4 2.6E-4

Page 77: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

64 III Mass at Zero and Small-Strike Implied Volatility Expansions

and the table below computes the integral (2.11) using the Python scipytoolpack for quadrature; the integral is truncated at some arbitrary value R > 0:

R = 20 R = 40 R = 60 R = 80 R = 100 R = 120Abs. error 2.33E-4 1.07E-4 6.77E-05 4.90E-05 3.81E-05 3.11E-05Comp.time 7.6E-3 7.9E-3 8.9E-3 9.2E-3 9.6E-3 9.9E-3

These results suggest that convergence of the series expansion is extremelyfast. In particular, event the limit (2.12), with n = 0, yields a very accurateresult, which allows for a simple interpretation of the impact of each parameterof the model on the large-time mass at the origin.

Remark 2.10. One could in principle compare these values with Monte Carlosimulations. However, as far as we are aware, no rate of convergence for suchschemes has yet been proved for the SABR model, so one may question numbersgenerated by simulation. In addition, it is known [40] that in the critical regionaround zero, Monte Carlo methods are prone to a simulation bias. Nevertheless,for a comparison with the above results we included some corresponding valuesfor the mass generated by a Monte Carlo algorithm in Section 3.1 below.

3.1 Large-time numerics

We provide below some numerics of the large-time mass at zero derived in (2.11).In particular, we observe the influence of the parameter β (Figure III.1) as wellas that of the starting point x0 (Figure III.1) of the uncorrelated SDE (1.1). Asβ tends to one (from below), the mass at zero is diminishing, even for arbitrarilysmall values of x0. Likewise, as the initial value x0 increases, the mass at theorigin decreases even for β = 0. We shall further comment on the importance ofthe mass at the origin in financial modelling in Section 4 below.

Figure III.1: Influence of β on the large-time mass at zero in the uncorrelatedSABR model with (y0, ν) = (0.015, 0.6) (left) and (y0, ν) = (0.1, 1) (right).

Page 78: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Mass at zero in the uncorrelated SABR model 65

Figure III.2: Influence of the initial value x0 on the large-time mass at zero with(y0, ν) = (0.015, 0.6) (left) and (y0, ν) = (0.1, 1) (right). This gives a numericalinterpretation of ‘feeling the boundary’: as we start the diffusion far enough fromthe origin, the mass at zero becomes small.

With due caution with respect to their validity (cf. Remark 2.10), we in-clude here for a comparison a sample of values for the mass at zero obtained byMonte Carlo simulations with M = 1000 and M = 2000 paths, and the corres-ponding computation times, for different time horizons T . The obtained valuessuggest that the ‘large-time’ regime is already achieved for maturities equal to 15years. An explanation for this phenomenon is provided in ChapterIV, see also[52, Section 4]. As in Section 3 above, we used the parameters (y0, ν, β, x0) =(0.1, 1.0, 0.2, 0.2), for which the exact mass at zero is P∞ = 20.833%.

T = 10 T = 15 T = 20 T = 30 T = 50 T = 100MC mass (M=1000) 0.1889 0.2020 0.2100 0.2110 0.2050 0.2090Comp. time (in s) 3.222 3.185 3.221 3.179 3.163 3.178

T = 10 T = 15 T = 20 T = 30 T = 50 T = 100MC mass (M=2000) 0.2100 0.2075 0.2050 0.2100 0.2065 0.2175Comp. time (in s) 6.4437 6.8386 6.4805 7.6308 6.6587 6.372

Page 79: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

66 III Mass at Zero and Small-Strike Implied Volatility Expansions

4 Representations of the density of the integrated variance

We are interested here in computing tractable expressions for the density of therandom variable

∫ t0Y 2s ds in the SABR model (1.1). When deriving the asymp-

totic behavior of the mass at zero above, we used the representation (2.5). Thereexist alternative representations for this density in the literature, and this sec-tion aims at clarifying the links between them, as well as providing some relatedformulae of independent interests. In particular, following the approach by Du-fresne, we provide an alternative formula for the density (see Theorem 2.13)), as adouble integral, from which it however seems more involved to derive asymptoticbehaviors. In order to set the notations, consider the following drifted version ofthe latter:

dXt = YtXβt dWt, X0 = x0 ∈ R,

dYt =

(µ+

1

2

)Ytdt+ YtdZt, Y0 = y0 > 0,

(2.19)

for some µ ∈ R, and we shall, unless otherwise stated, assume that y0 = 1. Letfurther Y(µ) denote the integrated variance process

Y(µ)t :=

∫ t

0

Y 2s ds = y0

∫ t

0

e2Z(µ)s ds, (2.20)

where we recall that Z(µ)s := Zs+µs. Finally, for any t ≥ 0, we shall denote by ϕ(µ)

t

and λ(µ)t the probability density functions, on (0,∞), of the random variables Y

(µ)t

and 1/(2Y(µ)t ). We first start with a ‘toy’ version, for which computations are

simpler, before extending them to the standard case.

4.1 The ‘toy’ SABR model

Definition 2.11. The ‘toy’ SABR model is the unique strong solution to (2.19)with µ = 0, y0 = 1.

Define the function Ψ : (0,∞)× R→ R by

Ψ(t, w) ≡ 1√(1 + 2w)t

exp

(−(log[(√

2w)1/2 + (√

2w + 1)1/2])2

2t

). (2.21)

The next lemma derives an expression for the density ϕ(0)t of the random vari-

able Y(0)t .

Lemma 2.12. For all t, x, R > 0,

ϕ(0)t

(1

x

)=x3/2

2Iπ

∫ R+I∞

R−I∞Ψ(t, z)exzdz =

x3/2

2πeRx∫ ∞−∞

Ψ(t, R + Is)eIxsds.

Proof. Recall Bougerol’s identity [83, Equation 4.9]: let Z and Z⊥ be two inde-pendent Brownian motions and let Y(0) be the process defined in (2.20). Then

Page 80: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Mass at zero in the uncorrelated SABR model 67

sinh(Zt) and Z⊥Y

(0)t

have the same law for each t ≥ 0. It follows that, for anyz ∈ R,

1√2πt(1 + z2)

exp

(−(arcsinh(z))2

2t

)=

1√2π

∫ ∞0

y−1/2 exp

(− z

2

2y

(0)t (y)dy

=

∫ ∞0

u−3/2ϕ(0)t

(1

u

)exp

(−z

2u

2

)du,

with the substitution y = u−1. Next, using w = 12z2 and the identity arcsinh(a) =

log(√a+√a+ 1), we obtain, for all t > 0 and w ∈ C+,∫ ∞

0

u−3/2ϕ(0)t

(1

u

)e−wudu = Ψ(t, w), (2.22)

where the function Ψ is defined in (2.21). The lemma then follows from applyingthe Mellin inverse formula to the Laplace transform (2.22). Note that, from (2.21),the function Ψ(t, ·) clearly admits an analytic extension to C+ for every t >0. Therefore, we can extend formula (2.22) to C+, where we use the principalbranches of the functions w 7→

√w and w 7→ log(x).

4.2 Density of the integrated variance

We now return to the standard SABR model defined in (1.1), and we let ν = 1and y0 = 1 for computational simplicity. This then corresponds to (2.20) withµ = −1/2, and the corresponding time change is now Y

(−1/2)t . The following

theorem gives a representation formula for its density (recall that the function Ψis given by (2.21)).

Theorem 2.13. For all t > 0, x > 0, and R > 0, with the function Ψ definedin (2.21),

ϕ(− 1

2)

t (x) =(2t)−1/4 e−t/8

2πΓ(1/4)x2exp

(−1

2tx2

)∫ ∞1

exp

(1

2tx2u− R

x√u

)u1/4(u− 1)3/4∫

RΨ(t, R + Is) exp

(Isx√u

)ds

du.

The theorem follows directly, by the Mellin inversion of the Laplace transform,from Lemma 2.15 below. Before stating the lemma, though, we need to recall thenotion of a fractional integral:

Definition 2.14. Let α ≥ 0 and f be a non-negative Lebesgue measurablefunction on (0,∞). Then the fractional integral of order α of the function f isdefined, for all σ > 0, as

Jαf(σ) :=1

Γ(α)

∫ ∞σ

(τ − σ)α−1f(τ)dτ.

Page 81: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

68 III Mass at Zero and Small-Strike Implied Volatility Expansions

Lemma 2.15. For all ξ > 0, t > 0,∫ ∞0

u−9/4ϕ(−1/2)t

(1

u

)e−ξudu =

e−t/8

(1 + 2ξ)1/4J3/4Ψ(t, ·)(ξ).

Proof of Lemma 2.15. Dufresne’s recurrence formula (see [55] or [83, Section5.4.5]) allows us to relate the volatility densities for different drifts in the volatilityequation in the SABR model; applying [83, Formula (5.92)] with r = −1/4,µ = −1/2 and s = −w < 0, we obtain

et/8∫ ∞

0

λ(−1/2)(x)

x1/4e−wxdx = (1 + w)−1/4

∫ ∞0

λ(0)t (x)

x1/4e−wxdx. (2.23)

Clearly, λ(−1/2)t (x) ≡ 1

2x2ϕ(−1/2)t

(1

2x

)and λ

(0)t (x) ≡ 1

2x2ϕ(0)t

(1

2x

), so that (2.23)

reads

et/8∫ ∞

0

ϕ(−1/2)t

(1

2x

)x9/4

e−wxdx = (1 + w)−1/4

∫ ∞0

ϕ(0)t

(1

2x

)x9/4

e−wxdx.

Substituting 2x = u and ξ = w/2, we obtain

et/8∫ ∞

0

ϕ(−1/2)t (1/u)

u9/4e−ξudu = (1 + 2ξ)−1/4

∫ ∞0

ϕ(0)t (1/u)

u9/4e−ξudu. (2.24)

We now simplify the integral on the right-hand side of (2.24), using (2.22)and (2.21). Let Lf(w) ≡

∫∞0f(u)e−wudu on (0,+∞) denote the Laplace trans-

form of the function f . The formula

Jα (Lf) (w) =

∫ ∞0

u−αf(u)e−wudu,

valid for any ω > 0, is immediate from Definition 2.14, so that (2.22) yields∫ ∞0

ϕ(0)t (1/u)

u9/4e−ξudu =

∫ ∞0

u−3/4u−3/2ϕ(0)t

(1

u

)e−ξudu = J3/4 (Ψ(t, ·)) (ξ),

(2.25)with Ψ defined in (2.21), and the lemma follows from (2.24) and (2.25).

Proof of Theorem 2.13. We only sketch the proof of the theorem. Dufresne’srecurrence formula [83, Theorem 5.25] with r = −1/4 and β = 0 yields

ϕ(−1/2)t (

√y) =

(2t)−1/4 e−t/8

Γ(1/4)yexp

(− 1

2ty

)∫ ∞y

√τ

(τ − y)3/4exp

(1

2tτ

(0)t

(√τ)

dτ.

(2.26)Furthermore, Lemma 2.12 implies

ϕ(0)t

(√τ)

=exp (−R/

√τ)

2πτ 3/4

∫ ∞−∞

Ψ(t, R + Is) exp

(Is√τ

)ds,

Page 82: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Mass at zero in the uncorrelated SABR model 69

so that, taking this and (2.26) into account, we obtain

ϕ(−1/2)t (

√y) =

(2t)−1/4

2πΓ(1/4)yexp

(− t

8− 1

2ty

)∫ ∞y

τ−1/4

(τ − y)3/4exp

(1

2tτ− R√

τ

)dτ

∫R

Ψ(t, R + Is) exp

(Is√τ

)ds.

The change of variables τ = yu implies

ϕ(−1/2)t (

√y) =

e−t/8 exp(− 1

2ty

)2π(2t)1/4Γ(1/4)y

∫ ∞1

exp(

12tyu− R√

yu

)u1/4(u− y)3/4

du ×

×∫R

Ψ(t, R + Is) exp

(Is√yu

)ds,

and the theorem then follows from the mapping √y 7→ x.

Remark 2.16. If Lemma 2.12 holds also for R = 0, then the formula simplifiesto

ϕ(−1/2)t (x) =

(2t)−1/4e−t/8

2πΓ(1/4)x2exp

(− 1

2tx2

)∫ ∞1

(exp

(1

2tx2u

)u1/4(u− 1)3/4

×

×∫R

Ψ(t, Is) exp

(Isx√u

)ds

)du.

4.3 Justification of the Borodin-Salminen formula for ϕ(−1/2)t via Du-

fresne’s formula.

Using some results of [55], and assuming for simplicity that y0 = 1, we provide ajustification for the representation (2.5) for the density ϕ(µ)

t of Y(µ)t , for any µ ∈ R:

Proposition 2.17. The representation (2.5) holds.

Proof. Recall that λ(µ)t denotes the density of 1/(2Y

(µ)t ). Therefore ϕ(µ)

t clearlysatisfies

ϕ(µ)t (x) =

1

2x2λ

(µ)t

(1

2x

), for all x > 0. (2.27)

Dufresne [55, Theorem 4.2] showed that λ(µ)t admits the following closed-form

representation:

λ(µ)t (x) = exp

(−µ

2t

2

)p

(µ)t (x), (2.28)

where

p(µ)t (x) ≡ 2−µx−

µ+12

∫R

e−x cosh2(y)q(y, t) cos[π

2

(yt− µ

)]Hµ

(√x sinh(y)

)dy,

Page 83: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

70 III Mass at Zero and Small-Strike Implied Volatility Expansions

Hµ is the Hermite function of order µ, and q(y, t) ≡ 1π√

2texp

(π2

8t− y2

2t

)cosh(y).

When µ 6= −2,−4, . . ., Dufresne [55, Equation (4.13)] proved the following equi-valent formulation:

p(µ)t (x) =

2Γ(1 + µ/2)

Γ(3/2)x−µ/2

∫ ∞0

e−x cosh2(y)q(y, t) sinh(y) ×

× sin(πy

2t

)M

(1− µ

2,3

2;x sinh2(y)

)dy,

(2.29)

where M is the confluent hypergeometric function (Kummer function), also de-noted by 1F1. We shall call this formula ‘Dufresne’s formula’. Since Γ(3/2) =√π/2, the formulae (2.27), (2.28)), and (2.29)) with µ = −1/2, yield, after sim-

plifications,

ϕ(−1/2)t (x) =

21/4Γ(3/4)

π3/2t1/2x−9/4 exp

(π2

8t− t

8− 1

2x

)∫ ∞

0

exp

(−y

2

2t

)sinh(y) cosh(y) sin

(πy2t

)M

(3

4,3

2;−sinh2(y)

2x

)dy.

(2.30)

From [1, Formula 13.1.27], the identity M(a, b; z) = ezM(b− a, b,−z) holds, andhence

M

(3

4,3

2;−sinh2(y)

2x

)= exp

(−sinh2(y)

2x

)M

(3

4,3

2;sinh2(y)

2x

)= e1/(4x) exp

(−cosh(2y)

4x

)M

(3

4,3

2;sinh2(y)

2x

),

so that (2.30) simplifies to

ϕ(−1/2)t (x) =

21/4Γ(3/4)

π3/2√t

exp

(π2

8t− t

8− 1

4x

)x−9/4∫ ∞

0

exp

(−cosh(2y)

4x

)exp

(−y

2

2t

)sinh(y) cosh(y) sin

(πy2t

)M

(3

4,3

2;sinh2(y)

2x

)dy,

which yields (2.5)-(2.6) with µ = −3/4.

Page 84: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Mass at zero in the uncorrelated SABR model 71

4.4 Alternative representation of the SABR-densityand Yor’s formula

We now provide an alternative representation for the quantity

P(∫ t

0

e2νZ(−ν/2)s ds ∈ dr

)appearing in (2.5) in terms of the modified Bessel function of the second kind K(see [1, Section 9.6]). This representation is derived from the results obtained byYor in [173] (see also [130]), and we use here the formulation given in [83, Section4.7].

Proposition 2.18. For all t > 0 and y > 0,

ϕ(−1/2)t (y) =

y20

2π3/2ν3t−

12y−2 exp

(π2

2ν2t− ν2t

8− y2

0

2ν2y

)×∫ ∞

0

sinh(u)√

cosh(u)ey20

4ν2ycosh(u)2

K1/4

(y20

4ν2ycosh(u)2

)exp

(− u2

2ν2t

)sin(πuν2t

)du.

Proof. An explicit formula for the joint density µt of the random variables

Y(−1/2)t =

∫ t

0

Y 2s ds and Yt = y0 exp

(νZ

(−ν/2)t

)can be found in [83, Formula (4.84)]. It follows from this formula that

µt(y, z) =y

3/20 exp

(−ν2t

8+ π2

2ν2t− y2

0+z2

2ν2y

)√

2ztπ3/2ν3y2×

×∫ ∞

0

exp

(− u2

2ν2t− y0z

ν2ycosh(u)

)sinh(u) sin

( πuν2t

)du.

Integrating out the variable z, we get the following expression for the dens-ity ϕ(−1/2)

t of Y(−1/2)t :

ϕ(−1/2)t (y) =

y3/20 exp

(−ν2t

8+ π2

2ν2t− y2

0

2ν2y

)√

2tπ32ν3y2

∫ ∞0

exp

(− u2

2ν2t

)sin(πuν2t

)sinh(u)du

×∫ ∞

0

exp(− z2

2ν2y− y0z

ν2ycosh(u)

) dz√z. (2.31)

We call (2.31) ‘Yor’s formula’, and we now simplify the last integral, which wedenote J . Clearly,

J = 2

∫ ∞0

exp

(− 1

2ν2yx4 − y0 cosh(u)

ν2yx2

)dx.

Applying the formula [80, 3.323 (3)])∫ ∞0

exp(−β2x4 − 2γ2x2

)dx = 2−3/2 γ

βexp

(γ4

2β2

)K1/4

(γ4

2β2

),

Page 85: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

72 III Mass at Zero and Small-Strike Implied Volatility Expansions

which holds for all β and γ with | arg β| < π4and | arg γ| < π

4, with β = 1/(ν

√2y)

and γ =√y0 cosh(u)/(ν

√2y), we obtain

J =

√y0 cosh(u)

2exp

(y2

0

4ν2ycosh(u)2

)K1/4

(y2

0

4ν2ycosh(u)2

),

and the proposition follows from this and (2.31).

It is also possible to represent the density ϕ(−1/2)t of Y

(−1/2)t by an integral

formula using the Kummer function of the second kind U [1, Section 13, Equation13.1.3]:

Lemma 2.19. For all t > 0 and y > 0,

ϕ(−1/2)t (y) =

y5/20

25/4πν7/2t−1/2y−9/4 exp

(π2

2ν2t− ν2t

8− y2

0

2ν2y

)×∫ ∞

0

U

(34,3

2,y20

2ν2ycosh(u)2

)sinh(u) cosh(u) exp

(− u2

2ν2t

)sin( πuν2t

)du.

Proof. Recall that [1, Equation 13.6.21]

U(a, 2a, x) =1√π

exp(x

2

)x

12−aKa− 1

2

(x2

),

for all a > 0 x > 0. Therefore,

K1/4

(y2

0

4ν2ycosh(u)2

)=√π exp

(− y2

0

4ν2ycosh(u)2

)[y2

0

2ν2ycosh(u)2

]1/4

×

× U

(3

4,3

2,y2

0

2ν2ycosh(u)2

),

and the lemma follows from Proposition 2.18.

Remark 2.20. Dufresne’s formula (2.29) and Yor’s formula (2.31) yield differentintegral representations for the density ϕ(−1/2)

t . It is not obvious, however, to showdirectly that they are the same, and we refer the reader to some hints in [130,Section 4.3].

Page 86: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Mass at zero for the correlated SABR model 73

3 Mass at zero for the correlated SABR model

The present section deals with the behavior of the mass at zero in the correlated(ρ 6= 0) SABR model, and in the associated model for the Brownian motion in theSABR plane. We shall consider the following natural perturbation of the SABRmodel:

dXt = YtXβt dWt +

β

2Y 2t X

2β−1t dt, X0 = x0 > 0,

dYt = νYtdZt, Y0 = y0 > 0,d〈Z,W 〉t = ρdt,

(3.1)

with ν > 0, ρ ∈ (−1, 1), β ∈ [0, 1). Note that the additional drift only appears asa higher-order term in the perturbation expansion for the kernel derived in theoriginal paper by Hagan et al. [90].

Note that the behavior of the drift in the first stochastic differential equationand its implications for the mass at zero of the modified model are significantlydifferent in the cases 0 < β < 1/2 and 1/2 < β < 1: in the former case the driftexplodes when the process X approaches zero, while it vanishes in the latter case.In the case β = 1/2, the drift does not depend on the process X. The modelin (3.1) is a modification of the SABR model which characterises a Brownianmotion in a suitably chosen Riemannian manifold with boundary (the SABRplane [90, Subsection 3.2]), see Lemma 3.3 below.

Of particular interest are two special cases of (3.1): the uncorrelated case ρ = 0and the β = 0 case. Computing the mass at zero for (3.1) in the uncorrelated casesheds light on the influence of the drift, by a comparison with the results obtainedin Section 2. In the case β = 0, the drift in (3.1) vanishes, and the model coincideswith the original SABR model (1.1) with β = 0. In fact, according to [17] and [90],in the prevalence of low interest rates, the choice β = 0 is essential. Note that upto a deterministic time change, we can assume ν = 1, which we shall tacitly dowithout loss of generality from now on if not stated otherwise.

Page 87: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

74 III Mass at Zero and Small-Strike Implied Volatility Expansions

1 SABR geometry and geometry preserving mappings

We first exhibit a set of mappings allowing to translate the properties of oneSABR model to another. Let H = x + Iy : x ∈ R, y > 0 and H+ := x +Iy : (x, y) ∈ (0,∞)2. (H, g) will denote the classical Poincaré plane with itsassociated Riemannian metric [81, Section 3.9], and (S, g) the general SABRplane (generated by (3.1)). As mentioned above, of particular interest are thetwo cases of the uncorrelated SABR plane, denoted by (U, u), and of (S0, g0), thegeneral SABR plane with β = 0. Note that only U and S exhibit a drift. We alsodenote by S0

+ := S0 ∩ x+ Iy : (x, y) ∈ (0,∞)2. The following tensors, h, g, g0

and u generate Riemannian metrics on their respective spaces:

h(x, y) =dx2

y2+

dy2

y2, (x, y) ∈ R× (0,∞), (3.2)

g(x, y) =dx2

ρ2y2x2β− 2ρdxdy

ρ2y2xβ+

dy2

ρ2y2, (x, y) ∈ (0,∞)2, (3.3)

g0(x, y) =dx2

ρ2y2− 2ρdxdy

ρ2y2+

dy2

ρ2y2, (x, y) ∈ R× (0,∞), (3.4)

u(x, y) =dx2

y2x2β+

dy2

y2(x, y) ∈ (0,∞)2, (3.5)

and the following diagram summarises the different relations between the map-

pings and the spaces (we also include the corresponding coordinate notations):

(x,y)

H

χ**(x,y)

S0

φ0

44

χ

(x,y)

U

ϕ0

jj

(x,y)

Sφ0

EE

ϕ0

XXφ0

0

OO

Regarding the mapping notations, subscripts are related to the correlation para-meter (φ0 ‘annihilates’ ρ), whereas superscripts 0 indicate that the map ‘annuls’the parameter β; the map χ reintroduces this parameter.

Page 88: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Mass at zero for the correlated SABR model 75

The mappings between these spaces are defined as follows:

φ00 : S −→ H,

(x, y) 7−→ (x, y) :=

(x1−β

ρ(1− β)− ρy

ρ, y

),

ϕ0 : S −→ S0,

(x, y) 7−→ (x, y) :=

(x1−β

1− β, y

),

φ0 : S −→ U,

(x, y) 7−→ (x, y) =

((1− β)

11−β

(x1−β

ρ(1− β)− ρy

ρ

) 11−β

, y

), ρ ≤ 0,

φ0 : S0+ −→ H,

(x, y) 7−→ (x, y) :=

(x− ρyρ

, y

),

χ : S0+ −→ S,

(x, y) 7−→ (x, y) :=(

(1− β)1

1−β x1

1−β , y),

χ : H+ −→ U,(x, y) 7−→ (x, y) :=

((1− β)

11−β x

11−β , y

),

ϕ0 : U −→ H+,

(x, y) 7−→ (x, y) :=

(x1−β

1− β, y

).

(3.6)From now on, if not indicated otherwise, we restrict the domains of the abovemaps to the first quadrant (0,∞)2, which–when considering compositions–imposerestrictions on the parameters in order to ensure that images also belong tothis set (for example the restriction ρ ∈ (−1, 0] needs to be imposed for thecomposition φ0

0 χ). While the map φ0 can be extended to the whole upperhalfplane R×(0,∞), thus describing an asset with negative value, the maps ϕ0, φ0

0

and χ are in general not meaningful there. They can be extended to x = 0 though,and are non-differentiable there. The following theorem gathers the properties ofall these maps:

Theorem 3.1. The diagram is commutative and all the mappings in (3.6) arelocal isometries on their respective spaces:

• the maps ϕ0 and χ (resp. ϕ0 and χ) on (0,∞)2 are onto and inverse to oneanother;

• the compositions φ0 ϕ0 and ϕ0 φ0 coincide with φ00;

• it holds that χ φ00 = φ0 and φ0

0 χ = φ0 and the latter is well defined forρ ∈ (−1, 0];

Page 89: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

76 III Mass at Zero and Small-Strike Implied Volatility Expansions

• the map ϕ0 (resp. ϕ0) transforms Brownian motion on (S, g) (resp. (U, u))into the SABR model (1.1) with β = 0 (resp. ρ = β = 0), which in turn istransformed back to Brownian motion on its original spaces by the map χ(resp. χ);

• the maps φ0 (resp. the extension of φ0) transforms Brownian motion on (S, g)(resp. (S0, g0)) into its uncorrelated version on (U, u) (resp. (H, h)).

Remark 3.2. The map φ00 was first considered in [90], where it was observed

there that it is a local isometry mapping Brownian motion on (S, g) to a Brownianmotion on the hyperbolic half-plane (H, h).

Proof. The first three items follows from simple computation; the remaining state-ments follow from Lemmas (3.3),(3.5), (3.6) and (3.7) below.

Lemma 3.3. The process (X, Y ) with dynamics (3.1) coincides in law withBrownian motion on the manifold (S, g). We define the process (X, Y ) pathwiseby applying to (X, Y ) the space transformation ϕ0 : S → S0, i.e. by setting

(Xt, Yt) :=

(X1−βt

1− β, Yt

), for all t ≥ 0. (3.7)

Then (X, Y ) is a SABR process with β = 0. Furthermore, the process (3.7)coincides in law with Brownian motion on the manifold (S0, g0), to which werefer as the correlated hyperbolic plane.

Proof. The statement that (3.1) has the same law as Brownian motion on (S, g) isverified by computing the infinitesimal generator of (3.1), which coincides with theLaplace-Beltrami operator 1

2∆g on a manifold with metric tensor g(x, y) (see (1.1)

and (1.1) in Appendix B 1.1 for more detail). The second statement is straight-forward from Itô’s formula, which transforms the system (3.1) into

dXt := YtdWt, X0 = x0 := x1−β0 /(1− β),

dYt = νYtdZt, Y0 = y0,d〈W,Z〉t = ρdt.

(3.8)

It is easy to see that (3.8) has SABR dynamics (1.1) with parameters β = 0,and ρ ∈ (−1, 1). The generator of (3.8) coincides with the Laplace-Beltramioperator 1

2∆g0 of the respective manifold, which yields the last statement.

Remark 3.4. Since β ∈ [0, 1), we have

P∞ = P(Xt = 0 for some t ∈ (0,∞)) = P(Xt = 0 for some t ∈ (0,∞)). (3.9)

Note that the map ϕ0 : S → S0 is applied in the proof of Theorem 3.10 below.

Page 90: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Mass at zero for the correlated SABR model 77

Lemma 3.5. φ0 : S0 → H is a global isometry and transforms the SABRmodel (1.1) with β = 0 into a Brownian motion on (H, h). Furthermore, the heat(or transition) kernel of the solution of the system (3.8) is available in closedform:

1

ρKhφ0(x,y)(s, φ0(x, y)), for s > 0, (x, y) ∈ S0,

where Kh(x,y)(s, ·) denotes the hyperbolic heat kernel at (x, y) ∈ H, for which a

closed-form expression as well as short- and large-time asymptotics are known(see [81, Equation (9.35)] and [90, Appendix]).

Proof. The following shows that φ0 is in fact a global isometry: φ0 is onto andinvertible on S0 and, for any (x, y) ∈ S0, its Jacobian

∇φ0(x, y) =

(1/ρ −ρ/ρ0 1

),

is independent of x and does not explode at x = 0.Furthermore, for any (x, y) ∈ S0,

(φ∗0h)

(x, y) =φ∗0

(dx2 + dy2

y2

)=

1

y2

(dx

ρ− ρdy

ρ

)2

+(dy)2

y2= g0(x, y).

The last statement follows from Lemma II.3.5 together with det(∇φ0(·)) = 1/ρ 6=0. One can easily verify by Itô’s lemma that the dynamics (3.8) for generalρ ∈ (−1, 1) are transformed into (3.8) for ρ = 0 under the map φ0.

Lemma 3.6. The map χ (resp. χ) is a local isometry between (S0+, g

0) and(S, g) (resp. (H+, h) and (U, u)) and transforms the Brownian motion on thehyperbolic plane (S0

+, g0) (resp. (H+, h)), whose dynamics are described by (3.8),

into a Brownian motion on the general SABR plane (S, g) (resp. (U, u)), satis-fying (3.1).

Proof. For a local isometry between (S0+, g

0) and (S, g) (resp. (H, h) and (U, u)),it holds that for any (x, y) ∈ S0 (resp.(x, y) ∈ H) there exists a small openneighbourhood U(x,y) ⊂ S0 (resp. U(x,y) ⊂ H), such that the map χ|U(x,y)

(resp.χ|U(x,y)

) is an isometry onto its image, in particular it satisfies the pullback relation

(χ∗g) (x, y) = χ∗(

dx2

ρx2βy2+

2ρdxdy

ρxβy2+

dy2

ρy2

)=

dx2 + 2ρdxdy + dy2

ρy2= g0(x, y),

respectively, for zero correlation

(χ∗u) (x, y) = χ∗(

dx2

x2β y2+

dy2

y2

)=

dx2 + dy2

y2= h(x, y).

Page 91: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

78 III Mass at Zero and Small-Strike Implied Volatility Expansions

For any (x, y) ∈ S0 (resp. (x, y) ∈ H), the Jacobians read

∇χ(x, y) =

((1− β)

β1−β x

β1−β 0

0 1

)and ∇χ(x, y) =

((1− β)

β1−β x

β1−β 0

0 1

),

respectively, hence the local pullback property is clearly satisfied by χ (resp. χ).The last statement follows by Itô’s lemma.

We now verify that φ0 is a ‘geometry-preserving’ map from the general SABRplane (S, g) into the uncorrelated SABR plane (U, u), which of course reduces tothe identity map when ρ = 0, and to φ0 when β = 0.

Lemma 3.7. For any ρ ∈ (−1, 0] and any (x, y) ∈ S, the space transformationφ0 : S −→ U in (3.6) is a local isometry between (S, g) and (U, u).

Proof. The statement follows directly from the fact that the map φ0 and itspartial derivatives

∂xx(x, y) =x−β

ρ(1− β)

β1−β

(x1−β

ρ(1− β)− ρy

ρ

)β/(1−β)

,

∂yx(x, y) = −ρρ

(1− β)β

1−β

(x1−β

ρ(1− β)− ρy

ρC

)β/(1−β)

,

∂xy(x, y) = 0, ∂yy(x, y) = 1,

(3.10)

satisfy the following system of differential equations implied by the local pullbackproperty

(φ∗0u)

(x, y) = g(x, y), for any (x, y) ∈ S, (x, y) ∈ U for the Riemannianmetrics g and u:

(∂xx)2

x2β y2+

(∂xy)2

y2=

1

ρ2y2x2β,

2(∂xx∂yx)

x2β y2+

2(∂xy∂yy)

y2=−2ρ

ρ2y2xβ,

(∂yx)2

x2β y2+

(∂yy)2

y2=

1

ρ2y2.

As an application of Lemma 3.7 it may be possible to relate the absolutelycontinuous part of the distribution of Brownian motion on the uncorrelated SABRplane (U, u) and that of Brownian motion on the general SABR plane (S, g) viathe relation (II.3.4) of the heat kernels [159]; this can be performed followingsimilar steps as in [90], but care is needed, as discussed below.

Page 92: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Mass at zero for the correlated SABR model 79

Lemma 3.8. Let Kg and Ku denote the fundamental solutions (in terms ofLebesgue) of the heat equations corresponding to the metrics g and u, then, forany z = (x, y) ∈ S,

KgZ(s, z) =

(1− β)β

1−β

ρxβ

(x1−β

ρ(1− β)− ρy

ρ

) β1−β

Kuφ0(z)(s, φ

00(z)). (3.11)

When β = 1/2, the formulae simplify to

φ0(x, y) ≡(

1

(1− ρ)2

(x−√xρy +

ρ2y2

4

), y

),

and det(∇φ0(x, y)) =(

1− ρy2√x

)/(1− ρ)2, for all (x, y) ∈ S.

Proof. The statement follows from Lemma II.3.5: the Radon-Nikodym derivativesare dz

dµg(z)= ρ2y2xβ and dz

dµu(z)= y2xβ, with µg and µu the Riemannian volume

elements on S and U (Definition II.3.2 in Appendix 3), and the Jacobian of φ0

at z = (x, y) ∈ S is as in (3.10), so that

det(∇φ0(x, y)

)=

(1− β)β

1−β

ρxβ

(x1−β

ρ(1− β)− ρy

ρ

) β1−β

.

Remark 3.9. Such a relation of heat kernels relies on the property II.3.1 ofLaplace-Beltrami operators, which is not meaningful for (1.1) at x = 0 for gen-eral β.

A statement relating the heat kernels might not hold true in the vicinity ofthe origin. Although in the case of exploding Jacobians the relation (II.3.4)of ‘kernels’ formally indicates that the map under consideration induces anatom, it does not allow for an exact computation.

Remark further, that non-differentiability issues at x = 0 of the maps mayinduce a local time at this point, which we do not investigate further for ϕ0,ϕ0, χ and χ, as we imposed Dirichlet boundary conditions at x = 0. Butthey might be of importance for the map φ0 introduced in (3.6) above andfor the map φ0

0 considered in [90].

A statement similar to Lemma 3.8 below was made in [90] relating Kg tothe hyperbolic heat kernel Kh; in their analysis, the determinant was

det(∇φ00(x, y)) ≡ x−β/ρ.

Page 93: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

80 III Mass at Zero and Small-Strike Implied Volatility Expansions

2 Application: Large-time behavior of the mass

We now compute the large-time limit of the mass at zero in the modified SABRmodel (3.1), with correlation:

P∞ =: limt↑∞

P(Xt = 0). (3.12)

The computation of the mass (Theorem 3.10 below) follows the works of Hob-son [98] on time changes. We apply such a technique to progress from theBrownian motion on the correlated hyperbolic plane (3.8) to a correlated Brownianmotion on the Euclidean plane. The joint distribution of hitting times of zeroof two (correlated) Brownian motions without drift was first established by Iy-engar [103], and refined by Metzler [132] (see also [28] for further results on hittingtimes of correlated Brownian motions).

We also borrow some ideas from Chapter II, where Hobson’s constructionfor the normal SABR model [98, Example 5.2] is extended to (3.1) for generalβ ∈ [0, 1]. This indeed follows from the observation that stochastic time changemethods, going back to Volkonskii [171], can still be applied to the Brownian mo-tion on the SABR plane. In order to formulate our next statement, we introduceseveral auxiliary parameters (see [132]):

a1 :=x1−β

0

1− β, a2 :=

y0

ν, r0 :=

√a2

1 + a22 − 2ρa1a2

ρ2 ,

α :=

π + arctan(−ρ/ρ), if ρ > 0,π

2, if ρ = 0,

arctan(−ρ/ρ), if ρ < 0,

θ0 :=

π + arctan(a2ρ/ρ), if a1 < ρa2,π

2, if a1 = ρa2,

arctan(a2ρ/ρ), if a1 > ρa2.

Theorem 3.10. For the modified SABR model (3.1), the limit (3.12) of the massat zero satisfies P∞ =

∫∞0

dt∫ t

0f(s, t)ds, where for any s < t,

f(s, t) =π sin(α)

2α2(t− s)√s(t− s cos2(α))

exp

(− r

20

2s

t− s cos(2α)

2t− s(1 + cos(2α))

)×∞∑n=1

n sin

(nπ(α− θ0)

α

)Inπ

(r2

0

2s

t− s2t− s(1 + cos(2α))

).

where Iz denotes the modified Bessel function of the first kind of order z (see [29,Page 638]).

Page 94: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Mass at zero for the correlated SABR model 81

Remark 3.11. Note that when β = 0, the model (3.1) exactly corresponds tothe original SABR mode (1.1) with β = 0. In Theorem 3.10 above, a1 is thenequal to the starting point x0.

Proof. Recalling the process X in (3.7), and the SDE (3.8), we wish to apply [98,Theorem 3.1] to (3.8). Consider the system of SDEs

dXt = dWt, X0 = x0,

dYt = νdZt, Y0 = y0,

d〈W , Z〉t = ρdt,

(3.13)

where (W , Z) is a two-dimensional standard Brownian motion. With the time-change process

τ(t) := inf

u ≥ 0 :

∫ u

0

Y −2s ds ≥ t

, (3.14)

Theorem 3.1 in [98] implies that

Xt = Xτ(t) and Yt = Yτ(t), (3.15)

for all t ≥ 0. In addition, the transformation (3.7) gives, for all t ≥ 0,

Xt =(x1−β

0 + (1− β)Wτ(t)

)1/(1−β)

.

Let now ε denote the explosion time of (3.13), namely the first time that either Xor Y hits zero. It is also the first time that the process W hits the level −x0 orthat Z hits −y0/ν. Set

Γt :=

∫ t

0

Y −2s ds and ζ := lim

t↑ζΓt.

The process Γ is strictly increasing and continuous, so that its inverse Γ−1 is welldefined, and clearly the time-change process (3.14) satisfies τ = Γ−1. Consider anew filtration G and two processesW and Z defined, for each t ≥ 0, by Gt := Fτ(t),

Wt :=

∫ τ(t)

0

dWs

Ysds and Zt :=

∫ τ(t)

0

dZs

Ysds.

Up to time ζ, W and Z are G-adapted Brownian motions, and the system(W,Z, X, Y ) is a weak solution to (3.8). It is therefore clear that

P(τ X0 ∈ ds, τ Y0 ∈ dt

)= P

(τ W−x0

∈ ds, τ Z−y0/ν∈ dt

).

Moreover, it follows from [132, Equation 3.2] with ~µ = ~0, ~x0 = (x0, y0), and

σ =

(ρ ρ0 ν

),

Page 95: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

82 III Mass at Zero and Small-Strike Implied Volatility Expansions

that P(τ X0 ∈ ds, τ Y0 ∈ dt

)= f(s, t)dsdt, where the function f is defined in The-

orem 3.10, so that

P(τ X0 < τ Y0

)=

∫ ∞0

dt

∫ t

0

f(s, t)ds. (3.16)

Reversing the arguments presented in Chapter II (see also [52, 98]), the prob-ability P(τ X0 < τ Y0 ) coincides with the probability that the process X hits zeroover the time horizon [0,∞). Indeed, through (3.15), the time change (3.14)converts the Brownian motion Y into a geometric Brownian motion Y startedat y0 > 0, so that the (a.s. finite) point τ Y0 is mapped to τY0 = ∞. Thereforethe time-changed process X over [0, τ Y0 ) corresponds to X considered over [0,∞)and, using (3.15), we obtain

P(τ X0 < τ Y0

)= P

(τ X0 < τY0

)= P

(τ X0 <∞

)= P

(Xt = 0, for some t ∈ (0,∞)

),

and Theorem 3.10 follows from (3.9) and (3.16).

Remark 3.12. For the normal SABR model (β = 0) in (1.1), Hobson [98, Ex-ample 5.2] found the following explicit formula for the price process X:

Xt =ρ

ν

(Yτ(t) − y0

)+ ρ2Zτ(t), for all t ≥ 0,

where the process Y and the Brownian motion Z are the same as in (3.13), andτ is defined in (3.14).

Remark 3.13. For β = 1, the SDEs of the SABR and the modified SABR modelsread

dXt = Xt (YtdWt) , X0 = x0, and dXt = Xt

(YtdWt +

1

2Y 2t dt

), X0 = x0,

respectively, and, by the Doléans-Dade formula [151, Section IX-2], the solutionsto these equations are exponential functionals, and therefore do not exhibit massat the origin.

Remark 3.14. In the uncorrelated case ρ = 0, the expressions in Theorem 3.10simplify to α = π

2, θ0 = arctan

(a2

a1

), r0 =

√a2

1 + a22, and

f(s, t) =2

π(t− s)√st

exp

(−r

20(t+ s)

4st

) ∞∑n=1

n sin(

2n(π

2− θ0

))In

(r2

0(t− s)4st

).

Page 96: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

4 Implied volatility and small-strike expansions 83

4 Implied volatility and small-strike expansions

In this section, we show how the results above on the mass at zero in the SABRmodel can be used to infer information about the corresponding implied volatilitysmile. We recall that the implied volatility is simply the Black-Scholes volatilityparameter that allows to match observed (or computed) European option prices.It obviously depends on strikes and maturities, and we refer the reader to [72] formore details. As noted on Page 79, one possible explanation for the inaccuraciesof the ‘classical’ implied volatility asymptotic expansion [90] in the vicinity ofzero lies in the breakdown of the commutativity of Laplace-Beltrami operators(cf. (II.3.1) and (II.3.4)), which is used in their proof, when passing from thehyperbolic heat kernel to the heat kernel on the general SABR plane. In the casewhere β = ρ = 0, the infinitesimal generator corresponding to the SDE (1.1) isuniformly elliptic and the heat kernel is known. We plot below (Figure 4) theimplied volatility expansion derived in [140]–a slightly refined version of the onein [90]–and highlights the fact that it can yield arbitrage. As explained in [76],the density of the log stock price log(X) (or log forward rate) can be expresseddirectly in terms of the implied volatility (see [76, Proof of Lemma 2.2]), andnegative densities obviously yield arbitrage opportunities.

Figure III.3: Density (right) of the log process log(X) obtained from the impliedvolatility expansion [140] (left) with (ν, β, ρ, x0, y0, T ) = (0, 1, 0.6, 0.05, 0.5, 1.2).The mass at zero, computed using (2.2) is equal to 4.5%.

This anomaly can in principle be fixed if one accounts for the accumulation ofmass at zero due to the Dirichlet boundary condition. Let us recall a few (model-independent) results regarding small-strike asymptotics of the implied volatility.For any strike K > 0 and maturity T > 0, let us denote by IT (K) the impliedvolatility. In the presence of strictly positive mass at zero, the small-strike tail ofthe implied volatility satisfies [119]:

lim supK↓0

IT (K)√| logK|

=

√2

T. (4.1)

Page 97: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

84 III Mass at Zero and Small-Strike Implied Volatility Expansions

This behavior was recently refined by de Marco et. al. and Gulisashvili independ-ently [46, 85]. Assuming that there exists ε > 0 such that P(XT ≤ K)− P(XT =0) = O(Kε) as K tends to zero, de Marco et al. [46] derive the following small-strike asymptotic formula:

IT (K) =

√2| logK|

T+N−1(mT )√

T+

(N−1(mT ))2

2√

2T | logK|+ Φ(K), (4.2)

where mT := P(XT = 0) is the mass at the origin, N the Gaussian cumulativedistribution function, and Φ : (−∞, 0)→ R a function satisfying

lim supK↓0

√2T | logK||Φ(K)| ≤ 1.

Gulisashvili [85] obtained an alternative formulation (removing the assumptionon the decay of the probability of XT near zero); however, since we only wishhere to highlight the inaccuracy of Hagan’s (or Obłój’s) expansion in low-strikeregimes, we omit a precise formulation of his result and refer the interested readerto this paper for full details. In Figure 4 below, we visually quantify how ‘wrong’Hagan’s expansion is for small strikes in the presence of a mass at the origin. Weplot the functions k ∈ R 7→ IT (ek)

√T/|k| which, from (4.2) has to be bounded

by√

2 in order to avoid arbitrage, and compare it to the first and second orderof (4.2), using (2.11) to compute the (large-time) mass at zero. We considertwo parameter sets, one for which the large-time mass is small, and the otherwhich yields a large mass at the origin. As the mass becomes small, Hagan’s (orObłój’s) approximation becomes more accurate. This holds in particular as theparameter β gets close to one, as indicated in Section 3.1 above. In the limit asβ = 1, the mass becomes null.

Figure III.4: The black line marks the level√

2. The paramet-ers are (ν, β, ρ, x0, y0, T ) = (0.3, 0, 0, 0.35, 0.05, 10) for the left plot, and(ν, β, ρ, x0, y0, T ) = (0.6, 0.6, 0, 0.08, 0.015, 10) for the right graph. Obłój’s im-plied volatility expansion clearly violates this upper bound in both cases. Thelarge-time mass is equal to 28.3% for the left plot and 3.1% for the right one.

Page 98: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

4 Implied volatility and small-strike expansions 85

Comparison with Antonov-Konikov-Spector formulae

In the uncorrelated case ρ = 0, Antonov, Konikov and Spector [16] derived adouble integral formula for the price of a Call option in the SABR model. Thisexpression reads as follows:

E(XT −K)+ = (X0 −K)+ +2√X0K

π

∫ s+

s−

sin(ηϕ(s))

sinh(s)G(ν2T, s)ds

+ sin(ηπ)

∫ ∞s+

exp(−ηψ(s))

sinh(s)G(ν2T, s)ds

,

where η := 1/|2(β−1|, q := K1−β/(1−β), q0 := X1−β0 /(1−β), s± := arcsinh (ν|q ± q0|/y0),

ϕ(s) := 2 arctan

√sinh(s)2 − sinh(s−)2

sinh(s+)2 − sinh(s)2, and

ψ(s) := 2arctanh

√sinh(s)2 − sinh(s+)2

sinh(s)2 − sinh(s−)2.

The function G is defined as the following integral:

G(t, s) :=2 exp (−t/8)

t3/2√π

∫ ∞s

u√

cosh(u)− cosh(s) exp

(−u

2

2t

)du.

In Figure 4 below, we compare the implied volatility smile obtained fromthe Antonov et al.’s formula (computing the double integral and numericallyinverting the Black-Scholes formula) and the closed-form tail formula (4.2) usingthe large-maturity mass at zero computed from (2.11). Following Antonov et al.,we consider a maturity T = 20 years. We consider the following set of parameters:

ν = 0.8, β = 0.1, ρ = 0, x0 = 0.1, y0 = 0.15, T = 20,

which implies that the large-maturity mass at zero (2.11) is roughly equal to 63%.

Page 99: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

86 III Mass at Zero and Small-Strike Implied Volatility Expansions

5 Proofs of Section 2

Proof of Proposition 2.6

We adapt here Gerhold’s proof in [78] to our case, which is based on an inverseLaplace transform approach. From [29, Page 645], the Laplace transform of thefunction my has a closed-form representation, namely, whenever µ > −3/2 andz > 0,

my(µ, z) = L−1u

(Γ(µ+ 1

2+√u)

Γ(1 + 2√u)

M−µ,√u(2z)

),

where the function M is related to the Kummer function M function via theidentity

Mn,m(x) ≡ xm+1/2e−x/2M

(m− n+

1

2, 2m+ 1, x

).

Therefore, we can write, for some R ∈ R,

my(µ, z) =e−z

2Iπ

∫ R+I∞

R−I∞euy

Γ(µ+ 12

+√u)

Γ(1 + 2√u)

(2z)12

+√u

M

(µ+

1

2+√u, 1 + 2

√u, 2z

)du.

(5.1)

Since we wish to determine the behavior of my as y (equivalently, t) tends tozero, we need to understand the limit of the integrand as u tends to infinity. Thefollowing asymptotic relations hold uniformly in v, as v =

√u tends to infinity:

Γ(1 + v) =√

2πe−vvv+1/2[1 +O(v−1)

]and

M

(µ+

1

2+ v, 1 + 2v, 2z

)∼ ez.

(5.2)

The first one is standard [135, Section 3.5]. As for the second one, the represent-ation (2.7) yields

M

(1

2+ v + µ, 1 + 2v, 2z

)=∞∑k=0

γk(2z)k

k!,

where

γk :=(µ+ v + 1

2) · · · (µ+ v + k − 1

2)

(1 + 2v)(2 + 2v) · · · (k + 2v),

for k ≥ 0. Now, clearly |γk| ≤ 2−k and γk ∼ 2−k as v tends to infinity. Notefurthermore that from [1, Formula 13.6.3], we have

M

(1

2+ v + µ, 1 + 2v, 2z

)≤ M

(1

2+ v, 1 + 2v, 2z

)= Γ(1 + v)ez

(1

2z

)−vIv(z),

where again Iv denotes the modified Bessel function of the first kind [29, Page638], so that, using (5.2) and [78, Equation 9], we have, uniformly in v,∣∣∣∣M(

1

2+ v + µ, 1 + 2v, 2z

)∣∣∣∣ ≤ Γ(1 + v)ez(z

2

)−vIv(z) = ez

(1 +O

(v−1)).

Page 100: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs of Section 2 87

Therefore the integrand in (5.1) reads, as u tends to infinity,

Φ(u, y, z) ≡ euyΓ(µ+ 1

2+√u)

Γ(1 + 2√u)

(2z)12

+√uM

(µ+

1

2+√u, 1 + 2

√u, 2z

)∼ ev

2y+v+zzv+ 12vµ−v−

12 2−v

= exp

[v2y + αv +

(µ− 1

2− v)

log(v) + z +1

2log(z)

]=: exp

(ψy(u) + z +

1

2log(z)

).

where α := 1 + log(z)− log(2) ∈ R, and where the function ψy is defined by

ψy(u) ≡ uy − 1

2

√u log(u) + α

√u+

1

2

(µ− 1

2

)log(u). (5.3)

For y > 0 small enough, the saddlepoint equation ∂uψy(u) = 0, or

2µ− 1 + 4uy + 2(α− 1)√u−√u log(u) = 0,

(namely (2.8)) admits a solution uy > 0. This saddlepoint equation can berewritten as

y =log(uy)

4√uy

+1− α2√uy− (µ− 1/2)

2uy. (5.4)

Remark 5.1. Note that the saddlepoint equation also reads

y =log(u0)

2√

2u0

− ρ√2u0

− 4µ− 2

4u0

,

where ρ := log(z/√

2) and u0 := 2uy, which is reminiscent of that of [78]. In fact,the saddlepoint equation above does not admit a unique solution; in order for thelatter to be continuous (as a function of y), one should take the largest solution.

Following [78], we can therefore deform the contour of integration in (5.1)around the saddlepoint uy to obtain, as y tends to zero:

my(µ, z) =e−z

2Iπ

∫ R+I∞

R−I∞Φ(u, y, z)du ∼

√z

2Iπ

∫ R+I∞

R−I∞eψy(u)du

∼√z

2Iπ

∫ uy+I∞

uy−I∞eψy(u)du.

(5.5)

Let λ denote the real integration variable, so that u = uy + Iλ. Around thesaddlepoint (λ = 0), we have the uniform Taylor series expansions:

√u =√uy +

Iλ2√uy

+λ2

8u3/2y

+O

(λ3

u5/2y

),

log u = log uy +Iλuy

+λ2

2u2y

+O(λ3

u3y

),

Page 101: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

88 III Mass at Zero and Small-Strike Implied Volatility Expansions

and

√u log u =

√uy log uy +

(2 + log(uy))Iλ2√uy

+log(uy)λ

2

8u3/2y

+O

((1 + log(uy))λ

3

u5/2y

),

(5.6)

so that

ψy(u) = uyy +√uy

(α− log(uy)

2

)− log(uy)

4+µ log(uy)

2

−Myλ2 +O

[λ3(1 + log(uy))

u5/2y

] (5.7)

where the coefficients in front of λ cancelled out from the saddlepoint equation,and where

My :=log(uy)

16u3/2y

− α

8u3/2y

+1− 2µ

8u2y

,

as defined in Proposition 2.6. Note that by bootstrapping (see Section 5 fordetails), the expansion

uy =log(y)2

4y2

[1− 2 log log(1/y)

log(y)+

log(z2)

log(y)+ o

(1

log(y)

)](5.8)

holds for the saddlepoint as y tends to zero, and implies

My =y3

log(y)2

[1 +O

(log | log(y)|

log(y)

)]. (5.9)

Now,

∫ h

−he−Myy2

dy =1√2My

∫ h√

2My

−h√

2M

e−ω2/2dω

∼ 1√2My

∫R

e−ω2/2dω =

√π

My

∼√π| log(y)|y3/2

,

(5.10)

Page 102: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs of Section 2 89

and we can therefore write (5.5) as

my(µ, z) ∼√z

2Iπ

∫ uy+Ih

uy−Iheψy(u)du

∼√z

2πexp

[uyy +

√uy

(α− log(uy)

2

)− log(uy)

4+µ log(uy)

2

] ∫ h

−he−Myλ2

∼√z

2πexp

[uyy +

√uy

(α− log(uy)

2

)− log(uy)

4+µ log(uy)

2

] √π| log(y)|y3/2

=

√z

2πexp

[(1

2− µ

)+

log(uy)

2

(µ− 1

2

)− uyy +

√uy

] √π| log(y)|y3/2

=

√z

2√π

exp

(1

2− µ

)| log(y)|y3/2

u12(µ− 1

2)y exp

(−uyy +

√uy)

=

√z

2√π| log(y)| exp

[− log(y)2

4y+

1

2

| log(y)|y

+

(1

2− µ

)(1− 1

2log

(log(y)2

4y2

))](1

y3/2+O

(y3/2

)),

(5.11)

where we used the saddlepoint equation (5.4) in the fourth line.

It now remains to prove that one can indeed neglect the tails of the integrationdomain, where =(u) = λ ≥ h. The analysis of this is similar to that of [78,Section 3], and we only outline here the main arguments. First, specify a choiceh := log(y)2/y3/2 of integration bounds accounting for the main contribution tothe integral

∫ uy+I∞uy−I∞ exp(ψy(u))du, with ψy defined in (5.1) and where uy denotes

the saddlepoint in (5.4). By symmetry, it is clearly sufficient to consider only oneside of the tails, and we shall therefore focus on the positive one

∫ uy+I∞uy+Ih eψy(u)du.

The analysis is then split into looking at the inner tail h ≤ λ < elog(1/t)2/4 and atthe outer tail λ ≥ elog(1/t)2/4. Similarly to [78, Equation (10)], the estimate∫ uy+I∞

uy+Iheψy(u)du ∼ 2 exp

uyt+

1

8log(y)2 − exp

(log(y)2

8

)then prevails for the outer tail. Furthermore, for any real number B, [78, Lemma1] remains valid for the behavior of the real part of

√u log(u) + B

√u with re-

spect to |=(u)|, which allows to bound above the inner tail by the value of theintegrand at λ = h of −Myλ

2|λ=h ∼ −12

log(y)2 multiplied by the length of theintegration path, which is of order elog(1/t)2/4; the relative error is therefore oforder exp(−1

4log(y)2 + o(log(y)2)).

The final part of the error analysis in the expansion (5.11) follows from analog-ous estimates to [78, Table 1], and the total (both tails) error resulting from the

Page 103: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

90 III Mass at Zero and Small-Strike Implied Volatility Expansions

completion to Gaussian integral

2√2My

∫ ∞h√

2M

exp

(−1

2ω2

)dω ∼ 2√

2My

exp(−1

2ω2)

ω

∣∣∣∣∣ω=h√My

= exp

(−1

2log(t)2 + o(log(t))

).

(5.12)

The error O(λ3/u

5/2y

)from the local expansion (5.7) for ψy is of order

O(log(y)2√y

), (5.13)

which is immediate from bootstrapping (5.9), (5.8) for My and uy and from thechoice of h

λ3

u5/2y

≤ Clog(uy)

u3/2y

1

uy

log(y)2

y3/2∼ C log(y)2√y.

Hence the total relative error is dominated by the error (5.13) from the localexpansion if My is not expanded, and by the relative error (5.9) of My if oneconsider its bootstrapping expansion.

Justification of the expansion for My

We defineMy :=

log(uy)

16u3/2y

− α

8u3/2y

.

The term 1−2µ8u2y

in the definition (2.9) of My is of higher order, therefore we can

henceforth work with the simpler expression My in the bootstrapping expansionand the error analysis instead of My. With α = ρ + 1 − 1

2log(2) and u ≡ uy/2,

the approximation of the saddlepoint simplifies to

My =log(uy)

16u3/2y

− α

8u3/2y

=log(uy)

16u3/2y

− ρ+ 1

8u3/2y

+log(2)

16u3/2y

=1

4

(√2 log(u)

16u3/2−√

2(ρ+ 1− log(2))

8u3/2

).

(5.14)

Thus My is up to constants of the same form as [78, Equation (12)]. By boot-strapping,

My ∼y3

log(y)2

[1 +O

(log | log(y)|

log(y)

)].

Indeed, the saddlepoint equation (2.8)

y =log(2uy)

2√

2(2uy)− ρ√

2(2uy)− 4µ− 2

4(2uy)

Page 104: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Proofs of Section 2 91

when setting c(u) ≡(

log(√u)√

2− ρ√

2+ k

4√u

), ρ = log

(z√2

)and k := 4µ − 2, and

with u0 = 2uy becomes√u0 ≡ y−1c(u0), where

log(√u0)√

2=

log(c (u0) + log

(1y

))√

2.

Hence, bootstrapping as in [78] yields

u0 =1

y2

(log (1/y)√

2+

log(c (u0))√2

− ρ√2

+k

4√u0

)2

=1

y2

((log (1/y)√

2

)2

+ 2

(log (1/y)√

2

)(log(c (u0))√

2− ρ√

2+

k

4√u0

)

+

(log(c (u0))√

2− ρ√

2+

k

4√u0

)2)

=(log (1/y))2

2y2

[1 +

(2√

2

log (1/y)

)(log(c (u0))√

2− ρ√

2+

k

4√u0

)

+2

(log (1/y))2

(log(c (u0))√

2− ρ√

2+

k

4√u0

)2].

Now expanding around log(1/y)

log (c(u0))√2

∼ log (log(1/y))√2

− log(2)

2√

2+

log(c(u0)− ρ+ k

2√

2u0

)√

2 log(1/y),

and using the fact that both

− 2√

2

log(y)

k

4√u−

log(c(u)− ρ+ k

2√

2u0

)√

2 log(y)

and

2

log(y)2

(log (c(u0))− ρ√

2+

k

4√u0

)2

are of order o (1/ log(y)), we obtain, by collecting terms,

2uy =log(y)2

2y2

(1− 2 log(− log(y))

log(y)+

2ρ+ log(2)

log(y)+ o

(1

log(y)

)).

Similarly,

u3/20 =

1

y3

[− log(y)√

2+

log(c(u))√2

− ρ√2

+k

4√u

]3

∼ − log(y)3

2√

2y3

[1− log(− log(y))

log(y)+

2ρ+ log(2)

2 log(y)+ o

(1

log(y)

)]3 (5.15)

Page 105: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

92 III Mass at Zero and Small-Strike Implied Volatility Expansions

hence u3/2y ∼ (log(1/y))2

8y2 ; further,

log(u0) = −2 (log(y)− log(c (u)))

∼ −2 log(y) + 2 log(− log(y))− log(2)−2 log

(c(u)− ρ+ k

2√

2u

)log(y)

,

so that, by bootstrapping we also recover the form of [78, Equation (13)]at u = 1

2uy:

My =1

4

[√2 log(u)

16u3/2−√

2(ρ+ 1− log(2))

8u3/2

]

=y3

2 log(y)2

[1 +O

(log(− log(y))

log(y)

)].

Page 106: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Chapter IV

Dirichlet Forms and Finite ElementMethods for the SABR Model

1 Introduction

We propose a deterministic numerical method for pricing vanilla options underthe SABR stochastic volatility model, based on a finite element discretizationof the Kolmogorov pricing equations via non-symmetric Dirichlet forms. Ourpricing method is valid both in moderate interest rate environments and in thecurrently prevalent low interest rate regimes and is consistently applicable undervery mild assumptions on parameter configurations of the process, which are eas-ily met in all practical scenarios. The parabolic Kolmogorov pricing equationsfor the SABR model are degenerate at the origin, yielding nonstandard partialdifferential equations, for which conventional pricing methods —designed for non-degenerate parabolic equations— break down.

We derive here the appropriate analytic setup to handle the degeneracy of themodel at the origin. That is, we construct an evolution triple of suitably chosenSobolev spaces with singular weights, consisting of the domain of the SABR-Dirichlet form- its dual space- and the pivotal Hilbert space. We show well-posedness of the variational formulation of the SABR-pricing equations for vanillaand barrier options on this triple. Furthermore, we present a concrete finite ele-ment discretization scheme based on a (weighted) multiresolution wavelet approx-imation in space and a θ-scheme in time and provide an error analysis for thefinite element discretization.

Page 107: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

94 IV Dirichlet Forms and Finite Element Methods for the SABR Model

The SABR model of Hagan et. al in [89, 90] is considered here in the form

dXt = YtXβt dWt, X0 = x0 > 0,

dYt = νYtdZt, Y0 = y0 > 0,d〈Z,W 〉t = ρdt, 0 ≤ t ≤ T <∞,

(1.1)

with the parameters ν > 0, β ∈ (0, 1), and ρ ∈ (−1, 1), where W and Z areρ-correlated Brownian motions on a filtered probability space (Ω,F , (Ft)t≥0,P).The process (X, Y ) takes values in the state space D = [0,∞) × (0,∞), anddescribes the dynamics of a forward rate X with stochastic volatility Y and withthe initial values x0 > 0 and y0 > 0. The first pricing formula for the SABRmodel proposed in [89, 90], is based on an expansion of the Black-Scholes impliedvolatility for an asset driven by (1.1). For low strike options such as in low in-terest rate and high volatility environments, this formula can yield a nonsensicalnegative density function for the process X in (1.1), which leads to arbitrageopportunities. Therefore, as the problem of negative densities and arbitrage be-came more prevalent, it has been addressed for example in [15, 16, 17, 54, 88]by different approaches, some suggesting modifications of the SABR model or itsimplied volatility expansion. The attempt of suggesting suitable modifications tothe original model is an intricate challenge since the Hagan expansion is deeplyembedded in the market and fits market prices closely in moderate interest rateenvironments. Any model that deviates from its prices in those regimes may bedeemed uncompetitive. This makes such pricing techniques desirable, which areapplicable to the original model in all market environments.

In the present chapter we propose a numerical pricing method for the (original)SABR model (1.1) with very mild assumptions on the parameters (they are sat-isfied in most practical scenarios). It is consistently applicable in all marketenvironments and allows for the derivation of convergence rates for the numericalapproximations of option prices.

The most popular numerical approximation methods which were considered sofar for the SABR model (or closely related models) fall into the following classes:probabilistic methods comprising of path simulation of the process combinedwith suitable (quasi-) Monte Carlo approximation, see [40] for the SABR modeland [6, 5, 38, 43, 100] for related models. Furthermore in [40] some difficultiesof Euler methods in the context of SABR are discussed. Splitting methods—where the infinitesimal generator of the process is decomposed into suitable oper-ators for which the pricing equations can be computed more efficiently—providea powerful tool in terms of computation efficiency for sufficiently regular pro-cesses. Such methods are considered in [21] for a model closely related to SABR(see also [20]), and in [53] for a large class of models. However, the applicab-ility of corresponding convergence results to the SABR model itself is not fullyresolved. Among fully deterministic PDE methods are most notably finite dif-ference methods, which were considered in [11, 116] for the modification [88]

Page 108: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

1 Introduction 95

of the SABR model (1.1) and finite element methods, which were describedin the context of mathematical finance in [172]. In the recent textbook [96] finiteelement methods have been applied to a large class of financial models—includingthe closely related process (1.3)—and provide a robust and flexible framework tohandle their stochastic finesses. Finite element approximation methods did notappear in the context of the SABR model so far in the corresponding literature.For a broad review of simulation schemes used for the SABR model (1.1) in spe-cific parameter regimes, see also [122] and the references therein.

Standard theory provides convergence of the above methods if the consideredmodel satisfies certain (method-specific) regularity conditions. However in thecase of the SABR model, obtaining convergence rates is non-standard for thesemethods: The degeneracy of the SABR Kolmogorov equation at the origin viol-ates the assumptions needed in conventional finite difference methods and—for arange of parameters—also those of ad-hoc (i.e. unweighted) finite element meth-ods. Path simulation of the SABR process also requires nonstandard techniquesdue to the degeneracy of the diffusion (1.1) at zero. Nonstandard techniquesoften become necessary for the numerical simulation of a stochastic differentialequation, when the drift and diffusion (b and σ) do not satisfy the global Lipschitzcondition

|b(x)− b(y)|+ |σ(x)− σ(y)| ≤ C|x− y|, (1.2)

for x, y ∈ Rn and a constant C > 0 independent from x and y, cf. [150]. Thedegeneracy of the SABR model (1.1) atX = 0, originates from failure of condition(1.2) for the CEV process X, described for parameters α > 0, and β ∈ [0, 1] bythe equation

dXt = αXβt dWt X0 = x0 > 0, 0 ≤ t ≤ T <∞. (1.3)

Although the exact distribution of the CEV process is available [35], simulationof the full SABR model based on it can in many cases become involved andexpensive. In fact, exact formulas decomposing the SABR-distribution into aCEV part and a volatility part are only available in restricted parameter regimes,see [15, 64, 102] for the absolutely continuous part and [86] for the singular partof the distribution.

A simple space transformation (see (1.5) below) makes some numerical approx-imation results for the CIR model (the perhaps most well-understood degeneratediffusion) applicable to certain parameter regimes of the SABR process. The CIRprocess

dSt = (δ − γSt) dt+ a√StdWt, S0 = s0 > 0, 0 ≤ t ≤ T <∞, (1.4)

with a > 0, δ ≥ 0 indeed reduces for the parameters γ = 0, a = 2 to a squaredBessel process with dimension δ on the positive real line, the connection to CEVis then made via

ϕ : R≥0 −→ R≥0

s 7−→ 11−δ/2 s

1− δ2 , where β = 1−δ

2−δ , for δ 6= 2,(1.5)

Page 109: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

96 IV Dirichlet Forms and Finite Element Methods for the SABR Model

that is, assuming absorbing boundary conditions at zero, the law of S in (1.4) (forthe parameters γ = 0, a = 2) under the space transformation ϕ in (1.5) coincideswith the law of X in (1.3).Recent results in this direction, exploring probabilistic approximation methodsfor diffusions where the global Lipschitz continuity (1.2) is violated, can be foundfor example in [38, 43] and [100], see also [9, 49] and the references therein. Es-tablishing strong convergence rates in the case 2δ < a2 where the boundary isaccessible as in [100], is of particular difficulty, as this case renders coefficientsof the SDE (1.4) neither globally, nor locally Lipschitz continuous on the statespace. Yet, these convergence results do not cover the parameter range of theSABR model. Further approximation schemes are presented in [6, 5] which applyto CIR processes with accessible boundary and both strong and weak convergencefor the approximation are studied. The weak error analysis of Talay and Tubaro[166] yields second order convergence of the schemes proposed in [6, 5], whichcovers the parameters 0 < β < 1

2but the results do not directly carry over to the

case 12≤ β < 1.

Here, we turn to a fully deterministic numerical method based on discretiza-tions of the Kolmogorov partial differential equations, using finite elements. Wederive the appropriate analytic setup to handle the degeneracy of the model atthe origin. That is, we construct a suitable evolution triple of Sobolev spaceswith singular weights on which well-posedness of the variational formulation ofthe SABR-pricing equations holds. The proposed method for space-discretizationis based on the Dirichlet form corresponding to the SABR stochastic differentialequation. Specifically, using the Dirichlet form we recast the Kolmogorov pricingequations in weak (variational) form and show the so-called well posedness ofthe latter. We use the weighted multiresolution (wavelet) Galerkin discretiza-tion of [27] in the state space to approximate variational solutions of the SABR-Kolmogorov pricing equations for financial contracts (vanilla and barrier options).For the time discretization of the semigroup generated by the process (1.1) wepropose a θ-scheme. We derive approximation estimates tailored to our weightedsetup, measuring the error between the true solution of the pricing equations andtheir projection to the discretization spaces. Based on these, we conclude errorestimates akin to [143] for our fully discrete scheme. Under appropriate regularityassumptions on the payoff we obtain the full convergence rate for our finite ele-ment approximation. The advantage of the presented method is that it allows fora consistent pricing with very mild parameter assumptions on the SABR processand it is robustly applicable for moderate interest rate environments as well as inthe current low interest rate regimes. Furthermore, the proposed discretizationcan be applied to compute prices of compound options or multi-period contractswithout substantial modifications of the numerical methodology, cf. [150].

The chapter is organized as follows. Section 2 is devoted to the formulationof the SABR pricing problem in the appropriate analytic setting and an outlineof the general idea of the variational analysis underlying the proposed finite ele-ment method for the SABR model. In Section 1 we introduce some notations

Page 110: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Preliminaries and problem formulation 97

and review existing results which will be used for our finite element discretiza-tion. In Section 2 we introduce the SABR Dirichlet form and cast the variationalformulation of the SABR pricing equation in a suitable setting. We then proposea Gelfand triplet of spaces for our finite element discretization, consisting of aspace V of admissible functions (the domain of the SABR-Dirichlet form), its dualspace V∗ and a pivotal Hilbert space H, containing the domain of the Dirichletform. In Section 3 we prove well-posedness of the SABR-pricing problem, whichyields the existence of a unique weak solution to the variational formulation of theKolmogorov partial differential equations on these spaces. In Section 3 we presentthe finite element discretization of the weak solution of the equation examinedin the previous sections. Section 1 is devoted to the space discretization, whichis carried out through a spline wavelet discretization of spaces V and H. Wereview the multiresolution spline wavelet analysis of [96, 143] (in the unweightedcase) to discretize the volatility dimension. The forward dimension (the CEVpart) is more delicate, due to its degeneracy at zero. In this case we apply theweighted multiresolution norm equivalences, proven in [27] which are suitable tothis degeneracy. We pass from the univariate case to the bivariate case by con-structing tensor products of the discretized spaces in each dimension as outlinedin [96]. Finally, we specify the mass- and stiffness matrices involved in the spacediscretization. In Section 2 we present the fully discrete scheme by applying aθ-scheme in the time-stepping. We follow [143], to conclude that the stability ofthe θ-scheme continues to hold in the present setting of weighted Sobolev spaces.In Section 4 we derive error estimates for our finite element discretization. In Sec-tion 1 approximation estimates of the projection to our discretization spaces areestablished based on multiresolution (weighted) norm equivalences. We cast ourestimates under specification of different regularity assumptions on the solution ofour pricing equations. We use these estimates in Section 2 to derive convergencerates for our finite element discretization and conclude that under some regularityassumptions on the payoff, examined in the previous section, we obtain the fullconvergence rate.

We remark here, that the approximation- and error estimates presented inSection 4 for the SABR model readily yield the corresponding approximation-and error estimates for the CEV model as a direct corollary. The well-posednessof variational formulation of the CEV pricing equations has been studied in [96,150], however to the best of our knowledge, a presentation of the full error analysisthereof was not available in the corresponding literature prior to this article.

2 Preliminaries and problem formulationPreliminaries and Notations: the process X in (1.1) is a martingale [107,Remark 2], [98, Theorem 5.1] and [121, Theorem 3.1]. For two norms on a spaceV the notation || · ||V1 ≈ || · ||V2 indicates that the norms are equivalent on V .Function spaces of bivariate functions will be denoted in italic (V ,H, . . .) andspaces of univariate functions by (V,H, . . .) accordingly. For a domain G ⊂ R2

Page 111: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

98 IV Dirichlet Forms and Finite Element Methods for the SABR Model

(resp. interval I ⊂ R) we denote by L1loc(G) (resp. L1

loc(I)) the locally integrablefunctions on G (resp. I), and by C∞0 (G) (resp. C∞0 (I)) the smooth functions withcompact support. Derivatives with respect to time will be denoted by u, u, . . .accordingly to ease notation.

1 General analytic setup

We first recall here the analytic setup of finite element methods. In a generalparabolic setup (see for example [128, Section 2.3] or [143, Section 2]) one con-siders on the finite time interval J = (0, T ), T > 0 a parabolic evolution problemof the following form: find u ∈ C1,2(J,R2) ∩ C0(J ,R2) which solves

u(t, z)− Au(t, z) = g(t, z), t ∈ J, z ∈ R2

u(0, z) = u0(z), z ∈ R2.(2.1)

Here, the function g is referred to as the right hand side of (2.1) or the forcingterm (cf. [96, p. 27]). In a financial context, evolution equations of the type(2.1) often arise as Kolmogorov equations and the forcing term often appears asg = −ru, where r ∈ C0(R2;R≥0) denotes a deterministic risk-free interest rate. Inthis case, the operator A is the infinitesimal generator of a suitable price process(Zt)t∈[0,T ] and the solution u ∈ C1,2(J,R2)∩C0(J ,R2) of (2.1) can be represented1

asu(t, z) = Et[e−

∫ Tt r(Zs)dsu0(Zz

T )], t ∈ J, z ∈ R2 (2.2)

under a (possibly non-unique) martingale measure, see [96, Theorem 8.1.3] fordetails.

For the discretization, the domain R2 in (2.1) will be localized to a boundeddomain G ⊂ R2 with Lipschitz boundary, cf. [143, 150]. In what follows, alllocalization domains G will be rectangular and denote the range of admissiblevalues which can be taken by the price (and volatility) process, see (2.14) below.The operator A is a linear second order operator and possibly degenerate (i.e.non-elliptic) at the boundary. The domain D(A) of the operator A is equippedwith a norm ||·||V and the completion of D(A) under this norm will be denoted byV . Furthermore, we denote by H a separable Hilbert space—the so called pivotspace—containing V , such that V → H is a dense embedding. For degeneratediffusions, the pivot space H could be a weighted L2 space over G (such as thespace H in (2.14) below), cf. [150]. The inner product (·, ·)H of H is extendedto a duality pairing (·, ·)V∗×V on V∗ × V , where V∗ denotes the dual space of V ,equipped with the corresponding dual norm || · ||V∗ . Identifying the Hilbert spaceH with its dual H∗ we obtain the so-called Gelfand-triplet

V → H ∼= H∗ → V∗, (2.3)

where → denotes a dense embedding. Applying the duality pairing (·, ·)V∗×V andnoting that A ∈ L(V ;V∗), we can associate with the operator A in (2.1) a bilinear

1 Where u(t, z) := u(T − t, z), t ∈ J, z ∈ R2 solves the Kolmogorov backward equation (2.1)to (2.2).

Page 112: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Preliminaries and problem formulation 99

form a(·, ·) : V × V → R by setting

a(u, v) := −(Au, v)V∗×V , u, v ∈ V . (2.4)

Note that we do not assume the operator A in (2.1) to be self-adjoint nor thebilinearform (2.4) to be symmetric.

Definition 2.1. The form a(·, ·) is called continuous on V if there exists a 0 <C1 <∞ such that

∀u, v ∈ V : |a(u, v)| ≤ C1||u||V ||v||V . (2.5)

Definition 2.2. The form a(·, ·) is called coercive on V if there exists a constantC2 such that 0 < C2 ≤ C1 and

∀u, v ∈ V : |a(u, u)| ≥ C2||u||2V . (2.6)

Remark 2.3 (Equivalent norm on V). The continuity and coercivity properties(cf. (2.5) and (2.6)) of the bilinearform a(·, ·) allow us to define for u ∈ V aso-called energy norm || · ||a

|| · ||a := a(·, ·)1/2 ≈ ||u||V . (2.7)

That is, the energy norm induced by the bilinear form is equivalent to the norm|| · ||V on V .

Definition 2.4 (Gårding inequality). The form (2.4) is said to satisfy the Gårdinginequality on V , if there exists a constant C3 ≥ 0 such that

∀u ∈ V : a(u, u) ≥ C2||u||2V − C3||u||2H. (2.8)

Remark 2.5 (Reduction to Gårding inequality). If the form (2.4) satisfies the(weaker) Gårding inequality (2.8), one can arrive at the coercivity property (2.6)by the substitution v := e−C3tu. In case (2.8) is fulfilled for the operator A,equation (2.1) implies that v satisfies (2.6), and the operator A+C3I is coercive:

v(t, z) + (A+ C3I)v(t, z) = e−C3tg(t, z) in t ∈ J, z ∈ R2. (2.9)

The time derivative of a function u in the appropriate Bochner space isunderstood in the weak sense: For u ∈ L2(J ;V), its weak derivative in u ∈L2(J,V∗) ∩H1(J ;V∗) is defined by the relation∫

J

(u(t), v)V∗×V ϕ(t)dt = −∫J

(u(t), v)V∗×V ϕ(t)dt, (2.10)

for every v ∈ V , ϕ ∈ C∞0 (J), see [96, p. 30-31] for reference and Section 2 for moredetails on the above Bochner spaces. With these preparations one can formulatethe variational (or weak) framework corresponding to the evolution problem (2.1).The motivation for passing to the weak formulation is that it is often not possibleto find a classical solution to the original problem i.e. if there exists no functionwith sufficient regularity u ∈ C1,2(J,R2) ∩ C0(J ,R2) which solves (2.1). In suchcases one passes to a variational reformulation of the original problem:

Page 113: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

100 IV Dirichlet Forms and Finite Element Methods for the SABR Model

Definition 2.6 (Variational formulation). Consider a Gelfand triplet V → H ∼=H∗ → V∗ as in (2.24). Let the initial condition be u0 ∈ H in the pivotal Hilbertspace, such as the right hand side g ∈ L2(J,V∗) of (2.1) in the Bochner spacecorresponding to V∗, where J = (0, T ) with T > 0. The weak formulation of(2.1) is as follows: Find u ∈ L2(J ;V) ∩ H1(J ;V∗) such that u(0) = u0, and forevery v ∈ V , ϕ ∈ C∞0 (J) it holds that

−∫J

(u(t), v)H ϕ(t)dt+

∫J

a(u, v) ϕ(t)dt =

∫J

(g(t), v)V∗×V ϕ(t)dt. (2.11)

The variational formulation allows us to consider “solutions” u of (2.1) withless regularity than classical solutions of the original problem, which solve the re-lated problem (2.11). Solutions of the variational problem (2.11) are referred to asweak solutions of the original problem (2.1), since whenever these are sufficientlysmooth, they coincide with the solutions of the corresponding original problem(2.1). In this (possibly degenerate) parabolic setup, there holds the followinggeneral theorem.

Theorem 2.7. Let V and H be separable Hilbert spaces with a continuous denseembedding V → H. Furthermore, let a : V × V → R be a bilinear form sat-isfying the inequalities (2.5) and (2.8). Then the corresponding abstract vari-ational parabolic problem (see Definition 2.6 above) has a unique solution inL2(J ;V) ∩H1(J ;V∗).

Proof. See [120, Theorem 4.1].

Theorem 2.8. Assume that A ∈ L(V ,V∗) and consider a bilinearform a(·, ·) :V ×V → R associated with A via (2.4). If a(·, ·) satisfies the properties (2.5) and(2.6) (resp. (2.8)) , then

(a) A ∈ L(V ,V∗) is an isomorphism and ||A||L(V,V∗) ≤ C1 and that ||A−1||L(V∗,V) ≤1C2.

(b) −A is the infinitesimal generator of a bounded analytic C0-semigroup (Pt)t≥0

in V∗.

(c) For u0 ∈ H and g ∈ L2(J ;V∗), the unique (cf. Theorem 2.7) variationalsolution of the pricing equation (2.1) (resp. (2.9)) can be represented as

u(t) = Ptu0 +

∫ t

0

Pt−sg(s)ds. (2.12)

Moreover, the following a priori estimate holds

||u||L2(J ;V) + ||u′||L2(J ;V∗) + ||u||C(J ;H) ≤ C(||g||L2(J ;H) + ||u0||H

). (2.13)

Proof. See [128, Theorem 2.3.], [143, Section 2.] and also [120].

Page 114: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Preliminaries and problem formulation 101

In our analytic setup, the spaces V and H will typically be weighted Sobolevspaces

H := L2ω(G) = u|G ∈ L1

loc(R2) | ωu ∈ L2(R2),V := H1

ω(G) = u|G ∈ L1loc(R2) | ωu, ω1∂1u, ω2∂2u ∈ L2(R2);u|R2\G = 0,

(2.14)where G ⊂ R2 is a bounded domain of the form (0, Rx)× (−Ry, Ry) for some realconstants Rx, Ry > 0, and appropriate singular weight functions2 ω, ω1, ω2 (see[143, Section 2], [128, Section 2.2] for further reference on admissibility criteriafor the weight functions).

Remark 2.9. For call and put options the error made by truncating the domainto G ⊂ R2 corresponds to approximating the option prices by a knock-out barrieroptions, up to the first hitting time of the boundary ∂G, see [96, Section 4.6and Chapter 6]. A probabilistic argument to estimate the localization error wassuggested by Cont and Voltchkova in [42]. In the case of the SABR model, theprobability that the first hitting time of Rx resp. Ry occurs before T convergesto zero as Rx, Ry → ∞. The lower boundary however cannot be truncated toany domain (ε, Rx) for a positive ε without possibly introducing a considerablelocalization error, since the SABR process accumulates a positive mass at zerofor every T > 0 whenever β < 1. For details see [86], where the mass at zero iscalculated for several relevant parameter configurations.

2 Analytic setting for the SABR model

In this section we establish the triplet V ⊂ H ⊂ V∗ of spaces, tailored to theSABR process on which we cast the Kolmogorov pricing equations in variationalform (cf. (2.11)). We then proceed to show the well-posedness of these pricingequations on this triplet, i.e. that inequalities (2.5) and (2.8) are fulfilled. Weconclude the section by proving that the bilinear form resulting from the weakformulation of the pricing equations is indeed a non-symmetric Dirichlet formcorresponding to the (unique) law of the SABR process. Fix a time horizon [0, T ]

and set Y = log(Y ), so that the SDE (1.1) becomes

dXt = Xβt exp(Yt)dWt X0 = x0 > 0,

dYt = νdZt − ν2

2dt Y0 = log y0, y0 > 0

d〈W,Z〉t = ρdt.

(2.15)

The solution to (1.1) is uniquely defined if one imposes boundary conditionsat X = 0. It is standard to impose absorbing boundary conditions to ensuremartingality of the process, and indeed, for the parameters β ∈ [0.5, 1] this isthe only choice. For the parameters β ∈ [0, 0.5) one could alternatively choosereflecting boundary conditions (as suggested in [14]) in order to accomodate tomarket conditions where interest rates can become negative. The value at time

2See Appendix 1 for more details on the notation and properties of the weights ω, ω1, ω2.

Page 115: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

102 IV Dirichlet Forms and Finite Element Methods for the SABR Model

t ∈ J = (0, T ) of a European-type contract3 on (2.15) with payoff u0 is

u(t, z) = E [u0(Zzτ )] , t ∈ J (2.16)

where τ := (T−t) and Zzτ := (Xτ , Yτ ) is the process started at z := (x, y) ∈ R≥0×

R, with (x, y) = (Xt, Yt) P-a.s. Then for u ∈ C1,2(J ;R≥0 × R) ∩ C0(J ;R≥0 × R)the Kolmogorov pricing equation to (2.15) is

u(t, z)− Au(t, z) = 0 in J × R≥0 × R,u(0, z) = u0(z) in R≥0 × R (2.17)

where the infinitesimal generator A of (2.15) reads

Af =x2βe2y

2∂xxf + ρνxβey∂xyf + 1

2ν2∂yyf − 1

2ν2∂yf for f ∈ C2

0(D) ⊂ D(A).

(2.18)From now on we drop the tilde in the logarithmic volatility for notational con-venience.

Definition 2.10. For any µ ∈ [max−1,−2β, 1 − 2β], β ∈ [0, 1] let G :=[0, Rx) × (−Ry, Ry) ⊂ R+ × R, Rx, Ry > 0 be an open subset. On G we definethe weighted space4

H := L2(G, xµ/2) = u : G→ R measurable | ||u||L2(G,xµ/2) <∞ (2.19)

with ||u||2L2(G,xµ/2):= (u, u)H for the bilinear form

(u, v)H :=

∫G

u(x, y)v(x, y) xµdxdy, u, v ∈ V. (2.20)

Remark 2.11. Note that (H, (·, ·)H) is a Hilbert space, see Appendix 1, Lemma5.5 and 5.3.

A possible choice for the weight is µ = −β for all β ∈ [0, 1]. Alternatively,one can distinguish the cases β ∈ [1

2, 1) and β ∈ [0, 1

2) ∪ 1 and choose µ = −β

for β ∈ [12, 1), but µ = 0 for β ∈ [0, 1

2)∪ 1. Distinguishing the above parameter

regimes has the advantage that we preserve the classical setting of an unweightedL2(G)-space for the parameters β ∈ [0, 1

2) ∪ 1. The latter choice furthermore

highlights, that our analytic setting consistently extends the univariate CEV caseto the bivariate SABR case: see [96, Section 4.5] and Remark 2.16 for the analyticsetting for CEV.

Definition 2.12. Set G := [0, Rx)×(−Ry, Ry) ⊂ R+×R, Rx, Ry > 0, the domainof interest. For the first coordinate we consider

L2([0, Rx), xµ/2) = u ∈ L2([0, Rx)) with ||xµ/2u||L2 <∞.

3We assume for notational simplicity zero risk-free interest rates here.4See Section (5.1) for a reminder.

Page 116: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Preliminaries and problem formulation 103

On L2([0, Rx), xµ/2) we define the space

Vx := C∞0 ([0, Rx))||·||Vx where

||u||2Vx := ||xβ+µ/2∂xu)||2L2(0,Rx) + ||xµ/2u||2L2(0,Rx), u ∈ C∞0 ([0, Rx)).

For the second coordinate we consider on L2(−Ry, Ry) the space

Vy := H1(−Ry, Ry), where||u||2Vy := ||∂yu||2L2(−Ry ,Ry) + ||u||2L2(−Ry ,Ry), u ∈ H1(−Ry, Ry).

We define for the bivariate case

V :=(Vx ⊗ L2(−Ry, Ry)

)⋂(L2([0, Rx), x

µ/2)⊗ Vy). (2.21)

The dual space will be denoted by V∗ and equipped with the usual dual norm

||v||V ∗ = supu∈V

(v, u)V∗×V||u||V

, v ∈ V∗. (2.22)

Remark 2.13. Note that the norm on the space (2.21) is by construction5 equi-valent to

||u||2V ≈ ||xβ+µ/2∂xu||2L2(G) + ||xµ/2∂yu||2L2(G) + ||xµ/2u||2L2(G), u ∈ V . (2.23)

Lemma 2.14. For any µ ∈ [max−1,−2β, 1− 2β] the spaces H , V and V∗ indefinitions 2.10 and 2.12 form a Gelfand triplet with dense embeddings

V → H ∼= H∗ → V∗ (2.24)

Proof. Since all weights are contained in the A2 Muckenhoupt class6, it holds inparticular that all weights are in L1

loc(G), and therefore C∞0 (G) is indeed a subsetof both the weighted Sobolev space7 W 1,2(G;xµ/2;xµ/2+β, xµ/2) and of L2(G, xµ/2),and by construction C∞0 (G)

||·||V ⊂ V . Therefore, it suffices to show C∞0 (G)||·||V

→H. For this, one can take a route via symmetric Dirichlet forms: We define asymmetric bilinear form

E(u, v) :=∫ ∫

Gx2β+µe2y∂x(u)∂x(v) dx dy +

∫ ∫xµ∂y(u)∂y(v) dx dy

D(E) := C∞0 (G).(2.25)

It follows from [153, Corollary 3.5] or [33, Proposition 1] that for any open G ⊂R≥0 × R, the bilinear form (2.25) is closable on the Hilbert space L2(G, xµ/2).Closability yields a dense subspace of L2(G, xµ/2) containing C∞0 (G), on which(E(·, ·) + || · ||2H)1/2-Cauchy sequences converge. This norm coincides with thenorm || · ||V in (2.23) by the construction (2.25) of E .

5See Appendix 3.1 for details.6See Appendix 1, in particular Lemma 5.4.7See Definition 5.2 in the Appendix for the notation.

Page 117: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

104 IV Dirichlet Forms and Finite Element Methods for the SABR Model

We define the SABR-bilinear form by the relation (2.4). Noting that A ∈L(V ,V∗), and extending the inner product (·, ·)H of the Hilbert space to the dualpairing (·, ·)V∗×V we set

a(u, v) := −(Au, v)V∗×V , u, v ∈ V ,

where the operator A acts on V in the weak sense. The SABR-bilinear form istherefore

a(u, v) =12

∫ ∫G

x2β+µe2y ∂xu ∂xv dxdy + 2β+µ2

∫ ∫G

x2β+µ−1e2y ∂xu v dxdy

+ ρν

∫ ∫G

xβ+µey ∂xu ∂yv dxdy + ρν

∫ ∫G

xβ+µey ∂xu v dxdy

+ ν2

2

∫ ∫G

xµ ∂yu ∂yv dxdy + ν2

2

∫ ∫G

xµ ∂yu v dxdy, u, v ∈ V ,

(2.26)

which is obtained from (2.4)—by the divergence theorem together with V ⊂L1loc(G)—when Au, u ∈ V are interpreted as weak derivatives8.

Definition 2.15 (Variational formulation of the SABR pricing equation). LetV ,V∗ and H (resp. L2(G, xµ/2)) be as in Definitions 2.12 and 2.10, and let thebilinearform a(·, ·) on V be as in (2.26). Furthermore, let u0 ∈ H (resp. u0 ∈L2(G, xµ/2)) and consider for a T > 0 the finite interval J = (0, T ). Thenthe variational formulation of the SABR pricing problem reads as follows: Findu ∈ L2(J ;V) ∩H1(J ;V∗), such that u(0) = u0, and for every v ∈ V , ϕ ∈ C∞0 (J)

−∫J

(u(t), v)H ϕ(t)dt+

∫J

a(u(t), v) ϕ(t)dt =

∫J

(g(t), v)V∗×V ϕ(t)dt. (2.27)

Remark 2.16 (The CEV case: ν = 0). In [96, 150] a corresponding analyticsetup is studied for the univariate case: For the CEV model the Gelfand tripletV ⊂ H ∼= H∗ ⊂ V in [96, 150] consists of the weighed spaces

H := L2((0, R), xµ/2) and

V := C∞0 ([0, R))||·||V with ||u||2V := ||xβ+µ/2∂xu)||2L2(0,R) + ||xµ/2u||2L2(0,R)

such as its dual V ∗, where the parameter µ ∈ [max−1,−2β, 1 − 2β] is chosenas in Definitions 2.10, and 2.12. Indeed, setting ν = 0, the SABR process (1.1)(resp. in (2.15)) with trivial volatility process reduces to the CEV model, andthe state space G reduces to [0, Rx) ⊂ R. Accordongly, for ν = 0 the spaces H,V V∗ in Definitions 2.10, and 2.12 coincide with the spaces V , H and V ∗ above.Also the SABR bilinear form (2.26) reduces to the corresponding CEV bilinearform. Hence, our analytic setup extends the univariate setup of the CEV model

8Multiplying −Au with v ∈ C∞0 and integrating gives (2.4), partial integration (cf. (2.10))then yields (2.26).

Page 118: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Preliminaries and problem formulation 105

consistently to the bivariate SABR-case. See [150, Equation (21)],[96, Equations(4.30) and (4.33)] such as [96, page 62] for the definitions9 of V,H, V ∗ and [96,Equation (4.28)] for the CEV bilinear form.

3 Well-posedness of the variational pricing equations andSABR Dirichlet forms

In this section we will show that the triplet of spaces V ⊂ H ∼= H∗ ⊂ V∗ (Defin-itions 2.10, and 2.12) is tailored to the degeneracy of the infinitesimal generator(2.18) at zero. That is, in this setting the variational formulation of the SABRpricing equation has a unique solution in V (this is a consequence of Theorem2.7 and well-posedness cf. Theorem 2.17) and hence the family Pt = E[u0(Zt)],t ≥ 0 is a strongly continuous contraction semigroup on the Hilbert space H,cf. Theorems 2.8 and 2.17. Furthermore, the SABR-bilinear form (2.26) is a(non-symmetric) Dirichlet form with Domain V on the Hilbert space H (The-orem 2.22).Key is the well-posedness result, Theorem 2.17. This is a direct consequence ofTheorem 2.7 applied to Lemmas 2.18 and 2.19 below, which establish continuityand the Gårding inequality for the SABR Dirichlet form (2.26) on this triplet.

Theorem 2.17 (Well-posedness of the SABR pricing equation). For every con-figuration (β, |ρ|, ν) ∈ [0, 1] × [0, 1] × R+ of the SABR parameters, which satisfythe condition |ρ|ν2 < 2 and for any x0, y0 > 0, the weak formulation (2.27) of thepricing equation (2.17) corresponding to the SABR model (2.17) admits a uniquesolution u ∈ L2(J,V)∩H1(J,V∗) for any forcing term10 g ∈ L2(J,V∗) and any u0

in H. The unique variational solution of the pricing equation can be representedas

u(t, z) = Ptu0(z) +

∫ t

0

Pt−sg(s)ds, t ≥ 0, z ∈ G (2.28)

for a strongly continuous semigroup (Pt)t≥0 on H with the infinitesimal generatorA in (2.18). Furthermore, the norm || · ||2a := a(·, ·) induced by the bilinearform(2.26) is equivalent to || · ||a ≈ || · ||V .

Lemma 2.18. The bilinear form (2.26) is continuous: There exists a constantC1 > 0 such that

|a(u, v)| ≤ C1 ||u||V ||v||V , for all u, v ∈ V . (2.29)

Proof. The continuity statement (2.29) is a direct consequence of the followingestimates:

9Note that V,H, V ∗ are presented here in a form which is adjusted to our current notation.10See equation (2.1) in Section 1.

Page 119: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

106 IV Dirichlet Forms and Finite Element Methods for the SABR Model

(1) 12

∫ ∫Gx2β+µe2y∂xu∂xvdxdy ≤ 1

2

(||xβ+µ/2ey∂xu||2L2(G) + ||xβ+µ/2ey∂xv||2L2(G)

),

(2) By the Cauchy-Schwartz inequality,

2β+µ2

∫ ∫Gx2β+µ−1e2y∂xuvdxdy

≤ 2β+µ2

(∫ ∫Gx2β+µ−2e2yv2dxdy

)1/2 (∫ ∫Gx2β+µe2y(∂xu)2dxdy

)1/2,

and an upper bound for the latter expression is derived from Hardy’s in-equality:

≤ 2β+µ2

2|2β+µ−1| ||x

β+µ/2ey∂xv||L2(G)||xβ+µ/2ey∂xu||L2(G)

≤ 2β+µ2

2|2β+µ−1|

(||xβ+µ/2ey∂xv||2L2(G) + ||xβ+µ/2ey∂xu||2L2(G)

)(3) ρν

∫ ∫Gxβ+µey∂xu∂yudxdy ≤ |ρν|

(||xβ+µ/2ey∂xu||2L2(G) + ν2

2||xµ/2∂yu||2L2(G)

),

(4) ρν∫ ∫

Gxβ+µey∂xu∂yudxdy ≤ |ρν|

(||xβ+µ/2ey∂xu||2L2(G) + ν2

2||xµ/2u||2L2(G)

),

(5) ν2

2

∫ ∫Gxµ∂yu∂yvdxdy ≤ ν2

2

(||xµ/2∂yu||2L2(G) + ||xµ/2∂yv||L2(G)

),

(6) ν2

2

∫ ∫Gxµ∂yuvdxdy ≤ 0 ≤ ν2

2

(||xµ/2∂yu||2L2(G) + ||xµ/2v||L2(G)

).

Lemma 2.19. The bilinear form (2.26) satisfies the Gårding inequality, i.e. thereexist constants C2 > 0 and C3 ≥ 0 such that

a(u, u) ≥ C2 ||u||2V − C3||u||2H, for all u ∈ V . (2.30)

Proof. The Gårding inequality (2.30) is obtained from the following estimates:

(1) 12

∫ ∫Gx2β+µe2y∂xu∂xudxdy = 1

2||xβ+µ/2ey∂xu||2L2(G)

(2) Integration by parts with (∂xu)u = 12∂x(u

2) yields

2β+µ2

∫ ∫Gx2β+µ−1e2y 1

2∂x(u

2)dxdy = −2β+µ2

(2β + µ− 1)∫ ∫

Gx2β+µ−2e2yu2dxdy

The last line term is non-negative if and only if µ ∈ [max−1,−2β, 1−2β],β ∈ [0, 1].

(3) ρν∫ ∫

Gxβ+µey∂xu∂yudxdy ≥ −|ρ|ν3

4

(1δ||xβ+µ/2ey∂xu||2L2(G) + δ||xµ/2∂yu||2L2(G)

)for a constant δ > 0.

Page 120: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Preliminaries and problem formulation 107

(4) ρν∫ ∫

Gxβ+µey∂xu∂yudxdy ≥ −|ρ|ν3

4

(ε||xβ+µ/2ey∂xu||2L2(G) + 1

ε||xµ/2u||2L2(G)

),

for a constant ε > 0 .

(5) ν2

2

∫ ∫Gxµ∂yu∂yudxdy = ν2

2||xµ/2∂yu||2L2(G)

(6) ν2

2

∫ ∫Gxµ∂yuvdxdy = 1

2ν2(xµ/2∂yu, x

µ/2u) = 0 by the (Dirichlet) boundaryconditions.

Hence,a(u, u) ≥

(12− |ρ|ν

3

4δ− |ρ|ν

3ε4

)||xβ+µ/2ey∂xu||2L2+

+(ν2

2− |ρ|ν

3δ4

)||xµ/2∂yu||2L2 − |ρ|ν

3

4ε||xµ/2u||2L2 ,

which yields the inequality (2.30)

a(u, u) ≥ C2

(||xβ+µ/2ey∂xu||2L2(G) + ||xµ/2∂yu||2L2(G) + ||xµ/2u||2L2(G)

)−C3||xµ/2u||2L2(G)

= C2||u||2V − C3||u||2H,

with C2 := minν2

2− |ρ|ν

3δ4, 1

2− |ρ|ν

3

4δ− |ρ|ν

3ε4 and C3 := C2 + |ρ|ν3

4ε.

It remains to verify that the constants δ and ε can in fact be chosen accordinglysuch that C2 > 0 and C3 > 0 hold simultaneously.

Lemma 2.20. For every configuration (β, |ρ|, ν) ∈ [0, 1]×[0, 1]×R+ of the SABRparameters, which satisfy the condition |ρ|ν2 < 2 there exist constants ε > 0 andδ > 0 such that C2 > 0.

Remark 2.21 (Discussion of the parameter restrictions). Note that the caseρ = 0, ν > 0 readily yields C2 > 0 and for ν = 0 yields the CEV model, asdiscussed in Remark 2.16. Furthermore, the condition on the parameters inLemma 2.20 is satisfied for any |ρ| ∈ (0, 1] for example if 0 < ν <

√2. The

latter condition on ν is fulfilled in most practical scenarios as usual values of thisparameter are well below

√2: The volatility of volatility typically calibrates to

values around ν = 0.2; 0.4; 0.6, see for example [89, 140, 147].

Proof of Lemma 2.20. For any parameter configuration with |ρ|ν2 < 2 one canchoose the constant δ in such a way that

2|ρ|ν > δ > 0, (2.31)

and the constants ε accordingly, such that

2ν3|ρ| −

1δ> ε > 0. (2.32)

Page 121: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

108 IV Dirichlet Forms and Finite Element Methods for the SABR Model

If the inequalities (2.32) and (2.32) are satisfied then C2 > 0 follows. It remainsto verify that (2.32) poses no contradiction to (2.31). The bounds on δ are

δ ∈(|ρ|ν3

2, 2|ρ|ν

). (2.33)

Indeed if |ρ|ν2 < 2, then the set in (2.33) is nonempty, and there exists an εsatisfying (2.32).

Theorem 2.22 (A non-symmetric SABR Dirichlet form). Let the bilinearforma(·, ·) be as in (2.26) and its domain V as in (2.21). Then the pair (a(·, ·),V) isa (non-symmetric) Dirichlet form on the Hilbert space (H, (·, ·)H) in (2.19), forevery parameter configuration (β, |ρ|, ν) ∈ [0, 1]× [0, 1]× R+ with |ρ|ν2 < 2, andfor any µ ∈ [max−1,−2β, 1− 2β].

Proof. The crucial statement in the above theorem is that the pair (a,V) isa coercive closed form on the Hilbert spaces H and L2(G, xµ/2) for any µ ∈[max−1,−2β, 1− 2β], where a denotes SABR-bilinearform (2.26) and V is thespace (2.21). Closability of (a, C∞0 (G)) in H and L2(G, xµ/2) is inherited fromthe closability of the auxiliary symmetric form E in (2.25) via the equivalence ofnorms a(·, ·) ≈ ||·||2V , see [123, Section 3, Proposition 3.5] . The latter equivalenceis a direct consequence of the continuity property (2.29) and Gårding inequality(2.30), which were proven for the form a(·, ·) on the triplets V ⊂ H ⊂ V∗ such asV ⊂ L2(G, xµ/2) ⊂ V∗ in Section 3. Hence, in the inequality (2.29) the norm || · ||Von the right hand side can be replaced by (a(·, ·))1/2, yielding the strong (resp.weak) sector condition (see (6.1) and Remark 6.3) for the SABR-bilinearform:There exists a C1 > 0, such that for all u, v ∈ V

|a(u, v)| ≤ C1 (a(u, u))1/2 (a(v, v))1/2

resp.|a1(u, v)| ≤ C1 (a1(u, u))1/2 (a1(v, v)))1/2

for a1(u, v) := a(u, v) + (u, v)H, u, v ∈ V . cf. [123, Equations (2.4) resp. (2.3)].Hence, (a(·, ·),V) is a coercive closed form on (H, (·, ·)H) and on (L2(G, xµ/2), (·, ·)H)),in the sense of [123, Definition 2.4], see Definition 6.2 below. The remainingcontraction properties in order for the coercive closed form (a(·, ·),V) to be aDirichlet form (cf. Definition 6.4) follow via Theorem 6.7 from the respectivecontraction properties of the unique (cf. Theorem 6.5) semigroups on (H, (·, ·)H)and (L2(G, xµ/2), (·, ·)H)) associated to (a(·, ·),V). Alternatively, the contractionproperties in Definition 6.2 can be shown for the bilinear form (2.26) directly:Note that for any u ∈ V it holds that u+ ∧ 1 ∈ V (since derivatives are taken inthe weak sense) and the the contraction properties are by non-negativity of theform equivalent to

a(u+ u+ ∧ 1, u− u+ ∧ 1) ≥ 0 if and only if a(u+ ∧ 1, u− u+ ∧ 1) ≥ 0,a(u− u+ ∧ 1, u+ u+ ∧ 1) ≥ 0 if and only if a(u− u+ ∧ 1, u+ ∧ 1) ≥ 0.

Since the functions u+ ∧ 1 and u − u+ ∧ 1 have disjoint supports, the assertiona(u+ ∧ 1, u − u+ ∧ 1) = 0 follows directly from the construction (2.26) of the

Page 122: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Discretization 109

bilinearform, yielding sub-Markovianity. Hence, the form a(u+ ∧ 1, u− u+ ∧ 1) isa non-symmetric Dirichlet form (as a(u, v) 6= a(v, u) in general). For the sake ofcompleteness, we included in Section 6 the properties of non-symmetric Dirichletforms which are involved in Theorem 2.22. For a comprehensive treatment ofsymmetric and non-symmetric Dirichlet forms, see the monographs [34, 67] and[123] respectively.

Remark 2.23. We remark moreover that the form (2.26) is a Dirichlet formcorresponding to the unique law of the SABR process. This is a consequence ofthe well-posedness of the SABR martingale problem (cf. [108, page 212-214]),which follows from [108, Theorem 21.7], and [108, Lemma 21.17] by pathwiseuniqueness of the solutions of (1.1) when imposing Dirichlet boundary conditionsat zero.

3 DiscretizationIn this section we derive a suitable discretization in space and time for the vari-ational formulation of the SABR pricing equations. We propose a multiresolutionapproximation inspired by the (unweighted) wavelet discretization in [143, Sec-tion 3.4]. To accommodate to the current setting, we shall rely on the weightedmultiresolution analysis established in [27, Sections 5.2 and 5.3], for further ref-erence see also [150].

1 Space discretization and the semidiscrete problem

Given u0 ∈ V , and g ∈ L2(J ;V∗), first choose an approximation u(0,L) ∈ VL ofthe initial data u0, where VL ⊂ V is a finite dimensional subspace. Then thesemi-discrete problem reads as follows: Find uL ∈ H1(J ;VL), such that

uL(0) = u(0,L) (3.1)( ddtuL, vL)H + a(uL, vL) = (g(t), vL)V∗×V , ∀vL ∈ VL. (3.2)

We will assume u(0,L) = PLu0, were PL : V → VL, u 7−→ uL is an appropriateprojector (see equation (3.17) below). The semi-discrete problem is an initialvalue problem for N = dimVL ordinary differential equations

M ddtu(t) + Au(t) = g(t), u(0) = u0, (3.3)

where u(t) denotes the coefficient vector of uL(t) for t ≥ 0, and M and A de-note the mass and stiffness matrix respectively with respect to some basis of thediscretization space VL, which we will construct in the subsequent sections.

1.1 Discretization spaces

Recall that the bivariate pricing equation (2.17) is parabolic with degenerate el-liptic operator A. To accommodate to the degeneracy of A, we introduced the

Page 123: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

110 IV Dirichlet Forms and Finite Element Methods for the SABR Model

weighted spaces H and V with weights which are singular at the boundary x = 0of the domain G (cf. Definitions 2.10 and 2.12). For the well-posedness of thevariational formulation of these pricing equations we introduced weighted spacesV and H with weights, which are singular at the boundary of the domain G. Toconstruct finite element approximation spaces for V and H we first establish uni-variate approximation in each dimension separately. The approximation spacesto the (weighted) univariate spaces Vx, Vy and Hx, Hy are then assembled toobtain the bivariate approximation spaces to V and H. From now on we willrestrict ourselves without loss of generality to the unit interval I = (0, 1) if notstated otherwise. Our multiresolution analysis on I ⊂ R consists of a nestedfamily of spaces

V 0 ⊂ V 1 ⊂ . . . ⊂ V NL+1 = VL ⊂ . . . ⊂ L2(I, ω) =: H(I, ω), (3.4)

with⋃l∈N V

l = L2(I, ω), where L2(I, ω) denotes a (possibly weighted) space ofsquare integrable functions on I with weight function ω. The nested family (3.4)is constructed as follows. Let T0 be a given partition of the unit interval [0, 1] andL ∈ N denote the discretization level. We assume that for any L > 0 the familyT0, . . . , TL is such, that for each 0 < l < L the partition Tl is obtained fromTl−1 by bisection of each subinterval, so that Tl has N l = C2l subintervals, whereC denotes the number of intervals on level 0. Hence, on the refinement level Lof the discretization we have the mesh-width C2−L for some C > 0. We defineV l as the space of piecewise linear functions on the triangulation Tl, with zerovalues on the boundary. For any l ∈ 0, . . . , L the dimension of the space V l is

dimV l = C 2l =: N l, 0 ≤ l < L, dimV L = C 2L =: N (3.5)

for a constant C > 0. Furthermore, the codimension of each level is

M l := N l+1 −N l, 0 ≤ l < L. (3.6)

1.2 Wavelets for L2-spaces on an interval

For the univariate approximation spaces of Sobolev spaces11 Hx := L2(I, ωx0 ), Hy :=L2(I) such as Vx := H1(I, ωx1 ), Vy := H1(I) on the interval I, we recapitulatebasic concepts and definitions of (bi-)orthogonal, compactly supported waveletsfrom [27, Section 2], [150, Section 4] and [149, Section 6.2]. We consider two-parameter wavelet systems ψl,k, l = 0, . . . ,∞, k ∈ M l of compactly supportedfunctions ψl,k. Here the first index, l, denotes the “level” of refinement resp. resol-ution: wavelet functions ψl,k with large values of the level index are well-localizedin the sense that diam supp(ψl,k) = O(2−l). The second index, k ∈M l, measuresthe localization of wavelet ψl,k within the interval I at scale l and ranges in theindex set M l. In order to achieve maximal flexibility in the construction of wave-let systems (which can be used to satisfy other requirements, such as minimizingtheir support size or to minimize the size of constants in norm equivalences, see

11For the notation of the weights, recall (2.14) and Section 1.

Page 124: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Discretization 111

(3.11),(3.12) cf. [149, Section 6.2]), we propose wavelet bases, which are biortho-gonal in L2(I). These consist of a primal wavelet system ψl,k, l = 0, ...,∞,k ∈ M l which is a Riesz basis of L2(I) (and which enter explicitly in the spacediscretizations) and a corresponding dual wavelet system ψl,k, l = 0, ...,∞,k ∈ M l, which are not used explicitly in the algorithms, see [149, Section 6.2].The primal wavelet bases ψk,l span the finite dimensional spaces

V l = spanψi,j | 0 ≤ i ≤ l, 1 ≤ j ≤M l, l > 0, that is V l = V l−1⊕

W l,

(3.7)

that is, V l = V l−1⊕

W l inductively, where W l = spanψl,1, . . . , ψl,M l. Hence,

VL =⊕

0≤l<L

W l, where W l := spanψl,1, . . . , ψl,M l (3.8)

and dual spaces are defined analogously via the dual wavelet system

V L :=⊕

0≤l<L

W l, where W l := spanψl,1, . . . , ψl,M l.

Furthermore, the spaces⊕∞

l=0 Wl and

⊕∞l=0 W

l are assumed to be dense in L2(I).To construct such spaces, we consider a multiresolution basis ψk,l(k,l) of L2(I, ω)with the following properties:

(a) The basis functions ψ, ψ are biorthogonal in L2(I), that is∫ 1

0

ψk,l(x)ψk′,l′dx = δk,lδk′,l′ .

(b) The wavelets ψk,l and their duals ψk,l are local with respect to the corres-ponding scale and normalized, that is

diam supp(ψk,l) = Cψk,l2−l such as ||ψk,l||L1 = C ′ψ2−l/2

holds, where the constants Cψ, C ′ψ may depend on the “mother” wavelet.

(c) The primal wavelets satisfy a vanishing moment condition∫ 1

0

ψk,l(x) xαdx = 0, for α = 0, . . . , p,

where p denotes the polynomial order of the wavelets, see [27, Equations(2.2),(2.4)] such as [149, Equations (6.7),(6.8)]. The dual wavelets exceptthe ones at the endpoints satisfy∫ 1

0

ψk,l(x) xαdx = 0, for α = 0, . . . , p+ 1,

Page 125: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

112 IV Dirichlet Forms and Finite Element Methods for the SABR Model

while at the end points the dual wavelets satisfy only∫ 1

0

ψk,l(x) xαdx = 0, for α = 1, . . . , p+ 1.

This third condition implies that the wavelets satisfy the zero Dirichletcondition namely ψk,l(0) = ψk,l(1) = 0, cf. [149, Equation (6.8)].

In the stock price dimension, the weighted spaces Hx and Vx such as their waveletbases further satisfy (cf. [27, Assumption 3.1 and 3.2])

(w1) The weight function ω(x) belongs to W 1,∞((δ, 1)) for every δ > 0 andsatisfies

C−1ω ≤

ω(x)

xα≤ Cω, C−1

ω ≤ω′(x)

xα−1≤ Cω

for some α ∈ R and a constant Cω > 0, which only depends on the weightfunction.

(w2) The boundary wavelets ψk,l(x) and dual wavelets ψk,l(x) are denoted by theindex set

∇Ll = k ∈ N0, γ − 1 ≤ k ≤ 2l − 1, 0 ∈ supp ψk,l,

∇l

L= k ∈ N0, γ − 1 ≤ k ≤ 2l − 1, 0 ∈ supp ψk,l,

and the boundary wavelets and their duals satisfy the conditions

|ψk,l(x)| ≤ Cψ2l/2(2lx)γ,|(ψk,l)′(x)| ≤ Cψ23l/2(2lx)γ−1, γ ∈ N0, k ∈ ∇L

l

|ψk,l(x)| ≤ Cψ2l/2(2lx)γ,

|(ψk,l)′(x)| ≤ Cψ23l/2(2lx)γ−1, γ ∈ N0, k ∈ ∇Ll ,

where γ, γ ∈ N0 are parameters such that α + γ > −12and −α + γ >

−12is fulfilled for the parameter α in (w1). We refer to [44] for explicit

constructions.

It follows that any v ∈ V has a representation as a series and any vL ∈ VL as alinear combination

vL =L∑l=0

M l∑j=1

vljψl,j, vL ∈ VL; v =∞∑l=0

M l∑j=1

vljψl,j, v ∈ V, (3.9)

where vlj = (v, ψl,j)L2(I), cf. [27, Equation (2.9)]. Approximations of functions inV are obtained by truncating the wavelet expansion, cf. [143, Equation (3.13)]or [96, p. 161].

Definition 3.1 (Projection Operator). For a subspace V of a (possibly weighted)Hilbert space L2(I, ω) over an interval I (cf. (3.4)) the projection to the univariatefinite element discretization space PL : V → VL is defined by truncating thewavelet expansion at refinement level L

PLv :=L∑l=0

M l∑j=1

vljψl,j, v ∈ V. (3.10)

Page 126: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Discretization 113

1.3 Norm equivalences

Wavelet norm equivalences are akin to the classical Parseval relation in Fourieranalysis (see [96, Section 12.1.2]) allowing to express Sobolev norms in terms ofsums of its Fourier coefficients. Norm equivalences are relevant for the construc-tion of the mass- and stiffness matrices and for approximation estimates in theerror analysis, cf. [27, Equations (2.5), and (2.6)]. In the unweighted univariatecase there hold the standard norm12 equivalences

||u||2Hs0(I) ≈

∞∑l=0

2l∑k=0

22ls|ulk|2. (3.11)

For the stock price dimension Vx we need norm equivalences in weightedspaces, for which the requirements ((1.2) and (1.2), cf. [27, Section 3]) are posedon the wavelets and their duals.

Theorem 3.2 (Weighted norm equivalences). Under the assumptions of Section1.2 on the wavelet basis of the discretization spaces, there holds for any u ∈L2(I, ω) the norm equivalence of its L2(I, ω)-norm and of the discrete l2ω-norm ofits coefficients with respect to the wavelet basis.

||u||2L2(I,ω) ≈∞∑l=0

M l∑k=0

ω2(2−lk)|ulk|2 (3.12)

Proof. See [27, Theorem 3.3], and also [27, Theorem 5.1].

1.4 Bivariate setting

For the bivariate case we set, similarly as above, without loss of generality G =(0, 1) × (0, 1). The Hilbert spaces Ht(G,ω), t = 0, 1, 2 can be constructed fromH tx(I, ω

xt ) andH t

y(I) via tensor products, see appendix 3 for explicit constructions.We define the two-dimensional discretization spaces as the tensor product of theunivariate discretization spaces for the x and y coordinates

VL := VLx ⊗ VLy . (3.13)

Therefore, it holds similarly to (3.7) that

VL = spanψl,k : 0 ≤ lx, ly ≤ L, 1 ≤ kx ≤M l

x, 1 ≤ ky ≤M ly

, (3.14)

where ψl,k(x, y) = ψ(lx,ly),(kx,ky) := ψlx,kx(x)ψly ,ky(y), for (x, y) ∈ (0, 1)×(0, 1). Re-call from (3.7) thatW lx = spanψlx,1, . . . , ψlx,M lx andW ly = spanψly ,1, . . . , ψly ,M ly

12With a view to error analysis, it is conventional (cf. [143, Section 3.1] and [96, Section3.6.1]) to consider functions in V , which have additional regularity: In the unweighted univariatecase one considers the classical Sobolev spaces Ht

0(I), t = 0, 1, 2, where the subscript denotesDirichlet boundary conditions.

Page 127: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

114 IV Dirichlet Forms and Finite Element Methods for the SABR Model

are the corresponding complement spaces. Hence it follows from (3.8) with (3.13)directly, that in the bivariate case

VL =

( ⊕0≤lx≤L

W lx

)⊗( ⊕

0≤ly≤L

W ly

)=

⊕0≤lx,ly≤L

W lx ⊗W ly . (3.15)

Every element u ∈ L2(G) has13 a series representation cf. [96, Equation (13.6)]

u =∞∑

lx,ly=0

M lx∑kx1

M ly∑ky=1

ulkψl,k, where l,k = (lx, ly), (kx, ky), (3.16)

and the projection operator PL : V → VL, u 7→ PLu =: uL is defined as in (3.10)by truncating at level L the above series representation,

uL =L∑

lx,ly=0

M lx∑kx1

M ly∑ky=1

ulkψl,k, u ∈ V, l,k = (lx, ly), (kx, ky). (3.17)

It is immediate from the underlying tensor product structure and from the norm-equivalences (3.11) of the one-dimensional case, that in the bivariate case thereholds the norm equivalence

||u||2Hs(G) ≈∞∑

lx,ly=0

2lx∑kx

2lx∑kx

(22sxlx + 22syly

)|ul

k|2, (3.18)

for the Sobolev spaces Hk0 (G), k = 0, 1, 2, where l,k = (lx, ly), (kx, ky), as in

(3.9), and s = (sx, sy) cf. [96, Equation 13.8]. For notational simplicity we statedhere the unweighted version of the bivariate norm equivalence. Note however,that passing from the univariate to the bivariate case is a direct consequence ofthe tensor product structure (see Section 3) and is valid both in unweighted andweighted Sobolev spaces analogously.

Now we are in a position to calculate the matrices Sx,Bx, and Sx such asSy,By, and Sy as building blocks of the mass- and stiffness matrices M and Aappearing in the semi-discrete problem (3.3) with respect to the constructed wave-let basis ψl,k: for the y-coordinate, the matrices in the appropriate weightedL2ω-norm read

Myω2y

:=

(∫ 1

0

ω2y(y)ψly,ky (y)ψl′y,k′y

(y)

ω(2−lyky)ω(2−l′yk′y)

dy

)0≤l′y ,ly≤L; 0≤k′y≤2l

′y , 0≤ky≤2ly

Syω2y

:=

(∫ 1

0

ω2y(y)ψ′ly,ky (y)ψ′

l′y,k′y(y)

ω(2−lyky)ω(2−l′yk′y)

dy

)0≤l′y ,ly≤L; 0≤k′y≤2l

′y , 0≤ky≤2ly

Byω2y

:=

(∫ 1

0

ω2y(y)ψ′ly,ky (y)ψl′y,k′y

(y)

ω(2−lyky)ω(2−l′yk′y)

dy

)0≤l′y ,ly≤L; 0≤k′y≤2l

′y , 0≤ky≤2ly

,

(3.19)

13Note that H = L2(G, xµ/2) ⊂ L2(G).

Page 128: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Discretization 115

and for the x-coordinate, the corresponding matrices are

Mxω2x

:=(∫ 1

0

ω2x(x)ψlx,kx (x)ψl′x,k′x

(x)

ω(2−lxkx)ω(2−l′xk′x)

dx)

0≤l′x,lx≤L; 0≤k′x≤2l′x , 0≤kx≤2lx

Sxω2x

:=

(∫ 1

0

ω2x(x)ψ′lx,kx (x)ψ′

l′x,k′x(x)

ω(2−lxkx)ω(2−l′xk′x)

dx

)0≤l′x,lx≤L; 0≤k′x≤2l

′x , 0≤kx≤2lx

Bxω2x

:=(∫ 1

0

ω2x(x)ψ′lx,kx (x)ψl′x,k′x

(x)

ω(2−lxkx)ω(2−l′xk′x)

dx)

0≤l′x,lx≤L; 0≤k′x≤2l′x , 0≤kx≤2lx

,

(3.20)

see [27, equation (3.1)] . Hence, the stiffness matrix A in the semi-discrete prob-lem (3.3) is

A =(A(l′,k′),(l,k)

)0≤l′x,lx≤L; 0≤k′x≤2l

′x , 0≤kx≤2lx

:= (a(ψl,k, ψl′,k′))0≤l′x,lx≤L; 0≤k′x≤2l′x , 0≤kx≤2lx ,

(3.21)

where a(·, ·) denotes the SABR-bilinear form (2.26). With respect to the weightedmultiresolution basis ψl,kψl′,k′ defined in this section, the mass matrix reads

M = Mxxµ ⊗My

1, (3.22)

and the stiffness matrix A takes the formA =

(QxxSxx2β+µ ⊗My

e2y +QxyBxxβ+µ ⊗By

ey +QyyMxxµ ⊗ Sy1

)+(cx1B

xx2β+µ−1 ⊗My

e2y + cx2Bxxβ+µ ⊗My

ey + cyMxxµ ⊗By

1

),

(3.23)

where the coefficients in (3.23) are

(Qxx,Qyy,Qxy) = (12, ρν, ν

2

2)

and(cx1 , cx2 , cy) = (2β+µ

2, ρν, ν

2

2).

2 Time discretization and the fully discrete scheme

In this section we define a θ-scheme for our time discretization to introduce thefully discrete scheme of our finite element method. Furthermore, we verify (seeProposition 3.4 below) following [143], that the stability of the θ-scheme remainsvalid in the setting of weighted spaces. For this, we introduce the following dualnorm for the approximation spaces:

||f ||∗ := supvL∈VL

(f, vL)V∗×V||vL||V

, f ∈ V ∗L , (3.24)

furthermore, for T <∞ and M ∈ N we consider the following uniform time-stepand time mesh

k := TM, and tm = mk, m = 0, . . . ,M. (3.25)

The θ-scheme for the time-discretization and the fully discrete scheme are de-scribed as follows:

Page 129: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

116 IV Dirichlet Forms and Finite Element Methods for the SABR Model

Definition 3.3 (θ-scheme and the fully discrete scheme). Given the initial datau0L := u(0,L) = PLu

0, for the projector in (3.17) for m = 0, . . . ,M − 1 findum+1L ∈ VL such that for all vL ∈ VL:

1k(um+1

L −umL , vL)V∗×V+a(θum+1L +(1−θ)umL , vL) = (θg(tm+1)+(1−θ)g(tm), vL)V∗×V

(3.26)Hence, the fully discrete finite element scheme for the SABR model reads

( 1kM + θA)um+1 = 1

kMum − (1− θ)Aum + gm+θ, m = 0, 1, . . . ,M − 1, (3.27)

where M denotes the mass matrix (3.22), A the stiffness matrix (3.23), and umis the coefficient matrix of umL with respect to the basis of VL.

Proposition 3.4 (Stability of the θ-scheme). For 12≤ θ ≤ 1 let the constants C1

and C2 satisfy

0 < C1 < 2, C2 ≥1

2− C1

, (3.28)

and for 0 ≤ θ < 12denote λA = supvL∈VL

||vL||2H||vL||2∗

, and the constants C1 and C2 besuch that

0 < C1 < 2− σ, C2 ≥ 1+(4−C1)σ2−σ−C1

where σ := k(1− 2θ)λA < 2.(3.29)

Then the sequence umL Mm=0 of solutions of the θ-scheme 3.26 satisfy the stabilityestimate

||uML ||2H + C1kM−1∑m=0

||um+θk ||2H ≤ ||u0

L||2H + C2kM−1∑m=0

||gm+θ||2∗. (3.30)

The proof of this proposition is analogous to the one in [143] and is delegatedto Section 5.

4 Error estimatesLet VL, the finite dimensional approximation space of the solution space V be asin Section 3. Furthermore, consider um(x) := u(tm, x), with tm, m = 0, . . . ,M asin (3.25) and umL as in the fully discrete scheme (3.26). In this section we estimatethe error

emL := um(x)− umL (x) = (um − PLum) + (PLum − umL ) =: ηm + ξmL , (4.1)

for the the time-points tm, m = 0, . . . ,M , where PL : V → VL denotes theprojection 3.10 on the finite element space via truncated wavelet expansion. InSection 1 we derive estimates for the error ηm, m = 0, . . . ,M (approximationestimates). Section 2 is devoted to concluding from the results of Section 1corresponding error estimates for ξmL , m = 0, . . . ,M and the convergence of thefully discrete scheme.

Page 130: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

4 Error estimates 117

1 Approximation estimates

The crucial ingredient of our error analysis is the derivation of approximationestimates which measure the error ηm = (um − PLum) between the true solutionof the pricing equation at times tm, m = 0, . . . ,M and its projection to thediscretization space in a suitably chosen norm. In the unweighted case, suchapproximation estimates—as in (4.2) below—in the usual Hk(I) (resp. Hk(G))norm, k = 0, 1, 2 are standard, see [96, Jackson-type estimates p.163]. With viewto the ensuing error analysis we derive here (see Section 1.2 below) analogousestimates for weighted Sobolev norms which are suitable to the Gelfand tripleV ⊂ H ⊂ V∗ constructed in Section 3 and the corresponding discretization spacesin Section 1.

1.1 Approximation estimates in the unweighted case

In the unweighted univariate case it is well-known that there exists for all u ∈H l(I) with l = 0, 1, 2 an element uL in the corresponding discretization space VL,such that uL = PLu and for k = 0, 1 and l = 0, 1, 2, l ≥ k it holds that

||u− PLu||Hk(I) ≤ C2−(l−k)L||u||Hl(I), (4.2)

where PL denotes the projector (3.10). The existence of such an element isprovided by the norm-equivalences (3.11). The same estimates hold in the two-dimensional case G = I × I, see [96, Theorem 13.1.2.], in particular the ap-proximation rate (2−L)(l−k) only depends on the discretization level 2−L and isindependent from dimension of the domain G. Analogously to the univariatecase, for kx = 0, 1 and lx = 0, 1, 2, lx ≥ kx such as for ky = 0, 1 and ly = 0, 1, 2,ly ≥ ky, it holds that

||u− PLu||Hk(G) ≤

C2−(l−k)∗L||u||Hl(G), if k 6= 0 or lx, ly 6= 2,

C2−(l−k)∗LL1/2||u||Hl(G) else,(4.3)

where we denote (l− k)∗ := minlx − kx, ly − ky and where PL is the projectionoperator (3.17). The estimate (4.3) is a direct consequence of (4.2) and the tensorproduct construction (3.13). Analogous arguments obtain as above for bivariatenorm-equivalences (3.18), cf. [96, Chapter 13].

1.2 Approximation estimates in the weighted case

In the weighted setting, the order of approximation may depend on the norms ofthe weighted Sobolev spaces, in which we measure the error. In accord with usualconventions14 we consider functions in V with additional regularity for our erroranalysis in the weighted setting. For this purpose we consider weighted Sobolevspaces up to second order, Hk(G,ω), k = 0, 1, 2, where the weight ω is yet tobe chosen suitably to our setting. We aim to establish approximation estimates

14Cf. [143, Section 3.1] and [96, Section 3.6.1] and see also the unweighted case above.

Page 131: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

118 IV Dirichlet Forms and Finite Element Methods for the SABR Model

of the following form: for any u ∈ Hk(G,ω), k = 0, 1, 2 there exists a constantC > 0 such that an estimate of the following type holds

||u− PLu||Hk(G,ω) ≤

C2−cω(l−k)∗L||u||Hl(G,ω), for k 6= 0 or lx, ly 6= 2,

C2−cω(l−k)∗LL1/2||u||Hl(G,ω) else,(4.4)

for a constant cω ∈ R+, which may depend on the choice of the weights inHk(G,ω), k = 0, 1, 2, where (l − k)∗ = minlx − kx, ly − ky as in (4.3). In thefollowing, we will first prove the one-dimensional version of the approximationproperty in weighted spaces. More specifically, we will show analogous state-ments to (4.2) on the weighted Sobolev spaces Hk(I, xµ/2), k = 0, 1, 2 in the xcoordinate. For this, we pass for a function u(t, x, y) ∈ L2(J,V), t ∈ J, (x, y) ∈ Gto uy(t, x) ∈ L2(J, V ), t ∈ J, x ∈ I, for y ∈ I (see Appendix 3)). Note thatfor the y coordinate, the unweighted setting prevails and the estimate (4.2) isvalid. We then proceed to the bivariate case G = I × I by constructing tensorproducts of univariate multiresolution finite element spaces15. Analogously asin the unweighted case (cf. equation (4.3)), the minimum of the obtained one-dimensional estimates then yields the estimates of (4.4), for the bivariate caseHk(G,ω), k = 0, 1, 2, (see Section 1.1 above and [96, Section 13.1]).

Definition 4.1. Consider an interval I = (0, R), R > 0 and the weighted Sobolevspaces

Hkj (I, xµ/2) := u : I → R measurable : Dau ∈ L2(I, xµ/2+aβj), a ≤ k, (4.5)

for k = 0, 1, 2 and j = 0, 1, with the norm

||u||2Hkj (I,xµ/2) :=

∑a≤k

∫I

|Dau(x)|2xµ+aβj dx, k = 0, 1, 2. (4.6)

To ease notation we shall henceforth denote Hkj=0 by Hk and Hk

j=1 by Hk1 , for

k = 0, 1, 2.

Remark 4.2. Note that for the spaces H and V in Remark 2.16 and for thespaces Hk

j , k = 0, 1, 2, j = 0, 1 in (4.5) it holds that H = H0((0, R), xµ/2) =

H01 ((0, R), xµ/2) and the weighted space V satisfies V = H1

1 ((0, R), xµ/2) andV ⊃ H1((0, R), xµ/2) such as the estimate

||v||2V ≤ CR||v||2H1((0,R),xµ/2), v ∈ V (4.7)

for any finite R > 0 and a positive constant CR > 0.

Proposition 4.3. The projection operator in (3.10) satisfies for k = 0, 1 andl = 0, 1, 2, l ≥ k the estimate

||u− PLu||Hk(I,xµ/2) ≤ C2−(l−k)L||u||Hl(I,xµ/2), u ∈ H2(I, xµ/2). (4.8)

15See Appendix 3.1 for an explicit construction

Page 132: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

4 Error estimates 119

It is immediate from Remark 4.2, that the approximation estimates (4.8)readily apply to the CEV model. Furthermore, combining estimate for the xcoordinate with the unweighted estimate (4.2) for the y coordinate and takingtensor products (see Appendix 3.1 for an explicit construction of the bivari-ate spaces) yields that the approximation estimate (4.3) remains valid in the(weighted) bivariate case. In contrast to this, measuring the error in the norms||u||Hk

j=1(I,xµ/2), k = 0, 1, 2, the approximation in the x coordinate dominates they coordinate, see Remark 4.4.

Remark 4.4. If we consider j = 1 in Definition 4.1, we do not assume anyadditional integrability requirements on our solution up to first order derivatives.In this case we obtain such (weaker) approximation estimates where the order ofapproximation depends on the parameter β: the projection operator PL : V → VLin (3.10) satisfies for k = 0, 1 and l = 0, 1, 2, l ≥ k the estimate

||u− PLu||Hk1 (I,xµ/2) ≤ C2−(1−β)(l−k)L||u||Hl

1(I,xµ/2), u ∈ H21 (I, xµ/2). (4.9)

A proof of this remark is delegated to Section 1.

Proof of Proposition 4.3. Let u ∈ V and consider PLu ∈ VL = VL. Then it isimmediate from (3.9) and (3.10) that

u− PL(u) =∞∑

l=L+1

2l∑j=1

uljψl,j.

It directly follows from the norm equivalence (3.11) and the scaling of the waveletbasis-elements to unit norm in L2(I), that the derivatives satisfy

||(u− PLu)′||2L2(I,xµ/2) = C∞∑

l=L+1

22l

2l∑k=0

ω2(2−lk)|ulk|2

≥ C22L

∞∑l=0

2l∑k=0

ω2(2−lk)|ulk|2

= C22L||u− PLu||2L2(I,xµ/2).

(4.10)

Now with C = 1C

and recalling that we have set h = 2−2L, the following relationis obtained:

||u− PLu||2L2(I,xµ/2)≤ Ch||(u− PLu)′||2

L2(I,xµ/2)≤ Ch||u′||2

L2(I,xµ/2)

≤ Ch(||u||2

L2(I,xµ/2)+ ||u′||2

L2(I,xµ/2)

)= Ch||u||2

H1(I,xµ/2).

(4.11)

Analogously, replacing (u− PLu)′ by (u− PLu)′′ and (u− PLu) by (u− PLu)′ inequations (4.10) and (4.11), one obtains

||(u− PLu)′||2L2(I,xµ/2)

≤ Ch||(u− PLu)′′||2L2(I,xµ/2)

≤ Ch||u′′||2L2(I,xµ/2)

,(4.12)

Page 133: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

120 IV Dirichlet Forms and Finite Element Methods for the SABR Model

and adding up (4.11) and (4.12) yields:

||u− PLu||2H1(I,xµ/2)= ||u− PLu||2L2(I,xµ/2)

+ ||(u− PLu)′||2L2(I,xµ/2)

≤ Ch∑2

k=0 ||u(k)||2L2(I,xµ/2)

= Ch||u||2H2(I,xµ/2)

.

(4.13)

Finally, concatenating (4.10) and (4.11) directly yields:

||u− PLu||2L2(I,xµ/2)≤ Ch||(u− PLu)′||2

L2(I,xµ/2)

≤ C(2−L)2||(u− PLu)′′||2L2(I,xµ/2)

≤ C(2−L)2||u′′||2L2(I,xµ/2)

≤ C(2−L)2||u||2H1(I,xµ/2)

.

(4.14)

2 Discretization error and convergence of the scheme

In this section we apply the estimates of Section 1 to derive estimates on thediscretization error (cf. equation (4.1)) and to conclude the convergence of theproposed finite element approximation of the variational solution of the SABRpricing equations. For the estimates of the discretization error we follow the(unweighted) analysis of [143, Section 5]. Corresponding proofs prevail with minormodifications and are provided—accommodated to our setting and notations—inthe Section 7 for easy reference.

Lemma 4.5. For u ∈ C1(J ;H2(G, a)), the errors ξmL are the solutions of theθ-scheme:Given ξ0

L := PLu0 − u0

L, for m = 0, . . . ,M − 1 find ξm+1L ∈ VL such that for all

vL ∈ VL:1k(ξm+1L − ξmL , vL)V×V∗ + a(θξm+1

L + (1− θ)ξmL , vL) =: (rm, vL)V×V∗ (4.15)

The proof of the above Lemma relies on the observation that the errors ξsatisfy the same PDE as the solutions u, therefore the θ schemes for ξ and foru are analogous16. The following corollary is a direct consequence of Lemma 4.5and of the the stability of the θ-scheme, established in Proposition 3.4.

Corollary 4.6. There exist constants C1 and C2 independent from the discretiza-tion level L and time mesh k such that the solutions of (4.15) satisfy the estimate

||ξML ||2H + C1 k

M−1∑m=0

||ξm+θL ||2a ≤ ||ξ0

N ||2H + C2 k

M−1∑m=0

||rm||2∗, (4.16)

where for any f ∈ V ∗L ||f ||∗ := supvL∈VL(f,vL)V×V∗

||vL||a, cf. (3.24) and where rm,

m = 0, . . . ,M − 1 denote the weak residuals defined in equation (4.15).16See Section 7 for details, where we also included for completeness a reminder of the proof of

[143, Lemma 5.1]—accommodated to our notation—which directly carries over to the presentsituation.

Page 134: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

4 Error estimates 121

The weak residual rm, m = 0, . . . ,M − 1 in (4.15) can be decomposed intothe following parts

(rm, vL)V×V∗ := (rm1 , vL)V×V∗ + (rm2 , vL)V×V∗ + a(rm3 , vL),

where the components rm1 ,rm2 and rm3 , m = 0, . . . ,M are defined as

(rm1 , vL)V×V∗ := ( 1k(um+1 − um)− um+θ, vL)V×V∗ ,

(rm2 , vL)V×V∗ := ( 1k(PLu

m+1 − PLum) + 1k(um+1 − um), vL)V×V∗ ,

(rm3 , vL)V×V∗ := a(PLum+θ − um+θ, vL).

(4.17)

Using this decomposition facilitates the following estimates for the residuals.

Lemma 4.7 (Norm estimates for the residuals). Consider the weak residuals rmof the θ-scheme (4.15) for m = 0, . . . ,M − 1. Furthermore, assume that

u ∈ C1(J ;H2j (G, x

µ/2)) ∩ C3(J ;H2j (G, x

µ/2)), j = 0, 1,

where Hkj (G, x

µ/2), k = 0, 1, 2, j = 0, 1 are the spaces in (5.13). Then there holdsthe estimate

||rm||∗ ≤ C

k1/2(∫ tm+1

tm||u||2∗ds

)1/2

, θ ∈ [0, 1]

k3/2(∫ tm+1

tm||...u ||2∗ds

)1/2

, θ = 12

+2−L(

Ck1/2

(∫ tm+1

tm||u||2H1

j (G,xµ/2)

ds)1/2

+ C||um+θ||H2j (G,x

µ/2)

),

(4.18)

A proof of this Lemma is provided in the Section 7. With these preparations,we are in a position to prove the main result of this section:

Theorem 4.8 (Convergence of the finite element approximation: SABR). As-sume that

u ∈ C1(J ;H2j (G, x

µ/2)) ∩ C3(J ;H2j (G, x

µ/2)), j = 0, 1,

where Hkj (G, x

µ/2), k = 0, 1, 2, j = 0, 1 are the spaces in (5.13), and assumefurther that the approximation u(0,L) ∈ VL of the initial data is quasi optimal,that is

||ξ0L||2H = ||u0 − u(0,L)||2H ≤ C2−2L||u0||2H. (4.19)

Let um(z) = u(tm, z), z ∈ G for tm, m = 0, . . .M be as in (3.25) let umL denotethe solution of the fully discrete scheme (3.26), and let the approximation spaceVL be as in Section 3. Then, the following error bounds hold:

||uM − uML ||2H + kM−1∑m=0

||um+θ − um+θL ||2a ≤ C 2−j(1−β)2L max

0≤t≤T||u(t)||2H2

j

+ 2−j(1−β)2L

∫ T

0

||u(s)||2H1jds

+ C

k2∫ T

0||u(s)||2∗ds, 0 ≤ θ ≤ 1

k4∫ T

0||...u (s)||2∗ds, θ = 1

2

Page 135: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

122 IV Dirichlet Forms and Finite Element Methods for the SABR Model

for j = 0, 1, where um+θL = θum+1

L + (1− θ)umL , and um+θ = θum+1 + (1− θ)um.

Remark 4.9 (Convergence of the finite element approximation: CEV). The sameestimates hold for the CEV model, when we replace in Theorem 4.8 the spacesHkj (G, x

µ/2), k = 0, 1, 2, j = 0, 1 and the corresponding norms by Hkj (I, xµ/2),

k = 0, 1, 2, j = 0, 1 in (4.5) and the norms (4.6).

Proof of Theorem 4.8, and Remark 4.9. For the proofs we follow [96, Theorem3.6.5], and [143, Theorem 5.4.] with the appropriate adjustments. For brevity weshall prove both cases Hj=0 and Hj=1 together, and denote the generic conver-gence of order by 2−2Lcω := 2−2L j(1−β) as in (4.4), where cω = 1 for Hj=0 andcω = (1− β) for Hj=1. By Corollary 4.6,

||eML ||2H = ||um(x)− umL (x)||2H = ||ηm + ξmL ||2H

≤ 2(||ηm||2H + k

M−1∑m=0

||ηm+θ||2a + ||ξmL ||2H + k

M−1∑m=0

||ξm+θL ||2a

).

This yields

||eML ||2H + C1kM−1∑m=0

||em+θ||2a

≤ 2(||ηM ||2H + C1k

M−1∑m=0

||ηm+θ||2a + ||ξmL ||2H + C1kM−1∑m=0

||ξm+θL ||2a

)≤ C

(||ηM ||2H + k

M−1∑m=0

||ηm+θ||2a + ||ξ0L||2H + C2 k

M−1∑m=0

||rm||2∗),

and by Lemma 4.7 one can further estimate the last terms to obtain

C(||ηM ||2H + k

M−1∑m=0

||ηm+θ||2a + ||ξ0L||2H + C2 k

M−1∑m=0

||rm||2∗)

≤ C

||ηM ||2H + k

M−1∑m=0

||ηm+θ||2a + ||ξ0L||2H

+ C3

M−1∑m=0

(2−L)2cω

(∫ tm+1

tm

||u||2H1jds+ ||um+θ||2H2

j+

k2∫ tm+1

tm||u||2∗ds, θ ∈ [0, 1]

k4∫ tm+1

tm||...u ||2∗ds, θ = 1

2

)

≤ C||ξ0L||2H + C(2−L)2cω

∫ T

0

||u||2Vds+ C max0≤t≤T

(2−L)2cω ||u(t)||2H2j

+ C

k2∫ T

0||u||2∗ds, θ ∈ [0, 1]

k4∫ T

0||...u ||2∗ds, θ = 1

2,

where the last step follows by || · ||a ≈ || · ||V ≤ CG|| · ||H1j (G,x

µ/2) for a CG > 0 (cf.Remark 5.8) and the approximation estimates (4.8) resp. (4.9). Finally, quasioptimality (4.19) of the initial data yields the statement of the Theorem.

Page 136: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Reminder on some properties of the considered function spaces 123

5 Reminder on some properties of the consideredfunction spaces

1 Weighted Sobolev spaces

For the sake of completeness, we include a reminder on the weighted Sobolevspaces considered in this article and some of their relevant properties and referthe reader to the monographs of [112] and [37] for full details. By a weight, weshall mean a locally integrable function ω on R2 such that ω(x) > 0 a.e. Everyweight ω gives rise to a measure on the measurable subsets of R2 via integration.This measure will also denoted by ω. Thus for measurable sets E ⊂ R2, we setω(E) =

∫Eω(x)dx.

Definition 5.1 (Weighted L2-space). Let ω be a weight and let G ⊂ R2 be open.We define L2(G,ω) as the set of measurable functions u on G such that

||u||2L2(G,ω) =

∫G

|u(x)|2ω(x)dx <∞ (5.1)

Definition 5.2 (Weighted Sobolev space). Let k ∈ N and Let a = ωa =ωa(x), x ∈ G, |a| ≤ k be a given family of weight functions on an open setG ⊂ R2. We denote by W k(G,ϕ) the set of all functions u ∈ L2(G,ω) for whichthe weak derivatives D(a)u, with |a| ≤ k, belong to L2(G,ωa). The weightedSobolev space W k(G,ϕ) is a normed linear space if equipped with the norm

||u||2Wk(G,ω) =∑|a|≤k

∫G

|Dau(x)|2ωa(x)dx. (5.2)

Remark 5.3. If ωω ∈ L1loc(G) then C∞0 (G) is a subset of W k(G,ω), and we can

introduce the space W k0 (G,ω) as the closure of C∞0 (G) with respect to the norm

W k(G,ω), see also [37, 113].

The class of Ap weights was introduced by B. Muckenhoupt (cf. [136]). Aweight ω is in Ap if there exists a positive constant C such that for every ballB ⊂ R2 (

1

|B|

∫B

ωdx

)(1

|B|

∫B

ω−1/(p−1)dx

)p−1

≤ C. (5.3)

Lemma 5.4. For ω(x) := |x|, x ∈ R2 is in Ap if and only if −2 < ω < 2(p− 1).For ω(x) := e(λϕ(x)) ∈ A2, with ϕ ∈ W 1(G) and λ is sufficiently small.

Proof. See Corollary 4.4 in [168], and Corollary 2.18 in [68]

Lemma 5.5. If ω ∈ Ap then since ω−1/(p−1) is locally integrable, we have Lp(G,ω) ⊂L1loc(G) for every open set G ⊂ Rn. It thus makes sense to talk about weak deriv-

atives of functions in Lp(G,ω).

Page 137: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

124 IV Dirichlet Forms and Finite Element Methods for the SABR Model

The weighted Sobolev space W k,p(G,ω) is the set of functions u ∈ Lp(G,ω)with weak derivatives Dαu ∈ Lp(G,ω), |α| ≤ k. The norm of u in W k,p(G,ω) is

||u||pWk(G,ω)

=∑|α|≤k

∫G

|Dαu(x)|pω(x)dx.

Lemma 5.6. The following statements hold for Muckenhoupt weights:

(i) if ω ∈ Ap then C∞(G) is dense in W k,p(G,ω).

(ii) if ω ∈ Ap then we have the weighted Poincaré inequality, that is: Let 1 <p < ∞ and ω ∈ Ap. Then there are positive constants C and δ such thatfor all Lipschitz continuous functions ϕ defined on B(B = B(x0, R)), andfor all 1 ≤ θ ≤ n/(n− 1) + δ(

1

ω(B)

∫B

|ϕ(x)− ϕB(x)|θpω(x)dx

) 1θp

≤ CR

(1

ω(B)

∫B

|∇ϕ(x)|pω(x)dx

) 1p

where ϕB := 1ω(B)

∫Bϕωdx.

Proof. See [169, Corollary 2.1.6 ] and [62, Theorem 1.5].

2 Bochner spaces

Let H denote any Hilbert space with norm || · ||H. For a finite T > 0 considerJ := (0, T ). Then the Bochner Space L2(J ;H) is defined as

L2(J ;H) := u : J → H measurable, ||u||L2(J ;H) <∞,

with norm defined by

||u||L2(J ;H) :=

(∫J

||u(t)||2Hdt)1/2

, u ∈ ||u||L2(J ;H). (5.4)

We call u the weak time-derivative of a function u ∈ L2(J ;H), if∫J

u(t)ϕ(t)dt = −∫J

u(t)ϕ(t))dt, ∀ϕ ∈ C10(J). (5.5)

With the above definitions at hand, we define for J = (0, T ), T > 0, the Bochnerspace

H1(J ;H) := u ∈ L2(J ;H) : u ∈ L2(J ;H),equipped with the norm

||u||H1(J ;H) :=

(∫J

||u||2H + ||u||2Hdt)1/2

. (5.6)

Furthermore, for any n ∈ N0, we define

Cn(J ;H) := u : J → H, u ∈ Cn with respect to t (5.7)

See [96, Sections 2.1 and 3.1].

Page 138: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

5 Reminder on some properties of the considered function spaces 125

3 Tensor products of Hilbert spaces

Let I := (0, 1) denote the unit interval and G := I × I. See in [96, Section 13.1]that the Hilbert spaces Hk(G), k = 0, 1, 2 can be constructed from Hk(I) via thetensor product structure:

L2(G) ∼=(L2(I)⊗ L2(I)

)(5.8)

H1(G) ∼=(H1(I)⊗ L2(I)

)⋂(L2(I)⊗H1(I)

)(5.9)

H2(G) ∼=(H2(I)⊗ L2(I)

)⋂(H1(I)⊗H1(I)

)⋂(L2(I)⊗H2(I)

)(5.10)

Recall from [148, Chapter II. 4] that the inner products on each of the tensor-Hilbert spaces are defined as

〈u1 ⊗ u2, v1 ⊗ v2〉H1⊗H2 := 〈u1, v1〉H1〈u2, v2〉H2 , (5.11)

for u1, v1 ∈ H1 and u2, v2 ∈ H2, where H1, H2 stand for generic Hilbert spaces(say any of the tensor products involved in (5.8),(5.9), or (5.10) above). Theinner products on the intersection spaces Hk(G), k = 0, 1, 2 in (5.8),(5.9) and(5.10) induced by this construction are equivalent to the usual norms on thesespaces. Explicitly, for any u ≡ ux ⊗ uy, and v ≡ vx ⊗ vy ∈ Hk(G) they are givenby

〈u, v〉L2(G) = 〈ux, vx〉L2(I)〈uy, vy〉L2(I)

〈u, v〉H1(G) ≈ 〈ux, vx〉H1(I)〈uy, vy〉L2(I) + 〈ux, vx〉L2(I)〈uy, vy〉H1(I)

〈u, v〉H2(G) ≈ 〈ux, vx〉H2(I)〈uy, vy〉L2(I) + 〈ux, vx〉H1(I)〈uy, vy〉H1(I)

+〈ux, vx〉L2(I)〈uy, vy〉H2(I).

Furthermore, to justify u ≡ ux ⊗ uy, and v ≡ vx ⊗ vy ∈ L2(G) for ux, vx ∈ L2(I)and uy, vy ∈ L2(I) we recall the following [148, Theorem II. 10. c, Chapter II. 4]:

Theorem 5.7. Let (M1, µ1) and (M2, µ2) be measure spaces so that L2(M1, µ1)and L2(M2, µ2) are separable, then there is a unique isomorphism such that

L2(M1 ×M1, dµ1 ⊗ dµ2) 7−→ L2(M1, dµ1;L2(M2, dµ2))f(x, y) 7−→ (x 7→ f(x, ·)). (5.12)

3.1 Explicit construction of the bivariate spaces in Section 1.2

Consider our domain of interest interval G = (0, Rx) × (−Ry, Ry), Rx, Ry > 0and the weighted Sobolev spaces

Hkj (G, x

µ/2) := u : G→ Rmeasurable : ∂|a|a u ∈ L2(I, xµ/2+axβj), a ≤ k, k = 0, 1, 2,(5.13)

for j = 0, 1, and a multiindex a with |a| = ax + ay, where ax denotes the numberof derivatives in direction x and ay in direction y. The respective norms inHkj (G, x

µ/2) for j = 0 are defined by

||u||2H0j=0(G,xµ/2)

:= ||xµ/2u||2L2

||u||2H1j=0(G,xµ/2)

:= ||xµ/2u||2L2 + ||xµ/2∂y(u)||2L2 + ||xµ/2∂x(u)||2L2

||u||2H2j=0(G,xµ/2)

:= ||xµ/2u||2L2 + ||xµ/2∂y(u)||2L2 + ||xµ/2∂x(u)||2L2

+||xµ/2∂yyu||2L2 + ||xµ/2∂xyu||2L2 + ||xµ/2∂yyu||2L2 ,

(5.14)

Page 139: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

126 IV Dirichlet Forms and Finite Element Methods for the SABR Model

and for j = 1 by

||u||2H0j=1(G,xµ/2)

:= ||xµ/2u||2L2

||u||2H1j=1(G,xµ/2)

:= ||xµ/2u||2L2 + ||xµ/2∂y(u)||2L2 + ||xβ+µ/2∂x(u)||2L2

||u||2H2j=1(G,xµ/2)

:= ||xµ/2u||2L2 + ||xµ/2∂y(u)||2L2 + ||xβ+µ/2∂x(u)||2L2

+||xµ/2∂yyu||2L2 + ||xβ+µ/2∂xyu||2L2 + ||x2β+µ/2∂yyu||2L2 .(5.15)

Remark 5.8. Note, that similarly to Remark 4.2 in the univariate case, for thespaces H and V in Definitions 2.10 and 2.12 and for the spaces Hk

j (G), k = 0, 1, 2,j = 0, 1 with norms as in (5.14) and (5.15) it holds that H = H0

j=0(G, xµ/2) =

H0j=1(G, xµ/2) and the weighted space V satisfies V = H1

j=1(G, xµ/2) and V ⊃H1j=0(G, xµ/2) and there holds the estimate

||v||2V ≤ CG||v||2H1j=0(G,xµ/2), v ∈ V (5.16)

on any bounded domain G, where CG > 0 which only depends on G.

Lemma 5.9. The spaces in (5.13), k = 0, 1, 2, j = 0, 1 can be constructed astensor products of the spaces (4.5) and the usual (unweighted) Sobolev spacesHk(G), k = 0, 1, 2, j = 0, 1 via (5.8) (5.9) and (5.10) as follows

H0j (G, x

µ/2) ∼=(H0j (I, xµ/2)⊗H0(I)

)H1j (G, x

µ/2) ∼=(H1j (I, xµ/2)⊗H0(I)

)⋂(H0j (I, xµ/2)⊗H1(I)

)H2j (G, x

µ/2) ∼=(H2j (I, xµ/2)⊗H0(I)

)⋂(H1j (I, xµ/2)⊗H1(I)

)⋂(H0j (I, xµ/2)⊗H2(I)

).

6 Non-symmetric Dirichlet formsDefinition 6.1 (Symmetric closed form). A pair (E , D(E)) is called a symmetricclosed form on the Hilbert space (H, (·, ·)H), if D(E) is a dense linear subspaceH and E : D(E)×D(E)→ R is a non-negative definite symmetric bilinear form,which is closed on H. That is, D(E) is a complete metric space with respect tothe norm E1(·, ·)1/2 := (E(·, ·) + (·, ·)H)1/2.

Definition 6.2 (Coercive closed form). A pair (E , D(E)) is called a coerciveclosed form on the Hilbert space H if D(E) is a dense linear subspace of H, andE : D(E) × D(E) → R is a bilinear form such that the following two conditionshold:

• Its symmetric part E(u, v) := 12

(E(u, v) + E(v, u)) is a symmetric closedform on H.

• The pair (E , D(E)) satisfies the so-called weak sector condition: there existsa continuity constant K > 0 such that

|E1(u, v)| ≤ K E1(u, u)1/2E1(v, v)1/2 for all u, v ∈ D(E). (6.1)

Page 140: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

6 Non-symmetric Dirichlet forms 127

Remark 6.3. Recall the continuity property 2.5 (also considered in Section 3)

|E(u, v)| ≤ K E(u, u)1/2E(v, v)1/2 for all u, v ∈ D(E)

implies the weak sector condition (6.1) above.

Definition 6.4 (Dirichlet form). Consider a Hilbert space (H, (·, ·)H) of the formH = L2(E,m), where (E,m) is a measure space. A coercive closed form(E , D(E)) on L(E,m) is called a (non-symmetric) Dirichlet form, if for all u ∈D(E) ⊂ E, one has the contraction properties

u+ ∧ 1 ∈ D(E) and E(u+ u+ ∧ 1, u− u+ ∧ 1) ≥ 0and E(u− u+ ∧ 1, u+ u+ ∧ 1) ≥ 0,

(6.2)

where for any u, v : E → R, we have set

u ∧ v := inf(u, v), u ∨ v := sup(u, v), u+ := u ∨ 0, u− := −(u ∧ 0). (6.3)

A coercive closed form satisfying one of the two inequalities in (6.2) is called12- Dirichlet form.

If (E , D(E)) is in addition symmetric, that is E = E , where E denotes the sym-metric part of E (recall E(u, v) := 1

2(E(u, v) + E(v, u))), then (E , D(E)) is called

a symmetric Dirichlet form.In the latter case, the contraction property in condition (6.2) reduces to

E(u+ ∧ 1, u+ ∧ 1) ≤ E(u, u). (6.4)

See [123, Section 4, Def. 4.5]

Theorem 6.5. Let (E , D(E)) be coercive closed form on a Hilbert space (H, (·, ·)H)with continuity constant K > 0. Define the domain

D(A) := u ∈ D(E) | v 7→ E(u, v) is continuous w.r.t. (·, ·)1/2H on D(E). (6.5)

For any u ∈ D(A), let Au denote the unique element in H such that

(−Au, v) = E(u, v) for all v ∈ D(E). (6.6)

Then A is the generator of the unique strongly continuous contraction resolvent(Gα)α>0 on H which satisfies

Gα(H),⊂ D(E) and E(Gαf, u) + α(Gαf, u)H = (f, u)Hfor all f ∈ H, u ∈ D(E), α > 0.

(6.7)

Furthermore, since (E , D(E)) is a coercive closed form on (H, (·, ·)H), there existsa further unique strongly continuous contraction resolvent (Gα)α>0 on H, whichsatisfies

Gα(H) ⊂ D(E) and (f, u)H = E(u, Gαf) + α(u, Gαf)Hfor all f ∈ H, u ∈ D(E), α > 0.

(6.8)

Page 141: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

128 IV Dirichlet Forms and Finite Element Methods for the SABR Model

In particular, Gα is the adjoint of Gα for all α > 0. That is

(Gαf, g)H = (f, Gαg)H for all f, g ∈ H, (6.9)

and similarly, for the (unique) strongly continuous contraction semigroups (Pt)t≥0,(Pt)t≥0 corresponding to (Gα)α>0 and (Gα)α>0 respectively it holds that

(Ptf, g)H = (f, Ptg)H for all f, g ∈ H, t ≥ 0. (6.10)

Proof. See: [123, Theorem 2.8, Corollary 2.10 and Proposition 2.16].

Definition 6.6 (Contraction Properties). Let (H, (·, ·)H) be Hilbert space whereH = L2(E,m), and where (E,m) is a measure space. For any f, g ∈ L2(E,m)we write f ≤ g or f < g for any m-classes f, g of functions on E, if the respectiveinequality holds m-a.e. for corresponding representatives.

(i) Let G be a bounded linear operator on L2(E,m) with domain D(G) =L2(E,m). Then G is called sub-Markovian, if for all f ∈ L2(E,m) thecondition 0 ≤ f ≤ 1 implies 0 ≤ Gf ≤ 1.

(ii) A strongly continuous contraction semigroup (Pt)t≥0 resp. resolvent (Gα)α>0

is called sub-Markovian if all Pt, t ≥ 0 resp. αGα, α > 0 are sub-Markovian.

(iii) A closed, densely defined operator A on (L2(E,m), (·, ·)H) is called Dirichletoperator if (Au, (u− 1)+)H ≤ 0 for all u ∈ D(A) ⊂ E.

See [123, Section 4, Def. 4.1].

Theorem 6.7. Consider a Hilbert space (H, (·, ·)H) of the form H = L2(E,m),where (E,m) is a measure space. Let (Gα)α>0 be a strongly continuous contrac-tion resolvent on (L2(E,m), (·, ·)H) with corresponding generator A and semig-roup (Pt)t≥0. Furthermore, let (E , D(E)) be a coercive closed form on L2(E,m)with continuity constant K > 0 and corresponding resolvent (Gα)α>0. Then thefollowing are equivalent:

(i) (Pt)t≥0 is sub-Markovian.

(ii) A is a Dirichlet operator.

(iii) (Gα)α>0 is sub-Markovian.

(iv) for all u ∈ D(E), u+ ∧ 1 ∈ D(E) and E(u+ u+ ∧ 1, u− u+ ∧ 1) ≥ 0, that is(E , D(E)) is a 1

2-Dirichlet form.

If in the above statements the operators (Gα)α>0 (resp. (Pt)t≥0 and A) are replacedby their adjoints (Gα)α>0 (resp. (Pt)t≥0 and A), then the analogous equivalenceshold, where in (iv) the entries of E are interchanged. Hence, if (iii) (resp. (ii) or(i)) holds both for (Gα)α>0 (resp. A or (Pt)t≥0) and its adjoint (Gα)α>0 (resp. Aor (Pt)t≥0), then the coercive closed form (E , D(E)) is a (non-symmetric) Dirichletform.

Proof. See [123, Section 4, Prop. 4.3 and Theorem 4.4].

Page 142: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

7 Further Proofs 129

7 Further ProofsProof of Proposition 3.4. We recall and slightly modify the proof given in [143,Proposition 4.1] to accommodate to the situation at hand. Set

Xm := ||umL ||2H − ||um+1L ||2H + C2k||gm+θ||2∗ − C1k||um+θ

L ||2a. (7.1)

If suffices to show, Xm ≥ 0 for m = 0, . . . ,M − 1, then∑M−1

m=0 Xm ≥ 0 readily

yields (3.30). In the remainder of the proof we use the following shorthandnotation:

ϑ := um+1h − umh ,

um+θL := θum+1

L + (1− θ)umL = 12(um+1

h − umh ) + (θ − 12)ϑ,

gm+θ := θg(tm+1) + (1− θ)g(tm).(7.2)

It holds that

||um+1L ||2H − ||umL ||2H = (um+1

L − umL , um+1L + umL )H

= 2(ϑ, um+θL )H − (2θ − 1)(ϑ, ϑ)H

(7.3)

On the other hand, by the definition of the θ scheme (3.26)

2(ϑ, um+θL ) = 2k

(−||um+θ

L ||2a + (gm+θ, um+θL )

)≤ 2k

(−||um+θ

L ||2a + ||gm+θ||∗||um+θL ||a

),

(7.4)

Hence, with (7.3) and (7.4) and by the definition (7.1) of Xm,

||umL ||2H − ||um+1L ||2H ≥ (2θ − 1)||ϑ||2H− 2k

(||um+θ

L ||2a − ||gm+θ||∗||um+θL ||a

),(7.5)

and adding C2k||gm+θ||2∗ − C1k||um+θL ||2a on both sides yields

Xm ≥ (2θ − 1)||ϑ||2H− 2k(||um+θ

L ||2a − ||gm+θ||∗||um+θL ||a

)+ C2k||gm+θ||2∗ − C1k||um+θ

L ||2a.(7.6)

In case 12≤ θ ≤ 1, the right hand side is positive whenever

0 < C1 < 2, C2 ≥ 12−C1

. (7.7)

In case 0 ≤ θ ≤ 12once again from the θ scheme (3.26)

(ϑ, vL)H = k(−Aum+θL + gm+θ, vL)H,

which now yields by the definition of the dual norm (3.24) and of λA the estimate

||ϑ||H ≤ λ1/2A ||ϑ||∗ ≤ λ

1/2A k

(||Aum+θ

L ||∗ + ||gm+θ||∗), (7.8)

where the first term on the right hand side satisfies ||Aum+θL ||∗ = ||um+θ

L ||a. Thisequality holds by the estimates (Aum+θ

L , vL)H = a(um+θL , vL) ≤ ||um+θ

L ||a||vL||aand ||Aum+θ

L ||∗ ≥ ||um+θL ||a, where the latter inequality is obtained by setting

Page 143: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

130 IV Dirichlet Forms and Finite Element Methods for the SABR Model

vL := um+θL , and the former inequality by polarisation17. Hence, from (7.8) we

conclude||ϑ||2H ≤ λAk

2(||um+θ

L ||a + ||gm+θ||∗)2,

and therefore we proved Xm ≥ 0, m = 0, . . . ,M − 1, since the above togetherwith (7.6) yields

Xm ≥ k(2− C1− σ)||um+θL ||a − k2(1 + σ)||gm+θ||∗||um+θ

L ||a + k(C2 − σ)||gm+θ||∗.

Proof of Lemma 4.5. The weak residual in (4.15) can be decomposed into

(rm, vL)V×V∗ := (rm1 , vL)V×V∗ + (rm2 , vL)V×V∗ + a(rm3 , vL),

where the components rm1 ,rm2 and rm3 , m = 0, . . . ,M are

(rm1 , vL)V×V∗ := ( 1k(um+1 − um)− um+θ, vL)V×V∗ ,

(rm2 , vL)V×V∗ := ( 1k(PLu

m+1 − PLum) + 1k(um+1 − um), vL)V×V∗ ,

(rm3 , vL)V×V∗ := a(PLum+θ − um+θ, vL).

(7.9)

The statement if the Lemma is then is essentially a consequence of the fact thatthe errors ξ satisfy the same PDE as the functions u: The variational formulation(2.11) implies

(um+θ, v)V×V∗ + a(um+θ, v) = (gm+θ, v)V×V∗ ∀v ∈ V . (7.10)

Using the definition (4.1) of ξ, we rewrite (4.15) in the form:1k(ξm+1L − ξmL , vL)V×V∗ + a(θξm+1

L + (1− θ)ξmL , vL) =1k((PLu

m+1 − um+1L )− (PLu

m − umL ), vL)V×V∗ + a(PLum+θ + um+θ

L , vL) =(PLu

m+1−PLumk

, vL

)V×V∗

+ a(PLum+θ, vL)−

(1k(um+1

L − umL , vL)V×V∗ − a(um+θL , vL)

)where we used um+θ := θum+1 + (1− θ)um and the linearity of the projector:

θPLum+1 − (1− θ)PLum+1

L = PLum+θ.

Furthermore, by the θ-scheme (3.26) for u, and by (7.10)(PLu

m+1−PLumk

, vL

)V×V∗

+ a(PLum+θ, vL)−

((um+1L −umL

k, vL

)V×V∗

− a(um+θL , vL)

)=(PLu

m+1−PLumk

, vL

)V×V∗

+ a(PLum+θ, vL)− (gm+θ, vL)V×V∗

=(PLu

m+1−PLumk

, vL

)V×V∗

+ a(PLum+θ, vL)− a(um+θ, vL)− (um+θ, vL)V×V∗

=(PLu

m+1−PLumk

− um+1−umk

, vL

)V×V∗

+ a(PLu

m+θ − um+θ, vL)

+(um+1−um

k− um+θ, vL

)V×V∗

= (r2, vL)V×V∗ + (r3, vL)V×V∗ + (r1, vL)V×V∗ .

17For the square of the expression a(um+θL , vL) there holds the inequality

a(u, v)2 ≤ 14 (a(u+ v, u+ v) + a(u− v, u− v))

2 ≤ 12 (a(u, u) + a(v, v))

2 ≤ a(u, u)a(v, v).

Page 144: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

7 Further Proofs 131

Proof of Lemma 4.7. We adapt [96, Section 3.6.2] and [143, Lemma 5.3] to thesituation at hand and confirm that the estimates of [143, Lemma 5.3] carry over tothe weighted case. Analogously to the classical (unweighted) case, the statementof the Lemma follows from ||rm||2∗ ≤ ||rm1 ||2∗+||rm2 ||2∗+||rm3 ||2∗ and the correspondingnorm estimates for the decomposition (7.9). The estimate of the residual r1 is

||r1||∗ = || 1k(um+1 − um)− um+θ||∗ ≤ 1

k

(∫ tm+1

tm|s− (1− θ)tm+1 − θtm| ||u||∗ds

)≤ Cθ

k1/2

(∫ tm+1

tm||u||2∗ds

)1/2

.

(7.11)In case θ = 1

2, partial integration yields the refined estimate

||r1||∗ = || 1k(um+1 − um)− um+θ||∗ ≤ 1

2k

(∫ tm+1

tm|(tm+1 − s)(tm − s)| ||

...u ||∗ds

)≤ C

k3/2

(∫ tm+1

tm||...u ||2∗ds

)1/2

.

(7.12)The norm of the residual r2 is bounded by

||r2||∗ ≤ 2−L Ck1/2

(∫ tm+1

tm||u||2Vds

)1/2

. (7.13)

The bound (7.13) follows from (3.24) and from the estimate

|(r2, vL)V×V∗| = |( 1k(PLu

m+1 − PLum) + 1k(um+1 − um), vL)V×V∗|

≤ C|| 1k(PLu

m+1 − PLum) + 1k(um+1 − um)||∗||vL||a

≤ Ck||(I − PL)

∫ tm+1

tmu(s) ds||∗||vL||a

≤ Ck1/2

(∫ tm+1

tm||u− PLu||2Hds

)1/2

||vL||a(4.4)≤ C

k1/2 (2−L)ca(∫ tm+1

tm||u||2H1

jds)1/2

||vL||a,

(7.14)

where the last step follows from the approximation property (4.8) resp. (4.9) andthe estimate (5.16). The norm of the residual r3 allows for the upper bound

||r3||∗ ≤ C(2−L)ca ||um+θ||H2j. (7.15)

The estimate (7.15) follows from (3.24) and from

|(r3, vL)V×V∗| = |a(PLum+θ − um+θ, vL)|

≤ ||PLum+θ − um+θ||a||vL||a(4.4)≤ C(2−L)ca ||um+θ||H2

j||vL||a.

(7.16)

The second step follows from a simple polarisation argument

|(r3, vL)V×V∗|2 ≤∣∣1

4(a(u+ v, u+ v) + a(u− v, u− v))

∣∣2≤∣∣1

2(a(u, u) + a(v, v))

∣∣2≤ |(a(u, u)a(v, v))|

and in the last step we used || · ||a ≈ || · ||V which holds by well-posedness, cf.Theorem 2.17 and by (2.7) and the approximation property (4.8) resp. (4.9).

Page 145: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

132 IV Dirichlet Forms and Finite Element Methods for the SABR Model

1 Approximation estimates for CEV: Alternative

If we do not assume any additional integrability requirements on our solution upto first order derivatives, we stated in Remark 4.4 that we obtain such approxim-ation estimates where the order of approximation depends on the parameter β.The proof of the estimates (4.9) in Remark 4.4 is similar to that of Proposition4.3 and is included here:

Proof of Remark 4.4. For j = 1 and µ ∈ [max−1,−2β, 1− 2β] the inequalitiesin (4.10) become

||(u− PLu)′||2L2(I,xµ/2+β) = C∞∑

l=L+1

22l

2l∑k=0

(2−lk)µ+2β|ulk|2

≥ C22L(1−β)

∞∑l=0

2l∑k=0

22l(β−(µ/2+β))(k)µ+2β|ulk|2

≥ C22L(1−β)

∞∑l=0

2l∑k=0

(2−lk)µ|ulk|2 = C22L(1−β)||u− PLu||2L2(I,xµ/2).

(7.17)

Similarly, (7.17) after replacing (u − PLu)′ by (u − PLu)′′ and (u − PLu) by(u− PLu)′ reads

||(u− PLu)′′||2L2(I,xµ/2+2β) = C∞∑

l=L+1

24l

2l∑k=0

(2−lk)µ+4β|ulk|2

≥ C22L(1−β)

∞∑l=0

22l

2l∑k=0

22l(β−(µ/2+2β))(k)µ+2β|ulk|2

≥ C22L(1−β)

∞∑l=L+1

22l

2l∑k=0

(2−lk)µ+2β|ulk|2 = C22L(1−β)||(u− PLu)′||2L2(I,xµ/2+β).

(7.18)

Finally, combining (7.17) and (7.18) yields the last estimate in (4.9).

Page 146: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Chapter V

Portfolio Choice with AmbiguousDrifts and Volatilities

1 Introduction

Portfolio choice models crucially depend on the dynamics of the underlying riskyassets. Yet, these have to be estimated from historical time series, with oftenconsiderable estimation errors. If one ignores this and simply plugs in the pointestimates, this leads to “extreme positions in the assets... [that] deliver abysmalout-of-sample performance” [69]. As a result, there has been intense research onrobust optimization criteria that lead to a satisfactory performance in a widerclass of models. The extant literature typically follows one of the following twoparadigms: the “worst-case approach” of Gilboa and Schmeidler [157, 79] or set-tings with “smooth ambiguity aversion”, pioneered by Hansen and Sargent [92, 93].

Both approaches consider a whole class of alternative probabilistic models.In the worst case approach, all of these models are treated equally, in that onemaximizes the worst-case performance with respect to all models under consid-eration. In this setting, uncertainty about expected returns has been studied byvarious authors, cf., e.g., [165, 146, 156, 95] as well as many more recent studies.Recently, volatility uncertainty has also started to receive increasing attention[48, 129, 139, 167, 60, 66]. By its very definition, the worst-case paradigm leadsto very conservative results, because even the most unfavorable model – that ismaybe very unlikely but still conceivable – is treated in the same way as the stat-istical point estimate. In particular, if one cannot rule out with absolute certaintythat the risky assets under consideration are local martingales, then it is optimalnot to invest at all.

Smooth ambiguity aversion is more flexible, in that it interpolates betweenthe worst-case approach and the classical setting with one given model. Here,there is no longer a dichotomy of plausible and implausible models. Instead, onestill computes the worst case over all models, but penalizes them according totheir distance from a “reference model” (the statistical point estimate, say). In-

Page 147: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

134 V Portfolio Choice with Ambiguous Drifts and Volatilities

finite ambiguity aversion corresponds to a vanishing penalty and therefore theworst case approach. Conversely, with zero ambiguity aversion, one recovers theclassical setting without model risk because every deviation from the referencemodel incurs an infinite penalty. Intermediate values of the ambiguity parameterinterpolate smoothly between these two extreme cases, leading to a non-trivial“ambiguity-return tradeoff”. If uncertainty about expected returns is penalizedin terms of relative entropy, smooth ambiguity aversion turns out to be quitetractable, cf., e.g., [124, 158, 125, 94].

In the present study, we contribute to this second line of research in severalways by proposing and analyzing a simple tractable continuous-time model foruncertainty about both expected returns and volatilities. To wit, we combine alocal1 mean-variance criterion as in [109, 127, 71, 126, 41, 82] with smooth ambi-guity aversion concerning both model parameters.

If the reference model has constant expected excess returns and volatilities,the resulting myopic criterion readily yields explicit solutions. If only expec-ted returns are ambiguous, a variant of the results of Maenhout [124, 125] andHernandez-Hernandez and Schied [94] obtains: ambiguity aversion simply com-pounds agents’ “effective” risk aversion.2 Small additional volatility uncertaintyreduces the optimal risky exposure is reduced even further. Furthermore, ourresult confirms that when the model parameters can be estimated with high ac-curacy, then the optimal strategy remains close to the Merton proportion evenfor investors with high ambiguity aversion levels.

These results allow to discriminate between the relative contributions of thedifferent estimation errors. As pointed out by Garlappi, Uppal, and Wang [69],“ambiguity aversion is a general property of preferences, but ambiguity is prob-lem specific”. Indeed, the ambiguity aversion parameters measuring the relativeimportances of uncertainty about drift rates and volatilities should not be thesame, but instead reflect the accuracy of the respective statistical estimators interms of their squared absolute error.

As a simple benchmark example, we propose to use a universal constantΨ ∈ [0, 1] to describe the agent’s level of aversion against ambiguity in general.This constant determines the level of the confidence interval of the respective es-timator. This allows us to distinguish between different levels of ambiguity abouteach parameter. With this specification, we can quantify the relative importancesof drift and volatility uncertainty. For parameters fitted to historical time series,the impact of the volatility uncertainty turns out to be miniscule. This suggests

1The use of a local criterion instead of a power utility function as in [125] allows to treatinvestors with constant relative risk aversion without changing the HJB equation in an ad-hocmanner. Put differently, our model provides a consistent model that produces the policiesproposed by [125].

2A more involved result linking smooth ambiguity aversion to stochastic differential utilitieshas been obtained by Skiadas [160].

Page 148: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Benchmark case without ambiguity aversion 135

that while volatility uncertainty may be a crucial issue for the pricing and riskmanagement of options, it had apparently remained neglected by the literatureon portfolio choice for good reason.

The remainder of this chapter is organized as follows. In Section 2, we brieflyrecall the benchmark case without ambiguity aversion for easy reference. Sub-sequently, we introduce our setting for uncertainty about both expected returnsand volatilities. Section 4 contains our main result, characterizing the optimalpolicy in this setting. This is illustrated in Section 5 for a model calibrated totime-series data. All proofs are in the appendix.

2 Benchmark case without ambiguity aversionTo set the scene, we first briefly recall the case without ambiguity aversion. Tothis end, fix a filtered probability space (Ω,F ,F = (Ft)t∈[0,T ], P ) supporting a(discounted) risky asset with constant expected excess returns µ > 0 and volat-ilities σ > 0:

dSt/St = µdt+ σtdWt. (2.1)

As in [41, 82], we consider an investor who chooses her risky weight3 π to maximizethe present value of her expected excess returns, penalized for the correspondingvariances. As the returns of her portfolio are given by πtµdt+ πtσdWt, this leadsto the following criterion:4

EP

[∫ T

0

(πtµ−

γ

2π2t σ

2)dt

]→ max! (2.2)

Here, γ > 0 weighs the importance of future expected returns against the corres-ponding variances, thereby measuring the investor’s risk aversion. The optimalstrategy for (2.2) is readily determined by pointwise maximization of the integ-rand as the constant Merton proportion:5

πt =µ

γσ2, t ∈ [0, T ]. (2.3)

Indeed, this strategy is the unique maximizer of (2.2), with corresponding utilitygiven by the total squared Sharpe ratio accumulated over the investment horizonunder consideration: 0 ≤

∫ T0

µ2

2γσ2dt.

3That is, the fraction of her wealth invested in the risky asset, rather than the numeraire.4As is customary, we set the value of this criterion equal to −∞ if its negative part is not

integrable.5Whence, γ corresponds to constant relative risk aversion. Alternatively, one can also con-

sider settings where (2.1) describes the absolute returns dSt of the risky asset rather than theirrelative counterparts dSt/St and absolute portfolio returns are maximized in (2.2) over differentchoices of the number of shares held. In this case, the optimal number of risky shares is µ/γσ2,so that γ describes a constant absolute risk aversion in this case. See [109, 127, 126, 70] formodels of this type.

Page 149: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

136 V Portfolio Choice with Ambiguous Drifts and Volatilities

3 Setting with ambiguity aversionLet us now pass to a setting with uncertainty about both the asset’s expectedreturns and volatilities. To wit, we consider as alternatives to the benchmarkmodel (µ, σ) the class of all progressively measurable drift rates and volatilities.6As we want the information flow in our model to correspond to the observation ofthe risky asset, we use a weak formulation, where these scenarios are implementedas (possibly singular) measures on the canonical path space.

To make this precise, let Ω = C([0, T ];R) be the canonical space endowed withthe topology of uniform convergence. Denote by F the Borel σ-algebra on Ω andwrite S and F for the canonical process and the canonical filtration, respectively.Moreover, write P for the set of probability measures P on (Ω,F) such that, forsome progressively measurable processes (µPt )t∈[0,T ], (σ

Pt )t∈[0,T ]:(

St −∫ t

0Ssµ

Ps ds

)t∈[0,T ]

is a P − local martingale and 〈S〉t =∫ t

0(Ssσ

Ps )2 ds.

(3.1)Each probability measure P ∈ P is a possible scenario for the dynamics of therisky asset. Among those, we single out a constant pair (µ, σ). The correspondingscenario P (sometimes noted as P µ,σ) serves as the “reference model” that we deemmost plausible, for example, because it matches empirical estimates.

We now want to extend the local mean-variance criterion (2.2) to take take intoaccount ambiguity about both drift rates and volatilities. Here, the idea is thatmodels “far” from the reference model are less plausible, and therefore shouldreceive a smaller weight in the investor’s considerations. To make this precise,it remains to choose a measure for this “distance” between the different models.With uncertainty about the expected return only, this is typically done by meansof the relative entropy (e.g., [92, 94, 124]). To wit, each model P (let us denote ithere by P µ,σ) with expected returns (µt)t∈[0,T ] rather than µ then incurs a penaltyproportional to EP [dP

µ,σ

dPlog(dP

µ,σ

dP)] = EPµ,σ [

∫ T0

(µ−µt)2

2σ2 dt]. The correspondingrobust criterion in turn reads as

infPµ,σ∈P

EPµ,σ[∫ T

0

(ϕtµt −

γ

2ϕ2t σ

2 +1

2Ψ1

(µ− µt)2

σ2

)dt

], (3.2)

where Ψ1 ≥ 0 weighs the relative importances of ambiguity and risk aversion.This criterion can be interpreted as a game the investor plays against a fictitiousadversary (a malevolent “nature”), who tries to choose a particularly unfavorableenvironment, but has to pay a penalty for implausible choices. The advantageof this criterion is that the optimal strategy can still be determined by pointwiseoptimization. The relative entropy is only finite for absolutely continuous meas-ures, so that the corresponding distance of scenarios across volatilities is infinite.

6Note, in particular, that these alternative scenarios are not restricted to constant coefficientmodels as in (2.1).

Page 150: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

4 Main result 137

We circumvent this problem by starting directly from (3.2) and imposing an ad-ditional penalty for the mean-squared deviations of the instantaneous variancefrom its reference value:

infPµ,σ∈P

EPµ,σ[∫ T

0

(ϕtµt −

γ

2ϕ2tσ

2t +

1

2Ψ1

(µ− µt)2

σ2+

1

2Ψ2

(σ2 − σ2t )

2

)dt

].

(3.3)Here, Ψ1,Ψ2 describe the relative importance of the two ambiguity terms relativeto risk aversion. The choice of these parameters is discussed in Section 5.

4 Main result

With our specification from (3.3), the local utility maximization problem withdrift and volatility uncertainty can still be solved in closed form:

Theorem 4.1. The robust local utility criterion (3.3) has a unique maximizer,given by

π =241/3σ2(γ + Ψ1)Ψ2 − 21/3(

√3Ψ2

2(4σ6(γ + Ψ1)3 + 27γ2µ2Ψ2)− 9γµΨ22

62/3Ψ2(√

3Ψ22(4σ6(γ + Ψ1)3 + 27γ2µ2Ψ2)− 9γ4µΨ2

2).

(4.1)

The proof of this result is delegated to Section 7. To better understand thecomparative statics of the above formula, consider the case Ψ2 ∼ 0 where aversionagainst volatility uncertainty is small compared to aversion against uncertaintyabout expected returns.7 A Taylor expansion of the explicit formula (4.1) aroundΨ2 = 0 yields

π =µ

(γ + Ψ1)σ2− γ2µ3

(γ + Ψ1)4σ8Ψ2 + o(Ψ2). (4.2)

Here, the first term corresponds to the optimal strategy without volatility un-certainty.8 Similarly as in [94, 124], ambiguity aversion simply compounds riskaversion here, in that uncertainty about expected returns leads to a higher “effect-ive” risk aversion. The second term in the above expansion is the leading-ordercorrection for small volatility uncertainty. It shows that risky exposures are fur-ther reduced to account for the additional model risk.We remark here, that Theorem 4.1 also holds models with time dependent but de-terministic parameters (µt, σt)t∈[0,T ]. An extension of a variant of the asymptoticformula (4.2) to general diffusion models is the subject of ongoing work.

7This is a reasonable approximation because expected returns are much harder to estimatethan volatilities, compare Section 5. That is, the investor’s aversion against both ambiguitiesmay be similar, but the absolute amount of ambiguity about the drifts is much bigger than forthe volatilities.

8As a sanity check, note that, in particular, we recover the optimal strategy (2.3) withoutany model uncertainty for Ψ1 = 0 and Ψ2 = 0.

Page 151: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

138 V Portfolio Choice with Ambiguous Drifts and Volatilities

5 The parameters Ψ1 and Ψ2

We interpret the parameters Ψ1,Ψ2 ∈ [0,∞] as ambiguity (aversion) for drift-and volatility parameters in terms of accuracy of the estimation of the respectiveparameters: The more accurately one can estimate the paramater from the dataat hand, the lower the ambiguity to that parameter.

To smoothly interpolate among extreme scenarios of full information and fulluncertainty, our approach is that the investor forms her belief µ, σ about theparameters µ and σ through observations of (2.1). Denoting by N the totalnumber of observations, the reference model is then P determined using an ap-propriate estimator σ2 = σ2

N and µ = µN . We then characterize the parametersΨ1 and Ψ2 through the accuracy of the estimates µN and σN , where we leave oneparameter Ψ (the ambiguity aversion level) to be chosen by the investor. To fixideas, we assume that observations st1 , . . . , stN of (2.1) are equally spaced (thatis ti − ti−1 = δ for all i = 0, . . . , N) over the time horizon [0, T ] and that N = T

n

where n := 1δis the number of observations in a unit of time.

The accuracy of the estimates µN and σN is determined by their absoluteerror9 which is determined by the standard deviation and the chosen confidencelevel 1−α of the estimators and we let the investor choose Ψ := 1−α the level ofthe confidence interval according to her amiguity aversion level. Given n, T we setthe ambiguity parameters as the squared absolute error (for an N = nT -sampleand confidence level Ψ) of the standard estimators of the parameters of a normaldistribution. See Appendix (8) for a justification of this choice of Ψ1 and Ψ2.

Ψ1(Ψ, P , T, n)1/2 =σ√T

t(Tn−1, 1+Ψ2

) and

Ψ2(Ψ, P , T, n)1/2 =(Tn− 1)σ2

2T

(1

χ(Tn−1, 1−Ψ2

)

− 1

χ(Tn−1, 1+Ψ2

)

).

(5.1)

The investor can regulate the parameters Ψ1 and Ψ2 through the choice of thelevel Ψ: For Ψ ∼ 0, (5.1) yields the median of the student t distribution, henceΨ1 vanishes and also Ψ2 is null as the upper and lower boundary of the intervalcoincide. On the other hand a high amiguity aversion level Ψ ∼ 1 renders largeconfidence intervals as soon as there is any ambiguity about the true value ofthe parameter. Hence, the intermediate case enables us to interpolate smoothlybetween the extreme scenarios (Ψ1 ∼ 0, Ψ2 ∼ 0 and Ψ1 ∼ ∞, Ψ2 ∼ ∞) describedabove.

9I.e. half of the length of the confidence interval.

Page 152: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

6 Illustration of the optimal strategy 139

Example

We estimate the relative mean µ and variance σ2 in

dSt/St = (µ+ rt)dt+ σdWt, (5.2)

where we account for the time-value of money by considering the interest rates rti(we take 13 week treasury bills) prevailing at time ti for t0 < t1 < . . . < tN . Weestimate µ and σ via daily (ti − ti−1 = δ) observations s0, st1 , . . . , stN of closingrates of the S&P 500 index, starting on Dec. 15th 1994. over twenty years.Hence, T = 20 and n = 1

δ≈ 250, yielding N ≈ 5000 observations. We obtain,

σN ≈ 0.1931 and µN ≈ 0.0655. (5.3)

6 Illustration of the optimal strategyWe now examine the influence of the parameters Ψ1, Ψ2 and γ on the optimalstrategy (4.1) for parameters (µ, σ) around the above values.Figure V.1 below illustrates how the the risk aversion parameter γ affects theoptimal strategy (4.1). The images confirm that ambiguity aversion compoundsagents’ “effective” risk aversion and underline a variant of variant of the resultsof Maenhout [124, 125] and Hernandez-Hernandez and Schied [94]. A small addi-tional volatility uncertainty further reduces the optimal risky exposure. This onlybecomes visible (right image in Figure V.1) when the estimation of the volatilityis inaccurate due to particularly low number of observations n = 2 of the asset.For more frequent observations, the volatility uncertainty is negligible (cf. Sec-tion 5, and Figure V.3 in particular) and the effect of Ψ2 on the optimal strategyis not visible.

0.2 0.4 0.6 0.8 1.0

0.15

0.20

0.25

0.30

0.35

0.40

0.2 0.4 0.6 0.8 1.0

0.1

0.2

0.3

0.4

Figure V.1: Optimal strategies (on the vertical axis) for µ = 0.1, σ = 0.2 andT = 1, as a function of Ψ ∈ [0, 1] (on the horizontal axis) are illustrated forγ = 3 (dotted), γ = 4 (small-dashed), γ = 5 (large-dashed) and γ = 6 (solidline), observation frequencies n = 252 (left image) and n = 2 (right image).

The parameter Ψ as well as the observation horizon T and the observation fre-quency n influence the worst-case-optimal strategy via the levels of the ambiguity

Page 153: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

140 V Portfolio Choice with Ambiguous Drifts and Volatilities

0.2 0.4 0.6 0.8 1.0

0.14

0.16

0.18

0.20

0.22

0.24

0.2 0.4 0.6 0.8 1.0

0.225

0.230

0.235

0.240

0.245

0.250

Figure V.2: We plotted for (µ, σ) = (0.05, 0.2) and γ = 5 the optimal strategies asa function of the ambiguity aversion level Ψ ∈ [0, 1] for quarterly n = 4 (dotted),monthly n = 12 (small-dashed), weekly n = 52 (large-dashed) and daily n = 252(solid line) observations, over a horizon of one year T = 1 in the left image, suchas the optimal strategies for time horizons T = 1 (dotted), T = 2 (small-dashed),T = 5 (large-dashed) and T = 20 (solid line)for daily observations, on the right.

parameters Ψ1 and Ψ2 in (5.1). For an ambiguity aversion level Ψ ∼ 0, both Ψ1

and Ψ2 vanish and the maximization problem correspond to the benchmark casewith the optimizer (2.3). Indeed, for the values (µ, σ) = (0.05, 0.2) and γ = 5,(2.3) the investor chooses the Merton proportion π = 0.25 for the risky assets,see Figure V.2 at Ψ ∼ 0.

7 Proof of the main result

To ease notation, set

L(µ,σ)(µ, σ; π) = µπ − γ

2σ2π2 +

(µ− µ)2

2Ψ1σ2+

(σ2 − σ2)2

2Ψ2

. (7.1)

Lemma 7.1. For each fixed portfolio weight (πt)t∈[0,T ], the pointwise minimizerof (7.1) is given by

(µ(πt), σ2(πt)) =

(µ−Ψ1σ

2πt, σ2 +

γΨ2

2π2t

), t ∈ [0, T ]. (7.2)

Proof. Fix (ω, t) ∈ Ω× [0, T ] and consider the first-order conditions

0 = ∂xL(x, y; πt(ω)) and 0 = ∂yL(x, y; πt(ω)).

These vanish if and only if x = µ(πt(ω)) and y = σ2(πt(ω)), respectively. Thecorresponding Hessian matrix,

(1/Ψ1 0

0 1/Ψ2

)is positive definite, whence

(x, y) 7→ L(x, y; πt(ω))

is strictly convex and µ(πt(ω)), σ2(πt(ω)) are its unique global minimizers.

Page 154: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

7 Proof of the main result 141

For any portfolio weight (πt)t∈[0,T ], evaluated with respect to the pointwiseworst-case model from Lemma 7.1, we have

L(µ,σ)(µ(πt), σ(πt); πt) = µπt −(Ψ1 + γ)σ2

2π2t −

γ2Ψ2

8π4t . (7.3)

Remark 7.2. In particular, if the risky weight πt is uniformly bounded, then thisalso holds for the process L(µ,σ)(µ(π), σ2(π);π). In view of Lemma 7.1, this yieldsa uniform lower bound for L(µ,σ)(µ, σ

2; π) for any other choice of the processes µand σ2. As a result, the robust criterion (3.3) is well defined with a value greaterthan −∞.

Next, optimize pointwise with respect to the strategy in the correspondingworst-case scenario:

Lemma 7.3. The unique maximizer of (7.3) is given by the constant

π =241/3σ2(γ + Ψ1)Ψ2 − 21/3(

√3Ψ2

2(4σ6(γ + Ψ1)3 + 27γ2µ2Ψ2)− 9γµΨ22

62/3Ψ2(√

3Ψ22(4σ6(γ + Ψ1)3 + 27γ2µ2Ψ2)− 9γ4µΨ2

2)

Proof. The first-order condition for (7.3) is

0 = µ− (Ψ1 + γ)σ2πt − γ2Ψ2π3t . (7.4)

The corresponding second derivative is strictly negative, so that there is at mostone solution which necessarily is a global maximum. One now readily verifies(most conveniently using a software package such as Mathematica) that the con-stant π solves (7.4), completing the proof.

With these preparations, we can now prove our main result, Theorem 4.1:

Lemma 7.4. We have

sup(πt)t∈[0,T ]

infP∈P

EP

[∫ T

0

L(µ,σ)(µPt , σ

Pt ; πt)dt

]= T L(µ,σ)

= infP∈P

sup(πt)t∈[0,T ]

EP

[∫ T

0

L(µ,σ)(µPt , σ

Pt ; πt)dt

]

where L(µ,σ) := L(µ,σ)(µ(π), σ(π); π) with µ(·), σ2(·) as in Lemma 7.1 and π asin Lemma 7.3, respectively. In particular, π is optimal for the robust criterion(3.3).

Page 155: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

142 V Portfolio Choice with Ambiguous Drifts and Volatilities

Proof. Consider any scenario P with associated drift rate µP and volatility σP .In view of the pointwise minimality of (µ(·), σ2(·)) established in Lemma 7.1, wehave

L(µ,σ)(µPt , σ

Pt ; π) ≥ L(µ,σ)(µ(π), σ(π); π), t ∈ [0, T ].

By the monotonicity of the integral and since π and in turn µ(π), σ(π) as well asL(µ,σ)(µ(π), σ2(π); π) are constant, it follows that

EP

[∫ T

0

L(µ,σ)(µPt , σ

Pt ; π)dt

]≥ EP

[∫ T

0

L(µ,σ)(µ(π), σ(π); π)dt

]= EP

[∫ T

0

L(µ,σ)(µ(π), σ(π); π)dt

]= TL(µ,σ)(µ(π), σ(π); π).

As the scenario P was arbitrary, this implies

sup(πt)t∈[0,T ]

infP∈P

EP

[∫ T

0

L(µ,σ)(µPt , σ

Pt ; πt)dt

]≥ inf

P∈PEP

[∫ T

0

L(µ,σ)(µPt , σ

Pt ; π)dt

]≥ T L(µ,σ).

For the converse inequality, use the definition of the infimum and the pointwisemaximality of π established in Lemma 7.3 to obtain

infP∈P

sup(πt)t∈[0,T ]

EP

[∫ T

0

L(µ,σ)(µPt , σ

Pt ; πt)dt

]≤ sup

(πt)t∈[0,T ]

EP

[∫ T

0

L(µ,σ)(µ(π), σ(π);πt)dt

]≤ T L(µ,σ)

for any portfolio (πt)t∈[0,T ], where P denotes the scenario corresponding to thedrift rate µ(π) and volatility σ2(π). Finally, since

infP∈P

sup(πt)t∈[0,T ]

EP

[∫ T

0

L(µ,σ)(µPt , σ

Pt ; πt)dt

]≥ sup

(πt)t∈[0,T ]

infP∈P

EP

[∫ T

0

L(µ,σ)(µPt , σ

Pt ; πt)dt

],

in general, all inequalities above are in fact equalities. This completes the proof.

8 Discussion of the ambiguity parameters

We briefly justify the choice of the ambiguity parameters (5.1) in Section 5. Letus fix an estimation procedure as in [154]. Xt := log(St), then Xt ∼ νt + σWt,

Page 156: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

8 Discussion of the ambiguity parameters 143

where ν = µ − σ2. Then Xj := log(Sjδ) − log(S(j−1)δ) and noting that T = δN ,

we have1

δN

N∑j=1

Xj ∼ N(ν,σ2

T

).

Furthermore, we consider XNδ

:= 1δN

∑Nj=1 Xj and σ2

N

T:= 1

N−1

∑Nj=1

(Xjδ− XN

δ

)2

the standard estimators for ν and σ2

T. Fixing a level 1−α, which corresponds the

probability that the true values of ν and of σ are within the condidence bounds,the length of the corresponding confidence interval around XN

δis

2(t(N−1,1−α

2)σN√T

).

Note that the length of this confidence interval is independent from the level ofν, hence of µ. Similarly, the length of the confidence interval around σ2

N

Tis

(N−1)σ2N/T

χ(N−1, α2 )− (N−1)σ2

N/T

χ(N−1,1−α2 ).

Interpolation between extreme scenarios

Notwithstanding the possibility that we are restating the obvious, we recall somewell-known facts to highlight how our choice of ambiguity parameters (5.1)—which are a composition of the two inputs of estimation error and ambiguityaversion of the investor—interpolates smoothly between extreme scenarios andprovide some quick illustrations.

One extreme scenario is for example, if the investor has vanishing aversionΨ2 ∼ 0 against ambiguity in the volatility parameter. In this case, any devi-ation of σ from σ would incur in (3.3) an “infinite penalty” 1

2Ψ2(σ2 − σ2

t )2 on the

malevolent nature and hence whenever Ψ2 ∼ 0, one recovers (3.2). By the samereasoning, when both ambiguity parameters vanish Ψ1 ∼ 0,Ψ2 ∼ 0, one recoversthe classical setting without model uncertainty.

In the other extreme, when aversion of the investor against ambiguity in aparameter is infinite, then the respective penalties 1

2Ψ1

(µ−µt)2

σ2 such as 12Ψ2

(σ2−σ2t )

2

vanish, and the malevolent nature faces no cost for any deviation of the realizedparameters µ, σ from their reference values µ, σ. In this case (3.3) corresponds tothe worst case approach. If the investor has no information about the distribu-tion of the parameter within the predefined bounds, or if she deems every possibleoutcome equally likely the parameters are Ψ2 ∼ ∞ and Ψ1 ∼ ∞ respectively.

It is well-known that the bigger the observation frequency, the better the es-timate for σ, but it does not improve considerably for µ. The absolute error ofthe drift-estimator decreases as the observation frequency n increases with theincreasing number of degrees of freedom of the t-distribution, it approaches for

Page 157: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

144 V Portfolio Choice with Ambiguous Drifts and Volatilities

frequent observations (for n ≥ 52) the corresponding quantile of a normal distri-bution. The absolute error of the volatility on the other hand decreases rapidlyfor increasing frequency of observations. Daily observations n = 252 readily yielda nearly vanishing absolute error even for high levels of the ambiguity aversion Ψas highlighted in Figure V.3.

Evidently, the longer the observation horizon [0, T ], the better the estimate forboth parameters µ and σ, (5.1) and hence (keeping the ambiguity aversion Ψunchanged) the smaller the parameters Ψ1 and Ψ2 while for realistic observationfrequencies the accuracy of estimation of σ is by orders of magnitude better thanthe accuracy of the drift estimator, underlining the asymptotic expansion (4.2)for small Ψ2.

0.2 0.4 0.6 0.8 1.0

0.05

0.10

0.15

0.20

0.2 0.4 0.6 0.8 1.0

0.002

0.004

0.006

0.008

Figure V.3: The image illustrates the effect of the chosen level—or ambiguityaversion Ψ ∈ [0, 1] on the horizontal axis—on the ambiguity parameters Ψ1 (onthe vertical axis, left) and Ψ2 (on the vertical axis, right), for a horizon T = 1and σ = 0.2. Number of observations within a year n are considered for quarterlyn = 4 (dotted), monthly n = 12 (small-dashed), weekly n = 52 (large-dashed) anddaily n = 252 (solid line).

0.2 0.4 0.6 0.8 1.0

0.005

0.010

0.015

0.020

0.2 0.4 0.6 0.8 1.0

0.0001

0.0002

0.0003

0.0004

Figure V.4: The image shows how the ambiguity aversion level Ψ ∈ [0, 1] (on thehorizontal axis) effects the ambiguity parameters Ψ1 (vertical axis, left) and Ψ2

(vertical axis, right), for σ = 0.2 and n = 4, across different observation horizonsT = 1 (dotted), T = 2 (small-dashed), T = 5 (large-dashed) and T = 20 (solidline) in units of one year.

Page 158: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Appendix A

Martingale problems

1 Well-posedness of the martingale problemfor SABR

Definition 1.1 (Stochastic Differential Equations and Martingale Problems).Consider a stochastic differential equation

dX it := σij(Xt)dB

jt + bi(Xt)dt or more explicitly

X it := X i

0 +∑r

j=0

∫ t0σij(Xs)dB

js +

∫ t0bi(Xs)ds for i = 1, . . . d,

(1.1)where (B1, . . . , Br) is a Brownian motion in Rr with respect to a filtration F and∫ t

0

|||a(x)|||+ ||b(x)||dt <∞, (1.2)

with ai,j(x) := σik(x)σjk(x), where || · || denotes the Euclidean norm in Rd and||| · ||| denotes the norm in the d × d matrices. Here, the solution (X1, . . . , Xd)(provided it exists) is a continuous semimartingale in Rd. Recall, that given afiltered probability space (Ω,F,P), F = (Ft)t≥0 and a Brownian motion B suchas an F0-measurable random vector ξ on this probability space, an adapted pro-cess X is a strong solution of (1.1) if X0 = ξ a.s. and it satisfies (1.1). A weaksolution of (1.1), given the initial distribution µ consists of triple (Ω,F,P) andan F-Brownian motion B on this space together with an F-adapted process Xsatisfying (1.1), such that P X−1

0 = µ. Uniqueness in law holds for (1.1) withthe initial distribution µ if the corresponding weak solutions X have the samedistribution. Pathwise uniqueness holds for (1.1) with initial distribution µ if ona common filtered probability space with a given Brownian motion B, for anytwo solutions X and Y with X0 = Y0 a.s. (with distribution µ), we have thatX = Y a.s. cf. [108, page 212-214].Weak solutions of the equation (1.1) can also be characterized through the cor-responding martingale problem: Define the process

M ft := f(Xt)− f(X0)−

∫ t

0

A f(Xs) ds, t ≥ 0, f ∈ C∞0 (Rd), (1.3)

Page 159: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

146 A Martingale problems

where C∞0 (Rd) denotes the class of C∞ functions with compact support, and Ais the second order linear operator

Af(x) :=d∑

i,j=1

1

2ai,j∂i∂jf(x) +

d∑i=1

bi(x)∂if(x) f ∈ C∞0 (Rd), x ∈ Rd. (1.4)

A continuous process X in Rd, d ∈ N (or its distribution P ) solves the (local)martingale problem for (1.3) if the process M f in (1.3), (1.4) is a (local) mar-tingale for every f ∈ C∞0 (Rd). When the coefficients a and b are bounded, it isclearly equivalent for M f to be a local martingale or a true martingale, and thelocal martingale problem turns into a martingale problem [108, page 218].The (local) martingale problem for (1.1) with initial distribution µ (or local mar-tingale problem for (A, µ)) is said to be well posed if it has exactly one solutionPµ.If this is true for all initial distributions µ (by mixing it is sufficient if it is truefor all initial distributions δx), then the (local) martingale problem for A is saidto be well-posed. [61, page 182, Chapter 4.4].

Theorem 1.2 (Martingale Problem, Existence: Stroock-Varadhan). For anydistribution P on the path space C(R+,Rd) it holds that (1.1) has a weak solutionwith distribution P if and only if P solves the local martingale problem (1.3) forall f ∈ C∞0 (Rd).

Proof. See [108, Theorem 21.7].

Lemma 1.3 (Strong existence and pathwise uniqueness, Yamada andWatanabe).Assume that weak existence and pathwise uniqueness hold for solutions to equa-tion (1.1) with initial distribution µ. Then even strong existence and uniquenessin law hold for such solutions.

Proof. See: [108, Lemma 21.17]

Theorem 1.4 (Pathwise uniqueness, Skorohod, Yamada and Watanabe). Letw(σ, ·) denote the modulus of continuity of σ and let σ and b in (1.1) be bounded,measurable functions on R, where∫ ε

0

(w(σ, h))−2dh =∞, ε > 0, (1.5)

and either b is Lipschitz continuous or σ 6= 0. Then pathwise uniquenessholds forequation (1.1).

Proof. [108, Theorem 23.3]

Remark 1.5. In case of the CEV model, (1.5) is fulfilled for β ≥ 12but not for

β ∈ [0, 12). However, imposing absorbing boundary conditions at zero, pathwise

uniqueness holds for the latter parameter regime as well.

Page 160: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

1 Well-posedness of the martingale problem for SABR 147

Theorem 1.6 (Measurability and mixtures, Stroock and Varadhan). Supposethat for any x ∈ Rd, the local martingale problem for (1.3) with initial distributionδx has a unique solution Px. Then (Px) is a kernel from Rd to C(R+,Rd), andfor every initial distribution µ, the associated local martingale problem has theunique solution Pµ =

∫∞0Pxµ(dx).

Proof. See [108, Theorem 21.10]

Theorem 1.7 (Martingale Problem and the Strong Markov Property). Let a andb be measurable functions on Rd such that for every x ∈ Rd the local martingaleproblem for (1.3) with initial distribution δx has a unique solution Px. Then thefamily (Px) satisfies the strong Markov property.

Proof. See: [108, Theorem 21.11. p.421.]

bothfulfilled

for each y

Thereexists asolutionto the

m.p. at y

There is atmost onesolution tothe m.p. at

y

SDE hasa weaksolution

Uniquenessin law

togetherimply

Pathwiseuniqueness

Themartingaleproblem iswell-posed

The process is atime-homogeneous

strong Markov process

Thm 1.6

Thm 1.7

Thm 1.2

Thm 1.4

Lem 1.3

Page 161: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

148 A Martingale problems

Page 162: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Appendix B

Some Background on Analysis onManifolds

1 The Laplace-Beltrami operatorand the Riemannian volume form

We recall here some well-known definitions and facts on analysis on manifolds,which were used in the previous chapters for easy reference. Full details can befound in [8, 39, 81, 118]. In order to introduce the Laplace-Beltrami operator,which generalizes the Laplace operator to the setting of manifolds, we briefly re-call some related definitions and facts and introduce some notation, which willbe used in the following sections.

Definition 1.1. Let M1 and M2 be smooth1 manifolds and let

φ : M2 −→M1

be a Ck-diffeomorphism (for k ≥ 1) between M1 and M2. Then for any f ∈C1(M1), the map φ induces a pullback φ∗f to C1(M1) by concatenation

φ∗ : C1(M1) −→ C1(M2)f 7−→ φ∗f := f φ

and a pushforward φ∗ : TM2 −→ TM1 to the tangent bundle by

φ∗ : Tz(M2) −→ Tφ(z)(M1)X 7−→ φ∗X

for any z ∈ M2, where the action of φ∗X on a smooth map f : M2 −→ R is(φ∗X)f := X(f φ) = X(φ∗f).The pushforward of tangent vectors φ∗X2 induces a pullback of covectors φ∗ω1 asfollows: For any z ∈M2 let T ∗z (M2) denote the dual space to Tz(M2). Then

φ∗ : T ∗φ(z)(M1) −→ T ∗z (M2)

ω1 7−→ φ∗ω1,

1If not stated otherwise, smooth implies C∞ throughout this section.

Page 163: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

150 B Some Background on Analysis on Manifolds

where the action of φ∗ω1 on any X2 ∈ Tz(M2) is given by φ∗ω1(X2) = ω1(φ∗X2).

Definition 1.2 (Differential Forms). LetM be a smooth manifold and let kΩl(M)denote the vector space of differential k-forms of class C l, k, l ∈ N, i.e. the vectorspace of C l-smooth sections of the bundle

ΛkM =∐z∈M

Λk(TpM),

where for any vector space V , Λk(V ) ⊂ T k(V ) denotes the subspace of alternatingtensors (k-covectors) on V . A k- covector T (X1, . . . , Xk) ∈ T k(V ) is alternatingif for any i, j ∈ 1, . . . , k

T (X1, . . . , Xi, . . . , Xj, . . . , Xk) = −T (X1, . . . , Xj, . . . , Xi, . . . , Xk).

ΛkM is a smooth sub-vector-bundle of T k(M) of rank(nk)over M , where n =

dim(M).

We will mainly devote our attention to the cases k = 0, 1, 2. Note that0Ωl(M) = C l(M) consist of smooth functions.

Definition 1.3. Let (M, g) be a smooth Riemannian manifold2 of dimension n.There exists a unique smooth nonvanishing n-form µ ∈ nΩ(M), the Riemannianvolume form, also called the Riemannian volume element3, such that

µ(E1, . . . , En) = 1,

whenever (E1, . . . , En) is an n-tuple of vector fields

Ei :M −→ TM

z 7−→ Ei(z) ∈ TzM, i ∈ 1, . . . , n

with the property that at each point z ∈ M the tuple (E1(z), . . . , En(z)) formsan orthonormal basis of the tangent space TzM with respect to the Riemannianmetric:

〈Ei(z), Ej(z)〉g(z) = δji , i, j ∈ 1, . . . , n.

Proof. See [117, Proposition 13.22. page 342] for the proof of the existence anduniqueness statement.

Remark 1.4. Let dµg denote the volume element of a smooth Riemannian man-ifold (M, g) and let (U,ϕ), ϕ(z) = (x1, . . . , xn) be any smooth coordinate chart.Then dµg has with respect to ϕ the following local coordinate representation

dµg =√

det(gi,j)dx1 ∧ . . . ∧ dxn,

where gi,j are the components of the Riemannian metric in these coordinates.2We tacitly assume M to be oriented throughout this section.3In dimension n=2 also area element.

Page 164: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

1 The Laplace-Beltrami operator and the Riemannian volume form 151

Example 1.5. Hyperbolic volume element in orthogonal coordinates z = (x, y):

dµh(z) =1

y2dxdy, (1.1)

Hyperbolic volume element in polar coordinates.

dµh(z) = sinh(t)dαdξ (1.2)

Lemma 1.6 (Volume Element as Riemannian Meausure). Let (M, g) be a smoothRiemannian manifold of dimension n with Riemannian volume form µg and letΛ(M) denote the σ-algebra of all measurable sets4 in M . Then µg is the uniquemeasure on Λ(M), such that for any coordinate chart (U , ϕ), ϕ = (x1, . . . , xn)

dµg(z) =√

det(gi,j)dx1 . . . dxn (1.3)

for the Lesbegue measure dx1 . . . dxn in the sense that if A ∈ Λ(U) is a measurableset5 in the chart domain U , then

µg(A) =

∫ϕ(U)

Iϕ(A)

√det(gi,j) dx

1 . . . dxn (1.4)

for the Lesbegue measure dx1 . . . dxn in ϕ(U) ⊂ Rn. Furthermore, the measure µgis complete and µg(K) <∞ is finite for any compact set K ⊂M .

Proof. See [81, page 60-62]. An essential point is that value of the integral (1.4)remains invariant under the change of charts and therefore (1.3) can uniquely beextended from Λ(U) to all measurable sets on M .

Remark 1.7. The above implies in particular that within a coordinate chartϕ(U), the measure µg is absolutely continuos with respect to the Lesbegue meas-ure, with the Radon-Nikodým derivative

√det(gi,j( · )). In fact, the two measures

are even equivalent, since det(gi,j( · )) > 0 on all ϕ(U) by positive definiteness ofthe Riemannian metric tensor g.

Lemma 1.8. (a) The pair (M,µg) forms as measure space, on which the no-tions of measurable and integrable functions are well defined, as well as theLesbegue function spaces Lp(M,µg). In the case p = 2 and f, g ∈ L2(M,µg)

(f, g)L2 =

∫M

fg dµg

defines a scalar product, hence H = L2(M,µg) is a Hilbert space.

(b) Similarly, the space−→L 2(M,µg) contains equivalence classes of measurable

vector fields v, which satisfy 〈v, v〉1/2g ∈ L2(M,µg). Equipped with the innerproduct

(v, w)−→L 2(M,µg)

:=

∫M

〈v, w〉gdµg,

−→L 2(M,µg) is a Hilbert space.

4 A set E ⊂M is measurable, i.e. E ∈ Λ(M), if for any chart U , E ∩ U ∈ Λ(U)5Λ(U) denotes the Lesbegue measurable sets in the chart domain U .

Page 165: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

152 B Some Background on Analysis on Manifolds

Corollary 1.9. Since the Riemannian measure µg is finite on compact sets, anycontinuouos function with compact support is integrable against µg.Furthermore, if h ∈ C(M) and∫

M

h(z)f(z)dµg(z) = 0

for all test functions f(z) ∈ C∞0 , then h ≡ 0.

Proof. See [81, Lemma 3.13, page 63].

Definition 1.10 (Exterior Derivative). LetM be a smooth manifold. Then thereexist unique linear maps

d : kΩ(M) −→ k+1Ω(M)

defined for each integer k ≥ 0, satisfying the following three conditions:

(a) If f is a smooth real-valued function, then df is the differential of f ,

df(X) = X(f).

(b) If kω ∈ kΩ(M) and lη ∈ lΩ(M), then

d( kω ∧ lη) = d( kω) ∧ lη + (−1)k kω ∧ d( lη).

(c) d d = 0.

Proof. See [117, Theorem 12.14, page 306].

Definition 1.11 (Gradient). Let M be a smooth n-dimensional manifold andlet f : M −→ R be a smooth function (i.e. f ∈ C∞(M)) on M. Furthermore, letV(M) resp. 1Ω(M) denote the set of smooth vector fields resp. smooth 1-forms onM . Then the gradient is defined by the commutativity of the following diagram

C∞(M)gradg2

zz

d

%%V(M)

θg

∼=// 1Ω(M)

.

that is, for any f ∈ C∞(M) its gradiant is defined as

gradgi(f) := (θgi)−1df, (1.5)

where the map

θg : V(M) −→ 1Ω(M)

X(z) 7−→ 〈X(z), · 〉g(z)(1.6)

denotes the Riesz isomorphism. Here the exterior derivative

d : C(M) −→ 1Ω(M)

f 7−→ df

is the differential of f characterized by the action df(X) := X(f) on smoothvector fields X ∈ V(M).

Page 166: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

1 The Laplace-Beltrami operator and the Riemannian volume form 153

Remark 1.12. For a smooth Riemannian n-manifold (M, g) and a smooth func-tion f ∈ C∞(M) the gradient gradg(f) can be characterized as follows: is theunique vector field Y ∈ V(M) such that

〈Y,X〉g = X(f), (1.7)

for all X ∈ V(M). This characterization allows for a coordinate-representationwithin coordinate charts (U , ϕ), ϕ = (x1, . . . , xn).

Lemma 1.13. For any differentiable function f the representation at any z ∈ Uof the i-th coordinate of its gradient with respect to the basis ∂

∂x1 , . . . ,∂∂xn of

TzM is

(gradg(f)

)i=

n∑j=1

gij∂f

∂xj, (1.8)

where gij denotes components of the the inverse matrix (gij) := (g−1)ij of theRiemannian metric g in these coordinates.

Definition 1.14 (Divergence Operator). Let M be a smooth manifold, and letV(M) denote the set of smooth6 vector fields, and C∞(M) the set of smoothfunctions on M . Then the divergence operator is defined by the commutativityof the following diagram:

V(M)

yµg

divg // C(M)

•µg

1Ω(M) d // 2Ω(M)

,

that is,

divg : V(M) −→ C(M)

X 7−→ (•µg)−1 d(Xyµg),(1.9)

or equivalentlyd(Xyµg) = (divgX)µg,

where µg denotes the Riemannian volume element on (M, g), and the maps• and y are defined as follows: for a 2ω(·, ·) ∈ 2Ω(M)

• 2ω : C(M) −→ 2Ω(M)

f 7−→ f 2ω,

and for a 2ω(·, ·) ∈ 2Ω(M)

y 2ω : V(M) −→ 1Ω(M)

X 7−→ 2ω(X, ·),6X ∈ V(M) if for every z ∈ M there is a chart (φ,U) about z, such that the pushforward

φ∗X ∈ V(φ(U)).

Page 167: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

154 B Some Background on Analysis on Manifolds

furthermore,

d : 1Ω(M) −→ 2Ω(M)1ω(·) 7−→ (d1ω)(·, ·)

denotes the exterior derivative introduced in Definition 1.10.

One can alternatively define the divergence with help of the gradient throughthe divergence theorem as follows.

Proposition 1.15 (Divergence Theorem). For any C∞-vector field X on a Rieman-nian manifold (M, g) there is a unique smooth function divgX on M , such thatthe following identity holds∫

M

(divX)fdµg = −∫M

〈X, gradgf〉gdµg, (1.10)

for all f ∈ C∞0 (M).

The uniqueness statement relies on Corollary 1.9, which immedieately implies

(divX)1 = (divX)2

for any two functions satisfying (1.10). One of the advantages of the divergencetheorem is that it delivers a local coordiate representation of divX.

Lemma 1.16 (Coordinate Representation of the Divergence Operator). In acoordinate chart (U , ϕ), ϕ = (x1, . . . , xn) we have

X =n∑i=1

X i ∂

∂xi

and given the local coordinate representation of gradf and dµg, we find∫M

〈X, gradf〉gdµg = −∫M

(1√

det g

∂xi

(X i√

det g))

fdµg,

for any f ∈ C∞0 , and therefore

divX =1√

det g

∂xi

(X i√

det g). (1.11)

Proof. See: [81, p. 64-65.]

Definition 1.17 (Laplace-Beltrami Operator). Let (M, g) be a Ck+2-smooth7

Riemannian manifold for an arbitrary k ∈ N ∪ ∞, and let Vk+1(M) denote theset of (Ck+1)-smooth8 vector fields-, and Ck(M) the set of k-times smoothly

7We require the chart transitions to be Ck+1-smooth.8X ∈ Vk+1(M) if for every z ∈M there is a chart (U,ϕ) about z, such that the pushforward

φ∗X ∈ Vk+1(ϕ(U)), i.e. a (Ck+1)-smooth vector field on ϕ(U) ⊂ Rn.

Page 168: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

1 The Laplace-Beltrami operator and the Riemannian volume form 155

differentiable9 functions on M . Then we define the Laplace-Beltrami operator on(M, g) as the concatenation

∆gf = divg(gradgf)

of the gradient- and divergence operators, i.e. through the commutativity of thefollowing diagram

f ∈ Ck+2(M)gradg

''

∆g // Ck(M)

Vk+1(M)

divg99

.

Lemma 1.18 (Coordinate Representation). Combining the above coordinate rep-resentations of the gradient and divergence operators, we obtain a representationof the Laplace-Beltrami operator in terms of a given coordinate chart:

∆g =n∑i=1

1√det g

∂xi

(√det g

n∑j=1

gij∂

∂xj

). (1.12)

Of course, the representation of the Laplace-Beltrami operator in coordinates de-pends the choice of coordinate system. For example in orthonormal coordinates,the above expression (1.12) simplifies to

∆g =1√

det g

n∑i=1

∂xi

(√det g gii

∂xi

).

For a representation in polar coordinates see Section 1.

Finally we recall that the Laplace-Beltrami operator ∆g on a Riemannianmanifold satisfies Green’s formula.

Proposition 1.19 (Green’s Formula). If u and v are smooth functions on asmooth Riemannian Manifold M and one of them has compact support then∫

M

u∆g(v) dµg = −∫M

〈grad(u), grad(v)〉g dµg =

∫M

∆g(u)v dµg. (1.13)

Proof. The above statement follows from the Divergence theorem (cf. Proposition1.15). Note that the equality (1.10) in the Divergence theorem still holds ifu ∈ C∞ is not necessarily compactly supported, and X is a compactly supportedC∞ vector field on M , and set X = grad(v). See: [81, p. 67. Theorem 3.16.]

9Differentiability after precomposition with a chart f (ϕ−1) : Rn −→ Rn.

Page 169: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

156 B Some Background on Analysis on Manifolds

1 The SABR Laplace-Beltrami operators

The generators of the Brownian motions on (S, g) resp (U, u) (defined in Sec-tion 3.1) are defined on their respective spaces with x 6= 0 and x 6= 0 forβ 6= 0 respectively and read

∆gf = y2

(βx2β−1∂f

∂x+ x2β ∂

2f

∂x2+ 2ρxβ

∂x

∂f

∂y+∂2f

∂y2

), f ∈ Ck+2(S),

∆uf = y2

(βx2β−1∂f

∂x+ x2β ∂

2f

∂x2+∂2f

∂y2

), f ∈ Ck+2(U),

while the infinitesimal generators of the original SABR model (IV.1.1) are

Af = y2

(x2β ∂

2f

∂x2+ 2ρxβ

∂x

∂f

∂y+∂2f

∂y2

), f ∈ Ck+2(S),

Aρ=0f = y2

(x2β ∂

2f

∂x2+∂2f

∂y2

), f ∈ Ck+2(U),

Note that for β = 0 the operators ∆g and A (resp. ∆u and Aρ=0) coincide.

2 Heat equation and heat kernel on manifoldsDefinition 2.1 (Fundamental Solution). Let (M, g) be a smooth Riemannianmanifold. A fundamental solution at Z ∈ M of the heat equation on (M, g) is asmooth function

pZ : (0,∞)×M −→ R, (t, z) 7→ pZ(t, z),

which satisfies the following conditions

i) pZ solves the heat equation on (M, g), i.e.

∆gpZ(·, ·) =∂

∂tpZ(·, ·),

where ∆g denotes the Laplace-Beltrami operator on (M, g) acting on thespace variable z,

ii) and limt↓0+ pZ(t, ·) = δZ , where δZ denotes the Dirac delta distribution10 atZ ∈M . The convergence is to be understood in distributions on M in thesense that

limt↓0+

∫M

pZ(t, z)f(z) dµg(z) = f(Z),

for all test functions f ∈ C∞0 (M), where dµg(z) stands for the Riemannianvolume element at z ∈M .

If in addition pZ ≥ 0, and for all t > 0,∫M

pZ(t, z) dµg(z) ≤ 1,

then pZ is called a regular fundamental solution.10Cf. Definition 2.10.

Page 170: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Heat equation and heat kernel on manifolds 157

1 Operator semigroup and self-adjointness

Definition 2.2. Let H be a Hilbert Space. Given an operator L and a vector fin H, then the Cauchy problem associated to L and f is to find a pathu : (0,∞) −→ H so that

dudt

= −Lu, t > 0,

u|t=0 = f.(2.1)

The above is the short notation for the requirement that the following conditionsare satisfied:

(a) u(t) is strongly differentiable for all t > 0 with derivative dudt∈ H.

(b) For any t > 0, u(t) ∈ domL and satisfies dudt

= −Lu.

(c) limt→0 u(t) = f , where the convergence is in the norm of H.

Example 2.3 (The L2(M)-Cauchy Problem). Let (M, g) be a smooth Rieman-nian manifold and let µg denote the associated Riemannian volume element.Recall from Corollary 1.8 that H = L2(M,µg) is a Hilbert space. Then fora f ∈ L2(M,µg), the L2(M)-Cauchy problem associated to f and the Laplace-Beltrami operator11 L = ∆g is the problem of finding a path of functions u(t, z)on (0,∞)×M such that u(t, ·) ∈ L2(M,µg) for any t > 0 and such that (2.1) issatisfied in the sense of Definition 2.2.

Remark 2.4. The system (2.1) is reminiscent of a linear ordinary differentialequation, which suggests guessing the solution path to be

u(t) = e−tLf

in some sense that is yet to be specified. Trying to represent e−tL as an exponentialseries

eA = id + A+A2

2+A3

3!+ . . . , A = −tL

is in general not helpful, see [81, p. 112-117.] for full details. One of the reasonsis that in general the set of functions within f ∈ L2(M,µg), on which all powersof Akf exist in L2(M,µg) and are such that also the series converges, is too small.However, if L is a self-adjoint operator, one can apply spectral theory to definee−tL. This is the aim of Propositions 2.6 and 2.8 below.

We introduce briefly the notation and concepts that are necessary to formulateProposition 2.8, and refer to the excellent appendix A.5 of [81, p. 444-455.] fordetails.

11For this it is necessary to define an appropriate extension of the operator ∆g, the weakLaplacian See [81, p. 99].

Page 171: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

158 B Some Background on Analysis on Manifolds

Definition 2.5. Let H be a Hilbert space with scalar product (·, ·), and let A bedensely defined linear operator on H with domain dom(A) = D.

(a) Consider the subspace

D∗ =y ∈ H : ∃! a ∈ H : such that x 7→ (Ax, y) = (x, a) for all x ∈ D,

then the adjoint operator on dom(A∗) = D∗ is defined by A∗y = a.This yields by construction the identity

(Ax, y) = (x,A∗y) for all x ∈ D, y ∈ D∗. (2.2)

(b) An operator A is symmetric, if

(Ax, y) = (x,Ay) for all x, y ∈ D. (2.3)

Then D ⊂ D∗ is always true, and both are dense in H. If in additionD = D∗ holds, then we call A self-adjoint.

(c) The operator A is non-negative definite if

(Ax, x) ≥ 0 for all x ∈ D. (2.4)

Proposition 2.6 (Spectral Theorem for Unbounded Operators). Let A be a self-adjoint operator on a real Hilbert space H and let spec(A) denote its spectrum.Then there exists a unique spectral resolution, i.e. a family Eλλ∈R of projectors,such that the following decomposition takes place:

A =

∫spec(A)

λ dEλ.

Furthermore, for any Borel function ϕ defined on spec(A), the operator ϕ(A) canbe defined by

ϕ(A) :=

∫spec(A)

ϕ(λ) dEλ,

and its domain dom (ϕ(A)) is dense in H.

Proof. See [81, Appendix A.5.4, p.452-454].

It holds furthermore, that if ϕ is bounded on spec(A), then the operator ϕ(A)is bounded and therefore uniquely extends to an operator defined on all H.

Definition 2.7 (Heat Semigroup). Let L be a self-adjoint, non-positive definiteoperator, i.e. spec(A) ∈ [0,∞), then the family Ptt≥0 defined by

Pt := e−tL =

∫spec(L)

e−tλ dEλ =

∫ ∞0

e−tλ dEλ (2.5)

is called the heat semigroup associated with L.

Page 172: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Heat equation and heat kernel on manifolds 159

The following theorem is not only used in the derivation of the hyperbolicheat kernel presented in [90], but also in section 2.4 solving the original initialvalue problem.

Proposition 2.8. For any non-negative definite, self-adjoint operator L in aHilbert space H and for any given f ∈ H, there is a unique solution of theCauchy problem corresponding to L and f , which is given by the path

u(t) = Ptf,

where Pt denotes the heat semigroup (2.5). In particular the family Ptt≥0 sat-isfies the semigroup identity PtPs = Pt+s for all t, s ≥ 0, and the mapping t 7→ Ptis strongly continuous12 on [0,∞) with P0 = Id.

Remark 2.9. In Appendix A.1, the authors of [90] show symmetry and nonneg-ativity for the Laplace-Beltrami ∆h operator of the Poincaré halfplane. In orderto establish self-adjointness, the domain of definition yet has to be specified for∆h in the Hilbert space H = L2(H2, dµh) given in (104) of [90].Note that we also chose not to fix a specific domain for ∆g when introducing theLaplace-Beltrami operator of a Riemannian manifold (M, g) in Definition 1.17.But in any case, already the smallest choice of domain (i.e. dom(∆g) = C∞0 (M))is a dense subspace of L2(M,dµg), and Green’s formula (Proposition 1.19) yieldssymmetry of ∆g, as

(∆gu, v)L2 = (u,∆gv)L2 for all u, v ∈ C∞0 (M).

Thus if we set A := ∆g|C∞0 (M), then A is a densely defined symmetric operator onthe Hilbert space H = L2(M,dµg), with dom(A) ⊂ dom(A∗), where the inclusionis strict13. If B is a self-adjoint extension14 of A, then

dom(A) ⊂ dom(B) = dom(B∗) ⊂ dom(A∗).

Hence, the problem of constructing a self-adjoint extension of ∆g|C∞0 (M) amountsto an appropriate choice of dom(B), and specifying the domain of definitionfor ∆g will involve an extension of the operator to an appropriate subspace ofL2(M,dµg), which for Ck+2-smooth functions coincides with the operator intro-duced in Definition 1.17. (See: [81, p. 103-104].)

Definition 2.10. (Distributional Laplacian and Weak Laplacian)

(a) For any smooth manifold M the space of test functions D(M) consists ofthe set C∞0 (M), equipped with the appropriate topology15.

12For any t ≥ 0, and f ∈ H, lims→t Psf = Ptf , where the convergence is in the norm of H.13Recall that dom(A∗) = u ∈ L2(M) : ∃f ∈ L2(M) : such that (Av, u)L2 = (v, f)L2∀v ∈

dom(A) == u ∈ L2(M) : Au ∈ L2 contains functions which are not compactly supported.

14That is, B is a self-adjoint operator with dom(A) ⊂ dom(B) and B|dom(A) = A.15 This is the topology of the following convergence: ϕk

D→ ϕ if

(a) in any chart U and for any multiindex α, ∂αϕkk⇒ ∂αϕ (uniformly),

(b) and all supports suppϕk are contained in a compact subset of M .

Page 173: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

160 B Some Background on Analysis on Manifolds

The space D′(M) of distributions on M is the set of continuous linearfunctionals on the space of test functions D(M), we denote the value ofa distribution u ∈ D′(M) at ϕ ∈ D(M) by (u, ϕ), and the convergence indistribution by uk

D′→ u, as usual, if (uk, ϕ)→ (u, ϕ) for all ϕ ∈ D(M).

Remark 2.11. We can identify functions on M as distributions similarlyto the situation of distributions in Rn: If u ∈ L1

loc(M,µg), then by the rule

(u, ϕ) =

∫M

uϕdµg for any ϕ ∈ D(M). (2.6)

we associate a distribution in D′(M) with u.The mapping (2.6) is by Corollary 1.9 an injection L1

loc(M,µg) → D′(M).

(b) For any distribution u ∈ D′(M) we define its distributional Laplacian∆gu|D′(M) by the identity

(∆gu, ϕ) = (u,∆gϕ) ∀ϕ ∈ D(M). (2.7)

The right hand side defines again a continuos linear functional on D(M).This defines a legitimate extension of ∆g to D′(M), since for smooth func-tions, requirement (2.7) is satisfied by Green’s formula (Proposition 1.19).

(c) If in the above situation it even holds that u ∈ L2loc(M,µg) ⊂ D′(M), and

if also ∆g|D′(M)(u) ∈ L2loc(M,µg) for the distributional Laplacian, then the

latter is called weak Laplacian of u, and is denoted by ∆g|L2loc(M,µg)(u).

Definition 2.12 (Sobolev Spaces on a Manifold). Let (M, g) be a smooth Rieman-nian manifold. We briefly16 recall the following spaces:

(a) Analogously to the situation above, let−→D denote the space of smooth vector

fields on M with compact support with the appropriate topology. Thenelements of its dual space

−→D ′ are distributional vector fields.

(b) For any u ∈−→D′(M) we define its distributional gradient gradg|−→D′(M)

(u)

by the identity

(gradgu, ψ)−→L 2(M,µg)

= −(u, divgψ)−→L 2(M,µg)

∀ψ ∈−→D (M) (2.8)

For smooth vector fields, (2.8) is satisfied by the Divergence theorem (Pro-position 1.15).

(c) If u ∈ L2loc(M,µg) and gradg|−→D (M)

(u) ∈−→L 2loc(M,µg), then we call the latter

the weak gradient of u, denoted by gradg|−→L 2loc

(u).

16For details see [81, Chapter 4].

Page 174: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

2 Heat equation and heat kernel on manifolds 161

(d) The first Sobolev space is the Hilbert space

W 1(M,µg) := u ∈ L2(M,µg) : gradg|−→L 2loc

(u) ∈−→L 2(M,µg)

with inner product

(u, v)W 1 =

∫M

uv dµg +

∫M

〈u, v〉gdµg.

Furthermore, W 10 (M,µg) is the closure of D(M) in W 1(M,µg) and

W 20 (M,µg) := u ∈ W 1

0 (M) : ∆g|L2loc

(u) ∈ L2(M).

Proposition 2.13 (Self-adjoint Extension). On a Riemannian manifold (M, g)the operator L = −∆g|W 2

0is a self-adjoint non-negative definite operator. Fur-

thermore, its the unique self-adjoint extension of −∆g|D, whose domain is con-tained in W 1

0 . In particular, it holds that

dom(L) = u ∈ W 10 (M) : ∆g|L2

loc(u) ∈ L2(M) = dom(L∗).

Proof. Both the symmetry L = ∆g|W 20and the nonnegativity statement follow

from an appropriate version of the Green’s formula which states17 that for allu ∈ W 1

0 (M,µg) and all v ∈ W 2(M,µg), we have∫M

u ∆g|D′(v) dµg = −∫M

〈gradg|−→D ′(u) , gradg|−→D ′(v)〉g dµg (2.9)

This applies in particular if u, v ∈ W 20 (M) and yields

(L(u), v)L2(M,µg) :=

∫M

L(u) v dµg

=

∫M

u L(v) dµg =: (u,L(v))L2(M,µg),

hence L = −∆g|W 20is a symmetric operator. Non-negativity of L follows by

choosing u = v in (2.9). For the proof of the self-adjointness statement see [81,Theorem 4.6. p.107-108].

Corollary 2.14. As a result of Proposition 2.13, Proposition 2.8 applies to theself-adjoint extension L = −∆g|W 2

0of the Laplace-Beltrami operator. We call the

semigroup of operators (Pt)t≥0 corresponding to the heat equation on L2(M) theheat semigroup.

17See:[81, Lemma 4.4. p.104].

Page 175: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

162 B Some Background on Analysis on Manifolds

1.1 The Heat Kernel

Lemma 2.15. Let (M, g) be a smooth Riemannian manifold. For any x ∈ Mand any t > 0 there exists a unique function pt,x ∈ L2(M,µg) such that for allf ∈ L2(M,µg),

Ptf(x) =

∫M

pt,x(z)f(z) dµg(z). (2.10)

Proof. The result essentially follows from the Riesz representation theorem.See [81, Theorem 7.7, p.191] for details.

Definition 2.16 (Heat Kernel). Let (M, g) be a smooth Riemannian manifold.We define the heat kernel on M by setting

pt(x, y) :=

∫M

pt/2,x(z)pt/2,y(z) dµg(z), for any t > 0, and x, y ∈M,

where pt/2,x(z) and pt/2,y(z) are as in (2.10). The above function is also called theintegral kernel of the heat semigroup.

Remark 2.17. pt(x, y) is defined at all points y ∈M , as opposed to (2.10).

Proposition 2.18 (Properties of the Heat Kernel). On any smooth Riemannianmanifold (M, g), the heat kernel satisfies the following properties.

(a) Symmetry: pt(x, y) = pt(y, x) for all x, y ∈M and t > 0.

(b) For any f ∈ L2(M,µg) for all x, y ∈M and t > 0,

Ptf(x) =

∫M

pt(x, z)f(z) dµg(z).

(c) pt(x, y) ≥ 0 for all x, y ∈M and t > 0, and∫M

pt(x, z) dµg(z) ≤ 1,

for all x ∈M and t > 0.

(d) The semigroup identity, or Chapman-Kolmogorov equations:for all x, y ∈M and t, s > 0

pt+s(x, z) =

∫M

ps(x, z)pt(z, y) dµg(z),

(e) For any y ∈M , the function u(t, x) := pt(x, y) is C∞-smooth in (0,∞)×Mand satisfies the heat equation

∂u

∂t= ∆gu.

Page 176: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Transformation of the heat kernel under isometry 163

(f) For any f ∈ C∞0 (M),∫M

pt(x, z)f(z) dµg(z)→ f(x) as t→ 0,

where the convergence is in C∞(M).

Proof. See [99, Theorem 4.1.1] and [81, Theorem 7.13].

Definition 2.19 (Stochastic Completeness). We call a smooth Riemannian Man-ifold (M, g) stochastically complete18 if for all t, x ∈ (0,∞)×M∫

M

pt(x, z) dµg(z) = 1

Lemma 2.20. Let (M, g) be a stochastically complete smooth Riemannian Man-ifold. If u(t, z) := pZ(t, z) is a regular fundamental solution of the heat equationon M at a point Z ∈ M , then u(t, z) ≡ pt(z, Z) are identical as functions of tand z.

Remark 2.21. Some authors19 use the terms fundamental solution of the hyper-bolic heat equation and hyperbolic heat kernel interchangeably. We will abide bythat, while keeping in mind that for this a verification of stochastic completenessin place.

3 Transformation of the heat kernel under iso-metry

Proposition 3.1. Let k ∈ N ∪ ∞ be an arbitrary integer, M1, M2 two Ck+2-manifolds and φ : M2 →M1 a Ck+2-diffeomorphism which is an isometry between(M2, g2) and (M1, g1). The Laplace-Beltrami operator ∆gi, i ∈ 1, 2 commuteswith φ in the sense that the equation

∆g2(φ∗(f)) = φ∗(∆g1(f)) (3.1)

holds for any f ∈ Ck+2(M1).

Proof. A proof of this statement is given for example in [81, Lemma 3.27].

Definition 3.2. Let (M, g) be a smooth Riemannian manifold and Z ∈M . Thesmooth function pZ : (0,∞)×M → R is a fundamental solution at Z of the heatequation on (M, g) if it satisfies the following conditions:

(i) pZ solves the heat equation on (M, g): ∆gpZ = ∂tpZ , where ∆g denotes theLaplace-Beltrami operator on (M, g);

18See [99, Section 4.2].19See for example [39, Chapter VI].

Page 177: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

164 B Some Background on Analysis on Manifolds

(ii) limt↓0 pZ(t, ·) = δZ , where δZ denotes the Dirac measure at Z ∈M :

limt↓0

∫M

pZ(t, z)f(z)µg(dz) = f(Z),

for all test functions f ∈ C∞0 (M), where µg(dz) stands for the Riemannianvolume element at z ∈M .

The fundamental function pZ is said to be regular if furthermore pZ ≥ 0 and∫MpZ(t, z)µg(dz) ≤ 1.

Proposition 3.3. Let k ∈ N0 ∪ ∞, φ : (M2, g2) −→ (M1, g1) a Ck+2-smoothisometry, and pg1

Z1the fundamental solution at Z1 of the heat equation on (M1, g1).

Furthermore, let Z2 ∈ M2 be such that φ(Z2) = Z1. Then the map (t, z) 7→pg1

φ(Z2)(t, φ(z)) is the (unique) fundamental solution at Z2 of the heat equation on(M2, g2).

Proof. Property (i) in Definition 3.2 holds for the above map, which follows fromProposition 3.1 and especially from (3.1). The operator ∆g2 acts only on thespace variable z ∈M2 and not on Z ∈M2, so that

∆g2

(pg1

φ(Z2)(t, φ(z)))

= φ∗(∆g1(pg1

Z2(t, z))

)= φ∗

(∂

∂t(pg1

Z2(t, z))

)=

∂t

(pg1

φ(Z2)(t, φ(z))),

(3.2)

where the first equality follows from (3.1). Property (ii) of Definition 3.2 is aconsequence of the substitution rule. Let f ∈ C∞0 (M1) be a test function andf := φ∗f . Set z = φ(z) and Z = φ(Z) for any z, Z2 ∈ M2. Given that φ is anisometry, so is φ−1 and the pullback (φ−1)∗µg2(d·) coincides with the volume formon (M1, g1). Then

limt↓0

∫M2

pg1

φ(Z2)(t, φ(z))f(z)µg2(dz) = limt↓0

∫M2

pg1

φ(Z2)(t, φ(z))f(φ(z))µg2(dz)

= limt↓0

∫M1

pg1

φ(Z2)(t, z)f(z)((φ−1)∗µg2

)(dz)

= limt↓0

∫M1

pg1

φ(Z2)(t, z)f(z)µg1(dz)

=f(φ(Z2)) = f(Z).

Remark 3.4. Note that the fundamental solutions in Proposition 3.3 are denotedwith respect to the Riemannian volume form. In terms of integration with respectto the Lebesgue measure we make a slight modification of the above statement.

Let the Riemannian volume form be given in orthogonal coordinates, andlet Ku and Kg denote the fundamental solutions (in terms of Lebesgue) of the

Page 178: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Transformation of the heat kernel under isometry 165

heat equations corresponding to the Riemannian metrics u and g in the sense thatthe Radon-Nikodym derivatives with respect to the Lebesgue measure are alreadyincorporated into the expression forKu andKg: if pgZ(s, z) (resp. puZ(s, z)) denotethe fundamental solutions as in Proposition 3.3, then, for any test function f ,∫Sf(z)Kg

Z(s, z)dz =

∫Sf(z)pgZ(s, z)

dz

µg(dz)µg(dz) =

∫Sf(z)pgZ(s, z)

µg(dz)√det(g)

.

The following lemma follows directly from Proposition 3.3.

Lemma 3.5. Let Kg1 and Kg2 denote the fundamental solutions (in terms ofLebesgue) of the heat equations corresponding to the metrics g1 and g2:

∂Kg1

∂s=

1

2∆g1K

g1 ,

Kg1

Z (0, z) = δ(z − Z),and

∂Kg2

∂s=

1

2∆g2K

g2 ,

Kg2

Z(0, z) = δ(z − Z).

(3.3)

If φ is an isometry, then

Kg1

Z (s, z) = det(∇φ(Z)) Kg2

φ(Z)(s, φ(z)). (3.4)

1 Polar coordinates and the Eikonal equation

It was used in Theorem 3.9 of Chapter II, that for n-dimensional manifolds M ,the Laplace-Beltrami operator ∆M can be decomposed into a partial derivativewith respect to a radial component and a spherical Laplace-Beltrami operator∆Sn−1 on a sphere of dimension n− 1.Here we briefly elaborate on this fact. For constant curvature space forms, notonly can the Laplace-Beltrami operator be decomposed in the above way, but theaction of the Laplace-Beltrami operator can be explicitly determined in this rep-resentation. In particular, if it is known that the solution of the Cauchy-problemcorresponding to the heat equation on M is radial at all times (which for SABRfollows from (3.4) above), then the latter reduces second-order differential equa-tion with only one space dimension.Note that also theMinakshisundaram-Pleijel recursion formulas20, which are usedin explicit construction of heat kernels, rely on the representation of the Rieman-nian metric and of the Laplace-Beltrami operator in spherical coordinates.Furthermore, we would like to recall from [118, Theorem 6.8] the Gauss lemma,to motivate that the geodesic distance satisfies the Eikonal equation, which wasused in [26] and recalled in Section 2 of Chapter II. See the original monograph[118] for full details.

Definition 3.6. (a) If v is a unit vector in TzM and ε > 0 is sufficiently small,then the diffeomorphic image of the ball Bε(0) ⊂ TzM is called a geodesicball Uz := expz(Bε(0)) in M . Also, if the boundary ∂Bε(0) ⊂ U0 in thenormal neighborhood U0 (see [118, Section 6].), then its image expz(∂Bε(0))is called a geodesic sphere in M .

20See [39, Chapter VI.3].

Page 179: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

166 B Some Background on Analysis on Manifolds

(b) If U0 ⊂ TzM is a normal neighborhood and Uz = expz(U0), then given anyorthonormal basis Ei of TzM , we define a coordinate chart on Uz ⊂ Mby the concatenation of the inverses of expz and the invertible map

E : Rn −→ TzM

(x1, . . . , xn) 7−→n∑i=1

xiEi.

We obtain the Riemannian normal coordinate chart (Uz, ϕ) on M :

ϕ := E−1 exp−1z : Uz −→ Rn. (3.5)

(c) For a point z ∈ Uz with ϕ(z) = (x1, . . . , xn) ∈ Rn we can locally define theRiemannian radial distance function as the Euclidean norm of its coordin-ates

r(ϕ(z)) :=

(n∑i=1

(xi)2

)1/2

, (3.6)

(d) and the unit radial vector field ∂∂r

by

∂r:=

n∑i=1

xi

r

∂xi,

where ( ∂∂x1 |z, . . . , ∂

∂x1 |z) is the basis21 of coordinate vectors on TzM inducedby the normal coordinate chart (Uz, ϕ).

Remark 3.7. Let (Uz, ϕ), ϕ = (x1, . . . , xn) be any normal coordinate chartcentered at z. At any point z ∈ (expz(U0) \ z), the vector ∂

∂r|z is the velocity

vector of the unit speed geodesic from z to z.

If v =∑n

i=1 vi ∂∂xi

is a vector in Ez such that γv : Iγv −→M is a radial geodesiceminating at γv(0) = z, with γv(0) = v and passing through γv(t) = z at sometime t ∈ Iγv , then for sufficiently small t ≥ 0, the coordinate representation of γvis

ϕ(γv(t)) = (tv1, . . . , tvn). (3.7)

Then for any z ∈ expz(∂Bε(0)), d(z, z) = r(ϕ(z)) = ε where dg(·, ·) denotes thegeodesic distance on (M, g). Therefore, the set S(z; ε) := y ∈ M : dg(z, y) = εin fact coincides with the geodesic spheres expz(∂Bε(0)). And most importantly,the exponential map satisfies Gauss’s following Lemma:

Lemma 3.8 (Gauss). Let (M, g) be a Riemannian manifold and let U0 ⊂ TzMbe such that expz : U0 −→ expz(U0) is a (C1-)diffeomorphism. Then the radialgeodesics eminating from z are all orthogonal to the geodesic spheres in Uz.

21This basis is by construction orthonormal.

Page 180: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Transformation of the heat kernel under isometry 167

Being a diffeomorphism on U0, the (restricted) exponential map

expz : U0 −→ Uzvt 7−→ expz(vt) = γv(t)

induces a pushforward map on the tangent spaces

(expz)∗ : TvU0 −→ Texpz(vt)M.

There is a natural identification of Tvt(TzM) and TzM , here we denote the uniqueelement in Tvt(TzM) corresponding to a w ∈ TzM by w and vice versa. Thenby the Gauss lemma, for any vector w ∈ Tvt(TzM) with the property that thecorresponding w ∈ TzM is g-orthogonal 〈vt, w〉g(z) = 0 to vt at TzM , the imageunder the pushforward map remains g- orthogonal

〈(expz)∗vt, (expz)∗w〉g(expz(vt)) = 0

at Texpz(vt)M . The statement of the Gauss lemma holds locally also in the caseof pseudo-Riemannian metrics, such as the Minkowski metric in the hyperboloidmodel of the hyperbolic space. See: [133, p. 289.]

Remark 3.9. [Riemanninan Distance and the Eikonal Equation] It is a con-sequence of the Gauss Lemma that if r denotes the radial distance function onUz, then

grad(r) =∂

∂r. (3.8)

Together with the fact, that ∂∂r

is the unit radial vector field, it follows that

〈grad(r), grad(r)〉g = 1 for all z ∈ Uz \ z, (3.9)

and therefore the gradient of the Riemannian distance function satisfies theEikonal Equation (3.9), locally on Uz \ z.

Proof. The statement (3.8) is verified by showing that for any z and any X(z) ∈TzM

〈X(z),∂

∂r(z)〉g(z) = dr(X(z)) (:= X(z)(r))

The idea is to decompose X(z) = X⊥(z) +X>(z), where X⊥(z) = α ∂∂r

(z) for anα > 0, satisfying dr(X⊥(z)) = α and X>(z) is tangent to the geodesic sphereabout z passing through z and therefore satisfies dr(X>(z) = 0. The analogousstatements on the left hand side of the equation follows by the Gauss lemma, thisestablishes the equality. See: [118, p.103. Corollary 6.9].

Page 181: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

168 B Some Background on Analysis on Manifolds

If E1, . . . , En are chosen to be -the usual Euclidean- polar coordinates onTzM , then the above becomes

ϕ(γv(t)) = (t, ξ(t)),

for some (n − 1)-dimensional function ξ. As a consequence of the Gauss lemmait holds furthermore, that the Euclidean polar coordinates on Tz induce geodesicpolar coordinates r, θ1, . . . , θn−1 on Uz, where r is again the radial distancefunction. The coordinate vectors ∂

∂r, ∂∂θj

, j ∈ 1, . . . , n− 1 corresponding to thegeodesic polar coordinates r, θj satisfy

〈 ∂∂r,∂

∂r〉 = 1, and 〈 ∂

∂r,∂

∂θj〉 = 0, j ∈ 1, . . . , n− 1,

and by the property ([117, p.277. Ex. 11.13.]) of the Riemannian metric g thisyields the matrix representation

(gi,j)i,j∈1...,n =

1 0 . . . 00 ∗ . . . 0...

... . . . ...0 ∗ · · · ∗

(3.10)

For further details and implications see: [118, p.180. Proposition 10.9.]

Corollary 3.10 (Polar Coordinates). Let (M, g) be a Riemannian manifold ofdimension n, such that the Riemannian metric has a representation in the form(3.10). Then the Riemannian measure µg is given in polar coordinates by

dµg = ψ(r)n−1drdθ, (3.11)

where dθ stands for the Riemannian measure on the (n− 1)-sphere Sn−1,and the Laplace-Beltrami operator on (M, g) has the form

∆g =∂

∂r+

(d

drlog(ψ(r)n−1)

)∂

∂r+

1

ψ(r)2∆Sn−1 , (3.12)

where ∆Sn−1 denotes the Laplace-Beltrami operator on the (n− 1)-sphere Sn−1.

Example 3.11. As a consequence of the above, we obtain in particular the fol-lowing polar-coordinate representations for the respective Laplace-Beltrami op-erators of the spaces Rn, Sn and Hn.

In Rn we have: ψ(r) = r, thus (3.11) becomes dµ = rn−1drdθ,and (3.12) has the form

∆Rn =∂

∂r+n− 1

r

∂r+

1

r2∆Sn−1 . (3.13)

Page 182: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

3 Transformation of the heat kernel under isometry 169

In Sn we have: ψ(r) = sin(r), thus (3.11) becomes dµ = sin(r)n−1drdθ,and (3.12) has the form

∆Sn =∂

∂r+ (n− 1) cot(r)

∂r+

1

sin(r)2∆Sn−1 . (3.14)

In Hn we have: ψ(r) = sinh(r), thus (3.11) becomes dµ = sinh(r)n−1drdθ,and (3.12) has the form

∆Hn =∂

∂r+ (n− 1) coth(r)

∂r+

1

sinh(r)2∆Sn−1 . (3.15)

See [81, Example 3.23, p. 82].

Page 183: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

170 B Some Background on Analysis on Manifolds

Page 184: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Bibliography

[1] M. Abramowitz and I. A. Stegun. Handbook of Mathematical Functions:With Formulas, Graphs, and Mathematical Tables.Washington, D.C.: DoverPublications, 1964.

[2] R. A. Adams. Sobolev Spaces. Academic Press, 1975.

[3] Y. Aït Sahalia. “Closed form likelihood expansions for multivariate diffu-sions”. In: The Annals of Statistics 88.(2) (2008), pp. 906–937.

[4] S. Albeverio and M. Röckner. “Classical Dirichlet forms on topologicalvector spaces — closability and a Cameron-Martin formula”. In: Journalof Functional Analysis 88 (1990), pp. 395–436.

[5] A. Alfonsi. “High order discretization schemes for the CIR process: ap-plication to affine term structure and Heston models”. In: Mathematics ofComputation 79 (2010), pp. 209–237.

[6] A. Alfonsi. “On the discretization schemes for the CIR (and Bessel squared)processes”. In:Monte Carlo Methods and Applications 11.4 (2005), pp. 355–384.

[7] L. Alili and J.-C. Gruet. “An explanation of a generalized Bougerol’s iden-tity in terms of hyperbolic Brownian motion”. In: Exponential Functionalsand Principal Values related to Brownian Motion. A collection of researchpapers Biblioteca de la Revista Matematica Iberoamericana. MR1648653(1997).

[8] H. Amann and J. Escher. Analysis III. Birkhäuser Basel, 2008.

[9] L. B. G. Andersen. “Simple and efficient simulation of the Heston stochasticvolatility model”. In: Journal of Computational Finance 11.3 (2008), pp. 1–42.

[10] L. B. G. Andersen and V. V. Piterbarg. “Moment explosions in stochasticvolatility models”. In: Finance and Stochastics 11.1 (2007), pp. 29–50.

[11] J. Andreasen and B. Huge. “ZABR – Expansions for the masses”. Preprint,SSRN // 1980726. 2011.

[12] J. Andreasen and B. Huge. “ZABR–Expansions for the masses”. In: RiskMagazine January (2013).

[13] A. Antonov, M. Konikov and M. Spector. “Mixing SABR Models for Neg-ative Rates.” Preprint, SSRN//2653682. 2015.

Page 185: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

172 BIBLIOGRAPHY

[14] A. Antonov, M. Konikov and M. Spector. “The free boundary SABR:natural extension to negative rates”. Preprint, SSRN//2557046. 2015.

[15] A. Antonov and M. Spector. “Advanced Analytics for the SABR Model”.Preprint, SSRN//2026350. 2012.

[16] A. Antonov, M. Konikov and M. Spector. “SABR spreads its wings”. In:Risk Magazine August (2013).

[17] P. Balland and Q. Tran. “SABR goes normal”. In: Risk Magazine June(2013), pp. 76–81.

[18] P. Barrieu, A. Rouault and M. Yor. “A study of the Hartman-Watsondistribution motivated by numerical problems related to Asian option pri-cing”. In: Journal of Applied Probability 41 (2004), pp. 1049–1058.

[19] R. P. Bass. Diffusions and Elliptic Operators. Probability and its Applic-ations. Springer.

[20] C. Bayer, P. Friz and R. Loeffen. “Semi-closed form cubature and applica-tions to financial diffusion models”. In: Quantitative Finance 13.5 (2013),pp. 769–782.

[21] C. Bayer, J. Gatheral and M. Karlsmark. “Fast Ninomiya–Victoir calibra-tion of the double-mean-reverting model”. In: Quantitative Finance 13.11(2013), pp. 1813–1829.

[22] G. Ben Arous. “Développement asymptotique du noyau de la chaleur hypo-elliptique hors du cut-locus”. In: Annales Scientifiques de l’Ecole NormaleSupérieure 4.21 (1988), pp. 307–331.

[23] G. Ben Arous. “Methods de Laplace et de la phase stationnaire sur l’espacede Wiener”. In: Stochastics 25 (1988), pp. 125–153.

[24] S. Benaim and P. Friz. “Regular variation and smile asymptotics”. In:Mathematical Finance 19 (2009), pp. 1–12.

[25] E. Benhamou and O. Croissant. “Local time for the SABR model: Connec-tion with the ’complex’ Black Scholes and application to CMS and spreadoptions”. SSRN//1064461. 2007.

[26] H. Berestycki, J. Busca and I. Florent. “Computing the implied volatilityin stochastic volatility models”. In: Communications on Pure and AppliedMathematics 57 (2004), pp. 1352–1373.

[27] S. Beuchler, R. Schneider and C. Schwab. “Multiresolution weighted normequivalences and applications”. In: Numerical Mathematics 98 (2004), 67–97.

[28] J. Blath, L. Döring and A. Etheridge. “On the moments and the interfaceof the symbiotic branching model”. In: The Annals of Probability 39 (2011),pp. 252–290.

[29] A. N. Borodin and P. Salminen. Handbook of Brownian Motion—Factsand Formulae. 2nd. Basel: Birkhäuser, 2002.

Page 186: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

BIBLIOGRAPHY 173

[30] B. Böttcher. “Feller processes: The next generation in modeling. Brownianmotion, Lévy processes and beyond”. In: PLoS ONE 5 (2010), e15102.

[31] B. Böttcher. “On the construction of Feller processes with unbounded coef-ficients”. In: Electronic Communications in Probability 16 (2011), pp. 545–555.

[32] B. Böttcher, R. Schilling and J. Wang. Lévy Matters III. Vol. 2099. LectureNotes in Mathematics. Springer, 2013.

[33] N. Bouleau and N. Denis. “Energy image density property and the lentparticle method for Poisson measures”. In: Journal of Functional Analysis257 (2009), pp. 1144–1174.

[34] N. Bouleau and F. Hirsch. Dirichlet Forms and Analysis on Wiener Space.De Gruyter, 1991.

[35] D. R. Brecher and A. E. Lindsay. “Results on the CEV process, past andpresent”. Preprint, SSRN//1567864. 2010.

[36] J. A. van Casteren. Markov Processes, Feller Semigroups and EvolutionEquations. Vol. 12. Series on Concrete and Applicable Mathematics. WorldScientific Press, 2011.

[37] A. C. Cavalheiro. “Weighted sobolev spaces and degenerate elliptic equations-doi: 10.5269/bspm. v26i1-2.7415”. In: Boletim da Sociedade Paranaense deMatemática 26.1-2 (2008), pp. 117–132.

[38] J.-F. Chassagneux, A. Jacquier and I. Mihaylov. “An explicit Euler schemewith strong rate of convergence for financial SDEs with non-Lipschitz coef-ficients”. Preprint, arXiv:1405.356. 2014.

[39] I. Chavel. Eigenvalues in Riemannian Geometry. Vol. 115. Academic Press,1984.

[40] B. Chen, C. W. Oosterlee and H. Van Der Weide. “A low-bias simula-tion scheme for the SABR stochastic volatility model”. In: InternationalJournal of Theoretical and Applied Finance 15.02 (2012), p. 1250016.

[41] P. Collin-Dufresne, K. Daniel, C. Moallemi and M. Saglam. “Strategicasset allocation with predictable returns and transaction costs”. Preprint.2012.

[42] R. Cont and E. Voltchkova. “A finite difference scheme for option pricingin jump diffusion and exponential Lévy models”. In: SIAM Journal onNumerical Analysis 43.4 (2005), pp. 1596–1626.

[43] S. Cox, M. Hutzenthaler and A. Jentzen. “Local Lipschitz continuity in theinitial value and strong completeness for nonlinear stochastic differentialequations”. Preprint, arXiv:1309.5595. 2013.

[44] W. Dahmen, A. Kunoth and K. Urban. “Biorthogonal spline wavelet in theinterval - stability and moment conditions”. In: Applied and ComputationalHarmonic Analysis 6.132196 (1999).

Page 187: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

174 BIBLIOGRAPHY

[45] S. De Marco and P. Friz. “Varadhan’s formula, conditioned diffusions, andlocal volatilities”. Preprint, arXiv:1311.1545. 2013.

[46] S. De Marco, C. Hillairet and A. Jacquier. “Shapes of implied volatilitywith positive mass at zero”. Preprint, arXiv:1310.1020. 2013.

[47] M. Demuth and J. A. van Casteren. Stochastic Spectral Theory for Selfad-joint Feller Operators. Birkhäuser Basel, 2000.

[48] L. Denis and M. Kervarec. “Utility functions and optimal investment innon-dominated models”. Preprint. 2007.

[49] S. Dereich, A. Neuenkirch and L. Szpruch. “An Euler-type method for thestrong approximation of the Cox–Ingersoll–Ross process”. In: Proceedingsof the Royal Society of London A: Mathematical, Physical and EngineeringSciences. The Royal Society. 2011, rspa20110505.

[50] J. Deuschel, P. Friz, Jacquier, A. and S. Violante. “Marginal density ex-pansions for diffusions and stochastic volatility, Part I: Theoretical founda-tions”. In: Communications on Pure and Applied Mathematics 67.1 (2014),pp. 40–82.

[51] J. Deuschel, P. Friz, A. Jacquier and S. Violante. “Marginal density ex-pansions for diffusions and stochastic volatility, Part II: Applications”. In:Communications on Pure and Applied Mathematics 67.2 (2014), pp. 321–350.

[52] L. Döring, B. Horvath and J. Teichmann. “Geometry and time change:Functional analytic (ir-)regularity properties of SABR-type processes”.Preprint. 2015.

[53] P. Dörsek and J. Teichmann. “A semigroup point of view on splittingschemes for stochastic (partial) differential equations”. Preprint, arXiv:1011.2651.

[54] P. Doust. “No-arbitrage SABR”. In: The Journal of Computational Fin-ance 15.3 (2012), p. 3.

[55] D. Dufresne. “The integral of geometric Brownian motion”. In: Advancesin Applied Probability 33.1 (2001), pp. 223–241.

[56] A. F. M. ter Elst, D. W. Robinson and A. Sikora. “Small time asymptoticsof diffusion processes”. In: Journal of Evolution Equations 7 (2007), pp. 79–112.

[57] A. F. M. ter Elst, D. W. Robinson, A. Sikora and Y. Zhu. Dirichletforms and degenerate elliptic operators. The Philippe Clément Festschrift.Vol. 168. Operator Theory: Advances and Applications. Birkhäuser Basel,2006, pp. 73–95.

[58] A. F. M. ter Elst, D. W. Robinson, A. Sikora and Y. Zhu. “Second-order op-erators with degenerate coefficients”. In: Proceedings of the London Math-ematical Society 3 (2007), pp. 299–328.

Page 188: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

BIBLIOGRAPHY 175

[59] D. Elworthy. “Geometric aspects of diffusions on manifolds”. In: Écoled’Été de Probabilités de Saint-Flour XV–XVII, 1985–87. Springer, 1988,pp. 277–425.

[60] L. G. Epstein and S. Ji. “Ambiguous volatility, possibility and utilityin continuous time”. In: Journal of Mathematical Economics 50 (2014),pp. 269–282.

[61] S. Ethier and T. Kurtz. Markov processes: characterization and conver-gence. Wiley series in probability and mathematical statistics. Probabilityand mathematical statistics. Wiley, 1986.

[62] E. B. Fabes, C. E. Kenig and R. P. Serapioni. “The local regularity of solu-tions of degenerate elliptic equations”. In: Communications in Statistics-Theory and Methods 7.1 (1982), pp. 77–116.

[63] D. Filipovic. Term-Structure Models: A Graduate Course. Springer Fin-ance. Springer, 2009.

[64] M. Forde and A. Pogudin. “The large-maturity smile for the SABR andCEV-Heston models”. In: International Journal of Theoretical and AppliedFinance 16.8 (2013), pp. 0219–0249.

[65] M. Forde and H. Zhang. “Sharp tail estimates for the correlated SABRmodel”. Preprint, 2014.

[66] J.-P. Fouque, C. S. Pun and H. Y. Wong. “Portfolio optimization withambiguous correlation and stochastic volatilities”. Preprint. 2014.

[67] M. Fukushima, Y. Oshima and M. Takeda. Dirichlet Forms and SymmetricMarkov Processes. De Gruyter, 1994.

[68] J. García-Cuerva and J. R. De Francia. Weighted Norm Inequalities andRelated Topics. Elsevier, 2011.

[69] L. Garlappi, R. Uppal and T. Wang. “Portfolio selection with parameterand model uncertainty: A multi-prior approach”. In: Review of FinancialStudies 20.1 (2007), pp. 41–81.

[70] N. Gârleanu and L. H. Pedersen. “Dynamic portfolio choice with frictions”.Preprint. 2014.

[71] N. Gârleanu and L. H. Pedersen. “Dynamic trading with predictable re-turns and transaction costs”. In: Journal of Finance 68.6 (2013), pp. 2309–2340.

[72] J Gatheral. The Volatility Surface: a Practitioner’s Guide. Wiley, 2006.

[73] J. Gatheral, E. Hsu, P. Lawrence, C. Ouyang and T.-H. Wang. “Asymp-totics of implied volatility in local volatility models”. In: MathematicalFinance 22.4 (2012), 591–620.

Page 189: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

176 BIBLIOGRAPHY

[74] J. Gatheral and T.-H. Wang. “Implied volatility from local volatility: Apath integral approach”. In: Large Deviations and Asymptotic Methodsin Finance (Editors: P. Friz, J. Gatheral, A. Gulisashvili, A. Jacquier,J. Teichmann), Springer Proceedings in Mathematics and Statistics 110(2015), pp. 247–272.

[75] J. Gatheral and T.-H. Wang. “The heat kernel most likely path approxim-ation”. In: International Journal of Theoretical and Applied Finance 15.1(2012), p. 1250001.

[76] J. Gatheral and A. Jacquier. “Arbitrage-free SVI volatility surfaces”. In:Quantitative Finance 14.1 (2014), pp. 59–71.

[77] P. Gauthier and P.-Y. H. Rivaille. “Fitting the smile, smart parameters forSABR and Heston”. In: Smart Parameters for SABR and Heston (October30, 2009) (2009).

[78] S. Gerhold. “The Hartman-Watson distribution revisited: asymptotics forpricing Asian options”. In: Journal of Applied Probability 48.3 (2011),pp. 892–899.

[79] I. Gilboa and D. Schmeidler. “Maxmin expected utility with non-uniqueprior”. In: Journal of Mathematical Economics 18.2 (1989), pp. 141–153.

[80] I. Gradshteyn and I. Ryzhik. Table of integrals, series and products. Aca-demic Press. 1965.

[81] A. Grigor’yan. Heat Kernel and Analysis on Manifolds. American Math-ematical Society, 2009.

[82] P. Guasoni and E. Mayerhofer. “The limits of leverage”. Preprint. 2014.

[83] A. Gulisashvili. Analytically tractable stochastic stock price models. SpringerScience & Business Media, 2012.

[84] A. Gulisashvili. “Asymptotic formulas with error estimates for call pricingfunctions and the implied volatility at extreme strikes”. In: SIAM Journalon Financial Mathematics 1 (2010), 609–641.

[85] A. Gulisashvili. “Left-wing asymptotics of the implied volatility in thepresence of atoms”. In: International Journal of Theoretical and AppliedFinance 18.2 (2015), p. 1550013.

[86] A Gulisashvili, B. Horvath and A. Jacquier. “Mass at zero and small-strikeimplied volatility expansion in the SABRmodel”. Preprint, arXiv:1502.03254.2015.

[87] A. Gulisashvili and P. Laurence. “The Heston Riemannian distance func-tion”. In: Journal de Mathématiques Pures et Appliquées 101 (2014), pp. 303–329.

[88] P. Hagan, D. Kumar, A. Lesniewski and D. Woodward. “Arbitrage-freeSABR”. In: Wilmott Magazine 69 (2014), pp. 60–75.

[89] P. Hagan, D. Kumar, A. Lesniewski and D. Woodward. “Managing smilerisk”. In: Wilmott Magazine September (2002), pp. 84–108.

Page 190: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

BIBLIOGRAPHY 177

[90] P. Hagan, A. Lesniewski and D. Woodward. “Probability distribution inthe SABR model of stochastic volatility”. In: Large Deviations and Asymp-totic Methods in Finance (Editors: P. Friz, J. Gatheral, A. Gulisashvili,A. Jacquier, J. Teichmann), Springer Proceedings in Mathematics andStatistics 110 (2015), pp. 1–36.

[91] E. Hansen and A. Ostermann. “Exponential splitting for unbounded op-erators”. In: Mathematics of Computation 78.267 (2009), pp. 1485–1496.

[92] L. P. Hansen and T. J. Sargent. “Robust control and model uncertainty”.In: American Economic Review 91.2 (2001), pp. 60–66.

[93] L. P. Hansen and T. J. Sargent. Robustness. Princeton, N.J.: PrincetonUniversity Press, 2008.

[94] D. Hernández-Hernández and A. Schied. “A control approach to robustutility maximization with logarithmic utility and time-consistent penal-ties”. In: Stochastic Process and their Applications 117.8 (2007), pp. 980–1000.

[95] D. Hernández-Hernández and A. Schied. “Robust utility maximization ina stochastic factor model”. In: Statis. Decisions 24.1 (2006), pp. 109–125.

[96] N. Hilber, O. Reichmann, C. Schwab and C. Winter. Computational Meth-ods for Quantitative Finance: Finite Element Methods for Derivative Pri-cing. Springer Finance. Springer, 2013.

[97] M. Hino and J. A. Ramírez. “Small-time Gaussian behavior of symmetricdiffusion semi-groups”. In: The Annals of Probability 31.3 (2003), pp. 1254–1295.

[98] D. Hobson. “Comparison Results for Stochastic Volatility Models via Coup-ling”. In: Finance and Stochastics 14 (2010), pp. 129–152.

[99] E. P. Hsu. Stochastic analysis on manifolds. Vol. 38. American Mathem-atical Society, 2002.

[100] M. Hutzenthaler, A. Jentzen and M. Noll. “Strong convergence rates andtemporal regularity for Cox-Ingersoll-Ross processes and Bessel processeswith accessible boundaries”. Preprint, arXiv:1403.6385. 2014.

[101] O. Islah. “Heun solutions to the SABRmodel”. Preprint, SSRN // 1742942.

[102] O. Islah. “Solving SABR in exact form and unifying it with LIBOR marketmodel”. Preprint, SSRN // 1489428. 2009.

[103] S. Iyengar. “Hitting lines with two-dimensional Brownian motion”. In:SIAM Journal on Applied Mathematics 45.6 (1985), pp. 983–989.

[104] J. Jacod and A. N. Shiryaev. Limit Theorems for Stochastic Processes.2nd. Berlin: Springer, 2003.

[105] M. Jeanblanc, M. Yor and M. Chesney. Mathematical Methods for Finan-cial Markets. Springer Finance. Springer, 2009.

Page 191: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

178 BIBLIOGRAPHY

[106] D Jerison and A. Sánchez-Calle. “Subelliptic, second order differential op-erators”. In: Complex Analysis III, Lecture Notes in Mathematics, Springer1277 (1987), pp. 46–77.

[107] B. Jourdain. “Loss of martingality in asset price models with lognormalstochastic volatility”. In: International Journal of Theoretical and AppliedFinance 13 (2004), pp. 767–787.

[108] O. Kallenberg. Foundations of Modern Probability. Springer, 2002.

[109] J. Kallsen. “Derivative pricing based on local utility maximization”. In:Finance and Stochastics 6.1 (2002), pp. 115–140.

[110] J. Kallsen and A. Shiryaev. “Time Change Representation of Stochastic In-tegrals”. In: Theory of Probability and Its Applications 46 (2002), pp. 522–528.

[111] J. Kienitz and M. Wittke. “Option valuation in multivariate SMM/SABRmodels (with an application to the CMS spread)”. In: SABR Models (withan Application to the CMS Spread)(June 30, 2010) (2010).

[112] A. Kufner. Weighted sobolev spaces. Vol. 31. John Wiley & Sons Incorpor-ated, 1985.

[113] A. Kufner and B. Opic. “How to define reasonably weighted Sobolevspaces”. In: Commentationes Mathematicae Universitatis Carolinae 25.3(1984), pp. 537–554.

[114] P.-H. Labordère. Analysis Geometry and Modelling in Finance: AdvancedMethods in Option Pricing. Chapman & Hall/CRC, 2008.

[115] P.-H. Labordère. “Unifying the BGM and SABR Models: A Short Ridein Hyperbolic Geometry”. In: Large Deviations and Asymptotic Methodsin Finance (Editors: P. Friz, J. Gatheral, A. Gulisashvili, A. Jacquier,J. Teichmann), Springer Proceedings in Mathematics and Statistics 110(2015), pp. 71–88.

[116] F. Le Floc’h and G. J. Kennedy. “Finite difference techniques for arbitragefree SABR”. Preprint, SSRN//2402001.

[117] J. M. Lee. Introduction to Smooth Manifolds. Graduate Texts in Mathem-atics. Springer, 2003.

[118] J. M. Lee. Riemannian Manifolds: An Introduction to Curvature. GraduateTexts in Mathematics. Springer, 1997.

[119] R. W. Lee. “The moment formula for implied volatility at extreme strikes”.In: Mathematical Finance 14 (2004), 469–480.

[120] J.-L. Lions and E. Magens. “Problèmes aux limites non homogènes etapplications”. In: Travaux et Recherches Mathématiquies 1 (1968).

[121] P.-L. Lions and M. Musiela. “Correlations and bounds for stochastic volat-ility models”. In: Annales de l’Institut Henri Poincaré (C) Non LinearAnalysis 24.1 (2007), pp. 1 –16.

Page 192: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

BIBLIOGRAPHY 179

[122] R. Lord. “Fifty shades of SABR simulation”. Available at: http://www.rogerlord.com. 2014.

[123] Z.-M. Ma and M. Röckner. Introduction to the theory of (non-symmetric)Dirichlet forms. Springer Science & Business Media, 2012.

[124] P. J. Maenhout. “Robust portfolio rules and asset pricing”. In: Review ofFinancial Studies 17.4 (2004), pp. 951–983.

[125] P. J. Maenhout. “Robust portfolio rules and detection-error probabilitiesfor a mean-reverting risk premium”. In: Journal of Economic Theory 128.1(2006), pp. 136–163.

[126] R. Martin. “Optimal trading under proportional transaction costs”. In:Risk Magazine August (2014).

[127] R. Martin and T. Schöneborn. “Mean reversion pays, but costs”. In: RiskMagazine February (2012), pp. 96–101.

[128] M. Matache, T. von Petersdorff and C. Schwab. “Fast Deterministic Pri-cing of Options on Lévy Driven Assets”. In:M2AN Mathematical Modellingand Numerical Analysis 38(1) (2004), pp. 37–71.

[129] A. Matoussi, D. Possamai and C. Zhou. “Robust utility maximization innondominated models with 2BSDE: the uncertain volatility model”. In:Mathematical Finance to appear (2013).

[130] H. Matsumoto and M. Yor. “Exponential Functionals of Brownian MotionI: Probability laws at fixed time.” In: Probability Surveys 2 (2005), pp. 312–347.

[131] H. Matsumoto and M. Yor. “Exponential functionals of Brownian mo-tion, II: Some related diffusion processes”. In: Probability Surveys 2 (2005),pp. 348–384.

[132] A. Metzler. “On the first passage problem for correlated Brownian motion”.In: Statistics & probability letters 80.5 (2010), pp. 277–284.

[133] P. W. Michor. Topics in Differential Geometry. Graduate Studies in Math-ematics. American Mathematical Society, 2008.

[134] A. Mijatović and M. Pistorius. “On additive time-changes of Feller pro-cesses”. In: Progress in Analysis and its Applications Proceedings of the7th International Isaac Congress, Imperial College London UK (2009),pp. 431–437.

[135] P. D. Miller. Applied Asymptotic Analysis. Vol. 75. Graduate Studies inMathematics. American Mathematical Society, 2006.

[136] B. Muckenhoupt. “Weighted norm inequalities for the Hardy maximalfunction”. In: Transactions of the American Mathematical Society (1972),pp. 207–226.

[137] M. Musiela and M. Rutkowski. Martingale Methods in Financial Model-ling. Vol. 36. Springer Science & Business Media, 2006.

Page 193: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

180 BIBLIOGRAPHY

[138] J. R. Norris. “Small time asymptotics for heat kernels with measurablecoefficients”. In: C.R. Acad. Sci. Paris, Serie I. 322 (1996), pp. 339–344.

[139] M. Nutz. “A quasi-sure approach to the control of non-Markovian stochasticdifferential equations”. In: Electronic Journal of Probability 17.23 (2012),pp. 1–23.

[140] J. Obłój. “Fine-tune your Smile: Correction to Hagan et al.” In: WilmottMagazine May (2008).

[141] H. Park. “Efficient valuation method for the SABR model”. In: Preprint,SSRN//2304951 (2013).

[142] L. Paulot. “Asymptotic implied volatility at the second order with applic-ation to the SABR model”. In: Large Deviations and Asymptotic Methodsin Finance (Editors: P. Friz, J. Gatheral, A. Gulisashvili, A. Jacquier,J. Teichmann), Springer Proceedings in Mathematics and Statistics 110(2015), pp. 37–70.

[143] T. von Petersdorff and C. Schwab. “Wavelet Discretizations of ParabolicIntegrodifferential Equations”. In: SIAM Journal on Numerical Analysis41 (2003), pp. 159–180.

[144] P. Protter. Stochastic Integration and Differential Equations. 2nd. NewYork: Springer, 2004.

[145] W. Qi. “Analytical Solutions of the SABR Stochastic Volatility Model”.PhD thesis. Columbia University, 2012.

[146] M.-C. Quenez. “Optimal portfolio in a multiple-priors model”. In: Sem-inar on Stochastic Analysis, Random Fields and Applications IV. Vol. 58.Progress in Probability. Basel: Birkhauser, 2004, pp. 291–321.

[147] R. Rebonato, K. McKay and R. White. The SABR/LIBOR Market Model:Pricing, calibration and hedging for complex interest-rate derivatives. JohnWiley & Sons, 2011.

[148] M. Reed and B. Simon. Methods of modern mathematical physics: Func-tional analysis. Vol. 1. Gulf Professional Publishing, 1980.

[149] O. Reichmann. “Numerical Option Pricing Beyond Lévy”. PhD thesis.ETH Zürich, 2012.

[150] O. Reichmann and C. Schwab. “Wavelet solution of degenerate Kolmogoroffforward equations for exotic contracts in finance”. In: Recent Develop-ments in Computational Finance: Foundations, Algorithms and Applic-ations (Editors: Th. Gerstner and P. Kloeden) 14 (2013).

[151] D. Revuz and M. Yor. Continuous Martingales and Brownian Motion. Aseries of comprehensive studies in mathematics. Springer, 1999.

[152] M. Röckner and Z. Sobol. “Kolmogorov equations in infinite dimensions:Well-posedness and regularity of solutions, with applications to stochasticgeneralized Burgers equations”. In: The Annals of Probability 34.2 (2006),pp. 663–727.

Page 194: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

BIBLIOGRAPHY 181

[153] M. Röckner and N. Wielens. “Dirichlet forms-closability and change ofspeed measure”. In: Res. Notes in Math. Infinite-dimensional analysis andstochastic processes (Bielefeld, 1983) 124 (1985), 119–144.

[154] L. C. G. Rogers. “Stochastic Financial Models”. 2012.

[155] L. C. G. Rogers and D. Williams. Diffusions, Markov Processes and Mar-tingales: Volume 2, Itô Calculus. Cambridge Mathematical Library. Cam-bridge University Press, 2000.

[156] A. Schied. “Optimal investments for robust utility functionals in com-plete market models”. In:Mathematics of Operations Research 30.3 (2005),pp. 750–764.

[157] D. Schmeidler. “Subjective probability and expected utility without ad-ditivity”. In: Econometrica 57.3 (1989), pp. 571–587.

[158] M. Schroder and C. Skiadas. “Optimal lifetime consumption-portfolio strategiesunder trading constraints and generalized recursive preferences”. In: StochasticProcess and their Applications 108.2 (2003), pp. 155–202.

[159] N. A. Sidorova and O. Wittich. “Construction of surface measures forBrownian motion”. In: Trends in Stochastic analysis - a Festschrift in hon-our of Heinrich v. Weizsäcker. 2009.

[160] C. Skiadas. “Robust control and recursive utility”. In: Finance and Stochastics7.4 (2003), pp. 475–489.

[161] K.-T. Sturm. “Analysis on local Dirichlet spaces. I. Recurrence, conservat-iveness and Lp-Liouville properties”. In: Journal für die reine und ange-wandte Mathematik 456 (1994), pp. 173–196.

[162] K.-T. Sturm. “Is a Diffusion Process Determined by Its Intrinsic Metric?”In: Chaos, Solitions & Fractals 8 (1997), pp. 1855–1866.

[163] K.-T. Sturm. New Directions in Dirichlet Forms. Ed. by J. Jost, W. Kend-all, U. Mosco, M. Röckner and K.-T. Sturm. Vol. 9. Studies in AdvancedMathematics. AMS/IP, 1998.

[164] K.-T. Sturm. “On the geometry defined by Dirichlet forms”. In: Progressin Probability, Birkhäuser Basel 36 (1995), 231–242.

[165] D. Talay and Z. Zheng. “Worst case model risk management”. In: Financeand Stochastics 6.4 (2002), pp. 517–537.

[166] D. Talay and L. Tubaro. “Expansion of the global error for numericalschemes solving stochastic differential equations”. In: Stochastic Analysisand Applications 8.4 (1990), pp. 483–509.

[167] R. Tevzadze, T. Toronjadze and T. Uzunashvili. “Robust utility maxim-ization for a diffusion market model with misspecified coefficients”. In:Finance and Stochastics 17.3 (2013), pp. 535–563.

[168] A. Torchinsky. Real-variable methods in harmonic analysis. Courier Cor-poration, 2012.

Page 195: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

182 BIBLIOGRAPHY

[169] B. O. Turesson. Nonlinear Potential Theory and Weighted Sobolev Spaces.Vol. 1736. Springer Science & Business Media, 2000.

[170] A. Veraart and M. Winkel. “Time change”. In: Encyclopedia of Quantitat-ive Finance (Editor R. Cont) IV (2010), pp. 1812–1816.

[171] V. Volkonskii. “Random time changes in strong Markov processes”. In:Theory Probab. Appl 3 (1958), pp. 310–326.

[172] P. Wilmott, S. Howison and J. Dewynne. The Mathematics of FinancialDerivatives: A Student Introduction. Cambridge University Press, 1995.

[173] M. Yor. “On some exponential functionals of Brownian motion”. In: Ad-vances in Applied Probability 24 (1992), pp. 509–531.

[174] W. P. Ziemer. Weakly Differentiable Functions: Sobolev Spaces and Func-tions of Bounded Variation. Graduate Texts in Mathematics. Springer,1989.

Page 196: Rights / License: Research Collection In Copyright - Non ...48730/eth-48730-02.pdfwhere 0, ˆ 2[ 1;1], 2[0;1], and W and Z are correlated Brownian motions on a probability space (;F;(F

Curriculum Vitae

Blanka Nora Horvathborn Aug 22, 1985Hungarian citizen

Education

PhD in Mathematics 02/2011–08/2015ETH Zürich, Switzerland

Diploma in Mathematics 10/2008–11/2010University of Bonn, Germany

Master of Economics 09/2007–09/2008The University of Hong Kong

Double major studies in Mathematics and Economics 10/2004–08/2007University of Bonn, Germany

Abitur at Deutsche Schule Budapest 09/1996–06/2004Budapest, Hungary

Employment

Exercise coordinator in Mathematics 02/2011–08/2015ETH Zürich, Switzerland

Quantitative internship Risk Management Group 11/2010–01/2011AXA Cologne, Germany

Tutorial and research assistant in Mathematics 10/2008–10/2010University of Bonn, Germany 10/2006–08/2007