callable swaps, snowballs and videogamesfinmath.stanford.edu › seminars › documents ›...

Callable Swaps, Snowballs and Videogames

Claudio Albanese

? ? ?

Presented at Stanford University, October 2007

History of short rates (fund rates) for US dollar, the Euro and

the Japanese Yen.

1

Brief (and incomplete) history of interest rate derivatives.

• Black model.

• Vasicek-Hull-White, CIR model and more general affine models

(Duffie, Singleton, Cheyette, etc.)

• Market models: Heath-Jarrow-Morton and Brace-Gatarek-Musiela

• SABR (Hagan et al.)

2

Common themes.

• All models have some degree of analytic solvabilty thought neces-sary for calibration.

• Lattice models are implemented using implicit differentiation schemesand smoothing techniques.

• Measure change techniques are ubiquitous.

• The stochastic process for the drift of rates is not model explicitly.

• The stochastic process for the drift of rates is not model explicitly.

• Engineering implementations involve large clusters.

3

Real world, derivatives and building the missing link.

Processes used for interest rate derivatives deviate substantially from

the observed real world process and don’t possess these qualitative

features. See also the review paper by Q. Dai and K. Singleton (2003),

Term Structure Dynamics in Theory and Reality. Review of Financial

Studies 16, 631678.

Consistent pricing of long dated exotic derivatives requires is very

sensitive to model risk. A better agreement with the real world process

is necessary.

Punchline: A suitable mathematical and engineering implementation

framework can be built by implementing operator methods on mas-

sively parallel GPU architectures.

4

Stochastic calculus Operator methods

Arbitrage free pricing Arbitrage free pricingSigma algebra axioms Convergence estimates on

simplicial sequencesDynamics described Dynamics described by

by SDEs Markov generatorsPade approximants Fast exponentiationDouble precision Single precision

Exponential propagation of Homogeneizationerrors at high frequencies

Diffusions and sparse matrices Jump processes and full matricesAnalytic solvability Reducibility to manipulations of

matrices small enough to fit in memoryCPUs and CPU clusters CPU/GPU pairs and GPU clusters

Calibrated market models Calibrated econometric models withwith drift restrictions explicit drift modeling

Measure changes Operator manipulations possiblywithout probabilistic interpretation

Stochastic integrals Abelian processesMontecarlo methods Dynamic conditioning

Arbitrage free pricing

Arbitrage free pricing is the leading pricing framework adopted in the

industry. According to the fundamental theorem of arbitrage-free pric-

ing, if one finds prices by risk neutral valuation of future payoffs and

if the chosen pricing measure is mutually absolutely continuous with

respect to the statistical measure, then there are no arbitrage oppor-

tunities.

When using operator methods, there is no benefit in considering nu-

meraire assets other than the money market account.

Utility pricing and indifference pricing have been considered in the

stochastic control literature, not (yet) with operator methods.

5

Measure theory based on topological spaces and operator alge-

bras

The theory of integration can be rooted on sigma algebras and mea-

sure spaces. It can also be rooted on the theory of locally compact,

Hausdorff topological spaces whereby one defines integrals as linear

bounded functionals over the linear space of continuous functions and

then completes with respect to various norms to obtain Lp spaces.

The latter approach leads to operator algebras, i.e. C∗ algebras.

The second approach has the advantage from a computational view-

point to be “understandable constructively”. Constructive mathemat-

ics is a branch of mathematics whereby one limits logic constructs so

that any object needs to be explicitly constructed prior to applying an

existential qualifier to it. Since Finance is a very applied and computa-

tional science, a fully constructive approach to Mathematical Finance

seems appropriate.

6

Dynamics described by Markov generators

Consider a finite state space Λ. A Markov generator is given by a

time-dependent matrix of elements L(x, x′; t) indexed by x, x′ ∈ Λ that

allows one to define a Markov process and construct transition proba-

bility kernles U(x, t;x′, t′) as the solutions of the backward differential

equation:

d

dtU(x, t;x′, t′) +

∑yL(x, y; t)U(y, t;x′, t′) = 0 (1)

with final time condition U(x, t;x′, t) = δxx′.

7

Jump processes and full matrices

Specifying an SDE is equivalent to specifying a Markov generator.

When discretized, the generator corresponding to a diffusion process

is a tri-diagonal matrix. However, when working with generators di-

rectly on GPU platforms, sparsity patterns do not yield any appreciable

advantage and one can freely consider full matrices.

The single-name propagator satisfying equation (1) is given by the so

called “path-ordered exponential”

U(t1, t2) = P exp

( ∫ t2

t1L(s)ds

). (2)

There are two useful methods to express a path-ordered exponential:

Feynman path integrals and Dyson expansions.

8

Feynman path-integrals

The Feynman path-integral expansion is given by

P exp

( ∫ t2

t1L(s)ds

)= lim

N→∞

(I+δtNL(t1)

)(I+δtNL(t1+δt2)

)...(I+δtNL(tN)

)(3)

where δtN = t2−t1N .

An application is given below.

9

Dyson’s formula

Dyson’s formula is given by

P exp

( ∫ t2

t1L(s)ds

)=

∞∑n=0

1

n!P( ∫ t2

t1L(s)ds

)n

. (4)

where

P( ∫ t2

t1L(s)ds

)n

= n!∫ t2

t1ds1

∫ t2

s1ds2....

∫ t2

sn−1

dsnL(s1)L(s2)...L(sn). (5)

An application is given below.

10

Fast exponentiation

The Feynman path integral representation is interesting as this formulacan be implemented numerically very efficiently on GPU architecturesby the method of fast exponentiation. The method works as follows.Assume that the dynamic generators L(t) are piecewise constant as afunction of time. Suppose L(t) = Li in the time interval [ti, ti+(∆t)i].Assume δt be chosen so small that the following two conditions hold:

(FE1) miny∈Λ

(1 + δtLi(y, y)) ≥ 1/2

(FE2) log2(∆t)i

δt= n ∈ N.

This condition leads to intervals δt of the order of one hour of calendartime and this is indeed the choice we make. To compute e(∆t)iLi(x, y),we first define the elementary propagator

uδt(x, y) = δxy + δtLi(y, y) (6)

and then evaluate in sequence u2δt = uδt · uδt, u4δt = u2δt · u2δt, ...u2nδt = u2n−1δt · u2n−1δt.

11

Internal smoothing, sensitivities and floating point errors

Matrix multiplication is accomplished numerically by invoking either

the single precision routine sgemm or the double precision routine dgemm.

For practical purposes discussed in pricing theory, single precision is

almost always sufficient. In fact, the algorithm of fast-exponentiation

as described above with a δt satisfying the above bound has self-

smoothing properties which lessen the impact of floating point errors

to be far less than what one would naivley expect with a back-of-the

envelope worst case estimate.

12

A new notion of solvability

The traditional notion of analytic solvability involves reducing the cal-culation of certain quantities to the evaluation of special functions(usually a variant of hypergeometric functions) by means of Taylorexpansion or a Pade approximant (Black-Scholes, CEV, CIR, HW,Duffie’s affine models, Heston model, Albanese-Kuznetsov-Lawi clas-sification schemes, etc..).

A second notion of solvability is based on asymptotic expansions (Ha-gan’s SABR, Pieterbarg’s SV-BGM expansions, Papanicolao et al.volatility models).

A new notion of solvability that arises with operator methods is thereducibility of the problem to the multiplication of matrices of sizesmall enough that all the required buffers fit in the CPU and GPUmemory spaces. This definition changes quite radically the flavour ofmathematical work and also moves the boundary of solvable modelsto include a much larger number of processes and payoffs.

13

CPUs and CPU clusters

The traditional hardware frameworks for computational finance are

given by single core CPUs over which one executes single threaded

code

As an alternative, Montecarlo algorithms run on CPU clusters with

slow interconnects over which one spawns multiple processes typically

with a job queing PVM type algorithm.

14

CPU/GPU pairs and GPU clusters

Operator methods perform best on massively parallel multi-core archi-

tectures. A current example would be a Tesla GPU with 16 single-

instruction-multiple-data SIMD processors, each with 32 data registers

and one instruction register, with 1.5GB shared fast-access memory.

Such GPUs are linked to the CPU by a 100 Mhz bus on a PCI-E con-

nection and GPU-CPU data transfer. Although this transfer rate is

orders of magnitude faster than the typical inter-node communication

speed in a cluster, the transfer on a bus is a possible bottleneck which

needs to be avoided by executing GPU side calculations as much as

possible while keeping data buffers resident in GPU memory.

As a next level, GPU clusters allow one to execute loosely couple GPU

jobs in parallel while being coordinated by a CPU.

15

Execution times in seconds for various portfolios under various

configurations.

Task GPU-O GPU-D Host-O RatioInitialization 4.79 5.16 3.77 0.79

Calibration to term structure of rates 4.76 6.14 61.88 13.00585 European swaptions 7.62 10.17 85.30 11.19

Portfolio of CMS spread range accruals:9 callable swaps and 3 callable snowballs 21.03 23.74 134.59 6.40

30240 ATM European swaptions 11.34 16.16 138.68 12.2313725 ATM European swaptions and

13725 ATM Bermuda swaptions 56.56 59.02 272.19 4.81

GPU-O: using the GPU with host side optimized code.

GPU-D: using the GPU with host side debug code.

Host-O: using the host only with optimized code and Intel MKL li-braries.

16

Ratio: column 3 versus column 1.

“Economic” models without drift restrictions

Market models are characterized by a large number of dynamically

constrained processes. They are well suited for Montecarlo simulations

but not to the application of operator methods.

The models that are best suited to operator methods are specified

through a minimalistic filtration without drift restrictions. These tend

to be models with an intuitive economic content and explanatory

power such as short rate interest rate models, credit equity models

specified as defaultable equity models, stochastic skew FX models

with stochastic drift, etc..

17

Measure Changes

A mathematical framework is largely characterized by its morphisms,

the operations we allow ourselves to carry out on the objects at our

disposal.

In the traditional measure theoretic approach to finance, measure

changes play a pivotal role as they allow one to map stochastic pro-

cesses into stochastic processes.

With operator methods, numeraire changes correspond to a transfor-

mation of generators of the form

LG(t) =1

G(t)L(t)G(t) +

1

G(t)

∂G(t)

∂t(7)

where G(t) is a diagonal operator corresponding to the new numeraire.

18

Although this mapping allows one to rederive all the classic results of

continuous time finance from analytic solvability to quanto-options and

drift restrictions, this is not particularly useful to take full advantage

of the formalism.

Operator morphisms

The most useful operator manipulations are

• Path exponentiation

• Operator deformation and differentiation

• Operator lifting for path-dependent processes

• Block diagonalizations for Abelian processes

• Kernel splitting for dynamic conditioning

19

Time dependent adjustment function λ(t) to fit the term struc-

ture of rates.

Notice that this func-

tion is very close to 1. This is achieved by calibrating the drift of the

monetary policy process.

20

Zero curves in the deflation regime.

21

Zero curves in the regime with drift -75 bp/year.

22

Zero curves in the regime with drift -25 bp/year.

23

Zero curves in the regime with drift +25 bp/year.

24

Zero curves in the regime with drift +100 bp/year.

25

Backbone of 2y-10Y correlation, i.e. correlation between daily

returns of the 2Y swap rate versus the 10Y swap rate as a

function of the short rate.

26

Backbone of 1y-20Y correlation, i.e. correlation between daily

returns of the 1Y swap rate versus the 20Y swap rate as a

function of the short rate.

27

Projected occupancy probabilities of monetary policy regimes.

28

Implied volatilities for at-the-money European swaptions of tenor

2Y compared to market data.

29



30



31

Implied volatility skews for European swaptions of tenor 2Y com-

pared to market data.

32

Implied volatility skews for European swaptions of tenor 5Y com-

pared to market data.

33

Implied volatility skews for European swaptions of tenor 20Y

compared to market data. Only implied volatilites for extreme

strikes 16% over the forward failed to compute, as the graph

shoes.

34

Implied volatility backbone, i.e. the scatterplot of the implied at

the money volatility of 2Y into 2Y European swaptions versus

the corresponding forward rate.

35




36




37

Bermuda premium backbone, i.e. the Bermuda premium of 4Y

into 2Y at-the-money swaptions plotted against the correspond-

ing forward rate.

38

Bermuda premium backbone, i.e. the Bermuda premium of

5Y into 10Y at-the-money swaptions plotted against the corre-

sponding forward rate.

39

2Y into 5Y convexity backbone, i.e. the convexity correction for

of 2Y into 5Y European constant-maturity-swaps (CMS) with

respect to European swaptions.

40

3Y into 2Y convexity backbone, i.e. the convexity correction for

of 3Y into 2Y European constant-maturity-swaps (CMS) with

respect to European swaptions.

41

10Y into 20Y convexity backbone, i.e. the convexity correction

for of 10Y into 20Y European constant-maturity-swaps (CMS)

with respect to European swaptions.

42

Pricing functions of callable CMS spread range accruals.

43


44


45


46


47


48


49

Pricing functions of callable snowball CMS spread range accru-

als.

50


als.

51


als.

52

Conclusions

Operator methods are an emerging mathematical framework for fi-

nance and econometrics which is suitable for semi-parametric and non

parametric modeling.

We showed an example concerning long dated fixed income derivatives

but I worked out several others concerning credit, equity and energy

derivatives. Just google my name.

The practical engineering applications of operator methods rely on fast

implementations of matrix-matrix multiplication algorithms, which can

nowadays be nest achieved on massively parallel GPUs optimized for

single precision floating point arithmetics.

53

callable swaps, snowballs and videogamesfinmath.stanford.edu › seminars › documents ›...

Documents