callable swaps, snowballs and videogamesfinmath.stanford.edu › seminars › documents ›...
TRANSCRIPT
Callable Swaps, Snowballs and Videogames
Claudio Albanese
? ? ?
Presented at Stanford University, October 2007
History of short rates (fund rates) for US dollar, the Euro and
the Japanese Yen.
1
Brief (and incomplete) history of interest rate derivatives.
• Black model.
• Vasicek-Hull-White, CIR model and more general affine models
(Duffie, Singleton, Cheyette, etc.)
• Market models: Heath-Jarrow-Morton and Brace-Gatarek-Musiela
• SABR (Hagan et al.)
2
Common themes.
• All models have some degree of analytic solvabilty thought neces-sary for calibration.
• Lattice models are implemented using implicit differentiation schemesand smoothing techniques.
• Measure change techniques are ubiquitous.
• The stochastic process for the drift of rates is not model explicitly.
• The stochastic process for the drift of rates is not model explicitly.
• Engineering implementations involve large clusters.
3
Real world, derivatives and building the missing link.
Processes used for interest rate derivatives deviate substantially from
the observed real world process and don’t possess these qualitative
features. See also the review paper by Q. Dai and K. Singleton (2003),
Term Structure Dynamics in Theory and Reality. Review of Financial
Studies 16, 631678.
Consistent pricing of long dated exotic derivatives requires is very
sensitive to model risk. A better agreement with the real world process
is necessary.
Punchline: A suitable mathematical and engineering implementation
framework can be built by implementing operator methods on mas-
sively parallel GPU architectures.
4
Stochastic calculus Operator methods
Arbitrage free pricing Arbitrage free pricingSigma algebra axioms Convergence estimates on
simplicial sequencesDynamics described Dynamics described by
by SDEs Markov generatorsPade approximants Fast exponentiationDouble precision Single precision
Exponential propagation of Homogeneizationerrors at high frequencies
Diffusions and sparse matrices Jump processes and full matricesAnalytic solvability Reducibility to manipulations of
matrices small enough to fit in memoryCPUs and CPU clusters CPU/GPU pairs and GPU clusters
Calibrated market models Calibrated econometric models withwith drift restrictions explicit drift modeling
Measure changes Operator manipulations possiblywithout probabilistic interpretation
Stochastic integrals Abelian processesMontecarlo methods Dynamic conditioning
Arbitrage free pricing
Arbitrage free pricing is the leading pricing framework adopted in the
industry. According to the fundamental theorem of arbitrage-free pric-
ing, if one finds prices by risk neutral valuation of future payoffs and
if the chosen pricing measure is mutually absolutely continuous with
respect to the statistical measure, then there are no arbitrage oppor-
tunities.
When using operator methods, there is no benefit in considering nu-
meraire assets other than the money market account.
Utility pricing and indifference pricing have been considered in the
stochastic control literature, not (yet) with operator methods.
5
Measure theory based on topological spaces and operator alge-
bras
The theory of integration can be rooted on sigma algebras and mea-
sure spaces. It can also be rooted on the theory of locally compact,
Hausdorff topological spaces whereby one defines integrals as linear
bounded functionals over the linear space of continuous functions and
then completes with respect to various norms to obtain Lp spaces.
The latter approach leads to operator algebras, i.e. C∗ algebras.
The second approach has the advantage from a computational view-
point to be “understandable constructively”. Constructive mathemat-
ics is a branch of mathematics whereby one limits logic constructs so
that any object needs to be explicitly constructed prior to applying an
existential qualifier to it. Since Finance is a very applied and computa-
tional science, a fully constructive approach to Mathematical Finance
seems appropriate.
6
Dynamics described by Markov generators
Consider a finite state space Λ. A Markov generator is given by a
time-dependent matrix of elements L(x, x′; t) indexed by x, x′ ∈ Λ that
allows one to define a Markov process and construct transition proba-
bility kernles U(x, t;x′, t′) as the solutions of the backward differential
equation:
d
dtU(x, t;x′, t′) +
∑yL(x, y; t)U(y, t;x′, t′) = 0 (1)
with final time condition U(x, t;x′, t) = δxx′.
7
Jump processes and full matrices
Specifying an SDE is equivalent to specifying a Markov generator.
When discretized, the generator corresponding to a diffusion process
is a tri-diagonal matrix. However, when working with generators di-
rectly on GPU platforms, sparsity patterns do not yield any appreciable
advantage and one can freely consider full matrices.
The single-name propagator satisfying equation (1) is given by the so
called “path-ordered exponential”
U(t1, t2) = P exp
( ∫ t2
t1L(s)ds
). (2)
There are two useful methods to express a path-ordered exponential:
Feynman path integrals and Dyson expansions.
8
Feynman path-integrals
The Feynman path-integral expansion is given by
P exp
( ∫ t2
t1L(s)ds
)= lim
N→∞
(I+δtNL(t1)
)(I+δtNL(t1+δt2)
)...(I+δtNL(tN)
)(3)
where δtN = t2−t1N .
An application is given below.
9
Dyson’s formula
Dyson’s formula is given by
P exp
( ∫ t2
t1L(s)ds
)=
∞∑n=0
1
n!P( ∫ t2
t1L(s)ds
)n
. (4)
where
P( ∫ t2
t1L(s)ds
)n
= n!∫ t2
t1ds1
∫ t2
s1ds2....
∫ t2
sn−1
dsnL(s1)L(s2)...L(sn). (5)
An application is given below.
10
Fast exponentiation
The Feynman path integral representation is interesting as this formulacan be implemented numerically very efficiently on GPU architecturesby the method of fast exponentiation. The method works as follows.Assume that the dynamic generators L(t) are piecewise constant as afunction of time. Suppose L(t) = Li in the time interval [ti, ti+(∆t)i].Assume δt be chosen so small that the following two conditions hold:
(FE1) miny∈Λ
(1 + δtLi(y, y)) ≥ 1/2
(FE2) log2(∆t)i
δt= n ∈ N.
This condition leads to intervals δt of the order of one hour of calendartime and this is indeed the choice we make. To compute e(∆t)iLi(x, y),we first define the elementary propagator
uδt(x, y) = δxy + δtLi(y, y) (6)
and then evaluate in sequence u2δt = uδt · uδt, u4δt = u2δt · u2δt, ...u2nδt = u2n−1δt · u2n−1δt.
11
Internal smoothing, sensitivities and floating point errors
Matrix multiplication is accomplished numerically by invoking either
the single precision routine sgemm or the double precision routine dgemm.
For practical purposes discussed in pricing theory, single precision is
almost always sufficient. In fact, the algorithm of fast-exponentiation
as described above with a δt satisfying the above bound has self-
smoothing properties which lessen the impact of floating point errors
to be far less than what one would naivley expect with a back-of-the
envelope worst case estimate.
12
A new notion of solvability
The traditional notion of analytic solvability involves reducing the cal-culation of certain quantities to the evaluation of special functions(usually a variant of hypergeometric functions) by means of Taylorexpansion or a Pade approximant (Black-Scholes, CEV, CIR, HW,Duffie’s affine models, Heston model, Albanese-Kuznetsov-Lawi clas-sification schemes, etc..).
A second notion of solvability is based on asymptotic expansions (Ha-gan’s SABR, Pieterbarg’s SV-BGM expansions, Papanicolao et al.volatility models).
A new notion of solvability that arises with operator methods is thereducibility of the problem to the multiplication of matrices of sizesmall enough that all the required buffers fit in the CPU and GPUmemory spaces. This definition changes quite radically the flavour ofmathematical work and also moves the boundary of solvable modelsto include a much larger number of processes and payoffs.
13
CPUs and CPU clusters
The traditional hardware frameworks for computational finance are
given by single core CPUs over which one executes single threaded
code
As an alternative, Montecarlo algorithms run on CPU clusters with
slow interconnects over which one spawns multiple processes typically
with a job queing PVM type algorithm.
14
CPU/GPU pairs and GPU clusters
Operator methods perform best on massively parallel multi-core archi-
tectures. A current example would be a Tesla GPU with 16 single-
instruction-multiple-data SIMD processors, each with 32 data registers
and one instruction register, with 1.5GB shared fast-access memory.
Such GPUs are linked to the CPU by a 100 Mhz bus on a PCI-E con-
nection and GPU-CPU data transfer. Although this transfer rate is
orders of magnitude faster than the typical inter-node communication
speed in a cluster, the transfer on a bus is a possible bottleneck which
needs to be avoided by executing GPU side calculations as much as
possible while keeping data buffers resident in GPU memory.
As a next level, GPU clusters allow one to execute loosely couple GPU
jobs in parallel while being coordinated by a CPU.
15
Execution times in seconds for various portfolios under various
configurations.
Task GPU-O GPU-D Host-O RatioInitialization 4.79 5.16 3.77 0.79
Calibration to term structure of rates 4.76 6.14 61.88 13.00585 European swaptions 7.62 10.17 85.30 11.19
Portfolio of CMS spread range accruals:9 callable swaps and 3 callable snowballs 21.03 23.74 134.59 6.40
30240 ATM European swaptions 11.34 16.16 138.68 12.2313725 ATM European swaptions and
13725 ATM Bermuda swaptions 56.56 59.02 272.19 4.81
GPU-O: using the GPU with host side optimized code.
GPU-D: using the GPU with host side debug code.
Host-O: using the host only with optimized code and Intel MKL li-braries.
16
Ratio: column 3 versus column 1.
“Economic” models without drift restrictions
Market models are characterized by a large number of dynamically
constrained processes. They are well suited for Montecarlo simulations
but not to the application of operator methods.
The models that are best suited to operator methods are specified
through a minimalistic filtration without drift restrictions. These tend
to be models with an intuitive economic content and explanatory
power such as short rate interest rate models, credit equity models
specified as defaultable equity models, stochastic skew FX models
with stochastic drift, etc..
17
Measure Changes
A mathematical framework is largely characterized by its morphisms,
the operations we allow ourselves to carry out on the objects at our
disposal.
In the traditional measure theoretic approach to finance, measure
changes play a pivotal role as they allow one to map stochastic pro-
cesses into stochastic processes.
With operator methods, numeraire changes correspond to a transfor-
mation of generators of the form
LG(t) =1
G(t)L(t)G(t) +
1
G(t)
∂G(t)
∂t(7)
where G(t) is a diagonal operator corresponding to the new numeraire.
18
Although this mapping allows one to rederive all the classic results of
continuous time finance from analytic solvability to quanto-options and
drift restrictions, this is not particularly useful to take full advantage
of the formalism.
Operator morphisms
The most useful operator manipulations are
• Path exponentiation
• Operator deformation and differentiation
• Operator lifting for path-dependent processes
• Block diagonalizations for Abelian processes
• Kernel splitting for dynamic conditioning
19
Time dependent adjustment function λ(t) to fit the term struc-
ture of rates.
Notice that this func-
tion is very close to 1. This is achieved by calibrating the drift of the
monetary policy process.
20
Zero curves in the deflation regime.
21
Zero curves in the regime with drift -75 bp/year.
22
Zero curves in the regime with drift -25 bp/year.
23
Zero curves in the regime with drift +25 bp/year.
24
Zero curves in the regime with drift +100 bp/year.
25
Backbone of 2y-10Y correlation, i.e. correlation between daily
returns of the 2Y swap rate versus the 10Y swap rate as a
function of the short rate.
26
Backbone of 1y-20Y correlation, i.e. correlation between daily
returns of the 1Y swap rate versus the 20Y swap rate as a
function of the short rate.
27
Projected occupancy probabilities of monetary policy regimes.
28
Implied volatilities for at-the-money European swaptions of tenor
2Y compared to market data.
29
Implied volatilities for at-the-money European swaptions of tenor
5Y compared to market data.
30
Implied volatilities for at-the-money European swaptions of tenor
20Y compared to market data.
31
Implied volatility skews for European swaptions of tenor 2Y com-
pared to market data.
32
Implied volatility skews for European swaptions of tenor 5Y com-
pared to market data.
33
Implied volatility skews for European swaptions of tenor 20Y
compared to market data. Only implied volatilites for extreme
strikes 16% over the forward failed to compute, as the graph
shoes.
34
Implied volatility backbone, i.e. the scatterplot of the implied at
the money volatility of 2Y into 2Y European swaptions versus
the corresponding forward rate.
35
Implied volatility backbone, i.e. the scatterplot of the implied at
the money volatility of 4Y into 5Y European swaptions versus
the corresponding forward rate.
36
Implied volatility backbone, i.e. the scatterplot of the implied at
the money volatility of 10Y into 20Y European swaptions versus
the corresponding forward rate.
37
Bermuda premium backbone, i.e. the Bermuda premium of 4Y
into 2Y at-the-money swaptions plotted against the correspond-
ing forward rate.
38
Bermuda premium backbone, i.e. the Bermuda premium of
5Y into 10Y at-the-money swaptions plotted against the corre-
sponding forward rate.
39
2Y into 5Y convexity backbone, i.e. the convexity correction for
of 2Y into 5Y European constant-maturity-swaps (CMS) with
respect to European swaptions.
40
3Y into 2Y convexity backbone, i.e. the convexity correction for
of 3Y into 2Y European constant-maturity-swaps (CMS) with
respect to European swaptions.
41
10Y into 20Y convexity backbone, i.e. the convexity correction
for of 10Y into 20Y European constant-maturity-swaps (CMS)
with respect to European swaptions.
42
Pricing functions of callable CMS spread range accruals.
43
Pricing functions of callable CMS spread range accruals.
44
Pricing functions of callable CMS spread range accruals.
45
Pricing functions of callable CMS spread range accruals.
46
Pricing functions of callable CMS spread range accruals.
47
Pricing functions of callable CMS spread range accruals.
48
Pricing functions of callable CMS spread range accruals.
49
Pricing functions of callable snowball CMS spread range accru-
als.
50
Pricing functions of callable snowball CMS spread range accru-
als.
51
Pricing functions of callable snowball CMS spread range accru-
als.
52
Conclusions
Operator methods are an emerging mathematical framework for fi-
nance and econometrics which is suitable for semi-parametric and non
parametric modeling.
We showed an example concerning long dated fixed income derivatives
but I worked out several others concerning credit, equity and energy
derivatives. Just google my name.
The practical engineering applications of operator methods rely on fast
implementations of matrix-matrix multiplication algorithms, which can
nowadays be nest achieved on massively parallel GPUs optimized for
single precision floating point arithmetics.
53