a two-regime markov-chain model for the swaption matrix 2-regime... · a two-regime markov-chain...

A Two-Regime Markov-Chain

Model for the Swaption Matrix

�

Dr Mathias Thorsten Keil

Christ Church College

University of Oxford

A thesis submitted in partial fulfillment of the MSc in

Mathematical Finance

August 16, 2006

Abstract

The most commonly used models for pricing complex interest rate products are

LIBOR Market Models. These models describe the evolution of a set of forward

LIBOR rates. For a given set of LIBOR rates such a model is defined by the term

structures of their volatilities and the correlations among these rates. It is therefore of

vital importance to use a parameterization for these quantities that is able to capture

the important features of the market the model is calibrated to. Since this is usually

the swap market a good parameterization of the LIBOR Market Model should also

be a good model for the volatility term structure of forward rates, i.e. the swaption

matrix which contains all available information about the volatility term structure in

the swap market.

It is a well known feature of the interest rate markets that in times of market

turmoil the volatility term structure undergoes sudden changes in shape when the

market enters these excited regimes. After a short period the market is normal again

which also switches the shape of the volatility term structure back to normal. Re-

bonato and Kainth proposed an approach to capture this market feature in a LIBOR

Market Model. They use the most common parameterization of the volatility term

structure to describe the normal market situation and the excited regime each with

one set of stochastic parameters. In the present work a more simplistic approach is

taken by keeping the parameters constant. The model captures the switches between

the two regimes by transition probabilities.

After introducing the setup of a LIBOR Market Model the parameterization of

the forward rate volatility term structure is presented and it is shown how this relates

to swap rates. The algorithms for calibrating to caplets as well as swaptions are

discussed. The sensitivities of the model parameters are systematically analyzed. By

allowing the fit to vary various settings of parameters the quality of the calibration to

swaption data is assessed. Finally, a very promising and simple parameterization of

the model that captures the regime switches is fitted with four parameters to a series

of monthly market data spanning a period of four years that includes major financial

events. This model yields a dramatic improvement over the model without regime

switches in the description of the swaption matrix without increasing the number of

fitting parameters. While the latter approach yields an average deviation of 76 basis

points in implied volatility the new model obtains 63. Dramatic improvements are

obtained during and after periods of market turbulence. In addition the fit parameters

are a lot more stable over time which implies lower re-hedging costs when using this

model.

2

To Meike

3

Acknowledgment

First of all I would like to thank my supervisor, Dr Riccardo Rebonato, for his out-

standing support and his patience throughout the project. By giving me the oppor-

tunity to participate in the Oxford program, my employer, d-fine GmbH Frankfurt,

made this work possible. I benefited form discussions with Dr Laurent Hoffmann and

Dr Matthias Mayr. For pointing out the minpack-algorithm to me I thank Dr Gotz

Rienacker.

4

Contents

1 Introduction 6

1.1 Market Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2 LIBOR Market Models With a Smile . . . . . . . . . . . . . . . . . . 10

2 Instantaneous Volatility 12

2.1 Caplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 Swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Variation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.1 Volatility Term Structure . . . . . . . . . . . . . . . . . . . . 18

2.3.2 Swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 Calibration to Swaption Prices 28

3.1 The Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2 The Fitting Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2.1 Some Checks . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4 Results for Fits with Regime Switches 32

4.1 Regime Switches for a “Normal” and an “Excited” Market . . . . . . 32

4.2 Fits for Many Trading Days . . . . . . . . . . . . . . . . . . . . . . . 36

4.3 Necessary and Possible Future Steps . . . . . . . . . . . . . . . . . . 46

5 Conclusions 47

A Bootstrapping 49

B The Levenberg-Marquardt Method 50

5

Chapter 1

Introduction

In the past years, with the trading of more and more complex interest-rate deriva-

tive instruments, the modelling of the underlying rates received enormous attention.

There is some feedback reaction in modelling and market behaviour, so as the models

became more complex also the markets evolved. The observed changes in the market

influence the models and the changes in modelling standards influence the markets.

Complex products include optionalities on LIBOR as well as swap rates. Therefore

the models used for pricing and hedging such products should be able to describe the

markets of caps and floors as well as swaption markets in a satisfactory manner. The

smiles observed in implied volatility surfaces of cap, floor and swaption markets have

evolved during the years. While in earlier models (like the one by [6]) the occurrence

of a smile was sometimes considered as a side effect, by now the modelling of the

complex shapes takes up maybe most of the effort.

In pricing interest-rate derivatives the market standard for modelling the term

structure of interest rates are LIBOR and swap market models. There is a close

connection between the two markets so that models that describe the dynamics of

LIBOR rates and the ones describing swap rates can be used rather interchangeably.

However, either model assumes a log-normal distribution in the changes of the un-

derlying rates which cannot be the case for both models at the same time and not

even for forward rates of different tenors. But the discrepancy is small [15] and does

not matter in the common applications of the models. The market models are so

successful because most interest rate derivatives depend on a finite number of points

on the yield curve and can therefore be modelled by simulating the dynamics of the

corresponding set of rates. In agreement with market conventions (as long as one

neglects smiles) the rates are log-normal and Black’s [1] formulae can be applied.

Usually in market models a set of forward LIBOR rates or a set of co-terminal swap

rates or both is considered. The set of traded instruments that should be used to

6

calibrate the model is therefore a rather natural choice, which makes the sensitivity

to market prices very transparent.

On the other hand the setup also has some disadvantages, which are due to mod-

elling a finite set of rates. And the sampling of a set of rates is a high dimensional

problem, but the dimensionality can be reduced by principal component analysis.

Each specification of a LIBOR Market Model results in a multi-factor model for

the simultaneous evolution of a given number of forward rates. As I briefly show in

the following section all the drifts are given by the no-arbitrage condition and the

only remaining degrees of freedom are the volatility and correlation term structures.

1.1 Market Models

Today, market models are well known and detailed descriptions can be found in text

books and other publications. This overview is based on [3], [16], [17], [22]. The

dynamics of the forward rates are given by the equations

dfi

fi

= µ({σj}, {fj})dt + σi(Ti) dzi (1.1)

dzidzj = ρij(Ti, Tj)dt (1.2)

where all variables are at time t and fi stands for the i-th forward LIBOR rate starting

at time Ti and maturing after one period of length τi, i runs from 1 to the number of

considered rates. The corresponding σi are the volatility term structures dependent

on the starting times Ti, and dzi represent the Wiener increments for each rate. The

Wiener increments are correlated via the matrix ρij . We will see that the crucial

parameters in the model are the σi and ρij . Once these quantities are determined the

functional form of the µ({σj}, {fj}) depends on the choice of numeraire only. The

actual functional form for a given choice of numeraire is fixed by the no-arbitrage

requirement. I will use the notation µki for the drift of fi under the measure k as

discussed below.

A natural choice of numeraire is to use one of the discount bonds P (Tk) with

maturity Tk. Under this numeraire one of the equations (1.1) has a zero drift term

since the corresponding forward rate is a martingale. In order to determine the other

drift terms I consider a forward rate agreement (FRA) which has the underlying rate

that starts at time Ti and matures at time Ti + τi = Ti+1. Its value is then given by

τifiP (Ti). Since FRAs are tradable, the expression

τifi

P (Ti)

P (Tk)

7

is a Martingale in the chosen measure for any i. The special choice i = k shows that

under this measure also fk is a martingale and therefore has zero drift as stated before.

In addition, also the discount bonds are tradable and P (Ti)/P (Tk) is a Martingale

as well. Using Ito’s lemma we can now look at the differential of the FRA value and

adjust the drift terms in Eqs. (1.1) such that the sde for the FRA has zero drift.

d

(

fi

P (Ti)

P (Tk)

)

=P (Ti)

P (Tk)dfi + fid

(

P (Ti)

P (Tk)

)

+ dfid

(

P (Ti)

P (Tk)

)

(1.3)

The ratio of the two discount bonds depends on the forward rates in the following

way if i < k

P (Ti)

P (Tk)=

k∏

j=i+1

(1 + τjfj) (1.4)

For the differential of this ratio I need to consider the product rule.

d

(

P (Ti)

P (Tk)

)

= d

(

k∏

j=i+1

(1 + τjfj)

)

(1.5)

=k∑

l=i+1

1

1 + τlfl

k∏

j=i+1

(1 + τjfj) τl dfl

+k∑

l=i+1

l∑

m=i+1

1

1 + τlfl

1

1 + τmfm

k∏

j=i+1

(1 + τjfj) τl dfl τm dfm

=P (Ti)

P (Tk)

k∑

l=i+1

(

τl

1 + τlfl

dfl +l∑

m=i+1

τl

1 + τlfl

τm

1 + τmfm

dfl dfm

)

By using Eq. (1.1) and Ito’s lemma I can simplify this expression by keeping terms

of order dt and less:

d

(

P (Ti)

P (Tk)

)

=P (Ti)

P (Tk)× (1.6)

k∑

l=i+1

(

τlfl

1 + τlfl

(µldt + σldzl) +l∑

m=i+1

τlfl

1 + τlfl

τmfm

1 + τmfm

σlρlmσmdt

)

=P (Ti)

P (Tk)

k∑

l=i+1

τlfl

1 + τlfl

σldzl

In the last step I used the fact that the ratio of discount bonds is a martingale and

therefore the sde must not have a drift term. In other words, the µl in the second

line have the form that the first and second dt terms cancel.

Now, I consider the drift terms in Eq. (1.3). From the left hand side it is clear

that the total drift is zero. Since the second term on the right hand side is just fi

8

times the differential of the ratio of discount bond this term has no drift. So the drifts

in the first and last terms need to cancel. The first term is

P (Ti)

P (Tk)dfi =

P (Ti)

P (Tk)fi(µidt + σidzi) (1.7)

Keeping only terms to order dt and using Ito again the second term is

dfid

(

P (Ti)

P (Tk)

)

= fiσidzi

P (Ti)

P (Tk)

k∑

l=i+1

τlfl

1 + τlfl

σldzl (1.8)

=P (Ti)

P (Tk)fiσi

k∑

l=i+1

τlfl

1 + τlfl

σlρildt

The drift term µi can now be read off, by demanding that the drifts of the rhs of

Eq. (1.7) and Eq. (1.8) add up to zero:

µi = −σi

k∑

l=i+1

τlfl

1 + τlfl

σlρil (1.9)

These drift terms are for i < k and having chosen the discount bond P (Tk) as a

numeraire.

The same steps can be carried out for i > k where the ratio of discount bonds is

given by

P (Ti)

P (Tk)=

k∏

j=i+1

1

1 + τjfj

(1.10)

The use of the product rule finally yields the solution

µi = σi

i∑

l=k+1

τlfl

1 + τlfl

σlρil (1.11)

With µk = 0 all the drift terms in the model where we use P (Tk) as a numeraire are

now determined. For other numeraires similar results can be obtained.

From this sketch of a derivation of the drift terms in the specific setup of the LI-

BOR Market Model it is immediately clear that the complete set of model parameters

is given by today’s forward rates and the covariance term structure for these rates.

A similar statement is true for swap market models. The only difference is that a

set of co-terminal swap rates and their volatility term structures need to be specified.

Instead of discount bonds the natural numeraire for a swap market model are swap

annuities

B(Ti, Tn) =

n−1∑

j=i

τjP (Tj+1) (1.12)

9

where Ti is the start date of the swap and P (Ti+1) is a discount bond maturing at

the end of the period τi. Thus the fixed leg of a swap starting at Ti has the value

SRN×MB(Ti, Tn) with the swap rate SRN×M , where the starting date is N = Ti and

the swap matures after M = Tn − Ti years.

The sde for the forward swap rates can be written down in the same way as for the

forward LIBOR rates. Choosing a swap annuity as a numeraire and by considering

the forward value of the fixed leg of a swap it is possible to do very similar steps

as detailed above in the case of the FRA. Keeping in mind that both, the fixed leg

and the annuity are martingales, one finds the drift terms of the co-terminal swap

rates. The expressions are more complicated than for the forward LIBOR rates.

Since I will consider LIBOR Market Models in the following sections I will not go

into any further detail for the swap market models. As stated before, there is a close

connection between the two types of models. In the next chapter I show how to

calibrate a LIBOR Market Model using swap rates. But before that I present some

approaches to account for the smile observed in the Plain Vanilla Markets.

1.2 LIBOR Market Models With a Smile

As pointed out earlier, most of the current research in the field of LIBOR Market

Models goes into modelling the shape of the volatility surface (or cube) and the

correlations. Initially, when a kind of smile was first observed in JPY fixed income

markets the volatility was a monotonically decreasing function of the moneyness. The

models for plain-vanilla options started taking these features into account in the late

1990s and it was then observable in most currencies. Major financial events like the

Russia default and the LTCM crisis in late 1998 triggered market turbulences and

thereby also led to more complex volatility shapes. A historical review of the various

approaches and a nice discussion can be found in e.g. [18].

The approach that I will be using throughout this work is a constant elasticity of

variance (CEV) approach where a very common re-parameterization is applied (see

e.g. [16]). The standard form of a CEV process is given by

df = µ(f, σ)dt + σ(τ)fβdz (1.13)

where β is a constant parameter which leads to a normal process for β = 0 and a

log-normal process for β = 1 but it can also take on other values in the interval (0, 1).

Analytic solutions to some typical problems exist for special choices of β only (0, 12,23,

10

and 1). It is therefore very useful to work with a displaced rate

d(f + α)

f + α=

df

f + α= µ(α)(f, σ)dt + σ(α)(τ)dz (1.14)

with the constant α. The notion of a displaced diffusion process was introduced in [21]

in the context of modelling the firm value. This process offers a lot more analytical

tractability (see [16] for detailed calculations) and in addition for a wide range of rate

values the displaced diffusion process is a re-parameterization of the CEV process

[11]. The approximate re-parameterization

α = f01 − β

β(1.15)

will be used in section 2.3.2 (f0 represents the forward rate as of today).

So far, the volatility term structure is a deterministic function of time and the

forward rate. The CEV or displaced diffusion approach can account for monotonically

decreasing smiles. An obvious extension would be to introduce jumps, as such models

are well known from equities. However, such a model is very complex and it can be

shown that for most of these models prices of the same quality can be obtained by

deterministic coefficients ([16] and references therein). A more flexible way to obtain

complex shapes is to make the volatility a function of one or more stochastic processes

other than the forward rates, e.g. [8].

In what follows a different approach is chosen which is mainly inspired by market

observation. The idea was first presented in [20]. It is based on the observation that

the instantaneous volatility term structure for forward rates and also for swap rates

has different shapes in “normal” and “excited” market situations. Changes between

normal and turbulent market situations are more or less instantaneous and therefore

modelled by a Markov chain. The following chapter explains how to calibrate such a

model and shows the possible volatility term structures that can be obtained.

11

Chapter 2

Instantaneous Volatility

As I pointed out in the previous chapter the crucial quantities that need to be chosen

in order to use a LIBOR or swap market model are term structures of the volatilities

and correlations. In order to calibrate the model to market data the term structure of

the instantaneous volatility needs to be parameterized and the parameters need to be

adjusted to market data. There is a number of possible parameterizations available

on the market from which I use a very common and rather general one (see e.g. [3],

[16]). In addition the volatility term structure contains a stochastic component as it

is discussed in [19].

2.1 Caplets

Our starting point is the stochastic process

df

f= µ(σ, f)dt + σ(t, T )dz (2.1)

where f is the forward rate, dz is a Wiener increment. For the percentage volatility

σ I choose the well-known parameterization ([3], [16])

σ(t, T ) = σ(τ) = (a + bτ) exp(−cτ) + d (2.2)

with τ = T − t and a, b, c, and d real valued constants. In order to use this parame-

terization in the displaced-diffusion set up

df

f + α= µ(σα, f)dt + σα(t, T )dz (2.3)

I transform the parameters of the forward-rate volatility to the volatility of the dis-

placed forward rate as follows:

aα = af

f + α

12

bα = bf

f + αcα = c

dα = df

f + α(2.4)

In order to use the Black formalism I consider the root mean square σ of the

instantaneous volatility which is defined by the following equation:

σ(T, t)2(T − t) =

∫ T

t

σinst(T, u)2du . (2.5)

Equivalently for the volatility of the displaced forward rates

σα(T, t)2(T − t) =

∫ T

t

σαinst(T, u)2du (2.6)

With the obtained volatility I can then use the Black formula in order to obtain the

price of the caplet. With the parameterization of the instantaneous volatility as given

in Eq. (2.2) the integration can be done analytically as given in e.g. [16].

In order to account for the observed changes of the volatility term structure in the

market I then introduce another stochastic parameter that determines whether the

market makes the transition from a normal state into an excited state or vice versa.

The probabilities for these transitions are

[

λnn λxn

λnx λxx

]

(2.7)

Then the function Eq. (2.2) needs to be extended to

σ(t, Ti) = yt σn(t, Ti) + (1 − yt) σx(t, Ti)

⇔ σ(t, Ti)2 = yt σn(t, Ti)

2 + (1 − yt) σx(t, Ti)2 . (2.8)

The variable y assumes values 0 or 1 and follows a two-state Markov-chain process

and σn(t, Ti) and σx(t, Ti) represent the parameterizations of the normal and excited

states respectively. For a particular realization of yt on the considered time interval I

obtain a chain of normal and excited pieces in the volatility term structure squared.

The integration of this given chain then yields the variance and therefore the root

mean sqaure.

13

2.1.1 The Algorithm

As input parameters I use the current forward rate f0 and the parameterizations

for σn(t, Ti) and σx(t, Ti), i.e. an, bn, cn, dn, ax, bx, cx, and dx with the transition

probabilities λnx and λxn and the initial state (excited or normal). I also need the

displacement in the diffusion α and the expiry and strike of the caplet. The insight

is that, conditional on a particular realization of the stochastic parameter y, the

setting is exactly that of a deterministic-volatiliy LIBOR Market Model. The price

under regime switches can therefore be obtained by integrating the conditional prices.

For a given number Nsteps of subintervals I integrate the squares of both volatilities

σn(t, Ti)2 and σx(t, Ti)

2 by splitting up the integral in Eq. (2.5). I perform a Monte-

Carlo simulation where I determine for a number Nsim of simulations whether these

subintervals are in normal or excited states by drawing uniform random numbers

and using the probabilities from Eq. (2.7). In each simulation I add up the already

integrated variances for the subintervals and obtain the Black volatility by dividing

through the time interval and taking the square root. This volatility I can then use

in the Black formula and obtain the caplet price. I take the average over the prices of

all simulations and, from the unconditional price thus obtained, compute the implied

volatility.

2.2 Swaptions

A swaption is the option to enter into a swap with fixed rate K that starts at a given

time Tn and matures at time Tm. The underlying swap pays/receives a fixed rate

of the underlying nominal value and in return receives/pays a floating rate at some

initially fixed dates. Let τi denote the period following Ti, fi(t) ≡ f(t, Ti, Ti + τi)

the forward LIBOR rate spanning that period, Ni the nominal value relevant for

the payments at the end of that period, and P (t, Ti+1) a discount bond maturing at

Ti+1. In order to obtain the par rate of the swap both legs need to have the same

present value. The corresponding forward par swap rate can be viewed as a linear

combination of forward rates (see e.g. [14]):

SRk(t) =m−1∑

i=n

wi(t)fi(t) =

∑m−1i=n Niτifi(t)P (t, ti+1)∑m−1

i=n NiτiP (t, ti+1)(2.9)

where k labels the par rate of a swap starting at Tn and maturing at time Tm. I

denote the percentage instantaneous volatility of a swap rate expiring N = Tn − t

14

years from time t and maturing another M = Tm − Tn years after expiry by σN×M (t)

or also by σSRk(t) for short. By applying Ito’s Lemma I obtain

(σN×M (t))2 ≡ (σSRk(t))2 =

∑

i

∑

j

∂SR∂fi

∂SR∂fj

fifjρijσiσj

(∑

l wlfl)2(2.10)

where σi is the instantaneous volatility of the i-th forward swap rate and ρij is the

correlation of the i-th and the j-th rate. The sums over i, j, and l start with the

expiry and run over all fixing dates.

Using the approximation of static weights together with Eq. (2.9) and as in the

case of the forward rates Eq. (2.5) we calculate the Black volatility which is the root

mean square of the instantaneous volatility of the k-th swap rate:

(σSRk(t))2(Tn − t) =

∫ Tn

t

(σSRk(u))2 du (2.11)

I obtain

(σSRk(t))2(Tn − t)SRk

2 ≈n+m−1∑

i=n

n+m−1∑

j=n

wiwjfifj

∫ Tn

t

ρijσiσjdu (2.12)

and the weights have a value wi(t) independent of the integration time. The σi are

a short notation for instantaneous volatility σ(t, Ti) of the i-th forward rate with

the parameterization of Eq. (2.2). Correlations are taken into account by the simple

function

ρij = exp(−β|Ti − Tj |) (2.13)

and can therefore be pulled out of the integral. For the given parameterization of σi

the integral∫ Tn

t

σiσjdu (2.14)

can be done analytically (see e.g. [16]). The explicit result is

∫ t

0

σiσjdu =1

4c3ec(Ti+Tj)×

[

2c2(

a2(e2ct − 1) + 2ad(ect − 1)(ecTi + ecTj) + 2cd2ec(Ti+Tj)t)

+

b2(

−1 − 2c2TiTj − c(Ti + Tj)+

e2ct(

1 + 2c2(t − Ti)(t − Tj) + c(−2t + Ti + Tj)))

−2bc(

a(

1 + e2ct(−1 + c(2t − Ti − Tj)) + c(Ti + Tj))

+

2d(

ecTject(−1 + c(t − Ti)) + ecTj (1 + cTi)+

ecTiect(−1 + c(t − Tj)) +ecTi(1 + cTj)))]

(2.15)

15

As in the case of caplets I use the displaced-diffusion ansatz Eq. (2.3). The step

that is more complicated for swaptions is due to the correlations between the various

forward swap rates.

2.2.1 The Algorithm

In order to obtain the implied swap-rate volatility I do the following steps. First,

I initialize the parameters of our model. For the parameterization of the volatility

term structure an, bn, cn, dn, ax, bx, cx, and dx, the transition probabilities λnx and

λxn, and the correlation parameter β. The displacement parameter α needs to be

set and for the simulation I set the number of simulations Nsim and the number of

possible regime switches Nsteps. The discount factors are obtained by bootstrapping

(see Appendix A) from the swap rates that are interpolated from 0.5 and 1 year

deposit rates and the par rates of 2, 5, 10, 20, and 30 years swaps. Here and in what

follows the tenor is set to half a year.

For given strike and expiry of the swaption Tn and maturity of the swap Tm I

then compute the volatility term structure in the following way. First I transform

the volatility parameters a, b, and d from the given swap-rate volatility values to

the displaced-diffusion values by multiplying the forward swap rate SRk and dividing

by the displaced swap rate SRk + α as given for caplet case, Eq. (2.4). For Nsteps

intervals I compute two arrays of variances, one for the exited market and one for

the normal state. For each of the Nsteps intervals I use the integration in Eq. (2.12)

in the displaced diffusion setup, however, with different integration boundaries. Let(

σαSRk

(t))2

abdenote the root mean square volatility of the displaced swap rate in the

interval that runs from ta to tb then I get the following equation

(

σαSRk

(t))2

ab≃ 1

Tn(SRk + α)2

n+m−1∑

i=n

n+m−1∑

j=n

wiwj(fi + α)(fj + α)ρij

∫ tb

ta

σαi σα

j du (2.16)

The two arrays of excited-state and normal-state variances are each filled with Nsteps

of the(

σαSRk

(t))2

abwith the parameterizations for excited and normal resp. For the

first entry in both arrays the lower boundary of the time interval is ta = t (ta = t = 0

in our simulation) and the last entry in both arrays the upper boundary of the interval

is tb = Tn. Each interval spans the time dt = Tn−tNsteps

.

Now, I am in a position to perform a large number of simulations using the Markov-

chain process. In each of the simulations I start with the same initial state at time t,

e.g. for the normal state y from Eq. (2.8) is equal to zero in the first interval. Using

the probabilities λnx and λxn I decide for each of our Nsteps − 1 intervals following

16

the initial one by drawing uniform random numbers whether I stay in the same state

for the next interval or whether I switch to the other state. Finally, I end up with a

structure of y = 0 or y = 1 for each interval of the simulation under consideration.

Now, I just need to pick the corresponding entries from the two variance arrays.

Adding up all the entries that I picked for one of the simulations leaves me with one

realization of the integrated volatility term structure. After Nsim simulations I have

a Monte-Carlo sample of the σαSRk

(t).

The next goal is to obtain the Monte-Carlo sample of prices for the swaptions

corresponding to the realizations of the obtained rms displaced swap-rate volatilities.

Using the displaced swap rate SRk +α, the displaced strike K +α, and the volatilities

from our Monte-Carlo simulation σαSRk

(t) I compute the prices of a payer’s swaption

that correspond to the sample of volatilities via the well-known Black formula

Vswpt = ((SRk + α) N(h1) − (K + α) N(h2))n+m−1∑

i=n

0.5 P (t, ti+1) (2.17)

where N(h1,2) is the cumulative normal and

h1,2 =ln(SRk+α

K+α) ± 1

2

(

σαSRk

(t))2

(Tn − t)

σα√

Tn − t. (2.18)

The factor 0.5 is the tenor. I obtain a set of prices and take the average.

I then use this average price to plug it into an implied-volatility algorithm in order

to get the Black implied volatility without the displacement.

2.3 Variation of Parameters

After the model has been explained in detail it is now a good point to get some

intuition about the model parameters. In order to get a feeling for the effect that

the variation of the various parameters has I varied one of the input parameters

at a time in a reasonable range and plotted the curves for the chosen samples. In

the title of each plot I give the name of the parameter that I varied, the range of

variation [a, b], and the width of each step. The interval for the volatility σ (either

instantaneous or implied) gives the values [σ(a), σ(b)] in order to determine the way

that the lines change when increasing the parameter. If σ(a) > σ(b) then the upper

most curve corresponds to the parameter under consideration equal to a. In each

section the choice of all the model parameters is always given at the beginning. I

begin with the parameterization of the forward-volatility term structure and then

study the parameterization of the swaption matrix. All volatilities in the plots are

given in percent.

17

2.3.1 Volatility Term Structure

In this subsection I show how the volatility term structure Eq. (2.2) changes when

three of the parameters a, b, c, and d are kept fix and one parameter is varied. Except

the parameters that I varied all the other parameters were set to the best fit result for

the data of 05.01.1998. The a, b, c, and d parameters are summarized in the following

table:

an 0.00057399bn 0.00255834cn 0.32406904dn 0.00519951

The dependence of the instantaneous volatility on the four parameters is rather

simple. From Eq. (2.2), which is

σ(t, T ) = σ(τ) = (a + bτ) exp(−cτ) + d

it is obvious that a variation in d just yields a parallel shift of the curve (see

Fig. 2.4). At the same time

limτ→∞

σ(τ) = d

and therefore in all cases except the one where d is varied all the curves asymptotically

go to dn = 0.00519951.

The level of the volatility at τ = 0 is additionally influenced by a:

limτ→0

σ(τ) = a + d

as shown in Fig. 2.1. A maximum exists for a positive τ if b is not zero and 1c

> ab. If

a maximum is present it is at position

τmax =1

c− a

b.

In Fig. 2.2 only the lowest line where bn was set to zero shows no maximum. The

height of the maximum is given by

σ(τmax) =b

cexp(−1 +

ac

b) + d .

In a normal market situation the maximum of the curve was found to be at 12 to 18

months [19]. An explanation can be suggested along the following lines. At the long

end of the volatility term structure the level of uncertainty is given by the expected

variation of the long term inflation. Monetary authorities usually act in such a way as

to keep these variations low which is the reason for the low volatility at the long end.

18

At the short end the variation in the rates directly depends on the steps taken by

the monetary authorities. These steps are usually well anticipated by the market due

to the policy of the authorities to indicate rate changes in advance. The mid range

therefore has the most uncertainty in normal market situations. In market turmoil

the situation changes dramatically within a short period of time. At the short end

rates are very uncertain and volatility rises drastically. The effect at the long end is

less pronounced which in total leads to a decaying term structure. (See also Fig. 3.2)

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0 2 4 6 8 10 12 14 16

σ i

τ

an in [0.0, 0.0025] step 0.0005 and σi in [ 0.63, 0.80] for τ = 0.5

Figure 2.1: Possible shapes of the instantaneous forward-volatility term structure.The parameter an was varied as given in the title of the plot. Which choice of an eachline corresponds to can be inferred from the interval for σi.

19

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0 2 4 6 8 10 12 14 16

σ i

τ

bn in [0.0, 0.005] step 0.001 and σi in [ 0.57, 0.74] for τ = 0.5

Figure 2.2: Same as 2.1 with variation of the parameter bn.

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

0 2 4 6 8 10 12 14 16

σ i

τ

cn in [0.1, 1.1] step 0.2 and σi in [ 0.70, 0.64] for τ = 0.5

Figure 2.3: Same as 2.1 with variation of the parameter cn.

20

0

0.2

0.4

0.6

0.8

1

1.2

0 2 4 6 8 10 12 14 16

σ i

τ

dn in [0.0, 0.01] step 0.002 and σi in [ 0.16, 0.96] for τ = 0.5

Figure 2.4: Same as 2.1 with variation of the parameter dn.

21

2.3.2 Swaptions

After presenting the dependence of the volatility term structure of forward rates on

the variation of the relevant parameters we now turn to the implications of these

parameter variations for the swaption matrix. In this section I present a systematic

study of the dependence of the whole swaption matrix under consideration on the

parameters a, b, c, and d in the normal and exited states and α. I use the same values

for the set of normal and excited parameters with the exception of the one parameter

that I vary. The following table summarizes all model parameters that are used:

an 0.00057399 λxn 20.0bn 0.00255834 λxn 2.0cn 0.32406904 α 1.18013071dn 0.00519951 β 0ax 0.00057399 NSim 1000bx 0.00255834 NSteps 60cx 0.32406904 Initial State 0.0dx 0.00519951

The a, b, c, and d parameters correspond to the four-parameter best-fit result

of 05.01.1998. The swaptions used in the following figures are 0.5, 1, 3, 5, 10 years

expiry into 1, 2, 3, 5, 7, 10 years maturity swaps. The series always starts with the

0.5 year swaption into all the underlying swaps and continues with the 1 year expiry

swaption and so on. I plot the implied volatility in percent for all these swaptions.

In the plot for various choices of α (Fig. 2.13 the implied volatility is normalized to

the value for the 0.5 × 0.5 swaption. The reason is that varying the α parameter

corresponds to first order to multiplying the implied volatility by a factor of SRk+α

SRk

for the N ×M swaption. The fact that the implied volatilities differ for the different

choices of α is due to the swap curve which is on the 05.01.1998 basically a rising

function of the maturity (Fig. 2.14). This dependence is of course only present if

the a, b, c, and d parameters are the parameterization of the instantaneous volatility

in the displaced model. In order to obtain a reasonable distribution of the plotted

volatility term structures I use equally spaced steps in the CEV exponent βCEV and

then transform this into α by the approximate relation Eq. (1.15)

α = f01 − βCEV

βCEV

.

22

12.5

13

13.5

14

14.5

15

15.5

16

16.5

17

17.5

18

0 5 10 15 20 25 30

σ im

p

Swaption

an in [0.0, 0.002] step 0.0005 and σimp in [12.93, 16.19] for Swaption 0

Figure 2.5: Variation of the parameter an.

10

11

12

13

14

15

16

17

18

19

0 5 10 15 20 25 30

σ im

p

Swaption

bn in [0.0, 0.004] step 0.001 and σimp in [11.90, 14.98] for Swaption 0

Figure 2.6: Variation of the parameter bn.

23

10

12

14

16

18

20

22

24

26

28

0 5 10 15 20 25 30

σ im

p

Swaption

cn in [0.1, 0.9] step 0.2 and σimp in [14.28, 13.05] for Swaption 0

Figure 2.7: Variation of the parameter cn.

4

6

8

10

12

14

16

18

20

22

0 5 10 15 20 25 30

σ im

p

Swaption

dn in [0.0, 0.008] step 0.002 and σimp in [ 4.63, 19.24] for Swaption 0

Figure 2.8: Variation of the parameter dn.

24

13

14

15

16

17

18

19

0 5 10 15 20 25 30

σ im

p

Swaption

ax in [0.0, 0.016] step 0.004 and σimp in [13.78, 17.19] for Swaption 0

Figure 2.9: Variation of the parameter ax.

12

13

14

15

16

17

18

19

20

21

0 5 10 15 20 25 30

σ im

p

Swaption

bx in [0.0, 0.016] step 0.004 and σimp in [13.71, 14.94] for Swaption 0

Figure 2.10: Variation of the parameter bx.

25

12.5

13

13.5

14

14.5

15

15.5

16

16.5

17

0 5 10 15 20 25 30

σ im

p

Swaption

cx in [0.1, 0.9] step 0.2 and σimp in [13.89, 13.80] for Swaption 0

Figure 2.11: Variation of the parameter cx.

12

12.5

13

13.5

14

14.5

15

15.5

16

16.5

17

0 5 10 15 20 25 30

σ im

p

Swaption

dx in [0.0, 0.008] step 0.002 and σimp in [13.32, 14.38] for Swaption 0

Figure 2.12: Variation of the parameter dx.

26

90

95

100

105

110

115

120

0 5 10 15 20 25 30

σ im

p

Swaption

βCEV in [0.2, 1.0] step 0.2 and σimp in [ 94.44, 98.65] for Swaption 29

Figure 2.13: Variation of the parameter α by stepping through a range of the CEVexponent. In this plot I normalized all the swaption matrices to the 0.5×0.5 volatility.More details are given in the text.

6

6.05

6.1

6.15

6.2

6.25

6.3

6.35

6.4

0 5 10 15 20 25 30

SR

[%]

swap

Figure 2.14: The swap rates that were used for obtaining Fig.2.13.

27

Chapter 3

Calibration to Swaption Prices

3.1 The Data

For the data analysis I used samples of swaption prices from a data collection of 05

January 1998 through 31 May 2002, a total of 34,500 values. The collection of data

comprises for each of the 1,150 trading days the swaption expiries of 0.5, 1, 3, 5,

and 10 years were considered each expiry into swaps of lengths 1, 2, 3, 5, 7, and 10

years. For the same trading days I used the 6-months and 1-year deposit rates as

well as the 2-, 5-, 10-, and 20-years swap rates in order to obtain swap rates and the

corresponding discount factors by boot strapping as described in Appendix A. The

swap rates and discount factors for all dates relevant to the pricing of the swaptions

that lie in between the given dates were obtained by linear interpolation.

The data include some major financial events like the Russia default in August

1998, the LTCM crisis one month later, the unexpected rate cuts by the Fed in early

2001, and the market turmoil in the aftermath of the terror attacks on September

11 in 2001. As it has been shown in [19] it is crucial to capture these excited states

of the market in a model for the volatility term structure. Fig. 3.1 shows the first

month of data. For each day all the rows of the swaption matrix are plotted in a

row as in the previous chapter, i.e. starting with the half-year expiry swaption and

all possible underlying swap maturities in increasing order and then doing the same

for the one-year maturity swaption and so on. So the x-axis represents the swaption

matrix, the y-axis the trading days and the z-axis the implied volatility. During a

“normal” market period as in Fig. 3.1 there are basically no noticeable changes in the

shape of the matrix. Once there are turbulences in the market as in August through

October 1998 the picture changes completely (Fig. 3.2). During the last days of July

1998 and the first days of August the picture is qualitatively the same as in Fig. 3.1.

28

05.01.199813.01.1998

21.01.199829.01.1998

Date0.5x10.5x101x10

3x105x10

10x10Swaption

0.12

0.15

0.18

σimp

Figure 3.1: The term structure of the swaption matrix during January 1998, a “nor-mal” period.

With the Russia default the turmoil starts and the “slope” gets steeper. After the

LTCM crisis there is even a large jump between two trading days.

3.2 The Fitting Algorithm

In order to adjust the model parameters a least-squares fit is carried out. I use

an implementation of the minpack [13] package. The underlying algorithm is the

Levenberg-Marquardt Method as briefly described in Appendix B.

The calibration is carried out on a day by day basis. For a given trading day I use

the 30 swaption implied volatilities σmkt impi×j as market data and compute the implied

volatilities for the same swaptions by the model σimpi×j . The function that needs to be

minimized is

f(x) =∑

i,j

(σmkt impi×j − σimp

i×j )2 (3.1)

where x represents the vector of the parameters that are varied, i runs over all swap-

tion maturities and j over all swap expiries.

29

20.07.199809.08.1998

29.08.199818.09.1998

08.10.1998

Date0.5x10.5x101x10

3x105x10

10x10Swaption

0.09

0.13

0.17

σimp

Figure 3.2: The term structure of the swaption matrix during the Russia default andthe LTCM crisis.

3.2.1 Some Checks

In order to test the algorithm I performed various fit runs with no regime switches

and the four fit parameters an, bn, cn, dn. The stability of the fit was tested by

varying the initial parameters for the test cases. The fits always yielded the same

very good results. As Fig. 3.3 shows the agreement with the fits obtained by [19] is

very good considering the different fit algorithms. The upper panel shows the fit to

the data from 05.01.1998 where the market was in a “normal” state. The solid line

represents the result of the model fit, the dotted line which only slightly differs for

long dated swaptions is the fit result from [19] and the dashed line shows the market

data of that day. Another consistency check is the same setup but for an excited

market situation. The lower panel shows the same as the upper panel for 05.11.2001,

where the market was in an excited state. As before both fits agree rather well. The

sum of squared deviations to market data, however, almost doubles. The difference

in relative deviations is of course smaller with the higher level of volatility. In the

following chapter I move on to the results with regime switches.

30

12

12.5

13

13.5

14

14.5

15

15.5

16

16.5

0 5 10 15 20 25 30

σ im

p

Swaption

10

15

20

25

30

35

0 5 10 15 20 25 30

σ im

p

Swaption

Figure 3.3: A comparison of the market data (dashed) and the fit from [19] (dotted)and this fit (solid). The upper panel shows market data and fits for the first tradingday 05.01.1998 in the time series where the market was in a normal state and thelower one a fit on 05.11.2001 where the market was excited. The sum of squares ofthe deviations is in the lower case twice as high as in the upper.

31

Chapter 4

Results for Fits with Regime

Switches

In this chapter the main results are presented. The full model with regime switches

can have up to 13 parameters that need to be chosen. The goal is to have the smallest

possible number of free parameters, while still allowing for a good description of the

swaption matrix. Parameters that are not varied by the fit will be kept at reasonable

values as described in the text. In the fits that are discussed in the following only

combinations of the a, b, c, d parameters are varied for the normal and excited shapes.

As a first approach the next section gives details about varying various combinations

of these parameters for one day where the market was in a normal state and one day

in an excited day. After this in section 4.2 the best parameterization that was found

is applied to a large number of trading days spanning the period from January 1998

until January 2002. Finally, possible improvements and further steps are discussed.

4.1 Regime Switches for a “Normal” and an “Ex-

cited” Market

As an example for a day with a normal market I chose the first day of the data set

05.01.1998. I start by varying one additional parameter as compared to the fit without

regime switches. In principle I expect that an additional fit parameter should reduce

the sum of squared deviations. If the financial motivation of the model is true, i.e. it

is necessary to capture regime switches between normal and excited market states,

I expect to have an exponentially decaying shape for the excited forward-volatility

term structure and therefore choose bx = 0 as a simple assumption (see Sec. 2.3).

This is a rather crude choice and a different choice might well lead to a better fit

result, but in this study the focus is on keeping the model as simple as possible. In

32

addition to the normal parameters I varied one of the remaining excited parameters

ax, cx, dx and kept the other two excited parameters fixed at the values found for

the four-parameter fit. The resulting sum of squares of all deviations, however, is

hardly affected. This is true for any of the fits, including ax, cx, dx. Even for a fit

that includes an overall scaling factor for the excited term structure, i.e. I fit the

parameter kx and replace ax and dx by kxax and kxdx, there is only an improvement

of a few percent. In order to present an example, the following table gives the whole

parameter set for the fit of dx in addition to the normal parameters.

an 0.00095014 λxn 20.0bn 0.00274649 λxn 2.0cn 0.29944074 α 1.18013071dn 0.00449303 β 0ax 0.00098784 NSim 1000bx 0.0 NSteps 300cx 0.29788851 Initial State 0.0dx 0.00859404

Fig. 4.1 shows the resulting volatility term structures for excited (upper line)

and normal (lower line) parameters. The shapes are roughly as expected, i.e. the

excited line is decaying (which is trivially true after I chose bx = 0) and lies above the

humped normal shape. In order to meet the empirical data it is, however, necessary

that both functions asymptotically go to about the same level for large τ . Obviously,

the independent variation of parameters of the normal and excited functions does not

yield to the shapes observed in the market. Since the sum of squared deviations does

not improve much this choice of parameters does not improve the ability of the model

to describe the market. It does at the same time also not worsen the model.

As another test I keep the four normal parameters at their best fit values from the

four-parameter fit and let the solver change all four excited parameters, i.e. this time

including bx. Also in this case there is only very minor improvement in the overall

fit. Maybe,the weakness of this fit is, that the set of normal parameters that is kept

fix was a best fit in the four-parameter case and should therefore not be a “purely”

normal set of parameters. Both, the excited shape and the normal shape show a

hump in this fit which, because bx differs from zero quite a bit.

As a next stage I analyze the fit to the excited market situation on 05.11.2001 in

a similar way. The best fit results from the four-parameter cases are taken for the

normal and excited a, b, c and d respectively with the exception of bx which is again

set to zero. As a first try I fit an overall factor to each of the forward-volatility shapes,

33

i.e. kn and kx. It turns out that it is not possible to reach the accuracy of the fit

with just the four excited parameters. There is however a clear tendency to push the

excited shape up and the normal shape down (kn = 0.7800282 and kx = 1.84407039).

In order to give the fit some flexibility at the long end (large τ) of both curves

as well as on the short end I perform a four-parameter fit with the following setup.

The normal parameters are initially set to the best fit values from the normal four-

parameter fit (of 05.01.1998) and the excited parameters are set to the values that

were obtained by the fit to the excited market of 05.11.2001. Then the parameters

kn and kx as well as dn and dx are varied by the fit to the excited market 05.11.2001.

This basically means that for both curves, normal and excited, the fit can change

the overall level and in addition the level at the short end in connection with the

height of the hump. The latter is due to the fact that the k factors change a and b

at the same rate, where a is mainly responsible for the level at the short end and b is

mainly responsible for the height of the hump (see Sec. 2.3.1). It turns out that this

choice of fit parameters yields a more than 20% improved sum of squares compared

to the four-parameter fit without regime switches. This result is quite remarkable

since the number of fitted parameters is the same and the initial normal parameters

were the best fit results from a different day. The shapes of both forward-volatility

term structures is shown in Fig. 4.2.

Similar to the case for the normal day that is displayed in Fig. 4.1 the curves have

some of the expected features. The weak point is again at the long end of the curves

where the fit result does not agree with the behavior observed in the market. The

excited shape crosses the normal one at about τ = 11. As pointed out in the case of

the fit to the normal day, the fit with an independent variation of normal and excited

parameters does not yield the observed volatility term structure.

With the result of this section in mind I present a different approach in the next

section. Obviously, the normal and excited parameters should not be considered

independently but the features observed in the market should be implemented into

the model right from the start.

34

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0 2 4 6 8 10 12 14 16

σ i

τ

Figure 4.1: The forward-volatility term structure (in %) as a result of the 5-parameterfit for an, bn, cn, dn, dx to the data form 05.01.1998. The upper line corresponds tothe excited state and the lower one to the normal state. The values of the parametersare given in the text.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 2 4 6 8 10 12 14 16

σ i

τ

Figure 4.2: Same as in Fig. 4.1 with data from a four-parameter fit for kn, dn, kx, dx

to the data form 05.11.2001.

35

4.2 Fits for Many Trading Days

The parameterization that is used in this section has only four fit parameters and

attempts to incorporate all the findings of the preceding sections. The goal is to

choose a parsimonious parameterization that comprises the main features observed

in market data. The model should yield the expected shapes for the normal and

excited instantaneous volatility term structures, i.e. a humped shape for the normal

and a decaying shape for the excited curve where the excited curve is well above the

normal at the short end and gets to about the same level as the normal at the long

end, as shown in Fig. 4.3. In order to obtain a decaying shape I choose bx = 0. As a

somewhat arbitrary choice I set ax = 10 × an to ensure that the excited curve starts

out well above the normal at the short end. For the long end I make the simple but

again somewhat arbitrary choice dx = dn. As fit parameters I vary an, bn, cn and

dn (which therefore uniquely determine ax and dx). Since bx was set to zero only cx

remains at the initial choice of the best fit value from the four-parameter fit without

regime switches. In the previous studies the variation of cx did not have a large effect.

This parameterization is the result of the effects that have been seen in the variation

of the various parameters in Sec. 2.3.1, the analysis of market data (see Sec. 3.1 and

[19]) and the fits with various parameterizations in the previous section. A typical

forward volatility term structure that can be obtained by the model for normal and

excited curves is given in Fig. 4.3.

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

0 2 4 6 8 10 12 14 16

σ i

τ

Figure 4.3: These are typical shapes for the normal and excited curves in the param-eterization used for the fits in this section. For this plot I used the best-fit resultsfrom 01.04.1998.

36

To summarize the setup:

• Fit parameters: an, bn, cn and dn

• Constraints: ax = 10 × an, bx = 0, dx = dn

• All other parameters are set to “reasonable” values

In the following table I summarize all the parameters that were kept at a fixed value

during the fits.

cx 0.075363546 β 0λxn 20.0 NSim 1000λxn 2.0 NSteps 50α 1.18013071052 Initial State 0 or 1

With this setup I perform fits for the first trading day of each month in the years

1998, 1999, 2000 and 2001 and January 2002, a total of 49 months. For all these days

I perform the four-parameter fit in the described parameterization

• with a normal state as the initial state,

• with an excited state as the initial state.

It might seem strange to run both setups for the whole data set of trading days. Of

course, the model that has the normal state initially should fit better for a market

that is in a normal state. But many times it is not easy to decide whether the market

is normal or excited by just looking at the data. So the idea is to run both setups

for all days and then decide ex post which of the days was normal or excited. If the

fit that started with the excited state yields the lower sum of squared deviations the

market is in an excited state, otherwise the market is normal. In order to compare

these results to the model without regime switches I also fit the model without regime

switches as it was discussed in Sec. 3.2.1 to the data of each trading day.

The expectation is, to find fits of the same quality or slightly better when regime

switches are used with a normal initial state in normal periods as compared to the

model without regime switches. In the aftermath of excited periods, where the model

without regime switches does not yield satisfying fits I expect to find an improved

description by the model with regime switches and the excited initial state. It is quite

clear that this parameterization with a couple of arbitrarily chosen parameters still

leaves room for improvement. But as I show in the remainder of this section despite

37

these shortcomings the model with regime switches yields a noticeable improvement

in the description of market data.

The results of the three different fits are presented in Figs. 4.4–4.8. Each of these

figures contains two panels. The upper panel displays the results from the fit without

regime switches as a solid line, the results from the fit with regime switches that

started in a normal state as a dashed line and the results for the fit that started in

the excited state as a dotted line. In all these figures the lower panel gives the results

for the combined model with regime switches, i.e. at each day I decide whether the

market is normal or excited by comparing the sums of squared deviations of the two

fits with regime switches. I use the model that has a normal state as the initial state

when this model yields the better fit and the model that starts with an excited state

otherwise.

In the upper panel of Fig. 4.4 I show the sum of squared deviations for all three

runs. The quality of the fit without regime switches and the ones with regime switches

that has a normal initial state are of similar quality. On average the model without

regime switches is slightly better but the effect is very small. For both of these

models the sum of squared deviations is a lot higher in periods after market turmoil.

The times after the Russia default and LTCM crisis in 1998 and the rate cuts in

early 2001 and the terror attacks in September 2001 are clearly visible in the top

panel of Fig. 4.4 in the peaks of sums of squared deviations of the described models.

However, the situation changes for the model with regime switches that has an excited

state as the initial state. This model is slightly worse than the other two models in

normal market situations but is a lot better for the periods following excited market

situations. Obviously, the choice whether the initial state is normal or excited really

corresponds to the current market situation.

This result shows that the model does behave in the expected way and that

the model with regime switches is able to describe a broader range of real market

situations than the model without regime switches. For these results it is important

to keep in mind that many of the parameter choices in the models with regime switches

were rather arbitrary and therefore one can even expect a lot of room for improvement.

As described above, in the lower panel of Fig. 4.4 I use the quality of the fits for the

two parameterizations with regime switches as an indicator whether the market is

excited or normal, i.e. the market is in a normal state if the model that starts with

the normal state yields the lower sum of squared deviations and the market is excited

if the model that starts in the excited state yields the better fit. In other words for

each month the figure gives the lower of the two possible values, which is the result

38

of the combined model with regime switches that always starts with the “correct”

state. The average deviation of this combined model is 63 basis points in implied

volatility. For the fit without regime switches the average deviation is 76 basis points

(in [19] where all the trading days were taken into account instead of one per month,

the average deviation is 79 basis points). This needs to be compared to the usual

bid-offer spread of 50 to 100 basis points. So the improvement that stems form

regime switches is really remarkable and was obtained with the same number of fit

parameters in both models.

Now, I turn to the fit parameters that were obtained by the different setups.

In Figs. 4.5–4.8 I use the same formatting as in Fig. 4.4. The top panels display

the results from the three setups and the bottom panel displays the result from the

“combined” model. The very remarkable result that can be read off these plots is

that in case of the combined model the fit parameters are a lot more stable than in

the model without regime switches. Especially in the cases of an and dn the effect

is clearly visible. The smallest improvement is found for bn. As a measure for the

variation of the fit parameters over time I give the standard deviations for the four

parameters in the combined model and the model without regime switches:

Model an bn cn dn

No regime switches 0.01175 0.00197 0.28206 0.01065Combined model 0.00116 0.00174 0.11346 0.00245

The stability of the model parameters is a very crucial measure for the quality of a

model. Strong day-to-day variations in the model parameters cause high (re-)hedging

costs. If the parameters change much then the prediction of the future volatility

changes which also changes the future re-hedging costs. This model is therefore a

very interesting improvement to the widely used four-parameter setup without regime

switches. The next section briefly describes possible further steps in order to make

this model usable in practical applications. As a final plot in this section I show the

swaption matrix for 02.11.1998 where the market was in an excited state (Fig. 4.9).

This is the day with the greatest improvement in the sum of squared deviations.

39

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

Sum

of S

quar

ed D

evia

tions

Date

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

Sum

of S

quar

ed D

evia

tions

Date

Figure 4.4: In the top panel the sum of squared deviations is shown for all threeparameterizations. The model without regime switches is represented by the solidline, the model with regime switches and the normal state as the initial state is givenby the dashed line and the model with regime switches and the excited state as theinitial state is represented by the dotted line. By using the “correct” initial statethe sum of squared deviations should be the minimum of both models with regimeswitches shown in the lower panel.

40

0

0.01

0.02

0.03

0.04

0.05

0.06

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

a n

Date

0

0.01

0.02

0.03

0.04

0.05

0.06

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

a n

Date

Figure 4.5: Best-fit values for the parameter an. The line types in the top panel areas given in Fig. 4.4. The bottom panel shows the combined model.

41

-0.005

0

0.005

0.01

0.015

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

b n

Date

-0.005

0

0.005

0.01

0.015

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

b n

Date

Figure 4.6: Same as Fig. 4.5 for fit parameter bn

42

0

0.2

0.4

0.6

0.8

1

1.2

1.4

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

c n

Date

0

0.2

0.4

0.6

0.8

1

1.2

1.4

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

c n

Date

Figure 4.7: Same as Fig. 4.5 for fit parameter cn

43

-0.06

-0.05

-0.04

-0.03

-0.02

-0.01

0

0.01

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

d n

Date

-0.06

-0.05

-0.04

-0.03

-0.02

-0.01

0

0.01

01.01.1998 01.01.1999 01.01.2000 31.12.2000 31.12.2001

d n

Date

Figure 4.8: Same as Fig. 4.5 for fit parameter dn

44

8

10

12

14

16

18

20

22

0 5 10 15 20 25 30

σ im

p

Swaption

Figure 4.9: The swaption matrix on 02.11.1998 as given by market data (dashed), thefit without regime switches (dotted) and the fit for the model with regime switchesand the excited initial state (solid).

45

4.3 Necessary and Possible Future Steps

After the previous section showed the vast improvements that can be obtained by

the introduction of regime switches even though the model definition contains some

arbitrariness it is now time to mention possible future steps. An obvious improvement

to the model presented in the last section would be a systematic study on how to

choose the parameters that were set to “reasonable” values as well as how to possibly

choose the constraints ax = 10×an and dx = dn in a better way. A very large number

of fits would be needed for such a study. With the current implementation it is very

time consuming to analyze a large variety of parameterizations and a large range of

trading days. Even with only four parameters one fit can take up to 10 minutes on a

standard pc. For a further analysis it would be necessary to improve the performance

of the code and then systematically study reparameterizations and fits for many more

trading days.

By switching from Python as a programming language that was used for this

prototype to, e.g., C++, would certainly improve the performance a bit. However,

it would be more important to further optimize the algorithm where this is possible

and maybe find approximations for the most time-consuming steps.

The goal of a possible re-parameterization should be to further reduce the day-

to-day variations in the best-fit parameters and, of course, to further improve the

quality of the fits. The results presented in the previous section indicate that the

idea of linking an and ax as well as dn and dx works very well. Maybe these relations

can be optimized and similar constraints can be found for bn and cn. When the other

model parameters, i.e. transition probabilities λnx and λxn, the displacement α and

the correlation parameter β or even the whole correlation function, are investigated it

is important to examine the correlation among these parameters. If two parameters

show a high correlation it does not make a lot of sense to vary both parameters at

the same.

The ideal model can be reasonably fast calibrated, with stable best-fit parameters

and has a maximum deviation that is for each swaption in the range of the bit-offer

spread and not only on average, as it was obtained by the combined model in the

previous section.

46

Chapter 5

Conclusions

In this work I presented the implementation of a model for the instantaneous forward

volatility term structure of LIBOR rates. Together with the correlations among for-

ward rates the model contains all degrees of freedom that a LIBOR Market Model

has. Therefore the calibration of this model yields the specification of a LIBOR Mar-

ket Model . After the introduction, where I deduced the fact that all drift terms

in a LIBOR Market Model are given by the no-arbitrage condition, I explained the

definitions of the model for the LIBOR forward volatility term structure in detail and

showed how this model can be calibrated to either caplets and floorlets or swaptions.

I carried out a careful analysis of the possible shapes that can be obtained by varying

the different model parameters. From the large set of market data I presented two

samples that clearly show the swaption matrix evolves over time in a normal and an

excited market situation.

Based on this observation of more or less sudden switches between normal market

situations and excited markets the model incorporates transition probabilities from

a normal market situation where the instantaneous LIBOR forward volatility has a

“humped” shape to an excited market with an exponentially decaying shape and vice

versa. By using a Levenberg-Marquardt algorithm for adjusting the model parameters

to the swaption matrix for two sample trading days I showed that the implementation

works very well by comparing the limiting case of no regime switches with the results

of an earlier work by [19].

In the next step I used fits where the parameters for the normal and excited

shapes were varied independently. Again, these fits were carried out for one trading

day where the market was in a normal state and another trading day where the mar-

ket was in an exciting trading day shortly after the LTCM crisis. These fits showed

that the obtained shapes from varying excited and normal parameters independently

led to shapes for the instantaneous volatility term structure that was inconsistent

47

with the observations in the market. Therefore I suggested a more constrained pa-

rameterization with a close connection between normal and excited parameters.

As a final step I analyzed the quality of fits using this model with regime switches

and four fit parameters. For a total of 49 trading days that span the period from

January 1998 to January 2002, I performed fits of the reparameterized model with

regime switches and for the standard four-parameter model without regime switches.

Crucial for the quality of the model with regime switches is that the simulation starts

in the state that corresponds to the current market situation. This model then yields

improved fits to the marked data as compared to the model without regime switches.

While the model without regime switches yields an absolute deviation of 76 basis

points in implied volatility the model with regime switches reduces this value to 63

basis points. In addition the best-fit parameters that are obtained by the model with

regime switches show a lot less day-to-day variations compared to the model without

regime switches. This is very important for estimating re-hedging costs since stable

model parameters yield reliable hedging costs over a long time.

Even though some parameter choices in the model were just obtained by an ed-

ucated guess the results are very good. This leaves some room for improvement.

In further studies it would be worthwhile to improve the performance of the algo-

rithm and implementation of the model in order to systematically study the effects

of the parameters that were kept at a fixed value during the fit. Also some other

re-parameterizations can be studied and with a faster calibration more trading days

can be taken into account.

48

Appendix A

Bootstrapping

In order to obtain all the discount factors that I need, I use a series of swap rates

(6m, 1y, 2y, 5y, 10y, 20y, 30y) and linearly interpolate the rates in between these

market rates. After that I have a grid of rates with a step size of half a year. From

the obtained swap rates I bootstrap the discount factors for the same time grid. The

first and second discount factor are

D0 = 1 (A.1)

D1 =1

1 + τr0

(A.2)

where τ is the tenor, i.e. half a year, and r0 is the first rate in our array of rates.

All the other discount factors can be obtained from the following consideration. If I

have a notional of 1 that I invest and I obtain semi annual coupons that I properly

discount and at maturity I get the notional back, then the present value is 1.

1 = D1r0τ + D2r1τ + · · · + Dnrn−1τ + Dn (A.3)

The last term corresponds to the payment of the notional. I can therefore use this

equation for a range of different n and obtain the discount factors by the following

formula:

Dn =1 −∑n−1

i=1 τri−1Di

1 − τrn−1

(A.4)

49

Appendix B

The Levenberg-Marquardt Method

The original idea for this algorithm dates back to 1944 when it was first published by

[9] and later by [10] (which we will follow here). The implementation that is used in

the code is a function of the MINPACK software, which is documented in [13]. It is

a method to iteratively solve a nonlinear optimization problem.

For the optimization the model is given by the function y = f(x, β) where x is

a vector of length m representing the independent variables of the model, like swap

maturity and swaption expiry. The vector β contains the k model parameters, i.e. αn,

βn, and so on. In addition there is a set of (market) parameters Y which is a vector of

length n corresponding to the Black volatilities in this case, with a set of n vectors Xi

of the values of independent variables at which the data where taken. The problem

now consists of computing the set of parameters β that minimizes the expression

Φ =n∑

i=1

(f(Xi, β) − Yi)2 (B.1)

where Yi refers to the i-th element of Y . The standard approach, also in other

optimization algorithms, is to use a taylor series

f(Xi, β + δt) ≃ f(Xi, β) +

k∑

j=1

∂fi

∂βj

(δt)j ≡ f0 + Jδt (B.2)

with the vector δj being small and fi ≡ f(Xi, β), f0 is the unperturbed value and J

is the Jacobian. Plugging this into Eq. (B.1) and using ∂Φ∂βj

= 0 for all j leads to the

minimization condition for δt

JT Jδt = JT (Y − f0) . (B.3)

Note that J is a matrix of size n×k, δt is a vector of size k, and Y and f0 are vectors

of size n. In the gradient methods the perturbation δ is taken to be in the direction

50

of the gradient

δg = −(

∂Φ

∂β1,∂Φ

∂β2, ...,

∂Φ

∂βk

)

. (B.4)

For a valid problem the direction of the negative gradient and the true correction

that minimizes the problem δt are orthogonal.

Levenberg and Marquardt improved these methods by using a strategy that in-

terpolates between deltat and deltag. By modifying Eq. (B.3) to

(

JT J + λ)

δt = JT (Y − f0) (B.5)

where J is the matrix with elements ∂fi

∂βj/

√

∑n

l=0

(

∂fl

∂βj

)2

. By iteratively varying λ

such that the estimates of Φ decrease it is possible to obtain rapid convergence.

51

Bibliography

[1] Black F, “The pricing of commodity contracts”, The Journal of Financial Eco-

nomics, 3, pp. 167–179, 1976

[2] Brace A, Gatarek D, Musiela M, “The Market Model of Interest Rate Dynamics”,

Mathematical Finance, Volume 7, pp. 127, April 1997

[3] Brigo D, Mercurio F, “Interest Rate Models”, Springer, January 2001

[4] Hull J, White A, “Bond Option Pricing Based on a Model for the Evolution of

Bond Prices”, Advances in Futures and Option Research, 6, 2003.

[5] Heath D, Jarrow R, Morton A, “Bond Pricing and the Term Structure of Interest

Rates: A New Methodology for Contingent Claims Valuation”, Econometrica,

Vol. 60, No. 1 , pp. 77–105, January, 1992

[6] Hull J, White A, “One-Factor Interest-Rate Models and the Valuation of Interest-

Rate Derivative Securities”, Journal of Financial and Quantitative Analysis, Vol-

ume 28, Number 2 , pp. 235–254, June 1993

[7] Jamshidian F, “LIBOR and swap market models and measures”, Finance and

Stochastics, Volume 1, Number 4, pp. 293–330, September 1997

[8] Joshi M, Rebonato R, “A stochastic-volatility, displaced-diffusion extension of the

LIBOR market model”, Quantitative Finance 3, pp. 458–469, 2003

[9] Levenberg K, “A method for the solution of certain problems in least squares”,

Quart. Appl. Math, 1944

[10] Marquardt DW, “An Algorithm for Least-Squares Estimation of Nonlinear Pa-

rameters”, Journal of the Society for Industrial and Applied Mathematics, Volume

11, Number 2, pp. 431–441, June 1963

52

[11] Marris D, “Financial Option Pricing and Skewed Volatility’, M. Phil Thesis,

Statistical Laboratory, University of Cambridge, 1999

[12] Matsumoto M, Nishimura T, “Mersenne twister: a 623-dimensionally equidis-

tributed uniform pseudo-random number generator”, ACM Transactions on Mod-

eling and Computer Simulation (TOMACS), Volume 8, Issue 1, pp. 3–30, January

1998

[13] More JJ, Garbow BS, Hillstrom KE “User guide for MINPACK”, Argonne Na-

tional Laboratory, 1980

[14] Rebonato R, “Interest-Rate Option Models: Understanding, Analysing and Us-

ing Models for Exotic Interest-Rate Options”, Wiley Series in Financial Engineer-

ing, 1998

[15] Rebonato R, “On the pricing implications of the joint lognormal assumption for

the swaption and cap markets”, Journal of Computational Finance, Volume 2,

Number 3, pp. 30–52, 1999

[16] Rebonato R, “Modern Pricing of Interest-Rate Derivatives: The LIBOR Market

Model and Beyond”, Princeton University Press, November 2002

[17] Rebonato R, “Volatility and Correlation: The Perfect Hedger and the Fox”, 2nd

edition, J. Wiley Chichester, West Sussex, England, 2003

[18] Rebonato R, “Interest-rate term-structure pricing models: a review”, Proceed-

ings of the Royal Society A: Mathematical, Physical and Engineering Sciences,

Volume 460, Number 2043, pp. 667–728, March 2004

[19] Rebonato R, “Forward-Rate Volatilities and the Swaption Matrix: Why Neither

Time-Homogeneity Nor Time-Dependence Are Enough”, to appear in Interna-

tional Journal of Theoretical and Applied Finance, August 2006

[20] Rebonato R, Kainth D, “A two-regime, stochastic-volatility extension of the libor

market model”, International Journal of Theoretcal and Applied Finance, World

Scientific, 2004

[21] Rubinstein M, “Displaced Diffusion Option Pricing”, The Journal of Finance,

Volume 38, Number 1, pp. 213–217, March 1983

[22] Theis J, “Pricing with LIBOR and Swap Market Models”, Oxford, March 2005

53

a two-regime markov-chain model for the swaption matrix 2-regime... · a two-regime markov-chain...

Documents