statistical arbitrage pairs trading, long-short strategy

Statistical Arbitrage

Pairs Trading, Long-Short Strategy

Cyrille BEN LEMRID

Credit Suisse supervisor : Frederic PECQUEUR

Academic supervisors : Olivier GUEANT, Simone SCOTTI

Paris Diderot University, Paris VII

October 1, 2012

Contents

1 Pairs Trading Model 5

1.1 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Spread dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 State of the art and model overview 9

2.1 Stochastic Dependencies in Financial Time Series . . . . . . . . . . . . . . . 9

2.2 Cointegration-based trading strategies . . . . . . . . . . . . . . . . . . . . . 10

2.3 Formulation as a Stochastic Control Problem . . . . . . . . . . . . . . . . . . 13

2.4 Fundamental analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Strategies Analysis 19

3.1 Road map for strategy design . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2 Identification of potential pairs . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.3 Testing cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.4 Risk control and feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Results 22

2

Introduction

This report presents my research work carried out at Credit Suisse from May to September2012. This study has been pursued in collaboration with the Global Arbitrage Strategiesteam.

Quantitative analysis strategy developers use sophisticated statistical and optimizationtechniques to discover and construct new algorithms. These algorithms take advantageof the short term deviation from the ”fair” securities’ prices. Pairs trading is one suchquantitative strategy - it is a process of identifying securities that generally move togetherbut are currently ”drifting away”.

Pairs trading is a common strategy among many hedge funds and banks. However, there isnot a significant amount of academic literature devoted to it due to its proprietary nature.For a review of some of the existing academic models, see [6], [8], [11] .

Our focus for this analysis is the study of two quantitative approaches to the problem ofpairs trading, the first one uses the properties of co-integrated financial time series as a basisfor trading strategy, in the second one we model the log-relationship between a pair of stockprices as an Ornstein-Uhlenbeck process and use this to formulate a portfolio optimizationbased stochastic control problem.

This study was performed to show that under certain assumptions the two approaches areequivalent.

Practitioners most often use a fundamentally driven approach, analyzing the performanceof stocks around a market event and implement strategies using back-tested trading levels.

We also study an example of a fundamentally driven strategy, using market reaction to astock being dropped or added to the MSCI World Standard, as a signal for a pair tradingstrategy on those stocks once their inclusion/exclusion has been made e↵ective.

This report is organized as follows. Section 1 provides some background on pairs tradingstrategy. The theoretical results are described in Section 2. Section 3 describes our

3

methodology of constructing pairs and calculating returns. And finally in section 4, theresults are illustrated with numerical examples involving a real pair of stocks.

Acknowledgements

This study has been pursued in collaboration with the Global Arbitrage Strategies team atCredit Suisse. Since my first arrival in the locals of Credit Suisse, I have been thoroughlyassisted. Successful professionals were kind enough to answer my questions and to give theiropinion on my work during the whole internship. Without their advices, this study wouldnot have achieved its current findings.

First, I would like to thank my supervisor Mr Frederic Pecqueur, Managing Director of theIndex Arbitrage Team. I would also like to thank Alan Du↵y and Reda Ighachan, theygreatly contributed to my project and gave me valuable feedbacks. I am also thankful toAndrew Coman, Alex Nevill and Karl Bogoslavski for their contributions, and to the teamin general.

4

CHAPTER 1

Pairs Trading Model

1.1 General discussion

The fundamental idea of pair trading is that knowing that a pair of financial instrumentshas historically moved together and kept a specific pattern for their spread, we could takeadvantage of any disturbance over this historic trend. Pairs trading involves selling thehigher-priced security and buying the lower-priced security with the idea that the mispricingwill correct itself in the future. The mutual mispricing between the two securities is capturedby the notion of spread. The greater the spread, the higher the magnitude of mispricing andgreater the profit potential.

There are generally two types of pairs trading: statistical arbitrage convergence/divergencetrades, and fundamentally-driven valuation trades. In the former, the driving force for thetrade is a aberration in the long-term spread between the two securities, and to realize themean-reversion back to the norm, you short one and go long the other. The trick is creatinga program to find the pairs, and for the relationship to hold. The other form of pairs tradingwould be more fundamentally-driven variation, which is the purvey of most market-neutralhedge funds: in essence they short the most overvalued stock and go long the undervaluedstock. However, it is possible to determine that a security is overvalued or undervalued onlyif we also know the true value of the security in absolute terms.

A long-short position in the two securities is constructed such that it has a negligible beta andtherefore minimal exposure to the market. Hence, the returns from the trade are uncorrelatedto market returns, a feature typical of market neutral strategies. The key to success in pairstrading lies in the identification of security pairs.

Based on the discussion so far, the truly crucial questions are : How do you identify ”stocksthat move together?” Need they be in the same industry? Should they only be liquid stocks?

5

How far do they have to diverge before a position is put on? When is a position unwound?

The return on both securities is expected to be close over the time frames. In other words,the increment to the logarithm of the prices at the current time must be about the samefor both the securities at all time instances in the future. This, of course, means that thetime series of the logarithm of the two prices must move together, and the spread calculationformula is therefore based on the di↵erence in the logarithm of the prices. Having explainedour approach, we now need to define in precise terms what we mean when we say that theprice series or the log price series of the two securities must move together.

The idea of comovement is not captured by correlation of two time series but by the notionof cointegration that has been well developed in the field of statistics.

1.2 Cointegration

It is well documented that the correlation is a measure of the short-term linear dependencies(see [4], Theorem 4.5.7). In contrast to correlation, cointegration is a measure of long-termdependencies (see [9]).

We will briefly outline the notion of cointegration. In order to do so, we need to first definestationary and integrated time series.

Definition 1.2.1 X

t

2 R, t 2 Z is said strictly stationary i↵ its finite dimensionaldistributions are invariant under any time translation, i.e:

8⌧ 2 Z,8n 2 N⇤, 8(t1, .., tn) 2 Zn

, (Xt1 , .., Xtn) ⇠ (X

t1�⌧

, .., X

tn�⌧

)

Definition 1.2.2 A stochastic process X

t

2 R, t 2 Z is stationary if its first and secondmoments are time invariant::– (X

t

)t2Z 2 L

2(R), i.e 8t 2 Z, E [X2t

] < 1– 8t 2 Z, E [X

t

] = E [X0] := µ

X

– 8s, t 2 Z, �

X

(t, s) := Cov (Xt

, X

s

) = Cov (X0, Xs�t

) =: � (s� t)

Such a process is known as integrated of order 0 and denoted by I(0).

Definition 1.2.3 Univariate process is called integrated of order d, I(d), if in its originalform it is non-stationary but becomes stationary after di↵erencing d times.

Definition 1.2.4 If all elements of the vector Xt

= (X1t

, ..., X

N

t

)T

for t = 1, ..., N , are I(1),

and there exists a non-null vector � = (�1, ..., �

N)T

such that �T

X is I(0), then the vectorprocess X

t

is said to be cointegrated, and b is called the cointegrating vector. For example,two time series X and Y are cointegrated if X, Y are I(1), and there exists a scalar � suchthat X � � ⇤ Y is I(0).

6

The explanation for cointegration dynamics is captured by the notion of error correction.The idea behind error correction is that cointegrated systems have a long-run equilibrium;that is, the long-run mean of the linear combination of the two time series. If there is adeviation from the long-run mean, then one or both time series adjust themselves to restorethe long-run equilibrium.

1.3 Spread dynamics

The purpose of this section is to demonstrate that the modeling of the spread in parametricterms could indeed get complex. This section formulates a stochastic model associated witha cointegration relation. We begin by describing the asset, spread, and wealth dynamics.

We assume that a risk-free asset S0(t) exists with a riskfree rate of r compounded continu-ously. Thus, S0(t) satisfies the dynamics

dS0(t) = rS0(t)dt (1.1)

Let S1(t) and S2(t) denote respectively the prices of the pair of stocks S1 and S2 at time t.We assume that stock S2 follows a geometric Brownian motion

dS2(t) = µS2(t)dt+ �S2(t)dW2(t) (1.2)

where µ is the drift, � is the volatility, and W2(t) is a standard Brownian motion. Let X(t)denote the spread of the two co-integrated stocks at time t, defined as

X(t) = ln(S1(t))� � ln(S2(t)) (1.3)

Since S1(t) and S2(t) are co-integrated X(t) is stationary. We assume that the spread followsan Ornstein-Uhlenbeck process

dX(t) = k(✓ �X(t))dt+ ⌘dW

X

(t) (1.4)

where k(✓�X(t)) is the drift term that represents the expected instantaneous change in thespread at time t, and ✓ is the long-term equilibrium level to which the spread reverts. Therate of reversion is represented by the parameter k, which has to be positive to ensure stabilityaround the equilibrium value. The standard deviation parameter, ⌘, determines the volatilityof the spread. W

X

(t) is a standard Brownian motion where denotes the instantaneouscorrelation coe�cient between W

X

(t) and W2(t) (i.e. E [dWX

(t)dW2(t)] = ⇢dt).

By using (1.2), (1.3), (1.4) and applying Ito’s lemma to S1(t) = e

X(t)S2(t)�, we are able to

obtain the dynamics of S1(t) as

7

dS1(t) =

✓�µ+ k(✓ �X(t)) +

1

2⌘

2 +1

2�(� � 1)�2 + �⇢�⌘

◆S1(t)

+ ��S1(t)dW2(t) + ⌘S1(t)dWX

(t) (1.5)

Let V (t) be the value of a self-financing pairs-trading portfolio and let h(t) and h(t) denoterespectively the portfolio weights for stocks S1 and S2 at time t. Additionally, we only allowourselves to trade stocks S1 and S2 as a cointegrated pair. Thus, we require that

h(t) = ��h(t) (1.6)

Finally, the wealth dynamics of the self-financed portfolio value is given by

dV (t) = V (t)

✓h(t)

dS1(t)

S1(t)+ h(t)

dS2(t)

S2(t)+

dS0(t)

S0(t)

◆(1.7)

Using (1.1), (1.2), (2.2) and (1.6), we can rewrite (1.7) as

dV (t) = V (t)

⇢h(t)

✓k(✓ �X(t)) +

1

2⌘

2 +1

2�(� � 1)�2 + ��⌘⇢

◆+ r

�dt+ ⌘dW

X

(t)

�

(1.8)

We see that the self-financed portfolio dynamics depends on the spread dynamics only, thusin theory we do almost totally avoid the systematic market risk modelled here by W2(t).

8

CHAPTER 2

State of the art and model overview

2.1 Stochastic Dependencies in Financial Time Series

Assume now that we have N � 2 cointegrated financial assets, and their log-prices are I(1)processes. It is widely assumed that stock returns are integrated of order 0, whereas thestock prices are integrated of order 1 (see [1]).

Denote the vector of the asset prices by S

t

= (S1t

, ..., S

N

t

)T

Each of its elements can be written as

S

i

t

= S

i

0e

tPj=0

r

ij

where r

t

= (r1t

, ..., r

N

t

)T are the continuously compounded asset returns, and S

10 , ..., S

N

0 arethe initial prices.

Then, the log-prices can be written as

lnSi

t

= lnSi

0 +tX

j=0

r

i

j

Denote the corresponding cointegrating vector by � = (�1, ..., �

N)T

. By the definition ofcointegration, the resulting time series X

t

X

t

=NX

i=1

�

i lnSi

t

9

will be stationary and integrated of order 0.

The next two propositions lead to derivations of new properties of cointegrated time seriesthat we later use for the construction of a new trading strategy.

Proposition 2.1.1 Assume that the log prices of N assets, lnSi, i = 1, ..., N , are cointe-

grated with a cointegrating vector �. Let Xt

=NPi=1

�

i lnSi

t

be the corresponding stationary

series, and r

t

= (r1t

, ..., r

N

t

)T be the continuously compounded asset returns at time t ¿ 0.

Define Z

t

=NPi=1

b

i

r

i

t

has the following three properties:

Z

t

= X

t

�X

t�1 =NPi=1

�

i

r

i

t

. If limp!1 Cov [X

t

, X

t�p

] = 0 , then1Pp=1

pCov [Zt

, Z

t�p

] = �VarXt

where Cov [Zt

, Z

t�p

] =NPi=1

NPj=1

b

i

b

j Cov⇥r

i

t

, r

j

t�p

⇤

Proof : See [10].

Proposition 2.1.1 is a technical result. Intuitively, it shows that the variance of thecointegration process (X

t

) inadvertently defines the auto-covariance of the asset returns.

Proposition 2.1.2 Assume lnS

i, i = 1, ..., N are the log-prices of N assets, and r

t

=(r1

t

, ..., r

N

t

)T are the continuously compounded asset returns at time t ¿ 0. For some finitevector � the process lim

p!1 Cov [Xt

, X

t�p

] = 0 is stationary, and therefore the time series

of the assets’ log-prices are cointegrated, if and only if the process Z

t

=NPi=1

�

i

r

i

t

.

– E [Zt

] = 0

– varZt

= �21Pp=1

Cov [Zt

, Z

t�p

]

–1Pp=1

pCov [Zt

, Z

t�p

] < 1

Proof : See [10].

As a result of proposition 2.1.2, it follows that cointegration is a property related to the1st and 2nd moments of asset returns. In previous work, cointegration was viewed as aproperty of asset prices. Here cointegration is defined by the stochastic relationships amongthe returns.

2.2 Cointegration-based trading strategies

Next, we introduce a trading strategy by exploiting the theoretical results derived in theprevious sections.

10

Summarizing the results from the two propositions: for process Zt

= X

t

�X

t�1 =NPi=1

b

i

r

i

t

we have that E [Zt

] = 0 and VarZt

= �21Pp=1

Cov [Zt

, Z

t�p

]. Consider a strategy where each

time period we buy ��

i

C

1Pp=1

Z

t�p

value of stock i, i = 1, ..., N with C is a positive scale

factor. The reason for which we include constant C will become clear later.

At any point in time we can compute the profit of this strategy by multiplying the nextperiod return by the shares purchased:

V

t

=NX

i=1

��

i

C

" 1X

p=1

Z

t�p

#r

i

t

= �C

1X

p=1

Z

t�p

Z

t

Given that E[Zt

] = 0 and Cov[Zt

, Z

t�p

] = E[Zt

Z

t�p

],p > 0, the expected profit of thisstrategy is:

E [Vt

] = E

"�C

1X

p=1

Z

t�p

Z

t

#= �C

1X

p=1

Cov [Zt�p

, Z

t

] = 0.5C VarZt

Since VarZt

and C are positive, the expected profit of the proposed strategy is always positiveand proportional to the scale factor C. The reasoning behind this strategy is fairly simple.

The cointegration relations between time series imply that the time series are bound together.Over time the time series might drift apart for a short period of time, but they ought tore-converge. The term1Pp=1

Z

t�p

=1Pp=1

(Xt�p

�X

t�1�p

) = X

t

� limu!1

X

u

= X

t

� ✓ measures how far they diverge, and

sign

✓��

i

C

1Pp=1

Z

t�p

◆provides the direction of the trade for stock i. Specifically, +1 stands

for a long position, whereas -1 denotes a short trade. This strategy relies on identifyingspreads that have gone apart but are expected to mean revert in the future. The spreads oftypical pairs-trading strategy get identified by using correlation as a similarity measure andstandard deviation as a spread measure. A trade, for example, will be put in place if theassets are highly correlated but have gone apart for more than 3 standard deviations. Thetrade will unwind when the assets converge or some time limit is reached.

This approach uses cointegration as a measure of similarity. Cointegration is the naturalanswer of the question: How do we identify assets that move together? Proposition 2 providesthe answer of the question: How far do the assets have to diverge before a trade is placed?As a result, the decision to execute a trade is driven by cointegration properties of the assets.Having positive expected profit is excellent news for any strategy. The proposed strategy hassome shortcomings. The initial amount of money needed each period is a random variable,and the resulting portfolio is not dollar neutral (i.e. the total dollar value of the long positionis not equal to the total dollar value of the short position.) To construct a dollar neutral

11

long-short portfolio, we will first partition the cointegrated time series into two sets L andS:

i 2 L $ �

i � 0i 2 S $ �

i

< 0

Next, depending on what set a given asset belongs to, we purchase the value of

��

iCsign

PP

p=1Zt�p+1

!

P

it

Pj2L

�

j

�

iCsign

PP

p=1Zt�p+1

!

P

it

Pj2S

�

j

The return of this modified strategy is identical to the proposed earlier. Hence, the expectedprofit for that strategy is also positive. Indeed (without loss of generality) assume that

sign

✓��

i

C

1Pp=1

Z

t�p

◆= �1. The long returns RL

t

and the short returns RS

t

of our original

strategy are

R

L

t

=

Pi2L ��

i

C

P1p=1 Zt�p

S

i

t+1

�S

i

tPi2L ��

i

C

P1p=1 Zt�p

� 1 =

Pi2L �

i

S

i

t+1

�S

i

tPi2L �

i

� 1

R

S

t

= 1�P

i2S ��

i

C

P1p=1 Zt�p

S

i

t+1

�S

i

tPi2S ��

i

C

P1p=1 Zt�p

= 1�P

i2S �i

S

i

t+1

�S

i

tPi2S �

i

The modified strategy has the following returns from the short and long positions:

R

L

t

=

Pi2L

��

iP

i2L �

iCsign

⇣P1p=1 Zt�p

⌘S

i

t+1

�S

i

t

Pi2L

��

iP

i2L �

iCsign

⇣P1p=1 Zt�p

⌘ � 1 =

Pi2L �

i

S

i

t+1

�S

i

tPi2L �

i

� 1

R

S

t

= 1�

Pi2L

��

iP

i2S �

iCsign

⇣P1p=1 Zt�p

⌘S

i

t+1

�S

i

t

Pi2L

��

iP

i2S �

iCsign

⇣P1p=1 Zt�p

⌘ = 1�P

i2S �i

S

i

t+1

�S

i

tPi2S �

i

The above derivations indicate the return of the modified strategy is the same as the originalone, therefore its expected profit is positive(since we proved that the expected return of theoriginal strategy is positive). Now we can explain why we have included the constant C. In themodified strategy, every time period the value of C is invested in short and long positions.Hence, the money needed for each time period in order to execute the new strategy is aconstant, and the portfolio we obtain is dollar neutral.

In reality, we cannot compute the true value of (P1

p=1 Zt�p

(the cointegration vector b.) Weestimate them, and with the above theoretical results in mind, we propose the followingtrading strategy:– Step 1: using historical data, estimate the cointegration vector �.

12

– Step 2: using the estimated cointegration vector � and historical data, construct Z

t

realizations of the process Zt

=NPi=1

�

i

r

i

t

– Step 3: compute the final sumPX

p=1

Z

t�p+1

, where P is a parameter.– Step 4: partition the assets into two sets L and S (depending on values of �.)– Step 5: buy (depending in which set the asset belongs to) the following number of shares(round down to get integer number of shares):

��

i

S

i

t

Pi2L �

i

Csign

PX

p=1

Z

t�p+1

!

�

i

S

i

t

Pi2S �

i

Csign

PX

p=1

Z

t�p+1

!

– Step 5: buy (depending in which set the asset belongs to) the following number of shares(round down to get integer number of shares):

– Step 6: rebalance all the open positions the following trading day.– Step 7: update the historical data set.– Step 8: If it is time to re-estimate the cointegration vector (which happens every 22 tradingdays), go to step 1, otherwise go to step 2.

In the next section we describe the procedures used to test the strategy and present thenumerical results.

2.3 Formulation as a Stochastic Control Problem

We recall the wealth dynamics (2.1)

dV (t) = V (t)

⇢h(t)

✓k(✓ �X(t)) +

1

2⌘

2 +1

2�(� � 1)�2 + ��⌘⇢

◆+ r

�dt+ ⌘dW

X

(t)

�

(2.1)

We formulate the portfolio optimization pair-trading problem as a stochastic optimal controlproblem. We assume that an investor’s preference can be represented by the utility functionU(x) = 1

�

x

�, with x � 0 and x < 1. In this formulation, our objective is to maximizeexpected utility at the final time T . Thus, we seek to solve

13

suph(t)

E

1

�

V (T )��

subject to V (0) = v0, X(0) = x0

dX(t) = k(✓ �X(t))dt+ ⌘dW

X

(t)

dV (t)

V (t)=

h(t)

✓k(✓ �X(t)) +

1

2⌘

2 +1

2�(� � 1)�2 + ��⌘⇢

◆+ r

�dt

+ ⌘dW

X

(t)

where the supremum is taken over strategies h(t) that are adapted to the filtration generatedby W

X

(t) and W2(t). (For a rigorous formulation in a related setting, see [13].) In thisoptimal control problem, the first constraint just specifies the initial wealth of our portfolioand the spread. The second and third constraints describe the spread and wealth dynamicsrespectively.

In the following section, we show that a closed form solution to the above stochastic controlproblem exists.

Let G(t, v, x) denote the value function.

G(t, v, x) = suph

E

t,x,v

[V (T )�]

For any strategies h(t), define the Dynkin operator

Lh(t, x, v) = �k(x� ✓)@x

+

hk(✓ � x) +

1

2h⌘

2 +1

2h�(� � 1)�2 + h��⌘⇢+ rh

�v@

v

+1

2

⇥⌘

2@

xx

+ 2h⌘2v@vx

+ h

2⌘

2v

2@

vv

⇤(2.2)

The HJB equation can be rewritten using the Dynkin operator

@

t

G+ suph

⇥Lh

G

⇤= 0

subject to the terminal conditionG(T, v, x) = v

�

By standard arguments, one may show that the Hamilton-Jacobi-Bellman (HJB) equationcorresponding to our stochastic control problem is

G

t

+ suph

{12

⇥h

2⌘

2v

2G

vv

+ ⌘

2G

xx

+ 2h⌘2vGvx

⇤

+

hk(✓ � x) +

1

2h⌘

2 +1

2h�(� � 1)�2 + h��⌘⇢+ rh

�vG

v

� k(x� ✓)Gx

} = 0 (2.3)

14

where the subscripts on G denote partial derivative.

For notational ease we let b = k(✓ � x) + 12⌘

2 + �⇢⌘� + 12h�(� � 1)�2 and rewrite 2.3 as

G

t

+ suph

{12

⇥h

2⌘

2v

2G

vv

+ ⌘

2G

xx

+ 2h⌘2vGvx

⇤+ [hb+ r] vG

v

� k(x� ✓)Gx

} = 0 (2.4)

The first order condition for the maximization in 2.4 is

h

⇤⌘

2vG

vv

+ ⌘

2G

vx

+ bG

v

= 0 (2.5)

Assuming G

vv

< 0, the first order condition 2.5 is also su�cient, yielding

h

⇤ = �⌘

2G

vx

+ bG

v

⌘

2vG

vv

(2.6)

Plugging 2.6 back into 2.4 yields

⌘

2G

t

G

vv

� 1

2⌘

4G

2vx

� 1

2b

2G

2v

� b⌘

2G

v

G

vx

+1

2⌘

4G

vv

G

xx

+ r⌘

2vG

v

G

vv

� k(x� ✓)⌘2Gx

G

vv

= 0 (2.7)

Thus, we must solve the partial di↵erential equation 2.7 in order to determine an optimalstrategy.

To obtain a closed form solution, we consider the following separation ansatz that wasmotivated by [13] where a di↵erent portfolio optimization problem under Vasicek [18] termstructure dynamics was solved,

G(t, v, x) = f(t, x)v�

with the condition thatf(T, x) = 1

For this choice of ansatz, 2.7 becomes

(� � 1)⌘2fft

� 1

2�⌘

4f

2x

� 1

2�b

2f

2 � 1

2�⌘

4ff

x

� �⇢�⌘

3ff

x

+1

2(� � 1)⌘4ff

xx

+ �(� � 1)r⌘2f 2 + k(x� ✓)⌘2ffx

= 0 (2.8)

We then use the following ansatz for f(t, x)

f(t, x) = g(t)exB(t)+x

2A(t)

with g(T ) = 1, B(T ) = 0, A(T ) = 0.

15

Pluging the ansatz into 2.8 and setting the coe�cient of x2 to be zero yields an ordinarydi↵erential equation for A(t)

h⇥(� � 1)⌘2

⇤A

04iA

2 +⇥2k⌘2

⇤A� 1

2�k

2 = 0 (2.9)

Similarly, setting the coe�cient of x in 2.8 to be zero yields an ordinary di↵erential equationfor B(t)

h⇥(� � 1)⌘2

⇤B

02 � 2⌘4AiB+

��⌘

4A� 2�⇢�⌘3A� 2k✓⌘2A+ �k

2✓ +

1

2�k⌘

2 + ��k⇢⌘� +1

2k�(� � 1)�2

�= 0

(2.10)

Noting that 2.9 is a Riccati equation for A(t), and 2.10 is first order linear ordinary di↵erentialequations for B(t), respectively, one may obtain the solution in closed form as,

A(t) =k

�1�

p1� �

�

2⌘2

8<

:1 +2p1� �

1�p1� � � (1 +

p1� �) exp

⇣2k(T�t)p

1��

⌘

9=

; (2.11)

B(t) =1

2⌘2h(1�

p1� �)� (1 +

p1� �) exp

⇣2k(T�t)p

1��

⌘i

[�p

1� �(⌘2 + 2�⇢�⌘ + �(� � 1)�2)

1� exp

✓2k(T � t)p

1� �

◆�2

� �

�⌘

2 + 2�⇢�⌘ + �(� � 1)�2 + 2k✓�

1� exp

✓2k(T � t)p

1� �

◆�] (2.12)

Consequently, the optimal weight h⇤(t) can be obtained via 2.6

h

⇤(t, x) =1

1� �

B(t) + 2A(t)x� k (x� ✓)

⌘

2+

�⇢�

⌘

+�(� � 1)�2

2⌘2+

1

2

�(2.13)

With the above closed form solution in hand we find that as in the previous section theoptimal weight h⇤(t) is linear in x, through the distance between the di↵erence in the spreadprocess and its the long-run mean x� ✓.

The term X

t

� ✓ in the optimal weight h⇤(t) is equivalent to the term1Pp=1

Z

t�p

we found in

the standard cointegration strategy, we find that both approaches are consistent.

2.4 Fundamental analysis

Fundamental analysis of a business involves analyzing its financial data to get some insighton whether it is overvalued or undervalued. This is done by analyzing historical and present

16

economic data to do a financial forecast of the business. The intrinsic value of the businessis found by doing a fundamental analysis which consist of three main steps; (I) economicanalysis, (II) industry analysis and (III) company analysis. If the intrinsic value is higherthan the market price it is recommended to buy stocks, if it is equal to market price then itis best to hold your shares, and if it is less than the market price then it’s a selling signal.Fundamental analysis maintains that markets may misprice an asset in the short run butthat the ”correct” price will eventually be reached. Profits can be made by trading themispriced security and then waiting for the market to recognize its ”mistake” and reprisesthe security.

In this section we study an example of a fundamentally driven strategy, using market reactionto a stock being dropped or added to the MSCI World Standard, as a signal for a pair tradingstrategy on those stocks once their inclusion/exclusion has been made e↵ective.

Both FTSE and MSCI have their own set of criteria for including stocks into their respectiveindices. These criteria include (but are not limited to) size, liquidity, free float and tradehistory. Stocks not passing through these filters are not eligible to be a part of the index.

Since so much money throughout the world is passively managed (and therefore needs toclosely replicate the performance of the benchmarked index), it is reasonable to assume thatchanges in index constituents can drive huge flows in and out of the stocks in play.

The way we constructed this fundamentally driven strategy was to look to ”buy” stocks thatare announced to be included in the MSCI ACWI and ”sell” stocks that are announced tobe dropped from the benchmark.

Figure 2.1: MSCI Rebalance Profit and Loss

The strategy sounds simple enough to track; however, there are a few practical barriers totesting this hypothesis that we had to consider. In an ideal world, to extract maximumbenefit from the announcement, we would like to buy (sell) the additions (deletions) on thenight the reviews are announced (or develop a strategy to pre-empt those announcements)

17

and exit our positions at the close of business on the day when changes become e↵ective.The first drawback is that since the reviews are not made public until the markets haveceased trading for the day, it is impossible to take positions at the closing levels of the day,unless we are aware of what changes constitute the announcements.

We now go about testing the ”Index e↵ect” strategy historically. As mentioned earlier, ourinterest covers ”announcement dates” as well as ”e↵ective” dates. For this analysis, wetherefore decided to go Long stocks that have been announced as being soon added into theMSCI Europe Index and we go Short on stocks that have been announced as soon beingdeleted from the Index.

We go Long (Short) on the close of the day following the index change announcement andkeep positions until the close of the e↵ective date. On the close of the e↵ective date (iewhen all index changes are taking place), we square o↵ our positions and wait for the nextquarterly review.

Figure 2.2: MSCI longshort

We see that following the announcement of its addition to benchmark Index, a stock’s per-formance is usually positive on an absolute and relative basis. Conversely, an announcementof a stock’s deletion from benchmark Index is usually a negative trigger, on both absoluteand relative basis.

In this piece, this fundamentally driven strategy generated some impressive returns: 12.3%annualised with a Sharpe of 0.7 (before commissions, borrowing fees and transaction costs).

18

CHAPTER 3

Strategies Analysis

3.1 Road map for strategy design

In this section we we provide a road map for the design and analysis of the pairs tradingstrategy. The steps involved are as follows:

1. Identify stock pairs that could potentially be cointegrated. This process can be basedon the stock fundamentals or alternately on a pure statistical approach based onhistorical data.

2. Once the potential pairs are identified, we verify the proposed hypothesis that thestock pairs are indeed cointegrated based on statistical evidence from historical data.This involves determining the cointegration coe�cient and examining the spread timeseries to ensure that it is stationary and mean reverting.

3. We then examine the cointegrated pairs to determine the delta. A feasible delta thatcan be traded on will be substantially greater than the slippage encountered due tothe bid-ask spreads in the stocks.

3.2 Identification of potential pairs

The challenge in this strategy is identifying stocks that tend to move together and thereforemake potential pairs. Our aim is to identify pairs of stocks with mean-reverting relativeprices. To find out if two stocks are mean-reverting the test conducted is the Dickey-Fullertest of the log ratio of the pair.

A Dickey-Fuller test consists in determining if the log-ratio x

t

= log S1t

� � logS2t

of shareprices S1

t

and S

2t

. is indeed stationary.

19

Critical values of the cointegration test are depends on the number of observations, so thatwe have to compute our own critical values for the Dickey-Fuller test for cointegration. Theprocedure is as follows :

1. We simulate two time series of T error terms ("(i)t

, ⌘

(i)t

), t = 1, ..., T , distributed astwo independent N (0, 1) variables, and the independent random walks associated

p

(i)t

= p

(i)t�1 + "

(i)t

and d

(i)t

= d

(i)t�1 + ⌘

(i)t

2. We estimate by regression the relation between the two time series p(i)t

= a+bd

(i)t

+z

(i)t

Under the null of no co-integration, the residual series z

(i)t

should be non-stationary.We therefore perform a standard Dickey-Fuller test on z

(i)t

.

3. We fit an AR(1) model for the residuals, under the alternative hypothesis, i.e.

�z

(i)t

= ↵

(i) + �

(i)z

(i)t�1 + u

(i)t

And compute the t-stat for �(i) denoted t(�(i)) .

4. Then the quantiles at 10%, 5% and 1% of the distribution of t(�(i)) give the 10%, 5%and 1% critical values for the Dickey-Fuller test for cointegration. For T = 100 andthe critical values are -3.07 at 10%, -3.37 at 5% and -3.96 at 1%).

3.3 Testing cointegration

We now test the null hypothesis that stock prices are cointegrated. We proceed as follows.

1. Estimate the regression for the pair s1t

= logS

1t

, s2t

= logS

2t

:

s

1t

= a+ bs

2t

+ x

t

2. Use the Dickey-Fuller test for testing the null of unit root in x

t

. So, estimate theregression

�x

t

= ↵ + �x

t�1 + u

(i)t

and test the null hypothesis H0 : � = 0 using the corresponding t-stat. Use the criticalvalues computed previously.

In other words, we are regressing on lagged values of Xt

. the null hypothesis is that � = 0,which means that the process is not mean reverting. If the null hypothesis can be rejected onthe 99% confidence level the price ratio is following a weak stationary process and is therebymean-reverting. Research has shown that if the confidence level is relaxed, the pairs do notmean-revert good enough to generate satisfactory returns. This implies that a very largenumber of regressions will be run to identify the pairs. If we have 200 stocks, we should haveto run 19 900 regressions, which makes this quite time consuming.

3.4 Risk control and feasibility

As already mentioned, through this strategy in theory we do almost totally avoid thesystematic market risk. The reason there is still some market risk exposure, is that a minor

20

beta spread is allowed for. Also the industry risk ban be eliminated, if we invest in pairsbelonging to the same industry.

The main risk we are being exposed to is then the risk of stock specific events, that is the riskof fundamental changes implying that the prices may never mean revert again, or at leastnot within the holding period. In order to control for this risk we use the rules of stop-lossand maximum holding period.

We now study a simple trading strategy to access the feasibility of such trades. Giventhat stocks have a bid-ask spread, we would incur a trading slippage every time a trade isexecuted. Reducing the trading frequency reduces the e↵ect of this slippage. Let us thereforeconsider the strategy where the trades are put on and unwound on a deviation of � on eitherdirection from the long-run equilibrium m. We buy the portfolio (long S1 and short S2)when the time series is � below the mean and sell the portfolio (sell S1 and buy S2) whenthe time series is � above the mean in i time steps.

The profit on the trade is the incremental change in the spread, 2�. Consider two stocks S1

and S2 that are cointegrated with the following data:

– Cointegration Ratio �= 1.5– Delta used for trade signal = 0.045– Bid price of S1 at time t = $19.50– Ask price of S2 at time t = $7.46– Ask price of S1 at time t + i = $20.10– Bid price of S2 at time t + i = $7.17– Average bid-ask spread for S1 = .0005 percent (5 basis points)– Average bid-ask spread for S2 = .0010 percent ( 10 basis points)We first examine if trading is feasible given the average bid-ask spreads.

– Average trading slippage = ( 0.0005 + 1.5 * 0.0010) = .002 ( 20 basis points) This issmaller than the delta value of 0.045. Trading is therefore feasible. At time t, buy sharesof S1 and short shares of S2 in the ratio 1:1.5.

– Spread at time t = log (19.50) - 1.5 * log (7.46)= -0.045 At time t + i, sell shares of S1

and buy back shares the shares of S2.– Spread at time t + i = log (20.10) - 1.5 * log (7.17) = 0.045– Total return = return on S1 + g * return on S2

– = log (20.10) - log(19.50) + 1.5 * (log(7.46) - log(7.17) )– = 0.3 + 1.5 * 4.0– = .09 (9 percent)

21

CHAPTER 4

Results

We provide an example to illustrate the stochastic control trading strategy. We collectedon September 17, 2012, minute-by-minute data, on two stocks traded on the New YorkStock Exchange, Goldman Sachs Group, with ticker symbols GS, and JPMorgan Chase andCompany, with ticker symbol JPM. This gives us a 2 dimensional time series with 444 datapoints. We test for co-integration and estimate the parameters in our co-integration model.

We obtain, via Montecarlo simulation and for T=444 and N=10000, the following empiricaldistribution of t(�).

Mean Std Skewness Kurtosis q10% q5% q1%

-2.0322 0.8159 0.21331 3.5245 -3.0345 -3.3405 -3.9153

Table 4.1: Statistics of t(�) for the pair JP/GS

For T=444, we obtain the following histogram concerning t(�) :

Critical values for the Dickey-Fuller test for cointegration will correspond to the quantiles at10 %, 5% and 1% of our empirical distribution of t(�). We get q10%=-3.0345, q5%=-3.3405,q1%=-3.9153.

↵ � t(�)-0.3485 1.3739 -3.9247

Table 4.2: Results of the regression log(GS) = ↵ + � ⇤ log(JP )

First, as t(�) < q1%, we can assume the to be stationary, and thus we can clearly concludethat we should reject the null of no-cointegration: the data are co-integrated at the 99%confidence level. Indeed, the ACF for the spread X

t

is typical of an AR process :

22

Figure 4.1: Empirical distribution of t(�)

Figure 4.2: Correlogram of Residual Spread

A plot of the two cointegrated series is shown in Figure 4.3.

Now we estimate mean reversion behaviour in the pair of stocks: GS/JP. First, the estimatedcoe�cients are significant across the three pairs, supporting the Vasicek model of meanreversion in the residual spreads. Table 4.3 reports estimation results.

✓ k t(⌘)-0.0009 6.3665 0.0433

Table 4.3: Estimation of dXt

= k(✓ �X

t

)dt+ ⌘dW

X

(t)

Second, the level of mean reversion is strong, reflected by large values of k around 6.4. Thisvalues is also captured visually in the graphs where the estimated state is shown to quickly

23

Figure 4.3: GS/JP cointegrated Time Series

revert to its mean. The implication is twofold. On one hand, mean reversion is ample, hencethe non-convergence risk is mitigated. On the other, it may be too strong, such that profitopportunities are quick to vanish for those selected pairs. Third, the estimate of ✓ is notzero, albeit close to zero. This suggests there remains some residual risk over and above thebeta risk.

Figure 4.4 plot the estimated AR(1) residual spread as implied from the observed returndi↵erential.

Figure 4.4: Estimation of Residual Spread

For the purpose of illustrating, we plot our stock prices and the optimal policies for a wholetrading day. More precisely, we show the stock prices S1; S2, as well as the optimal policies⇡1 = h

⇤ and ⇡2 = �� ⇤ h⇤.

We then present in figure 4.6, the cumulative Profit and Loss function.

As expected, in a pairs trading setting, the controls are opposite in sign. We also noticethat the positions, which are large during the first half of the day, are both progressively

24

Figure 4.5: Stocks and optimal policies

Figure 4.6: Profit and Loss processes

Annual Return 15.48%Standard Dev. 13.46%Sharpe Ratio 1.15Sortino Ratio 0.89

Table 4.4: Performance Measures

unwound in the second half, ending close to 0 by the end of the trading day. For thisdata set, a significant profit is instantly realized because of the leverage. The profit thenfluctuates throughout the day but remains strongly positive and by the end of the day, itis approximately $1,348. Repeating this strategy on a daily basis gives a Sharpe Ratio of1.15 and an annual return of 15.48%. Of course, these figures do not take into account thecost of borrowing or transaction costs which are both assumed to be 0 in this model, this isunrealistic and in practice those costs can wipe out all the gains since the pair is reweightedevery minute. To avoid the slippage one solution is to enter the trade only when the spreadis wide enough, around earnings announcement of one the constituent of the pair.

25

Conclusion and outlook

In this report, we studied two quantitative approaches to the problem of pairs trading, thefirst one formulated the problem of optimal trading of pairs as a stochastic control problem.We were able to derive a closed form solution to this control problem. The second onerelied on the properties of co-integrated financial time series to construct a strategy witha theoretical positive expected return. This study was performed to show that the twoapproaches are equivalent, in the sense that the portfolio weight in both case depends onlyon the distance between the di↵erence in the logarithm of the prices and its the long-runmean.

The applicability of the method is illustrated with minute-by-minute historical stock data. Inthe model, the two stock processes are co-integrated, correlated, and have constant volatilityand we ignore the costs associated with trading. The simplicity of the present formulationenables a feasible implementation of parameter calibration and the derivation of analyticalformulae for the optimal trading strategies.

Maybe further work can be done in order to address the slippage issue one solution could beto enter the trade only when the spread is wide enough, and possibly to mix this strategywith a fundamentally driven one using the MSCI rebalance signal.

26

Bibliography

[1] Alexander, C., Giblin, I. and W. Weddington, Cointegration and asset allocation: A newactive hedge fund strategy, ISMA Centre Discussion Papers in Finance Series.

[2] Brockwell, P.J., and Davis, R.A.,Introduction to Time Series and Forecasting, secondedition , Springer-Verlag, New York. (2002)

[3] Brown, D. and R. Jennings, On technical analysis, The Review of Financial Studies,527-551, 1989

[4] Casella, G. and R. L. Berger, Statistical inference, 2012

[5] Chen, Z. and P. Knez, (Measurement of martket integration and arbitrage), The Reviewof Financial Studies 8, 287-325, 1995

[6] Do, B., Fa↵, R., and K. Hamza, A new approach to modeling and estimation forpairs trading, In Proceedings of 2006 Financial Man- agement Association EuropeanConference, Stockholm, June 2006.

[7] Duan, J.C., and S. Pliska, Option valuation with co-integrated asset prices, Journal ofEconomic Dynamics and Control, 754, 2004.

[8] Elliott, R. J. , van der Hoek, J., and W. P. Malcolm, Pairs trading, Quantitative Finance,271-276, 2005.

[9] Engle, R. F. and C. W. J. Granger, Long-run economic relationships, readings incointegration, Oxford University Press, 1991.

[10] Galenko, A., Popova, E. and I. Popova, Trading in the Presence of Cointegration,Operations Research and Industrial Engineering,78712

[11] Gatev, E., Goetzmann, W. N. , and K. G. Rouwenhorst, Pairs Trading: Performanceof a Relative-Value Arbitrage Rule, Review of Financial Studies, 797-827, 2006.

[12] Johansen, S. , Likelihood-based inference in cointegrated vector autoregressive models,Oxford University Press, 1995

[13] Korn, R., and H. Kraft. A Stochastic Control Approach to Portfolio Problems withStochastic Interest Rates, SIAM Journal on Control and Optimization, 1250-1269, 2002.

27

[14] Mudchanatongsuk, S., Primbs, J. A., and W. Wong. Optimal pairs trading: A stochasticcontrol approach, Proceedings of the Amer- ican Control Conference, 1035:1039, 2008.

[15] Dos Passos, W.,(Numerical methods, algorithms, and tools in C# ), CRC Press, (2010)

[16] Phillips, P.C.B., and S. Ouliaris, Asymptotic properties of residual based tests forcointegration, Econometrica, 165:193, 1990

[17] Tsay, R. S., Analysis of financial time series, Wiley, 2005

[18] Vasicek, O., An Equilibrium Characterization of the Term Structure, Journal of Finan-cial Economics, 177-188, 1977.

28

statistical arbitrage pairs trading, long-short strategy

Economy & Finance

basisfor trading strategy

driven strategy

strategies analysis

pairs tradingstrategy

problem ofpairs trading

strategy design193

common strategy

suchquantitative strategy