dynamic cointegration based pairs trading

Dynamic Cointegration Based Pairs Trading

Mads Hofstedt Jæ[email protected]

January 15, 2016

This article implements the pairs trading algorithm by Gatev et al. (2006) on recent

data on the S&P 500 Index. The finding is that this version of the pairs trading strat-

egy have become unprofitable. Several authors has suggested the use of cointegration

to pairs trading. Employing cointegration in pairing and use of the cointegrating

regression as basis for the trading rule in the Gatev et al. (2006) algorithm does not

alleviate recent negative performance. A suggestion to a new pairs trading strategy is

made which essence is to dynamically update the equilibrium model capturing funda-

mental equilibrium changes. Portfolio selection of pairs is based on trend following,

of allocating to pairs with the strongest recent backtest performance. This strategy

has had an average annual excess return of 17% but returns are decreasing in recent

years. Many of the tested portfolios has shown unprofitable since 2010.

University Copenhagen Business SchoolMaster thesis Advanced Economics and Finance (cand.oecon)Advisor PhD David Glavind Skovmand, Department of FinancePages, all inclusive 81Symbol count, applicable content < 150, 000

Contents

Notation iv

Pairs Trading 1

1 Introduction 1

2 Literature 22.1 Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 Profitability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Data 8

4 Review of the Distance Approach 104.1 Ranking by Euclidean Distance . . . . . . . . . . . . . . . . . . . . . . . . . 104.2 Generating Trading Signals and Consolidating . . . . . . . . . . . . . . . . . 114.3 Calculating Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.3.1 The Margin Account and Leverage . . . . . . . . . . . . . . . . . . . 144.3.2 Funding Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.3.3 Transaction Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4 Performance of the Distance Approach . . . . . . . . . . . . . . . . . . . . . 214.4.1 Revisiting Return Calculations . . . . . . . . . . . . . . . . . . . . . 25

5 Review of the Cointegration Approach 265.1 Pairing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.1.1 Engle & Granger Approach . . . . . . . . . . . . . . . . . . . . . . . 275.1.2 Phillips & Ouliaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275.1.3 Johansen Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.1.4 Test Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.1.5 Results of Pairing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2 Performance of the Cointegration Approach . . . . . . . . . . . . . . . . . . 325.3 Comparison to the Literature . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6 A New Trading Model 396.1 The Continuous Cointegration Model . . . . . . . . . . . . . . . . . . . . . . 416.2 Pairing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446.3 Portfolio Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.4 Performance of Short Term Trend Following . . . . . . . . . . . . . . . . . . 49

6.4.1 Long Only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

i

6.4.2 Hedging by SPX Futures . . . . . . . . . . . . . . . . . . . . . . . . . 59

7 Conclusion 61

Appendix 66

8 Literature Overview 66

9 Return Calculations 68

10 Summary Statistic Tables 69

11 Cluster Variables 70

12 Distance Approach - Long Only 71

13 Cointegration Approach Hedging by SPX Futures 71

List of Figures

4.1 Example of the distance approach . . . . . . . . . . . . . . . . . . . . . . . . 124.2 Performance of the distance approach . . . . . . . . . . . . . . . . . . . . . 205.1 Performance of the cointegration approach . . . . . . . . . . . . . . . . . . . 356.1 Example of a new trading model . . . . . . . . . . . . . . . . . . . . . . . . 436.2 Performance of various selection methods . . . . . . . . . . . . . . . . . . . 476.3 Performance of trend following pairs . . . . . . . . . . . . . . . . . . . . . . 526.4 Performance of trend following pairs, long only . . . . . . . . . . . . . . . . 586.5 Performance of trend following pairs hedging by SPX futures . . . . . . . . 5912.1 Performance of the distance approach - long only . . . . . . . . . . . . . . . 7113.1 Performance of the cointegration approach hedging by SPX futures . . . . . 73

ii

List of Tables

4.1 Example of consolidating multiple trading signals . . . . . . . . . . . . . . . 134.2 Performance of the distance approach, various cost applications . . . . . . . 214.3 Distance approach summary statistics . . . . . . . . . . . . . . . . . . . . . 224.4 Distance approach pair summary statistics . . . . . . . . . . . . . . . . . . . 234.5 Average biases of simple return calculations . . . . . . . . . . . . . . . . . . 265.2 Selection algorithm for cointegration testing . . . . . . . . . . . . . . . . . . 305.3 Cointegration pairing statistics . . . . . . . . . . . . . . . . . . . . . . . . . 315.4 Performance of the cointegration approach, all aplications . . . . . . . . . . 325.5 Cointegration approach summary statistics . . . . . . . . . . . . . . . . . . . 335.6 Cointegration approach trading statistics . . . . . . . . . . . . . . . . . . . . 345.7 Pair summary statistics: Trace and Eigen . . . . . . . . . . . . . . . . . . . 375.8 Pair summary statistics: PO and ADF . . . . . . . . . . . . . . . . . . . . . 386.1 Performance of various portfolio selection methods . . . . . . . . . . . . . . 486.2 Performance of trend following pairs . . . . . . . . . . . . . . . . . . . . . . 506.3 Trading statistics of trend following pairs . . . . . . . . . . . . . . . . . . . 516.4 Pair trade statistics LT65 and TF65 . . . . . . . . . . . . . . . . . . . . . . 536.5 Regression statistics LT65 and TF65 . . . . . . . . . . . . . . . . . . . . . . 556.6 Performance of trend following pairs, long only . . . . . . . . . . . . . . . . 576.7 Performance of trend following pairs hedging by SPX futures . . . . . . . . 6011.1 Cluster methods for pairing . . . . . . . . . . . . . . . . . . . . . . . . . . . 7012.1 Distance approach summary statistics - long only . . . . . . . . . . . . . . . 7213.1 Cointegration approach, hedging by SPX futures . . . . . . . . . . . . . . . 74

iii

Notation

List of Symbols

Pit Closing price of security i at time point trit =

PitPi,max(index(i)<t)

−1

Return of issue i at t evaluated to last observed price

τ A day of pairingL A scalar of instantaneous leveragePLt Profit and Loss at tMAINpct Maintenance margin as percent of the contemporaneous market

valueb Buffer to maintenance marginMAINt Dollar amount indicating the minimal equity that must be held to

maintain all positions active at tINTpct Initial margin, percent of purchase value of long trades and the

sales value of short tradesINTt Initial margin in dollars wired to the prime broker at tωi,t Portfolio weight (a scalar ∈ R) to the return of issue i at tEt Equity value in dollars at tMVt Market value of active trades at t, both long and shortrft Risk free rate, the federal funds rate, yieldFCt Financing costs at t in dollarsTCt Transaction costs at t in dollarsTCpct Transaction costs in percent of turned over market valueGLt Gross Leverage, the market value of long and short trades over

equitypb Prime brokerage lending spreadrebate Prime brokerage rebate spreadrpbt = rft + pb Prime brokerage lending rate on long positionsrrebatet = rft −rebate

prime brokerage rebate rate on margin loans

LLt Cash loan on long positions at tSPt Short sale proceeds at tE Expectations operator∆ Difference operator: ∆Xt = Xt −Xt−1

tr(·) The trace of a matrix

iv

List of Acronyms

ADR American Depositary ReceiptAPT Arbitrage Pricing TheoryAR(q) Autoregressive process of q lagsEMH Efficient market hypothesisMVA Minimum variance analysisNASDAQ National Association of Securities Dealers Automated QuotationsOTC Over the counterP&L Profit and lossSPX S&P 500 indexVAR(q) Vector autoregressive process of q lagsVECM Vector error correction model

List of Glossaries

Excess returns A time point return of: rt − rft .Gross leverage Market value of long positions and short positions divided by the

equity value.Initial margin Dollars wired to initial margin account. Dependent on the initial

margin percent (INpct) which is a fraction of each opening tradesthat must be wired to the prime broker and not used as margin innew trades. An initial margin on a trade is released when thetrade is terminated.

Instantaneousleverage

Scalar (L) multiplied to an initial weight on a trading signal. If allweights in the portfolio are positive (absolute weights) and sum toone, then applying these weights yields a gross leverage of 1 (noleverage). Scaling the weights by the instantaneous leverage yieldsa gross leverage of the instantaneous leverage (L). The pair traderoften place equal weight to a long and short position, theinstantaneous leverage lever this trade. This differs from grossleverage which is determined by the actual applied capital,dependent on the amount of trading signals generated, and driftto leverage.

Maintenancemargin

The minimal equity value needed in order not to close a fractionof the active trades. The maintenance margin is a percent(MApct) of the contemporaneous market value of both long andshort trades. If equity value is less than what must be maintainedthen positions are closed.

v

Net leverage Market value of long positions minus the market value of the shortpositions divided by the equity value.

Sharpe ratio A return series geometric annualized return divided by theannualized standard deviation of this series. Specifically twoSharpe ratios is reported (1) Sharpe(rf ), which is the Sharpe ratioof an excess return series, and (2) Sharpe(0) which is the Sharperatio of non-excess returns. Returns in long short tradingstrategies are often excess, as is the case in this paper, in order tocalculate Sharpe(0) the risk free rate is added to the returns.

vi

Pairs Trading

1 Introduction

Pairs trading is long short statistical arbitrage trading strategy betting on the relativepricing between two securities (Do et al., 2006). The strategy is simple: find two securitiesthat move together, perhaps such that they form a stationary spread. When the spreadsignificantly deviates from equilibrium place a bet on reversal. This means shorting therelatively overpriced (outperforming) security and purchasing the relative under priced(underperforming). This kind of trading is an expectations arbitrage (Nath, 2003), suchthat the expected value of execution is strictly positive. This form of pairs trading resemblesthe one introduced by Gatev, Goetzmann, and Rouwenhorst in 1999, but other types ofpairs trading are also known, as Alsayed and McGroarty (2012) who trades ADR’s againstthe underlying stocks, or as Pedersen (2015) who describes pairs trading in dual listedshares, between share classes, and as other assets with identical underlying cash flow.

Pairs trading coincides the bucket of long/short equity. Long/short equity finds rel-atively overvalued securities for shorting and purchases relatively undervalued securities.This valuation is often done fundamentally and the portfolio is balanced to keep betaneutrality, thus appropriately pairing the long and short side. Some of these fundamentalstrategies are more direct to pairs trading where long/short positions are specifically takenwithin each sector, thus eliminating specific sector exposure. Pairs trading conducted inthis paper differs from these two types of long/short equity strategies. First of all, it usesonly the price history, relying on securities with same factor exposure to have about thesame price pattern and are thus likely to be paired. As securities with similar price pathsare paired their common exposure is eliminated, thus the portfolio is by each pair in the-ory constructed neutral to any factor. In this regard no special portfolio construction isnecessary to create market neutrality. Moreover, the strategy is not a fundamental quan-titative strategy with long holding periods, it purely trades short term deviations. Thiskind of short term relative pricing is shown by Engelberg et al. (2009) and Jacobs andWeber (2015) to be justified when the paired securities incorporates common informationat different speeds, and in the presence of noise trading. It will be reviewed in Literaturethat these are violations of the efficient markets, and without this pairs trading cannot beprofitable.

There exists three major methods of pairs trading, 1) the distance approach (Gatevet al., 2006), 2) the cointegration approach (Vidyamurthy, 2004), and 3) stochastic mod-elling (Do et al., 2006). In this paper the analysis of Gatev et al. (2006) will be extendedto recent data on the US large cap market. The analysis shows that the original form ofpairs trading is no longer profitable, except during the turmoil of the financial crisis andrecovery. Huck and Afawubo (2015) suggested that the application of cointegration to thedistance framework is superior and has recently showed profitable. This is tested with

1

the result that the cointegration approach do show higher profitability, but also at highervariance than of the distance approach; their Sharpe ratio is similar. The conintegrationapplication has not shown profitable in recent years albeit returns are not as negative asfor the distance approach. The results of testing the two methods compare to the findingsof Do and Faff (2008) which shows that too many spread trades are taken on permanentlydiverging spreads. These permanently diverging spreads leaves the strategy unprofitable.This happens as parameters are frozen and do not incorporate the possibility of a meanshift in a spread (Triantafyllopoulos and Montana, 2011). To encounter this issue a newframework to pairs trading i suggested. Within the framework a specific cointegrationbased model is applied on a rolling window basis. Entry triggers for trading is dynamicallyupdated, and portfolio weights favours pairs that has recently performed well. The resultsof the new application to pairs trading is a strategy with an 17% annual excess returnswith a Sharpe ratio of 2.3; however returns are still decaying over recent years.

The rest of the paper will be organized as follows: In 2 Literature theoretical foun-dations considered in the literature is presented and the main methods to pairs tradingare described. Section 3 Data will shortly describe the used data. Section 4 Review ofthe Distance Approach, will extend the most studied pairs trading algorithm to recentdata. In this section a great emphasis will be put on reflecting reality through the wayof calculating returns (4.3 Calculating Returns). In light of the poor performance of thedistance approach an application of the cointegration will be applied in 5 Review of theCointegration Approach. As this also has showed unprofitable in recent years a new ap-proach to pairs trading developed in 6 A New Trading Model with the goal of alleviatingthe major issues of the two aforementioned backtests. Lastly are the results concluded in7 Conclusion.

2 Literature

In this section a large subset of the literature covering pairs trading will be reviewed. Thegoal is to find a justification for pairs trading. Many authors has found pairs tradingjustifiable through positive excess returns. The review is structured by the questions:

• What theories relate to pairs trading?

• Why should pairs trading be profitable?

• What methods has been applied?

2.1 Theories

There has been several theoretical linkages of pairs trading: APT for finding substitutablesecurities, a transformation of the law of one price to a relative law of one price, and thestatistical empirical work of cointegration. No matter how pairs trading is theoretically

2

linked its presence must be an attribute of inefficient market. To understand this it isconvenient to consider the APT:

rit = Ftβi + εit (2.1)

This states that the return of an asset is an attribute of its factor exposure and firm specificreturn which has zero expectations. The firm specific returns (εit) is a Brownian motion.The idea of pairs trading is to find two securities i and j that covariates. An obvious caseof covariation is when βi = βj . The spread of the two securities sijt =

∑tk=0 log

1+εik1+εjk

is aBrownian motion. As this is martingale it is efficiently priced and no trading rule beatingthe market can be applied to it. The spread has zero expected returns, no reversal, andpositive variance - clearly directional trading in this is folly. The notion of pairs tradingis arising when the two securities common factor exposure is priced in at different speeds(inefficiency), or when the relative firm specific returns are bounded creating the possibilityof reversal (inefficiency).

Despite the EMH’s exclusion of pairs trading several articles has reported positiveexcess returns of pairs trading. This phenomenon has naturally lead to near efficienttheoretical linkages. One of the common linkages to pairs trading is the law of one price(Gatev et al., 1999). The intuition of Chen and Knez (1995) follows: securities with almostthe same payoff in any future state (comovement) should face almost the same price today.The level of this price is very hard to asses, but the relative price of the securities is mucheasier to asses. If one of the securities fall in price, then it is relatively more cheap givenconstant expectations to future pay-off.

In line of the relative law of one price Do et al. (2006) and Vidyamurthy (2004) suggestthat his can be viewed in terms of APT. Securities with same risk factor exposure will havethe same long run expected return, thus APT may serve as catalyst in finding seeminglyidentical securities. However, as noted above same factor exposure does not alone allowfor pairs trading.

The key essence of pairs trading is to trade two securities which firm specific risk iscointegrating. Cointegration of two securities means weakly inefficient markets, i.e. in thelong run one of the securities is redundant. These two securities that are known to havea long run identical price (return profile, Do et al. (2006)), may be traded i pairs suchthat any deviation from equilibrium will yield an positive expected profit. This strongstatistical attribute is necessary to explain why there can be presence of pairs trading. Inorder to profit on the redundant securities the market cannot be perfectly efficient.

Following the efficient market hypothesis if a cointegrating relation is observed, twooutcomes may be seen: 1) the relation is a false positive and it breaks up, the arbitrageurswho previously have profited on this will now take a loss. In expectation trading thefalse positives will have an embedded risk that undermines the return in the seeminglycointegrating periods, such that no arbitrageur will participate in these trades. 2) Thecointegrating relation is in fact a true positive, arbitrageurs will compete spread levels so

3

close to equilibrium that no further arbitrage can take place. This is the notion of autoefficient markets, that for instance takes place in dual listed shares.

Evidence of pairs trading in truly cointegrated spreads can be found in Alsayed andMcGroarty (2012), who are trading ADRs and the underlying UK stocks using tick data.The paper effectively tests the notion of auto efficient markets. The auto efficient marketshould stem form the fact that any investor wishing to purchase a certain security willalways purchase at the cheapest venue. The conclusion of Alsayed and McGroarty (2012)was that an annual excess return of 1.45% of trading 25 securities and their ADR couldhave been made, rejecting full auto efficiency. Perhaps in these very obvious pairs thereis a rational explanation: very short term price deviation may happen in the pressence ofmarket movers who find it too costly to monitor and trade both securities at investing.The arbitrageur is paid a fair rate for his contribution of delivering efficiency through hissuperior ability to monitor and trade instantly at several venues.

In theory pairs trading cannot be justified under the efficient market hypothesis, thisneeds to be violated through cointegration, that ensures a relative equilibrium between twosubstitutable assets. The question is then if there are such violations of the the efficientmarket, and given there are where does the profitability stem from?

2.2 Profitability

As discussed in the above market inefficiency is a necessary condition for pairs trading. Inthis subsection the level of inefficiency will be discussed at first, highlighting whether ornot pairs trading is in fact profitable. Considering that many authors do find it profitablethe inefficiencies giving rise to the profitability will be investigated.

Does pairs trading work? In a wide range of articles, in fact most of the articlesrepresented in the literature list, the authors do suggest positive returns over the pastmany years. Many of the articles concerns what is popularly called the distance approach(Gatev et al., 1999) which is explained in detail in Review of the Distance Approach. Doand Faff (2008) shows that returns has rapidly been declining since 2001, and in 2012 Doand Faff suggested that net of transaction cost there has only been a slightly positive ifany returns from the distance approach. In section Review of the Distance Approach itwill appear that the distance approach can be expected to generate negative returns in theyears to come, confirming Do and Faff (2008). Broussard and Vaihekoski (2012) conductedthe distance approach in the Finish market, albeit more profitable than in the US market,returns has also been declining since 2001. In overall this suggest that the pairs trading isno longer profitable.

The founders of the distance approach Gatev, Goetzmann, and Rouwenhorst thoughtthe decline to be temporal and attributed to two sources: 1) decrease in fixed commissions2) a rise in hedge funds. The high commission in the early years would have excluded

4

many participants, whereas today commissions are low and technology vastly improvedmaking it more profitable for more to participate which in turn will reduce profits. Therise in hedge funds has increased competition leading to more efficiently pricing of spreads.Do and Faff (2008) tests these hypothesis and cannot find evidence that competition hasaltered pairs trading unprofitable. They find that spreads in an increasing numbers divergeto never converge (false positives), which is attributed to a complex change in the relativelow of one price relationship.

In short, Do and Faff are claiming that the distance approach has not been worthwhileto pursuit net of costs, and a structural change has happened where it is impossible torely on stationarity of a pair spread. On the other hand one can find Bowen et al. (2010)who justifies the method on intra daily basis in the year 2007 with a 7% excess return.A range of articles not using the distance approach has also showed profitable: Huck andAfawubo (2015), Harlacher (2012), Caldeira and Moura (2013) employing cointegration tothe framework of the distance approach. Caldeira and Moura (2013) is showing returns of16.38% with a Sharpe of 1.34 in the Brazilian market. Bogomolov (2013) employs renkokagi to the distance framework and shows no declining returns.

In overall todays market inefficiency is to little to justify the distance approach, butother pairs trading applications do still suggest profitable.

Inefficiencies Engelberg et al. (2009) and Jacobs and Weber (2015) finds multiple mar-ket inefficiencies driving the pairs trading returns for the distance approach. One couldargue that given the declining returns of the distance approach their analysis may be bi-ased as it is based on two regimes. Nevertheless, they give logical explanations that oughtto hold for any pairs trading strategy. Explanations follows:

1. News diffusion. Both Engelberg et al. (2009) and Jacobs and Weber (2015) showsthat when news appear regarding both members of a pair, a divergence of the spreadmay happen when the market participants are faster at adjusting the price of one ofthe securities than the other. That the market relativly slowly implements informa-tion to one of the securities creates a lead-lag relation in the pair. The enemy to thepairs trader is firm specific news, as this permanently alters the equillibrium spreadand induces a loss to the trader (Engelberg et al. (2009), Jacobs and Weber (2015),Papadakis and Wysocki (2007)).

2. Noise trading (volatility). Jacobs and Weber (2015) finds increased profitabletrading opportunities when the idiosyncratic noise increases. Bowen et al. (2010)who trades intra daily, shows that in the volatile and liquid opening and closinghours the greatest profits are generated. Further does Baronyan et al. (2010) findthat during the financial crisis high returns was made; as will it be shown in thispaper. Liquidity has the opposite effect less liquidity is showed in Engelberg et al.(2009) to increase profitability, and the same holds for sudden changes in liquidity.

5

3. Competition. Engelberg et al. (2009) shows when institutional holders holds bothassets of a pair profitability decreases, linking to competition in the given spread.Jacobs andWeber (2015) shows that limits to arbitrage proxied by an array of variableincreases profits. One of the limits is increased bid ask spread1. One might also linkcompetition to the conclusion of Jacobs and Weber (2015): when many potentialpairs exist greater profit should present. Many pairs means a greater market to trade,hence less competition per pair. In same region did Jacobs and Weber (2015)findthat emerging markets are more fertile grounds for pairs trading.

4. Coverage When sell side analysts cover both stocks of a pair pricing efficiency isincreased (Engelberg et al., 2009). Jacobs and Weber (2015) shows that greatercoverage leads to less profit.

In overall do Do and Faff claims that pairs trading is hard to justify net of costs, and itis no premium as momentum. If a premium ever has existed it has disappeared2, and notdue to increased competition. On the other side of the debate Engelberg et al. and Jacobsand Weber claiming that pairs trading is justified due to informational flow differential andnoise trading. They also find that competition will decay profits. The latter arguments issupporting pairs trading as a premium, which magnitude is a function of competition.

In conclusion the EMH does not allow for pairs trading, but if this is violated such thata spread no longer is martingale, pairs trading is justifiable. The literature has shown pairstrading profitable, even for the simple distance approach, just several years ago. Othervariants of pairs trading are still profitable. The inefficiencies giving rise to the profitabilityhas been differential in informational diffusion, noise trading, and lack of competition. Allthe factors gives rise to some mean reversion of a spread which can be traded.

2.3 Methods3

The essence of pairs trading is finding substitutable assets and buy relatively undervaluedand short relatively overvalued securities. The relative assessment can statistically onlytake place once the securities strongly covariate. The general toolbox for the pair traderis:

• A pairing method

• A spread model

• A trading strategy

• A portfolio selection scheme1 An increased bid ask spread will have the adverse effect of increased costs of trading. Thus the

increased spread might not yield net of transaction costs increase in profitability - net profitability canboth increase or decrease.

2For the distance approach.3 An overview of the methods discussed in this section can be found in Literature Overview

6

The literature can be grouped into four:

1. Distance approach

2. Cointegration approach

3. Stochastic modelling

4. Miscellaneous

Distance Approach was invented by Gatev et al. in 1999 through interviews of profes-sionals. They filled every part of the toolbox of (1) selecting pairs with the lowest euclideandistance in indexed returns; (2) used the difference in indexed return as the spread; (3)on a one year training period the spread standard deviation was calculated, in the fol-lowing 6 months trading period they opened a long/short position in the spread when itdeviated by to standard deviations from 0; (4) only pairs recently formed was includedin the portfolio. There are two major components of the distance approach: 1) pairingby distance and trading high distance in the pair prices 2) a frozen parameter and fixedperiod modelling. The main advantage of the distance approach is that it is model free andrelatively simple. On the contrary it by no means tries to reach an optimality condition.The model is robust to data snooping. Articles within the distance approach are: Do andFaff (2008), Do and Faff (2012), Broussard and Vaihekoski (2012), Bowen et al. (2010),Engelberg et al. (2009), Huck (2013), Jacobs and Weber (2015), Nath (2003), Mori andZiobrowski (2011), and Huck and Afawubo (2015) comparing the three major approachesin the portfolio framework of Gatev et al. (2006).

Cointegration approach to pairs trading was suggested by Vidyamurthy (2004) whopurely delivered theoretical and practical applications for pairs trading, but no empiricalwork. The cointegration approach levers from the fact that a stationary spread can bepredicted, yielding direct traceability of expected convergence. For instance does Puspan-ingrum (2012) solve the first passage time for several cointegrated spread series followingdifferent models. As a parametric model it is normative and allows for the stocks to haveasymmetric effects on the disequilibrium. But as a parametric model trading the mod-elled spread is vulnerable to misspecification. Often is pairing done through ADF-testingfollowed by Granger to way causality test, as in Baronyan et al. (2010). Trading is inHarlacher (2012), Huck and Afawubo (2015), Caldeira and Moura (2013) done followingthe simple cointegrating regression: εijt = P it − αij − βijP

jt . Whenever the standardized

residual εt/sd(εt) is significantly large a disequilibrium has occurred and a position in thespread betting on reversal is opened and closed a next zero crossing. Often the Gatev et al.(2006) framework of frozen parameters is used, allowing the spread never to converge4.

4 When a spread does not converge trading is stopped at either a stop-loss trigger or at the end of thetrading period.

7

Stochastic Approach. Elliott et al. (2005) suggested the use of the Ornstein Uhlenbeckprocess to model the mean reveting behaviour of a spread. Do et al. (2006) nested themodel of Elliott et al. (2005) in a more generalized setting where factor variables could becontrolled for (APT), thus modelling the firm specific returns. A problem with the modelas Do et al. (2006) noted is that it assumes a constant long run return parity of the pairedstocks, which seldom is the case. Neither Do et al. (2006) nor Elliott et al. (2005) employedthe model in an empirical work but Baronyan et al. (2010) implemented it on the DowJones 30 index on weekly data. Baronyan et al. (2010) used a very interesting portfolio set-up that does not compare to many others. Firstly five pairs were selected from a ranking ofseveral pairing metrics, on the year back from the pairing date. In the training period thepairs are backtested using several lack windows, i.e. rolling regression and one week aheadforecast. The window yielding highest cumulative return is used as parameter over theconsecutive year of the trading period. In the trading period parameters are updated atevery observation, and trading follows the updated model, using a two standard deviationerror as entry trigger. In A New Trading Model an approach of updating parameters isused. The great advantage in updating parameters is the allowence of a non constantmean spread as argued in Triantafyllopoulos and Montana (2011). The mean spread errorcan change with firm specific news. The great advantage of stochastic modelling is thepossibility of deducing optimal trading. Often it is thought to fit fundamental relatedassets better than the spreads of weakly substitutable assets.

Miscellaneous Huck (2009) and Huck (2010) supposed using a multi criterion decisionmodel (Elctre III) to explicitly forecasat the n(n−1)

2 spreads of the system implementedthough a neural network. Huck (2010) purchased the p spreads with highest expectedreturn one period ahead, while shorted those with the lowest expected return. This modelhas the advantage of being positive rather than normative on a pre-specified equilibrium.Another interesting article is Hameed et al. (2010) who showed a 1.5% monthly returnof betting on within industry reversal, shorting the previous month winners and buyingthe previous month losers. Others are Bogomolov (2013) using Renki Kagi for spreadmodelling, or Liew and Wu (2013) who models pair dependence by copulas.

3 Data

When conducting pairs trading one needs to find sufficiently many securities to trade suchthat enough stationary spreads are obtained for trading. The more spreads the morepotential trades, and more trades means greater diversification and smaller risk of movingthe market. However, the number of potential pairs to trade is quadratic on the numberof issues investigated. In this paper the members of the S&P 500 index are investigatedcreating a potential of 124,750 pairs to be traded. The number of pairs are many but yetthe task is doable. If one had investigated the Russel 3,000 members some smart sorting

8

algorithm must have been used. In fact in section Test Selection, a method to find highlycointegrated pairs without testing the entire span is suggested. Furthermore, it will berevealed that a lot of trading is conducted hence transaction costs are essential. Choosingliquid stocks increases the likelihood that the strategy actually can be implemented onobserved prices at reasonable costs. In other words the backtest results on liquid stocksare more likely to reflect an implementable strategy without increasing the complexity ofthe backtest. Nevertheless, as showed in Jacobs and Weber (2015) and Engelberg et al.(2009) illiquid stocks may be more profitable to trade, especially when the idiosyncraticnoise is high, which often is the case.

Data is obtained from Compustat that has preserved a list of all members ever enteredand left the S&P 500 index since 1983. Data on all the tickers from this list has beenrequested starting in 1984 and ending in July 2015. Some of the tickers were not obtained;in the first run, but data was provided on their cusip identification. Yet 5 symbols5

was never obtained, all of these firms have been acquired. 51 symbols was successfullygathered and data complied with Bloomberg data, but none of them showed data in theirmember period. In total 1,170 symbols could have been traded over the period, afterhaving removed the 5 missing and 51 symbols of lesser information. There have been twodual share class listings, Google and Discovery Communications, in both cases the B sharehas been removed6.

Data is corrected for corporate actions as splits, dividends, etc. The data is survivorshipbias free when exempting the 56 missing symbols. If one would desire to use the CRSPdatabase older data could have been gathered, which may be clever for some purposes.However, if the profitability of such a strategy is related to the technology then recent datamatters the most. We strongly suspect this to be the case.

As benchmark the S&P 500 index returns are used, and the Federal Funds Rate isused as the risk free rate. However, data on the fed funds rate started in the end of 1988,whereas data before this point is constructed. The constructed data is the US treasury bill3 month yield minus the mean difference in the 3 month bill to the fed funds rate in theoverlapping data period. The spread in the two series is relatively narrow and constantsuch that this very simple method can be somewhat justified. The major goal of extendingthe risk free rate, is to get the historical level of the financing costs. The returns of theS&P 500 index minus the yield of the fed funds rate gives the excess returns of S&P 500also called the equity premium.

5 These firms are: Jonathan Logan Inc (JOL.), Bausch & Lomb Hldgs -Redh (6583B), Biomet INc(5938B), Avaya Inc (5933B), and Archstone Inc -Redh (ASN).

6 Trading A and B share classes is different from trading substitutable assets thus excluded in thispaper. The A share is used as this saved the most historic information.

9

4 Review of the Distance Approach

This section will review the distance approach. It will expand the backtest of Gatev et al.(2006) to recent data, testing profitability in recent years. Moreover it will contribute tothe literature with a somewhat complex scheme for return calculations.

The natural start is at the beginning, which in the pairs trading world is 1999 whereGatev et al. published their working paper Pairs Trading: Performance of a Relative ValueArbitrage Rule. The authors interviewed professionals who successfully have conductedpairs trading for years. They identified a strategy of (1) finding pairs of assets that movetogether, and (2) taking a long/short position whenever the spreads got large, betting onreversal. The authors used data from the S&P 500 index. At each pairing date they indexedthe 500 constituents’ return over the last year, then calculated the 124,7450 euclideandistances in these indices, and selected the lowest ranking pairs for consecutive trading. Aposition was opened in a pair whenever the spread in the continuing index prices deviatedby more than two standard deviations as measured on the historical spread. The higherprice security was shorted whilst the lower priced was purchased, betting 1$ on each of thesecurities. The trade was closed whenever (if) the prices crossed (zero spread), or afterthe 6 months trading period had elapsed. This two-step procedure was repeated at thestart of every month, where the top 20 pairs7 was traded for the consecutive 6 months.This corresponds to 6 portfolio managers conducting the same strategy staggered by amonth. A potential of 120 pairs might have been traded at once. The authors updatedtheir paper in 2006 and reported an annual excess return of 11% in the period 1962-2002,after a conservative estimate of transaction costs.

In the following the distance approach of Gatev et al. (2006) will be replicated followingthe pattern of: (1) Ranking by Euclidean Distance, (2) Generating Trading Signals andConsolidating, (3) Calculating Returns.

4.1 Ranking by Euclidean Distance

Breakpoints of updating pairs are set at the first day of each month starting in 1985-01-01,ending in 2015-07-01. At each day a subset of the S&P 500 price data is created:

• Including all observations ranging 365 days back and including the breakpoint

• All members of the index as of the breakpoint are concidered

• All members with more than 2.5% missing observations are omitted

• All dates with one or more observations missing are omitted7The authors also considered the top 5 pairs portfolio, as well at the pairs ranging from 101 to 120, to

identify that the pairing rank mattered. The pairs 101-120 showed lower return and slightly lower variancethan top 20. Top 5 was more volatile and did not show greater returns than top 20.

10

Cumulative return series of the resulting price data is calculated all starting in index 1.The euclidean distances are calculated:

distτi,j =

τ∑t=τ−365d

(Rit −Rjt)2 (4.1)

τ = 1985− 01− 01, 1985− 02− 01, ..., 2015− 07− 01

τ is the breakpoint, and Rit =∏tk=τ−365(1 + rik), when rtk is the return of asset i at

time point k since last observed price, usually a day or weekend before. i, j are eligiblesecurities. At each breakpoint a maximum of 124,750 distances are calculated. The 20lowest distance pairs are saved for consecutive trading on the pairing date, such that anysignals generated and traded on this day will be yielding some return on the next day. Theoptimal pairs are:

Pτ = P 1τ (i, j)...P 20

τ (a, b) (4.2)

The number of securities in the optimal set may be less than 40 as any security may appearin more than once.

4.2 Generating Trading Signals and Consolidating

Trading pair P rankτ (i, j) is done in the period [τ, τ + 6m[, where underlying variables andparameters are calculated:

Rht =

t∏h=τ−365

(1 + rhk) ∀ h = i, j ∧ t ∈ [τ − 365d, τ + 6m[ (4.3)

Sijt = Rit −Rjt (4.4)

σijτ = sd(Sijt ) for t ∈ [τ − 365d, τ ] (4.5)

The signal generation is a logical function:

• Open a short position in the spread (short i, long j) when Sijt crosses 2σijτ from below

• Open a long position in the spread (long i, short j) when Sijt crosses −2σijτ fromabove

• Close any positions when Sijt crosses 0 (Rit and Rjt crosses)

• Close any open positions when t ≥ τ + 6m

The trading strategy is illustrated in figure 4.1 where a rather successful trade sequenceappears, with four converging trades and one trade closed on the 6 month rule. In factless than 7% of the pairs had 4 or more converging trades in their trading period. Thetrade reveals that the spread was not mean 0 in the training period, thus the closing rulewas slightly wrong. It is also evident that the historic standard deviation is too narrow to

11

Figure 4.1: Example of using the distance approach for τ = 2004 − 03 − 01, where 12th ranking pair isdepicted. Left side of the panel depicts both training and trading period separated by the vertical line.Panel 3 is the spread between the normalized prices of panel 1. The blue line in panel 3 is the zero line ofno price differential, the red lines are the two standard deviations distance to 0 as measured in the trainingperiod. Right side panels depicts the trading period, where panel 2 depicts the same spread from panel3. The red shaded areas in panel 2 indicates a short position in the spread whilst the gren shaded areasindicate long positions. Finally panel 4 computes the cumulative return of the trade with a gross leverageof 2 at initiation.

encapsulate 95% of the spread outcomes in the trading period. Lastly, excessive tradingis taking place as a position is opened too close to 6 months closing rule, such that onaverage the trade would not have closed naturally before expiry.

120 signals are monitored at once as result of the overlapping execution. There will beat most 6 active signal sequences on the same pair at once. The signals are consolidatedonto the unique assets traded in order to minimize transaction costs. An example of theconsolidation follows in table 4.1. When two sequences of signals on the same pair hasopposite signs the pair trade is cancelled. Likewise if a security A is indicated short andlong by two distinct pairs at the same time A is neither bought or sold.

4.3 Calculating Returns

Calculating returns is a complex matter (Gatev et al. (2006), Huck and Afawubo (2015))as both short and long sales are committed over many assets. Even more, the signals strikein a random manner such that the strategy can trade 0 to 240 assets at once, where thefuture holdings are unknown. This gives rise to the problem of selecting the proper equitybase. If one determines to hold 240 dollars in equity the gross leverage would vary from 0to 18. If one holds 120 dollars in equity the leverage can vary from 0 to 2, meaning that

8 When ignoring drift to leverage due to price changes.

12

Pairs AssetsDay AB BC AB A B C

1 1 0 -1 0 0 02 1 0 -1 0 0 03 1 0 -1 0 0 04 1 0 0 1 -1 05 1 0 0 1 -1 06 1 0 1 2 -2 07 1 0 1 2 -2 08 1 0 1 2 -2 09 1 0 1 2 -2 010 1 1 1 2 -1 -111 0 1 1 1 0 -112 0 1 1 1 0 -1

Table 4.1: Example of consolidating several trading signals of pairs onto individual securities

external leverage must be obtained in some periods. If on average 60 pairs are traded thenholding 120 equity would yield an average leverage of 1, and 60 dollars an average leverageof 2. Holding the 60 dollars of equity can induce gross leverage to increase to 4, and if thereis an initial margin requirement of 50% on both the long and short leg, the trader is on thelimit of being fully invested, and new signals will be foregone. This issue of selecting anequity base or average gross leverage and the effect to margin requirements and fundingcosts are cumbersome, but it will in the end have an effect on the return profile.

Gatev et al. (2006) conducted two return calculations: (1) fully invested (2) committedcapital. The first did not consider the leverage problem, a gap to reality that we will tryto fill. It will be shown in Revisiting Return Calculations that this has induced slightlyoverstated performance. The committed capital did have an oscillating leverage, but noactive decisions was made to actively reach a target leverage.Fully invested varied the equity base to the amounts of signals opened which is fairlyunrealistic; however, they made certain that the leverage was kept very close to 2. Choosingan appropriate equity base can replicate their portfolio with an average leverage of 2. Thecommitted capital commits 120 dollars of equity, one dollar to each pair. This has a varyingleverage ratio between 0 and 2, and from the coming results it will be clear that the averageleverage would have been close to 1.

In order to reach realistic returns, financing costs, initial margin, and maintenancemargins are considered. A prime brokerage set-up is constructed in the below, firstlydescribing (1) The Margin Account and Leverage, (2) Funding Costs, and thirdly (3)Transaction Costs.

13

4.3.1 The Margin Account and Leverage

Before investigating the margin account a basic portfolio rule is determined of equallyweighing each new signal to equity, such that a new signal gets the weight ω−1 = 1/240

of the existing equity. This position is scaled by the scalar L which will be called theinstantaneous leverage. Instantaneous should be thought as the leverage of the singletrade at investing, which is identical to the gross leverage while all 120 pair signals areactivated at once. More general if there are n weights and and each weight is greater than0 and they sum to 1, then if all weights are enacted at once then L is the gross leverage atthis point in time.

The margin account is cash based such that any security is purchased or sold on themargin. If the investor holds 100$ in equity and decides to purchase 100$ of IBM, thenthe broker will execute and effectively lend the investor a 100$ on which the investor mustpay interest. However, the investor receives interest on his cash of 100$ netting out theinterest paid when there is no financing spread. It is assumed that there is a financingspread, which means that there is a negative drift even when there is no leverage. Thisinduces a slightly conservative estimation of financing costs.

The major components of considerations are equity value, market value, initial marginand maintenance margin. The equity value is the initial equity plus all PL. The marketvalue can be distributed into two buckets, the value of all long trades and the value of allshort trades. Initial margin must be posted when entering either a long or short trade,which is INpct of the market value of the trades at initiation. The initial margin is wiredto the broker at initiation and is refunded at termination. Note that there is no distinctionbetween the long and short side in the requirement of posting margin9. When shorting theshort sale proceeds are kept by the broker as this is a loan in the first place; the proceedsdo not apply for margin. Only free cash (free equity10) can be wired as initial margin. Ifthe investor has no free cash to post as initial margin he must forgo investing. Specificallythe decision rule related to initial margin is:

• If new trades using the leverage L requires initial margin to increase over the equityvalue, then all the new trades are foregone.

• If new trading opportunities arise and the free equity is negative, then these tradesare skipped.

Themaintenance margin isMAINpct of current market value of both long and shorttrades. While the initial margin serves to restrict the initial leverage, the maintenancemargin serves to limit the drift to leverage. The maintenance margin is a calculated

9 In practice this will be dependent on the arbitrageurs set-up. Different prime brokers may havedifferent terms. It is not clear why either the short or long positions should have more favourable marginterms in compare to the other. Thus a simple stand is made that the same terms holds for both the longand short positions.

10 Free equity is the differential of equity to initial margin., i.e. cash not yet posted as margin.

14

number and is not required to be posted on a separate margin account. This is a controlnumber only, where it is evaluated whether or not equity can comply with the requiredmaintenance margin. If the maintenance margin is greater than equity, insufficient initialmargin is at that moment posted thus positions must be reduced. In order not to considerwhat the broker might do in case of a maintenance call, it is decided that the investor mustact in advance of a maintenance call:

• When equity value crosses MAINpct + b of the market value, and is strictly greaterthan MAINpct, then trades are terminated. The termination starts with the leastprofitable trade and continues by terminating the next least profitable trade until themaintenance margin is less than MAINpct + b of the equity value. This is an integerdecision on the number of trades that must be terminated, and the whole value ofthese trades are terminated. A trade is referring to a single asset position not a pair.

In general when margin actions are conducted the portfolio gets unbalanced, and volatil-ity will increase. Most often will maintenance actions induce short side trades to be ter-minated increasing the net leverage. In the case of momentum in the terminated tradesthe decision rule can improve returns, as it serves as a (late) stop loss.

The P&L before costs follows:

PLt =∑i∈a

rit · ωi,t (4.6)

Where a is the subset of assets ever traded. ωit is the dollar amount invested in asset i att, which may be zero or even negative for short trades. ωit is a complex function of pastreturns and the signals generated:

ωit =∑

j∈expand(sigi)︸︷︷︸a

LE(first(k · sigj 6= 0)− 1)

w︸︷︷︸b

·

(exp

[t−1∑k=0

log(sigj,kri,k + 1)

]+ 1first(k·sigj 6=0)=t

)︸︷︷︸

c

(4.7)

The (dollar) weight is at any point 0 until a signal is received. When the signal is receivedLω of the equity is placed in the security. This weight will change over the consecutive daysdue to prices changes, new signals adding on to the existing weight, or closing signals ofreducing the weight. In detail the explanation of equation (4.7) follows11

• a: sigi is a consolidated signal on security i. expand(sigi) gives the underlyingunique trading signals generating the consolidated series, these are indexed by j.Each underlying signal is vector of zeros followed by either a sequence of 1 or -1,

11 For an example of usage of the equation refer to section 9 Return Calculations in the Appendix.

15

then followed by more zeros until T. sigj is one of the underlying signals, these areconstructed such that sigj · ri gives the return vector of trade j, meaning that thelead of the series is the actual determinant on when to initiate the trade. For instanceif the investor determines to trade at 1985-01-03, then the underlying signal returns0 at this date and 1 (or -1) at the next day, as a return is at first made a day afterinitiation.

• b: For each of the underlying signals the initial dollar amount is determined whenthe trade is initiated. E(t) is equity at time point t. first(k · sigj 6= 0)− 1 indicatesthe time point when the signal was received, at this point 1/w (w = 240) of theequity is placed in gross value of this security. The initial placement is levered by L.

• c: first part is a sum of zeros until the trade is activated. At activation the cumulateproduct is calculated, however, this is not starting at 1. The second part is a dummyreturning 1 if the trade was initiated at t − 1, thus just returning the initial equitybase in t.

Given the P&L the equity value is simply the initial equity plus P&L:

Et = E0 +

t∑i=1

PLi (4.8)

The market value of all active trades is:

MVt =∑i∈a

(1 + ri,t) · |ωit| (4.9)

The maintenance margin is:

MAINt = MAINpct ·MVt (4.10)

It must hold that:Et > MAINt ∀t (4.11)

The initial margin wired at any time point is:

INTt =∑i∈a|INTi,t| (4.12)

INTit =∑

j∈expand(sigi)

INTpctLE(first(k · sigj 6= 0)− 1)

w·

(sigj,t + 1sigj,t+1>0 − 1sigj,t+1<0

)(4.13)

The initial margin is simply the required percentage margin times the initial investmentson all active trades. In order for a new trade to be made the difference Et− INTt, the freeequity, has to support the required initial margin on the new trading volume.

16

4.3.2 Funding Costs

A margin account has now been constructed, where its short sale proceeds, cash loanon long positions, wired collateral, and free equity can be deduced. These four capitalmeasures are all subset to interest income, especially they pay:

1. Cash loan on long positions yielding −rpbt2. Short sale proceeds yielding rrebatet

3. Wired initial margin yielding rrebatet

4. Free equity placed in money market products earning rft

Gatev et al. (2006) did not consider a financing spread, and claimed that their returnswhere excess returns. The reason is that, at L = 2 and fully invested then at each pairthey lend 100$ on the long leg, on which they pay rft (proxy for rpbt ), and 100$ are receivedin short sale proceeds paying rft (proxy for rreabtet ) and 100 dollars kept in margin for bothpositions yielding rft (equity placed in risk free bonds). The rate on the long and the shortpotions offset each other, remaining is the income on the margin. As they did not add therisk free rate to their returns they considered excess returns, which is the relevant economicmeasure of performance. However, with no financing spread there is no incentive for thebroker to provide a financing service, so in this paper a financing spread is assumed.

The short proceeds are calculated as:

SPt =INT shortt

INTpct= −

∑i∈a

min(INTit, 0)

INTpct(4.14)

And the cash loans on the long side is:

CLt =INT longt

INpct=∑i∈a

max(INTit, 0)

INTpct(4.15)

The funding costs are12:

FCt = SPt−1rrebatet − CLt−1rpbt + INTt−1r

rebatet + (Et−1 − INTt−1)rrft (4.16)

Subtracting Et−1rrft from the funding costs yields the funding costs in excess terms:

FCt =INT shortt−1INpct

rrebatet −INT longt−1INpct

rpbt − INTt−1(rrft − rrebatet ) (4.17)

When no maintenance actions have happened the dollar neutral trade will have an equalamount of cash loan on the long side as on the short side, leading to the simplification13:

FCt =1

2

INTt−1INpct

(rrebatet − rpbt )− INTt−1(rrft − rrebatet ) (4.18)

12 The interest rates are daily rates. Specifically the data gathered on the fed funds rate, is a yield, rftexpressed as annual yield. This rate is transformed to daily yield rft = (1 + rft )

260 − 1.13 The more general formula for excess funding costs when there is a long short differential goes:

FCt = −SPtrebate− CLtpb+ (SPt − CLt) rrft − INTt−1rebate

The investor who is net short gets a reduction to funding costs whenever rebate < rft

17

For simplicity it is assumed that the prime broker follows the market interest rates, suchthat rpbt = rrft + pb, and rrebatet = rrft − rebate, where pb and rebate are positive constants.In accordance to Pedersen (2015) realistic estimates for pb and rebate are 30 and 25 bpsrespectively. The resulting financing costs is then a constant of the invested value:

FCt = −1

2

INTt−1INpct

(pb+ rebate)− INTt−1rebate

It is now evident that the funding costs corresponds to the spread on the long to the shortside, and the loss of rebate when wiring margin to prime broker instead of placing in therisk free bonds. If one has more short than long positions the funding costs may become afunding income. In high turnover strategies the total funding (INTt/INpct) is very close tothe market value of the trades at any time. So if GLt is gross leverage then GLtEt ≈ INTt

INpct,

yielding that funding costs are:

FCt ≈ −GLt−1Et−1(

1

2pb+

[1

2+ INpct

]rebate

)(4.19)

Where one should read fr = E[GLt](12pb + [12 + INpct]rebate) as the funding rate. For

instance given an average leverage of 2 and using pb = 30 and rebate = 25 bps, the annualfunding costs should be close to 70 bps. When backtesting the strategy and using theexact formula the funding costs were 76 bps. It is nice to know that funding costs arelinearly increasing in gross leverage, as returns and transaction costs also can be describedas functions of gross leverage, leading to optimality conditions for maximizing return (orSharpe ratio) by gross leverage.

4.3.3 Transaction Costs

The final brick in the return calculations is the transaction costs, these may be classifiedinto:

1. Payment for execution.

2. Paying the bid-ask spread

3. Slippage

The first is the payment to the marketplace to facilitate the trade. The second is the costof immediate execution; in the backtest case all trades must be executed with certainty,thus the spread must be paid. Thirdly slippage is the issue of market impact when trading,this is both the cost of cutting deep into the other side of the order book as well as theorder book moving due to the presence of the arbitrageur.

Gatev et al. (2006) implemented transaction costs by lagging their trades by one ob-servation. In the case when entering into a pair trade, where the observed price on theloser is a bid price, and the winner is represented by a ask price, then no bid-ask spread ispaid at initiation (divergence). If the prices on the next day are equally likely to be bid as

18

well as ask prices then lagging the trade one observation will on average make the tradertake half the bid ask spread. In convergence the other half of the spread will be paid atdelayed trading.

The logic of Gatev et al. (2006) clearly catches the bid ask bounce, but it does notfilter the idiosyncratic and drifting changes to prices over the lagged day. If the model onaverage is right, the spread will be likely to start reversing at the entry point, thus laggingwill also have the cost of foregone profit. If entry points are too narrow and the spread islikely to diverge more before reverting then lagging the trade should improve the modeland induce negative transaction costs. As Gatev et al. (2006) estimated transaction costover their benchmark of 37 bps they claimed the method of being too conservative. Wetry to estimate transaction cost by the same procedure, but do also apply a fixed costonto each trade when both opening and closing the trades. The fixed cost is a fractionof the turned over value. This average cost would of course be larger for big funds dueto slippage. Pedersen (2015) reviews that average transaction costs for large institutionaltraders trading NYSE stocks are 8.8 bps per dollar traded, and 4 bps trading small lots,and an average cost of 27 bps when trading lots greater than 1% of daily volume. Tradingmore than 10% of the daily volume would make transaction costs increase rapidly and isnot recommended.

Using a fixed transaction cost the total costs can be estimated as:

TCt = −TCpct ·∑i∈a|∆MVit − PLit| (4.20)

The market value changes due to turnover or profit, by subtracting the profit the turnoverhas been identified. Using the cost above the equity now follows:

Et = E0 +t∑i=0

(PLi + FCi + TCi) (4.21)

A final note to transaction costs is that short selling can have further embedded costsnot yet considered. There are two sources: 1) redemption of shorted issues 2) short sellinglending fee on specials. Gatev et al. (2006) showed their version robust to both of these,so these will not be considered any further.

Summary of return calculations: Each signal gets an equal weight of Lω to equity,

and when a security is purchased or shorted INpct of the traded market value is wiredto the prime broker as collateral. Once the position is terminated the posted margin isrefunded. If more initial margin is required on new trades than the present free equity thenew trades will be foregone, and the portfolio gets unbalanced. If the market moves againstthe investor in magnitude such that equity is less than MAINpct + b of the market valuethe investor will close sufficient of the worst trades to recover margin balance. Whenevera trade is made TCpct is paid in transaction cost of the traded lot. The investor pays pb

19

Figure 4.2: Cumulative returns of the distance approach. Fund average gross leverage is set to two. Blackline is unrestricted strategy not corrected for transaction costs. Red line postpones trading by one day.Blue line is comparable to the black, but subtracting 10 bps for each dollar turned over (bought or sold).Green line corresponds to blue but further subtracts financing costs (rebate=25, pb=30 bps). Purple lineis the cumulative return of holding the S&P 500 index excess of the risk free rate. A logarithmic scale hasbeen applied. Lower panel depicts daily returns on the green line.

on all long positions, and loses rebate on short positions and on the margin wired. Usingthese financing costs makes the returns represent excess returns.

Lastly, a remark on the constant of the instantaneous leverage L must be made. If the av-erage trading opportunities are non constant, e.g. the opportunities consistently decreases,then under a constant L, leverage will be high in the early years, and low in the recentyears. In this case returns are not comparable across time, and perhaps many mainte-nance actions and margin actions will occur in the early days. Under these circumstancesL should be an inverse function of the trading opportunities.Another problematic opportunity pattern, is those of regime shifting opportunities. If nor-mally only 10% of the trades strike then a high L must be chosen to get a leverage of 2.But, if the opportunities can shift temporarily to a regime of all trades striking at once,the high L will not ensure balance in the temporary regime.Despite the pitfalls of a constant instantaneous leverage it has appeared to fit well for thisstrategy. The opportunities are nicely oscillating around 60 pair trades active a day, withno trend or vividly shifting volatility.

20

No costs Postpone a day 10 bps 10 bps and FCArithmetic daily return (bps) 2.9628 2.1907 1.9822 1.6889Geometric ann. return 0.0779 0.0566 0.0508 0.0428Daily std. dev. 0.0039 0.0039 0.0039 0.0039Ann. std. dev. 0.0632 0.0621 0.0633 0.0634Sharpe(0) 1.8817 1.5581 1.432 1.2998Sharpe(rf ) 1.2336 0.9112 0.8016 0.6748

Table 4.2: Performance of the distance approach using various cost applications, all presented in figure4.2, respectively black, red, blue, and green lines. For variable explanation refer to appendix 10 SummaryStatistic Tables.

4.4 Performance of the Distance Approach

At this moment a pair trading strategy is constructed as well as a portfolio system forcalculating returns. In this section the results of the implementation of distance approachwill be presented, it will be observed that the distance approach has had a very attractivereturn profile up to 2005, but since then has been generating negative returns exceptduring the financial crisis and recovery. The fact that performance was great in this periodwas reported by Baronyan et al. (2010) and complies with the notion that excessive noiseincreases performance (Engelberg et al. (2009) and Jacobs and Weber (2015)).

Figure 4.2 plots the cumulative returns of applying the distance approach, relative tothe performance of the S&P 500 index. A maintenance margin of 25% with a buffer of2%, and an initial margin of 30% has been applied. The applied instantaneous leverageis L = 3.97 ensuring a gross leverage close to 2. Lagging the trading decision by one dayvastly reduces the returns, which is closely represented by applying a transaction cost of 10bps for every dollar turned over. The returns are summarized in table 4.2 for the variouscost applications. Using the Gatev et al. (2006) approach of postponing trading, yieldedan annual return prior to 2005 of around 8.05% p.a. which is less than 11% reported byGatev et al. (2006). However they used data from 1962, and returns where higher prior to1985 than after, so the results might not differ at all. In accordance to the results of thestudy it is evident that the strategy has stopped working since 2005, and net of costs islikely to be losing in the years to come.

Letting the 10 bps transaction cost and financing cost adjusted returns be the relevantreturn statistic, then detailed summary results are presented in table 4.3. The strategy canreport modest excess returns of 4.28% over the entire period. On average the risk free ratewas at 3.84% which should be added for average total return. The strategy is uncorrelatedto the market with a β close to 0 when using the market excess returns. On average 58% ofthe months have been wining months, which is a much higher fraction prior to 2005. Thereturns are in drawdown in the end of the period and have been so for 10 years. Over theentire period 550 assets were traded. On average the investor using the top 20 decision rulewould hold 50 assets on a single day, with an average initial weight of 1.99% of invested

21

Return statisticsArithmetic daily return (bps) 1.6889Geometric ann. return 0.0428Daily std. dev. 0.0039Ann. std. dev. 0.0634Sharpe(0) 1.2998Sharpe(rf ) 0.6748Correlation to SPX 0.0725β 0.0249Ann. α 0.0433t-α 3.7049R2 0.0052Std. dev. on error 0.0632Information Ratio 0.6849Max loss 0.0312Max drawdown 0.2724Max drawdown duration 2833Days up 0.5127Months up (normality) 0.5762Skewness 2.7921Excess kurtosis 30.8501Start 1985-01-02End 2015-07-29

(a) Return statistics

Trading statisticsWeights Assets traded 550

Average assets held daily 50.0586Average weight 0.0199Q75 weights 0.0239Q95 weights 0.0597

Returns Geom. ann. gross return long 0.1705Geom. ann. gross return short -0.1002Geom. ann. transaction costs -0.0254Geom. ann. financing costs -0.0076Pct. trunover of m.val. 0.0514

Leverage Margin violations 0No. of maintenance actions 0No. of intial margin actions 0Avg. inst. leverage 3.97Avg. investment ratio 0.5933Maximal invested 0.8432Avg. gross leverage 2.0104Max gross leverage 2.8411Avg. net leverage -0.0336Max net leverage 0.0172Min net leverage -0.1503Avg. maintenance to equity 0.5026Max maintenance to equity 0.7103

(b) Trading statistics

Table 4.3: Distance approach summary statistics, excess returns. Corresponding to green line in figure4.2. For variable explanation refer to appendix 10 Summary Statistic Tables.

market value. The investor applied a gross leverage of 2.01, and a slightly negative netleverage. Both gross- and net leverage are oscillating around their mean. The investorhas on average invested 59% of the maximal investment limit, meaning that wired initialmargin corresponds to 59% of the equity value. At no point in time was the investor fullyinvested, the lowest slack was 16% of equity. As the investor was never fully invested noinitial margin actions where committed. Nor did the maintenance margin ever go to 98%of equity such that no positions were closed out. The portfolio was flipped 13.4 times ayear, inducing a vast amount of transaction costs at 2.5% to equity. The long leg of thetrade consistently won, averaging to 17% whilst the short leg has been losing 10% prior tocosts.

Further summary statistics on the underlying pair trades are given in table 4.4, whichis the essential diagnostic on the strategy. 2,466 pairs where traded such that each assetmust have appeared on average in 4.5 pairs over the entire period. 12,290 pair trades

22

Pair statisticsPct. converging 0.5195No. of pairs 2466No. of trades 12290Avg. trades per. day 1.5452

All Converging FailingReturns

N 12290 6385 5905Mean 0.0112 0.067 -0.0491Mean long 0.0325 0.0494 0.0142Mean short -0.0213 0.0176 -0.0633Std. dev. 0.0869 0.0404 0.0833Median 0.038 0.0587 -0.0298Skewness -1.01 13.1782 -2.294Ecxcess kurtosis 19.4497 377.1379 14.2624Pct. > 0 0.6542 1 0.2803Q5 -0.1494 0.0344 -0.2046Q95 0.1048 0.1268 0.0413

SigmaMean 0.026 0.026 0.026Std. dev 0.0068 0.0067 0.007Median 0.0252 0.0253 0.025Q5 0.0162 0.0164 0.016Q95 0.0378 0.0377 0.0379

DurationMean 74.7492 45.3997 106.4845Std. dev 59.1791 38.7378 61.0587Median 56 33 114Q5 7 6 9Q95 179 130 182

RegressionLeft hand side variable: return

Estimate Std. Error t-valueIntercept -0.0386 0.0026 -14.7652converging 0.0989 0.0014 72.0273rank 0 0.0001 0.0504sd -0.8537 0.0724 -11.7912duration -0.0003 0 -24.6901sigma 1.1732 0.0953 12.3114entry point -0.0004 0.0002 -1.6397R2 adj. 0.4821F-statistic 1895 on 6 and 12198 DF

Table 4.4: Distance approach pair summary statistics, of considering all the single constituting pairtrades. Some trades might not have been executed due to consolidation (trades cancel out). Returns aregross of costs.

23

where signalled to execution (and closed), but due to consolidation some of these mighthave cancelled. 52% of all trades were converging and yielded an average profit of 6.7%whilst the non-converging (failing) lost 4,91% both prior to costs. The trigger standarddeviation sigma was identically distributed for converging and failing trades, such thatsigma did not serve as identifier on which trades that would fail. The average trade lastedfor 45 days when converging and 107 days when it never converged. Some of the non-converging trades were in fact not failing, they just did not get the time to converge. Thisis represented by 5% quantile on failing trades of 9 days and less, and positive returns on alarge fraction of the failing trades. This is also reported by Gatev et al. (2006). Potentiallyclosing trades after e.g. 80 days, and not opening trades in the sixth month could haveincreased returns.

Lastly volume metric numbers can be considered. The obtained volume from compu-stat covers the number of shares traded on all national stock exchanges and OTC on onNASDAQ. The volume is corrected for splits and dividends, such that the corrected pricetimes the corrected volume yields total turnover, identical to the product of the uncor-rected price and volume. To get the actual turnover a day the volume weighted price timesvolume should be used. However, the volume weighted price is not observed. Assumingthat trading has either been normally or uniformly distributed on a given days price range,the mid price times volume yields the total turnover. The mid prices is the mid betweenthe high and low. Using this paradigm turnover is calculated for each security. On thetotal turnover matrix the 20 days rolling mean and standard deviation of the logarithmicturnover is calculated. These two metrics allows for constructing a matrix of number ofstandard deviations turnover away from the mean for any security on each day. Thesetwo turnover datasets of actual level data and normalized turnover data may be used toindicate potential fund size and whether or not trading is taking place in liquid or illiquidmarkets.

The 12,290 pair trades conducted are combined with the turnover of each of the secu-rities at opening and closing in levels and standardized difference to its mean. On averageboth opening and closing trades are taking place on significantly more volume than its 20days average. In other words trading is on average taking place in liquid periods. Tradesclosing prior to 2005 opened on an volume 0.1 standard deviations above their average,and after 2005 this number had increased to 0.14. This is a significant difference in meanson the 1.5% level. If relatively high volume is interpreted as increase in competition, tradesare opened on more competitive prices after 2005. Do and Faff (2008) showed that moretrades are permanently diverging in the recent years. The same can be reported here: 55%of the trades converged prior to 2005, and after 2005 only 47% converged. The fact thatmore trades diverge on competitive prices can mean that competition between pairs tradersand fundamental analysts has increased. Pairs trading can create a false equilibrium, fun-damentalists who observes pairs trading activity can use this to fuel his directional tradesin an spread that has been forced into a false equilibrium. In other words we find evidence

24

on the increased competition hypothesis.The normalized entry volume does not indicate increased spread at entry nor lower

returns i.e. volume at the signal of entry is meaningless in predicting returns. The factthat trading primarily takes place in liquid markets enforces fund size and allows for normaltransaction costs.

The minimum turnover level at entry in two paired securities indicates the potentialtrading lot. Entering a position in security A and B, whose turnover corresponds to 40 and50 million USD respectively, allows for trading 40m USD in the pair when consuming theentire market. Regarding that we could trade 2% of the volume without moving the market,and using the fact that each asset corresponds to 2% of the portfolio, this pair indicates apotential market value of all positions could amount to USD 80m (equity corresponding to40m). In other words the single security turnover indicates potential fund size, of a fundtaking on average 2% of the market at trading. The distribution of turnover is positivelyskewed. Using the modulus of the distribution of turnover gives an estimate of potentialfund size. The modulus fund size has been oscillating around its mean in the period 1984-1996, where it has been increasing until 2009, and since then slightly decreasing. This isjust a fact of market turnover has increased over the last 30 years. In the period 1984-1996,the modulus volume at entry was USD 5.7m whereas in the period after 2009 it was 66m.These number needs to be corrected for inflation to be comparable. Nevertheless, thisindicates that implementing this exact version of the strategy had a maximal capacity of5.7m and 66m in the years 1984-1996 and 2009-2016. Clearly is low capacity a problemas it reduces the total dollar returns to cover a set of fixed costs. However if one insteadwould trade the top 40 pairs rather than top 20 pairs, capacity is likely to almost double.Implementing the strategy on more stocks and more markets will likewise increase capacity.

4.4.1 Revisiting Return Calculations

The suggested return calculations are complex in compare to simpler methods as simplytaking the average daily return on any open pair, i.e. summing all returns and dividingby number of open pairs. This simple average procedure will induce the gross leverageto be fixed at exactly two. However, it is clear that a perfect balanced portfolio will inpractice induce too high transaction costs to be implementable. Nevertheless, the balancedportfolio may serve as a good proxy indicator for actual returns

Another simple procedure is the fully invested procedure as suggested by Gatev et al.(2006). This has an leverage very close to two at any point and weights increasing in assetprice increases. The procedure is also not reflecting a completely possible procedure, yetit may proxy real returns.

The two simpler return calculations has been conducted both before and after financingcosts, where the average annual premium in the simpler strategies is reported in table4.5. The equal weighted scheme has consistently overvalued the returns of over 50 bps

25

No costs Costs

Equal weighted 51.40 56.29Fully invested 10.79 15.65

Table 4.5: Geometric average biases of simple return calculations to the suggested in Return Calculations.The bias is represented as bps annual premium of the simpler methods.

per annum. When financing costs are considered the bias gets worse: leverage is inverseto portfolio performance, in good periods leverage is reduced, and financing costs arereduced, whereas in bad periods leverage increases and so do financing costs. The totaleffect is increased volatility of the strategy when considering the leverage problem, whichin turn reduces the expected return. To see this consider the expectations of the geometricBrownian motion:

log(St) = log(S0) +

(µ− σ2

2

)t

As volatility increases on a constant net return the long run return decreases.Not only the volatility trough the financing is contributing to decreased returns, the

leverage problem itself directly reduces returns. In high return regimes leverage is decreasedleading to lower future returns. In loosing regimes leverage increases making future lossesstrike even harder: These two effects moves the average return in same direction namelydownwards. Despite the simple methods being biased, the magnitude is not horrible large:definitely should an uncertainty of 50 bps not alter a decision of pursuing the strategygiven much greater uncertainties of at which costs it can be implemented on and the levelof capacity.

5 Review of the Cointegration Approach

As the distance approach has showed unprofitable over the recent years, it will now betested whether cointegration applied the distance framework can alter the profitability.Huck and Afawubo (2015), Caldeira and Moura (2013), and Harlacher (2012) suggestedexactly this. They copy every aspect of the Gatev et al. (2006) but replace the pairingwith a test for cointegration and choose top ranking pairs. Trading is almost also identicalbut the spread metric is changed to εijt = P it − β

ijτ P

jt −α

ijτ , estimated for t ∈ [τ − 365d, τ ].

Parameters are estimated in the same training period as for the distance approach, andfixed for the consecutive 6 months. When the spread |εijt | > 2στ ij , στ ij = sd(εt) for t ∈[τ−365d, τ ], then a dollar neutral trade is opened and this is closed at the next zero crossingor end of trading period (τ+6m). In the following the theoretical Pairing methodologies aredescribed, then Test Selection is applied to reduce the pairing problem, Results of Pairingare summarized, and lastly Performance of the Cointegration Approach is evaluated.

26

5.1 Pairing

In this subsection the three pairing methods that has been applied in this paper will bedescribed. The methods are:

• Engle & Granger (Engle and Granger, 1987)

• Phillips & Ouliaris (Phillips and Ouliaris, 1990)

• Johansen approach (Johansen, 1995)

5.1.1 Engle & Granger Approach

This method can test for one cointegrating relation between the prices P i and P j , by atwo step regression:

P it = αijτ + βijτ Pjt + εijt (5.1)

∆εijt = a+ πεijt−1 +k∑i=1

γi∆εijt−i + uijt (5.2)

Where critical values of the test statistic of π have been tabulated in Phillips and Ouliaris(1990) or can be simulated. If π significantly differs from 0, the spread series is stationaryand the price series are said to be cointegrated. The lag length k is determined by theAkaike Information Criterion. The test statistic on π is dependent on which security isplaced on the left hand side in the first regression, which in turn creates a random elementto the ordering of the pairs.

5.1.2 Phillips & Ouliaris

Let:

Pt =

(P it

P jt

)(5.3)

Then estimate the regular VAR(1):

Pt = ΠPt−1 + ξt (5.4)

Then the test statistic for stationarity in residuals is:

PO = Ttr(ΩM−1PP ) (5.5)

Where T is the number of observations considered and:

Ω =1

T

T∑t=1

ξtξ′t +

1

T

l∑s=1

ω(s, l)

T∑t=s+1

(ξtξ′t−s + ξt−sξ

′t) (5.6)

ω(s, l) is a weighting function defined as ω(s, l) = 1 − sl+1 , which is dependent on the

desired lag window l. MPP = 1T

∑Tt=1 PtP

′t . The great advantage of the Phillips and

Ouliaris approach is that it is invariant to which variable is set as the left hand sidevariable. Also if one wish to test for cointegration in basket of securities this is possible.Critical values have been tabulated in Phillips and Ouliaris (1990).

27

5.1.3 Johansen Approach

As Phillips and Ouliaris (1990), Johansen (1995) sets up a VAR model that ensures thatthere are no autocorrelations in the errors by choosing a sufficiently large lag length k.In the framework of pairs trading where p = 2 variables there are two test hypotheses:that r ≤ 1 and r = 0, where r is the number of cointegrating relations. When testing,one starts from the top testing the highest possible r, if r ≤ 1 is rejected there must be2 stationary components in the model, i.e. none of the asset prices are random walksand the arbitrageur may trade either assets individually and on average predict next day’sreturns. Clearly markets should be sufficiently efficient to reject this alternative, such thateither 1 or 0 cointegrating relations exist. If r = 0 is rejected then we accept the pair asa valid pair with one cointegrating relation, otherwise the pair is considered no further.Johansen (1995) considered several model specifications when testing for cointegration:a trend in cointegrating relation, a constant in cointegrating relations, seasonal variables,and/or dummies added to the model. Either specification alters the critical values, and onehas to simulate these. In order to stay close to the trading rule based on the cointegratingrelation: εijt = P it −β

ijτ P

jt −α

ijτ , the model specified includes a constant in the cointegrating

relation and no trend. The unrestricted VECM is specified:(∆Pit

∆Pjt

)=

(µi

µj

)+

(π11 π12

π21 π22

)(Pi,t−1

Pj,t−1

)+

k∑n=1

(γn11 γn12

γn21 γn22

)(∆Pi,t−n

∆Pj,t−n

)+

(εit

εjt

)(5.7)

for t ∈ [τ − 365d; τ ]

Equivalently:

∆Pt = µ+ ΠPt−1 +

k∑n=1

γn∆Pt−n + εt (5.8)

E[εtε′t] = Ω, εt ∼ N(0,Ω)

k is by default set to two. If the assets are not cointegrated, Π must have full rank, whenthe assets are cointegrated rank(Π) = 1. The rank test can be conducted by two methods:the trace-test and the eigenvalue test, see Johansen (1995) or Juselius (2006) for details.

5.1.4 Test Selection

Pairing by either of the three cointegration methods above is theoretically nice, as it ensuresthat stationary spreads are identified for trading. However, if all 124,750 pairs where to betested every month the process time would be long. To reduce process time some selectionrule must be made which tries to predict out of sample cointegration test statistic. Thereare some conditions necessary for justifying the use of a selection algorithm that are and

28

must be satisfied14. The generalized algorithm is to randomly sample some pairs and testfor cointegration. These test statistics will be regressed onto a set of explanatory variables.If the regression R2 is high a qualified prediction can be made with confidence on findingthe highest ranking pairs. Here, we first estimate the unconditional distribution of therandom test statistics and find the 0.25% upper critical value. Out of sample tests areconducted starting with highest conviction pairs, and continues while the 20 highest teststatistic is less than the critical value. When all the 20 highest test statistics are greaterthan the critical value, pairs in the 0.25% upper region only has been found. The algorithmis stopped at 2,500 test if 20 statics has not showed greater than the critical value at thatpoint. It must be noted that in order to induce efficiency on the pairing algorithm, thetesting of a random sample and modelling is done every quarter, and the following twomonths of pairing uses this model. This is in table 5.3 noted as N1, N2, N3, where N1 isdays of regressing and N2 is the following month, while N3 is pairing of the month followingN2. It appears in table 5.3 that the numbers of pairs generated increases in distance fromthe modelling date, i.e. the model is slowly outdated.

Explanatory variables are pair correlation, pair difference in volatility, squared differ-ence in terminal value, euclidean distance in indexed returns, Cluster variables, and somemore. The cluster variables are mainly an application of hierarchical clusters. A hierarchi-cal cluster structure is made on a dissimilarity. In this structure a cut is applied dividingthe assets into n bundles, n is the selected variable. A binomial variable is constructedgiven the cut, indicating for each pair whether or not the underlying coincide the samecluster. n is selected to maximize the correlation between this binomial variable and thesampled test statistics. This optimization is done to enhance R2, which in turn reducesthe necessary out of sample tests.

The full span of used cluster variables can be found in the appendix in table 11.1.Further detail of the applied selection algorithm follows in table 5.2. The goal for thesection algorithm is to find some of the highest ranking pairs without testing the entirespan. In reality the arbitrageur may be able to test the entire span over night and get theexact highest ranking pairs to trade next morning.

14 Conditions are:

• Processes time of the selection algorithm must be shorter than that of testing the entire span ofpairs.

• Sufficiently few very high ranking pairs are needed. Contrary is when the arbitrageur is satisfiedwith some random significantly cointegrated pairs. If that was the case then random samplinguntil sufficient significant pairs has been sampled would be faster than selection. The question is inessence questioning profitability of the pairs, in section Performance of the Cointegration Approachit will be shown that the highest ranking pairs significantly outperform random significant pairs,thus selection can be justified over sampling.

29

Parring algorithm

Datainput

Identify tickers Given a pairing date identify all members of the index.Reduce tickers Remove tickers with more than 2.5% missing.Data input Retrieve prices, returns, indexed returns over the past year.Explanatory Compile the variables: cor correlation in levels; cora corre-

altion in returns; distc euclidean distance in indices; distreuclidean distance in returns; diff_term squared differencein terminal indices; diff_vol squared difference in volatility.

Cluster structure is produced on the array of clusters described in table 11.1.Expand data Set up all pairs and bind explanatory variables (but not clus-

ters) onto the pairs, all negative correlation pairs in indexedreturns are removed.

Mod

el

Random sample 2,500 pairs and calculate their test statistic.Est. distr. of test statistics (unconditional).Cut clusters For each cluster structure, determine the number of clus-

ters (the cut) such that the cut maximizes correlation to teststatistic. Given the cut cluster variables are produced.

Regress Regress all explanatory variables incl. cluster variables ontothe test statistics of the random sample.

Reduce regression If one or more variable is insignificant on the 10% level re-move the least significant variable and re-estimate. Redothe procedure while insignificant variables exists on the 10%level.

Selection

Predict Use regression model to predict all test statisticsSelect and calculate Estimate the 40 highest ranking test statistics. Include the

next top ranking estimate while the p-value of the 20 high-est test statistic is greater than .25% as estimated from theunconditional distribution, or stop at 2500 test statistics. Ifsecond argument is not fulfilled sufficient test statistics in the0.25% right tail has been found.

Forward selection For two next pairing dates set up data as above. Use theindicated cuts for cluster variables. Set up the model ma-trix from the reduced regression and predict test statistics.Estimate top ranking test statistics under assumption of sta-tionary unconditional distribution and model.

Table 5.2: Selection algorithm for cointegration testing

30

Trace Eigen PO ADFSuper pairs

R2 0.1777 0.1607 0.2463 0.2442N1 1193.0894 1072.3577 2496.748 724.7967N2 1542.6839 1414.6353 2468.7004 809.3517N3 1608.9431 1449.5935 2479.6748 883.7398Pairs 6043 5973 4890 5724

Random pairsP-values ∈ 0-1% 0.0194 0.0177 0.0161 0.0154P-values ∈ 1-5% 0.0468 0.0487 0.0573 0.0492P-values ∈ 5-10% 0.0503 0.0513 0.0611 0.0622N 2211.4441 1953.6785 1652.4523 1714.1689Pairs 7188 7186 7209 7187

Table 5.3: Average pairs sampled before selecting top 20, of the different pairing methods. N1 correspondsto sampled pairs on the first pairing date, so forth for N2 and N3. Pairs are the total unique pairs selectedfor trading.

5.1.5 Results of Pairing

Eight models of the cointegration approach are produced following the four pairing meth-ods: Johansen trace test statistic, Johansen eigen value test statistic, Phillips & Ouliaristest for cointegration, and the Engle & Granger approach. The pairing methods are appliedfor random selected pairs, i.e. at each month random pairs are sampled and tested until 20pairs show significantly cointegrated on the 1% level. The pairing is also applied throughthe selection algorithm in table 5.2, aiming to find the strongest cointegrated pairs. Forboth Johansen methods the top 20 pairs are selected not rejecting r ≤ 1 and rejectingr = 0.

Table 5.3 summarizes some results of pairing. From the table it is revealed that theexplanatory power of the selection algorithm is decreased on later pairing days than thatof modelling. Interestingly is the explanatory power stronger for the simple cointegrationmethods than that of the Johansen methods. The Phillips & Ouilaris approach selectsfewer pairs for trading than the other methods, meaning ranking is more persistent. Themost interesting feature comes in the distribution of test statistics: if the test statisticsworks correct e.g. the remaining noise εt in equation (5.8) is white noise, then correctcritical values are used, i.e. the exact distribution is used. In this perfect application weshould get 1% false positives on the 1% level when random sampling. The observation thatmore than 1% test statistics are positive on the 1% level indicates market inefficiency. Butnote that the evidence only holds if the right distribution of test statistics are used.

31

Trace Eigen PO ADFSuper Random Super Random Super Random Super Random

Geometric ann. return 0.0817 0.0033 0.0793 -0.0087 0.082 -0.007 0.0713 -0.0146Ann. std. dev. 0.1261 0.0906 0.1159 0.0891 0.1532 0.091 0.0981 0.0904Sharpe(rf ) 0.6478 0.0366 0.6843 -0.0972 0.5352 -0.077 0.7273 -0.1611Ann. α 0.0885 0.0032 0.0841 -0.0077 0.0917 -0.0075 0.074 -0.0124Max drawdown 0.309 0.432 0.3189 0.3887 0.4158 0.5414 0.3644 0.4787Max drawdown duration 1440 4039 1233 6725 1178 5064 1461 6507Months up (normality) 0.5754 0.509 0.5787 0.4941 0.565 0.4965 0.5825 0.487

Table 5.4: Summary statistics for cointegration approach performance of eight applied pairing schemes

5.2 Performance of the Cointegration Approach

Backtesting is conducted using the same funding and transaction costs as described insection 4.4 performance of the distance approach. All backtests are constructed such thatthe average gross leverage is close to 2. Plots of the returns on the eight methods aregiven in figure 5.1, and a short summary statistics of the eight series are presented in table5.4. From this material it is clear that the random pairs considerably underperform thesuper selected pairs. Combined with the information of table 5.3 it is known that somepairs cointegrated on the 1% level are not false positives, but the major fraction is falsepositives. The random pairs thus contains a major fraction of false positives hence thepoorer performance than of the super selected pairs. The random pairs are excluded offurther study.

More elaborated performance and trading numbers on the four cointegration metricsare given in table 5.5 which in detail summarizes performance numbers, and table 5.6 thatdisaggregate performance on the long and short trades as well as describes leverage. Tables5.7 and 5.8 summarizes performance numbers on the trade level.

The graphs (5.1) shows that performance has in general been very appealing up to2004. 2004 to the summer of 2008 has for all the strategies been a period of consistentlosses. The summer 2008 until 2010 has been period where vast returns were made, butsince 2010 only the Eigen and PO method has been able to maintain value. In overallthe same long term performance picture of the distance approach compare to that of thecointegration approaches. As known form the literature (Engelberg et al. (2009), Jacobsand Weber (2015)), increased competition, coverage, and decreased limits to arbitrage canbe the reasons for the declining returns. The technological advances over the last 15 yearshas been immense, it is easier to get financial informations and trade with direct marketaccess, both factors that has effectively has increased coverage and competition.

32

Trace Eigen PO ADFArithmetic daily return (bps) 3.3256 3.1946 3.4776 2.8349Geometric ann. return 0.0817 0.0793 0.082 0.0713Daily std. dev. 0.0078 0.0072 0.0095 0.0061Ann. std. dev. 0.1261 0.1159 0.1532 0.0981Sharpe(0) 0.9739 1.0384 0.804 1.1426Sharpe(rf ) 0.6478 0.6843 0.5352 0.7273Correlation to SPX 0.0397 0.0616 0.0534 0.0717β 0.0272 0.0388 0.0444 0.0382Ann. α 0.0885 0.0841 0.0917 0.074t-α 3.799 3.929 3.243 4.092R2 0.0016 0.0038 0.0028 0.0051Std. dev. on error 0.126 0.1157 0.153 0.0978Information Ratio 0.7023 0.7263 0.5995 0.7564Max loss 0.1407 0.1116 0.1307 0.0542Max drawdown 0.309 0.3189 0.4158 0.3644Max drawdown duration 1440 1233 1178 1461Days up 0.5212 0.512 0.5086 0.5132Months up (normality) 0.5754 0.5787 0.565 0.5825Skewness 7.5323 6.5418 10.0839 8.043Excess kurtosis 218.7994 127.8064 167.3764 69.7849Start 1985-01-02 1985-01-02 1985-01-02 1985-01-02End 2015-07-29 2015-07-29 2015-07-29 2015-07-29

Table 5.5: Summary statistics for cointegration approaches, super selected pairs. Corrected for trans-action (10 bps) and financing costs (rebate=25, pb=30 bps). An maintenance margin of 25% and intialmargin of 30% has been applied. For description of variables refer to appendix 10 Summary StatisticTables.

33

Trace Eigen PO ADFWeights Assets traded 1040 1043 992 1026

Average assets held daily 91.0763 93.8226 87.1463 95.9807Average weight 0.0109 0.0106 0.0114 0.0103Q75 weights 0.0129 0.0125 0.013 0.0127Q95 weights 0.0244 0.0238 0.0279 0.0229

Returns Geom. ann. gross return long 0.2154 0.2142 0.202 0.1964Geom. ann. gross return short -0.1176 -0.1186 -0.1066 -0.1073Geom. ann. transaction costs -0.0229 -0.0229 -0.0227 -0.0238Geom. ann. financing costs -0.0076 -0.0075 -0.0076 -0.0076Pct. trunover of m.val. 0.0464 0.0464 0.0466 0.0482

Leverage Margin violations 0 0 0 0No. of maintenance actions 0 0 0 0No. of intial margin actions 0 0 1 0Avg. inst. leverage 3.53 3.44 3.0996 3.33Avg. investment ratio 0.589 0.588 0.5875 0.5893Maximal invested 0.9404 0.9287 1.0081 0.8286Avg. gross leverage 2.0009 1.9975 1.9986 2.0022Max gross leverage 3.1386 3.0158 2.8918 2.9066Avg. net leverage -0.0578 -0.0565 -0.0529 -0.0589Max net leverage 0.1023 0.0799 0.1206 0.1038Min net leverage -0.3985 -0.343 -0.559 -0.2457Avg. maintenance to equity 0.5002 0.4994 0.4996 0.5006Max maintenance to equity 0.7847 0.754 0.7229 0.7266

Table 5.6: Trading statistics for cointegration approaches, super selected pairs. For variable explanationrefer to appendix 10 Summary Statistic Tables.

34

Fig

ure

5.1:

Perform

ance

plotsof

thecointegrationap

proach.Eachpa

nelrepresents

adistinct

pairingap

proach,orde

red:

Joha

nsen

tracetest;Jo

hansen

Eigen

value

test;Phillips

&Ouliaristest;Eng

leGrang

ertest.Black

lines

representreturnson

thesupe

rcointegrated

pairs,

andredlin

esof

rand

omcointegrated

pairson

the1%

level.

Purplelin

esaretheequity

prem

ium.Su

mmarystatistics

may

befoun

din

table5.5.

35

Table 5.5 shows an annual return of around 8% of the strategies and volatility rangingfrom 10-15%, such that one could expect a Sharpe ratio of 0.6. The volatility of the POmethod is overstated as a big loss and return is made in 1991, but since then returns hasbeen much more stable. The Sharpe ratios are similar to that of the distance approachsuch that it may seem that the cointegration methods are 2 times levered versions of thedistance approach. In table 5.6 it appears that the double amount of assets are traded incompare to the distance approach. This is likely to be a factor of the randomness inducedby the pairing scheme, whereas the euclidean distance is a single clear variable that givesmore consistent ranking. An upside of the more pairs is that more assets are held daily,thus reducing the weights on single assets and in reality reduces transaction costs. Returnsare made on the long side, where the short side appears as a hedge. An interesting featureis that the maximal investment ratio is greater than 1 for the PO method, thus an intialmargin action was taken. From tables 5.7 and 5.8 it appear that despite sophisticationhas increased in compare to the distance approach, a very large fraction of the tradesdoes not converge. A major fraction of the trades converge within 50 days (read days notobservations), rest of the trades are mostly non converging trades waiting to the 6 monthperiod of expiry.

Final notes to the strategy, is that the standard deviation within the trading period andduration, both reduces expected returns. Thus monitoring the continuing spread volatilityand trade duration, may act as stop indicators. Trades of increasing variance and durationmay be closed prior to convergence for enhanced performance. It is not evident that oneof the pairing methods should be superior to another, but combining the signal series ofthe four methods can serve as diversification and increasing weights to common pairs. ThePO method may be seen superior in the sense of more consistent ranking, little volatility,and highest fraction of converging trades.

5.3 Comparison to the Literature

The application of the cointegration approach (Trace and Eigen) has also been tested byHuck and Afawubo (2015) in the period 2000 to 2011. However, there are some small tech-nical differences between their and this test. Huck and Afawubo (2015) equally weightsthe portfolio each day and subtracts transaction costs as an estimate end of the month.In compare to subtracting the costs at trading this induces a smaller draw on the port-folio. As seen in Revisiting Return Calculations equally weighing the portfolio increasesperformance. The comparable result of Huck and Afawubo (2015) has been a consistentprofitable cointegration approach with a monthly return net of costs of 175 bps. This isdespite that Huck and Afawubo (2015) has estimated monthly costs of 33 bps whereaswe estimate it of 25 bps. In the period 2000-2011 the average monthly return was of thisstrategy 53 bps, vastly lower than that of Huck and Afawubo (2015). Cointegration test-ing in Huck and Afawubo (2015) is only done in pairs with a difference in terminal indices

36

Tra

ceP

ct.

conv

ergi

ng0.

5182

No.

ofpa

irs

5661

No.

oftr

ades

1203

3Avg

.tr

ades

per.

day

1.51

28A

llC

onve

rgin

gFa

iling

Ret

urns

N12

033

6235

5798

Mea

n0.

0195

0.12

15-0

.090

1M

ean

long

0.04

180.

0803

4e-0

4M

ean

shor

t-0

.022

30.

0412

-0.0

905

Std.

dev.

0.20

820.

1277

0.22

2M

edia

n0.

0561

0.09

68-0

.056

3Sk

ewne

ss-0

.228

16.

3375

-0.6

793

Ecx

cess

kurt

osis

25.2

491

93.8

241

25.2

791

Pct

.>

00.

6522

0.97

610.

3039

Q5

-0.3

144

0.02

68-0

.450

2Q

950.

2511

0.30

150.

1472

Dur

atio

nM

ean

71.7

104

42.2

656

103.

3744

Std.

dev

57.3

119

36.8

393

58.5

044

Med

ian

5530

110

Q5

65

9Q

9517

612

1.3

180

Reg

ress

ion

Lef

tha

ndsi

deva

riab

le:

retu

rnE

stim

ate

Std.

Err

ort-

valu

eIn

terc

ept

-0.0

463

0.00

52-8

.892

5co

nver

ging

0.18

220.

004

45.4

527

rank

8e-0

43e

-04

2.67

29sd

-0.0

054

8e-0

4-7

.128

7du

rati

on-5

e-04

0-1

4.95

09en

try

poin

t0.

001

7e-0

41.

4535

R2

adj.

0.27

56F-s

tati

stic

909

on5

and

1192

5D

F

Eig

enP

ct.

conv

ergi

ng0.

5207

No.

ofpa

irs

5657

No.

oftr

ades

1228

5Avg

.tr

ades

per.

day

1.54

45A

llC

onve

rgin

gFa

iling

Ret

urns

N12

285

6397

5888

Mea

n0.

0192

0.12

11-0

.091

5M

ean

long

0.04

040.

0809

-0.0

036

Mea

nsh

ort

-0.0

212

0.04

01-0

.087

9St

d.de

v.0.

1967

0.12

650.

1997

Med

ian

0.05

590.

097

-0.0

575

Skew

ness

-0.2

976

5.79

29-1

.173

Ecx

cess

kurt

osis

15.4

746

66.9

009

12.5

575

Pct

.>

00.

6505

0.97

670.

296

Q5

-0.3

129

0.02

73-0

.423

4Q

950.

2414

0.29

360.

1323

Dur

atio

nM

ean

71.0

624

41.8

191

102.

8336

Std.

dev

57.1

812

36.3

274

58.7

369

Med

ian

5329

110

Q5

65

9Q

9517

611

918

0R

egre

ssio

nLef

tha

ndsi

deva

riab

le:

retu

rnE

stim

ate

Std.

Err

ort-

valu

eIn

terc

ept

-0.0

393

0.00

47-8

.397

9co

nver

ging

0.18

140.

0036

50.5

698

rank

4e-0

43e

-04

1.51

35sd

-0.1

547

0.00

7-2

2.13

8du

rati

on-5

e-04

0-1

5.88

81en

try

poin

t3e

-04

6e-0

40.

4718

R2

adj.

0.33

46F-s

tati

stic

1226

on5

and

1217

2D

F

Tab

le5.

7:Und

erlyingtrad

estatistics

forTrace

andEigen

supe

rpa

iringmetho

ds.

37

PO

Pct

.co

nver

ging

0.54

27N

o.of

pair

s48

51N

o.of

trad

es13

588

Avg

.tr

ades

per.

day

1.70

84A

llC

onve

rgin

gFa

iling

Ret

urns

N13

588

7374

6214

Mea

n0.

0196

0.10

86-0

.086

2M

ean

long

0.03

80.

0693

8e-0

4M

ean

shor

t-0

.018

40.

0393

-0.0

869

Std.

dev.

0.17

60.

1163

0.17

63M

edia

n0.

0505

0.08

81-0

.054

9Sk

ewne

ss0.

3905

7.87

56-0

.759

4E

cxce

ssku

rtos

is26

.679

618

3.99

948.

972

Pct

.>

00.

6582

0.96

380.

2956

Q5

-0.2

761

0.01

44-0

.390

1Q

950.

2192

0.26

810.

1181

Dur

atio

nM

ean

70.6

959

41.7

525

105.

0423

Std.

dev

58.0

107

37.3

708

59.3

96M

edia

n51

2911

4Q

56

58

Q95

177

124

181

Reg

ress

ion

Lef

tha

ndsi

deva

riab

le:

retu

rnE

stim

ate

Std.

Err

ort-

valu

eIn

terc

ept

-0.0

268

0.00

4-6

.686

6co

nver

ging

0.16

190.

0031

52.2

323

rank

1e-0

42e

-04

0.60

99sd

-0.1

998

0.01

44-1

3.82

91du

rati

on-5

e-04

0-2

0.40

12en

try

poin

t9e

-04

5e-0

41.

8679

R2

adj.

0.33

57F-s

tati

stic

1361

on5

and

1345

8D

F

AD

FP

ct.

conv

ergi

ng0.

5414

No.

ofpa

irs

5585

No.

oftr

ades

1316

3Avg

.tr

ades

per.

day

1.65

57A

llC

onve

rgin

gFa

iling

Ret

urns

N13

163

7127

6036

Mea

n0.

0176

0.11

54-0

.097

8M

ean

long

0.03

690.

0748

-0.0

078

Mea

nsh

ort

-0.0

193

0.04

06-0

.090

1St

d.de

v.0.

1895

0.10

580.

2011

Med

ian

0.05

810.

0968

-0.0

613

Skew

ness

1.96

926.

3212

3.80

48E

cxce

ssku

rtos

is97

.936

813

5.55

7316

1.91

32P

ct.>

00.

6591

0.98

820.

2705

Q5

-0.2

913

0.03

46-0

.418

5Q

950.

2038

0.24

430.

0965

Dur

atio

nM

ean

68.8

412

41.2

774

101.

3872

Std.

dev

56.4

928

36.5

746

58.5

164

Med

ian

4929

109

Q5

65

8Q

9517

412

017

8.25

Reg

ress

ion

Lef

tha

ndsi

deva

riab

le:

retu

rnE

stim

ate

Std.

Err

ort-

valu

eIn

terc

ept

-0.0

339

0.00

43-7

.908

9co

nver

ging

0.17

560.

0033

53.1

354

rank

1e-0

42e

-04

0.26

23sd

-0.1

287

0.00

68-1

8.83

53du

rati

on-6

e-04

0-2

0.69

35en

try

poin

t-9

e-04

5e-0

4-1

.720

5R

2ad

j.0.

3553

F-s

tati

stic

1439

on5

and

1304

4D

F

Tab

le5.

8:Und

erlyingtrad

estatistics

forPO

andADFsupe

rpa

iringmetho

ds.

38

less than 10%, which perhaps has been another relevant filter for consecutive performance.Otherwise there is no clear answer why there should be such a great differential in perfor-mance. The conclusion of Huck and Afawubo (2015) is that there is no decay in returnsclose to the end of the period in 2011, which is in strong contrast to the results of this test.

With Huck and Afawubo (2015) and Caldeira and Moura (2013) finding their versionsprofitable of the cointegration approach, there is mixed signals of the profitability of thestrategy. Nevertheless, with still a high fraction of pairs never converging, some largeimprovements has to be made to the model. In the following section a new model withfoundations in the cointegration methods will be constructed.

6 A New Trading Model

The distance approach is no longer working, and the cointegration approach did not alterthe return profile. However, there are some stronger features of the cointegration approach:spread stationarity and asymmetric partial effect to the spread of each of the securities.One key failure that is embedded in both models is that they do not act on mean shifts inthe spread. As noted by Jacobs and Weber (2015) and Engelberg et al. (2009) firm specificnews will alter a spread trade unprofitable. Furthermore holding periods has been shownin the literature and in above to reduce profits, and many holding periods are long. A newmodel will be suggested in this section, one that incorporates mean shifts in the spread,and will reduce the average holding period.

The portfolio construction proposed by Gatev et al. (1999) is very intelligent. The basicis to select the most recent pairs which are the most relevant pairs. Updating this everymonth keeps the portfolio fresh, and most importantly ensures a favourable unbalancedweighting to pairs striking month after month. The method even works well with the weaktrading scheme, where outdated signals can be reversed by a newer opposite signal. If twotrading periods yields the same signal increased conviction is put on this trade.

The trading model proposed of one year pairing and then six months trading seemsinferior to the portfolio construction. It has been seen in the above that a lot of pairsnever converge meaning that the equilibrium has either shown as a false equilibrium or themean spread value is moving in time. This induces a lot of trades waiting for expiry. Ifone updated the equilibrium model with the most recent information these trades may beclosed faster. Continuously updating the trading model would adjust the mean spread tothe new equilibrium, and informed trading would take place.

To alleviate the problems of above a general framework is suggested which shouldemphasize trading over pairing. The sampled pairs must be selected in order to fit thetrading model, in thread with the intuition of backwards induction: given the terminaloutcome of trading by a known model which pairs fits this model the best? The generalframework can be illustrated as following:

39

First element that must be in place, is a trading model using a rolling window basiscontinuously updating the view on the equilibrium. Assuming this model in place thefirst step is to sample n pairs that will be backtested. The backtest information will showperformance of the n pairs which can be combined to a panel dataset of pair performanceover time. The panel data can be combined with descriptive variables: pair ranking, paircorrelation, spread volatility etc. This massive dataset saves information that may be usedin (1) portfolio construction and (2) future sampling. The first can be mean varianceoptimization or allocating by trend following. The optimally indicated portfolio of pairsis the one that will be traded on the days to come. The latter can be questioning whichpairing schemes has contributed with the most successful pairs, or which pairing metricsare valuable in predicting performance. The answers to these questions can improve thesearch for profitable pairs. In overall the optimization procedures moves the model fromnormative to positive.

The framework can be implemented on a continuous basis; especially is portfolio opti-mization often not computationally demanding allowing for continuous portfolio decisions.Pairing is computationally expensive hence it is expensive to update the pool of pairs, sothis may take place less frequently. The framework suggest to never remove a pair from thepool of pairs, inducing data to increase in time. A strong feature of the framework is thatpairs included in the portfolio passes two screenings 1) pairing that tests fore theoreticalvalidity, and 2) past backtest performance indicating whether the pair in fact has worked(out of sample). This double filter will mitigate the issues of false positives.

This suggested framework can become quite complex and computationally expensive.First of all it suggest backtesting all pairs ever sampled over their entire data period, suchthat more tracking than trading is done. Secondly the optimization procedures can becomecostly. The more complex system will show valuable when there is systematic informationthat can be captured by the massive data available.

An example of an application of the framework will be shown in the subsections tofollow. Firstly the trading model of The Continuous Cointegration Model is constructedwhich just is a regressions on a rolling window basis. Nevertheless it gets an overlay ofoptimization to the employed entry triggers such that these follows the spread volatility.In section Pairing the sampling procedure is described, however, this is a fixed procedure

40

that will not be optimized on past information. The reason for this is that sampling iscomputationally expensive, so the already sampled pairs in previous sections will be reusedin this section. Lastly a small range of Portfolio Selection rules are employed and evaluated,where especially those that uses trend following perform well. The intuition follows: if thetrading model consistently finds miss pricing in a given pair, this pair will consistnentlyprofit, thus the recent profitability is a relevant variable in the portfolio construction.Furthermore pairs performing well are exactly those who fit the trading model hence thesethat must be selected.

6.1 The Continuous Cointegration Model

The trading model applied will be the same as for the cointegration method, but on arolling window basis using the past year information, including the day of the signal. Seethe illustration below for an overview.

The illustration considers an updating date τ and the securities i and j. Firstly is εt =

P it−αijτ −βijτ P jt estimated for t ∈ [τ−365 : τ ], resulting the output of the standardized error,

nspreadτ . This procedure has been executed for any t ≤ τ . Every fortnight the optimalentry trigger k∗t is calculated. This optimal entry trigger is that which has maximizedprofit in the last 180 days. In the illustration τ is one of the dates of updating the entrytrigger. The actual employed entry trigger is the average of the past three optimized entrytriggers, a choice that serves to reduce noise from the optimization, but at the cost ofslowly adapting optimal triggers. The trading follows the rules presented in Review ofthe Cointegration Approach: when the standardized spread εt/σt is greater than somethreshold kt a short position is taken in the spread, whence less than −kt a long position istaken in the spread. Positions are closed at next zero crossing. The new feature is that ktis variable in time (and so are parameters). k∗τ is used for t ∈ [τ, τ + 13d], at τ + 14d a newtrigger is calculated which is used for the next 13 days. The idea of using an optimized

41

trigger is to let the model itself determine optimality rather than enforcing a fixed rule tothe system. Furthermore will a small kt be optimal in periods of little volatility and manyzero crossings, and a large kt in periods of few crossings and great volatility. If the pair hasrecently showed negative performance kt can even become so large that no further tradingwill take place until performance turns to positive.

Optimizing the entry trigger is a complex matter, and in general there are two ap-proaches 1) spread modelling and solving, 2) simulating and numerically optimizing. Onlythe second approach is used in this paper. The first needs a process of the spread serieswhere the aim is to maximize the yard sticks traded on this series. One can proxy thespread process by the observed eqillibirum error, and find entry and exit triggers thatmaximize the total standard deviations gained by trading the given process. But note thatmodelling gained standard deviations on the disequilibrium only proxies the gained returnof trading the exact spread15. The first step is to assume a model on the past spread series,or perhaps select the best fitting model from a list of estimated models over the last 180days. E.g. when the spread series follows an AR(1) optimal trading can be deduced on thisprocess (Puspaningrum, 2012). Intuitively, if a very small trigger is used slightly greaterthan the transaction costs many trades will be executed with a low profitability, while alarge trigger will yield few trades of great profitability. There will be an optimal triggerbetween 0 and infinity. The great advantage of this procedure is that optimal triggers caneither be solved directly or simulated in advance depending on which spread process thatis applied. The optimal trigger will then be a known function of the type of process andestimated parameters, thus the process time is limited to the time of estimating param-eters. Estimation is often very fast, especially in compare to the second approach. Thesecond approach does not use the standardized spread as proxy for returns (see footnote15). This simulates returns on a set of pre given entry triggers on the standardized spread.Here entry triggers range from 0.2 to 10. For each constant trigger over the past 180 daysa specific return would have been induced using this trigger. As trading is discrete thenover the past 180 days two or more triggers could have yielded the same return, in suchcases the lowest trigger of this set is returned. The trigger yielding the greatest return isused. This optimization has the advantage of being entirely model free. On the other handof intuitively easy to understand and implement it is rather slow hence optimization onlytakes place every fortnight. Also, it suboptimal as it is optimal on the exact past, but notoptimal on the current process.

15 The equilibrium error follows a complex process as each point indicates the standard deviations fromequilibrium, but the equilibrium moves in time. These changing parameters makes it hard to link realprofit of a trade to the change in the standardized spread. More precisely a one standard deviation changein the standardized spread will not not have the same return across different days. Thus maximizing thenumber of standard deviations gained by trading given a trigger k is not exactly maximizing the return.Here two stands can be taken: use the standardized spread as the proxy for the real spread series, or linkthe standardized spread to the real spread to construct an entirely new spread that is consistent with thereal spread.

42

Figure 6.1: Example of a new trading model. Panel 1 shows the price series of Amsouth Bancorporation(red line) and Huntington Bancshares (blue line), and the price differential of Amsouth to Huntington (blackline). Panel 2 depicts the return series of trading this pair, using the signals of panel 3. Panel 3 depictsthe normalized spread series, of regressing Huntington onto Amsouth using the past year information. Theerror is divided by the residual standard error from the training period, thus is transformed to number ofstandard deviations from the equilibrium. The equilibrium spread is normalized to 0. Red lines are thetrigger barriers. The red squares are indicating a short position in the spread i.e. shorting Amsouth andbuying Huntington; opposite for the green squares.

43

The trading model is applied on each pair ever sampled for their entire track record. Thefirst 260 observations does not provide any trading and are used for the initial estimation.In the next 180 observations an entry trigger of 2 is used as no optimized trigger has on thatpoint been generated. From observation 440 the algorithm follows as above. An illustrationof the algorithm in work is showed in figure 6.1 applied on the pair Amsouth Bancorporationand Huntington Bancshares. The third panel of the figure shows the standardized spreadand the optimized triggers. Whenever the standardized spread is absolutely greater thanthe triggers a position is placed in the real spread depicted in first panel. It is clear that thereal spread does not resemble the standardized spread (the equilibrium error). Investing19 dollars in this pair trade in 2002 would be worth 29 dollars in 2006, which is around11% annual return net of transaction costs prior to financing cost, and above the risk freerate. Trading this pair since 1985 until 2007 using this strategy would have yielded anannual return of around 17%.

The described trading strategy is just one of many potential applications. The maintwo points to stress out of the application is that 1) it is implemented on a continuous basisupdating the equilibrium at any point eliminating the potential of a permanent long/shortposition in a spread. From an informational point of view there is no reason for not usingthe latest information. 2) Secondly optimizing entry triggers will make the strategy adaptto periods of high and low volatility, through the moving window of optimized triggers.A constant trigger will in low volatility regimes forgo trades and trade to much in highvolatility regimes.

6.2 Pairing

In order to save time pairing uses the results of the cointegration approach. Each monththe entire data from the trace, eigen, PO, and ADF pairing is used to select 24 pairs. Thisstarts by taking the 4 top ranking pairs of the trace method, of which it has been rejectedthat both 0 or 2 cointegrating relations exist. Next the top 4 pairs are selected from theeigen test who are not members of the already four trace pairs. This procedure follows forthe PO and ADF method yielding a total of 16 pairs. To the bucket of the 16 pairs, thefour pairs of highest correlation not yet in the bucket are included. Likewise are the fourlowest distance pairs included, yielding a total 24 pairs. This procedure is repeated everymonth.

A total of 5,488 pairs was generated. Several pairs was indicated more than once.Each of these pairs has been backtested using all historic information in that pair. Theentire period backtest serves to calibrate entry triggers, and yield the greatest amount ofinformation to date on that pair, relevant for the portfolio selection.

44

6.3 Portfolio Selection

At this point a trading model is available, and has been used for backtesting a set of 5,488pairs. Now it is time to figure out which of these pairs that ought to be traded conditionalon their past performance. There are an infinite amount of portfolio methods that couldbe applied but here we limit the investigation to very few methods, especially focusingon trend following. Pairs with a strong trend in their performance, are those that fit themodel best over the past days, and are likely to do so for reasons of market inefficiency. Ifthis inefficiency persist the highest performing pairs ought to be selected.

The portfolio selection is subset to two rules (1) a pair not yet indicated by the pairingscheme, cannot be included in the portfolio. (2) If a security leaves the S&P 500 indexit is allowed to be traded a maximum of one year after the leave, as long as the averageone month volume exceeds USD 1m16. First rule is a simple matter of future informationcannot be used at present. Second rule is a matter of liquidity. It is assessed that membersof the S&P 500 index are sufficiently liquid to induce average transaction costs at 10bps,and that volume is sufficiently large for a small hedge fund to find it worthwhile to pursuethe strategy. This is an assessment not a known fact. Once a member leaves the indexit is too dangerous to rely on the sufficient liquidity assumption, and its volume must beassessed. Clearly one could just terminate a trade on a leaving security, but then one hasto have suggestion of why this pair no longer forms a valid pair. Contrary, one can thinkthat higher returns can be made on securities with less competition. There is no reason tosudden terminate a pair once one of the issues leaves the index.

Thirteen portfolio selection variables are considered:

1-3 Trend following (TF), selecting pairs with the highest distance of accrued return toits 65, 130, 260 observation rolling index.

4-6 Selecting the last 65, 130, 260 days highest total return pairs (LT).

7-9 Special return metrics:

7 Normalized return, selecting pairs with the highest per trade return over dura-tion estimated on the past year.

8 Sharpe, selecting highest Sharpe ratio pairs, estimated on past year

9 Predict, selecting the highest predicted return pairs

10 Minimum variance, selecting pairs with the lowest intra trade variance on the pastyear

11 Gatev et al. (2006) (GGR), method of selecting recent pairs and hold them for 6months

16Volume is normally quoted in number of shares traded, but here volume refers to number of sharestraded times closing price, indicating the total dollar turnover on that day.

45

12 Selecting pairs of average 1 year rolling window lowest trade duration (duration)

13 R2, selecting pairs with the highest regression R2 from equilibrium

Explanations of the selection schemes follows: (1-3) The trend following variables com-putes the cumulative return from backtesting a pair; on this its rolling average (65, 130, 260)

is calculated. The cumulative return over its rolling average gives the trend strength. The160 highest positive strength pairs may open new trades. If a pair leaves the top 160members active trades are not terminated. (4-6) Are similar to the trend following butuses only the total return over the last 65, 130, 260 days, as allocation variable. (7-8)Each trade ever terminated within the last 260 days in a pair are evaluated: its return,duration, and standard deviation intra trading is calculated. The average of return overduration gives the normalized returns, and the average of return over standard deviationgives the Sharpe. Again are top 160 highest normalized return and Sharpe ratio pairsallowed to open new trades. (9) A linear regression is estimated explaining future returnof: present equilibrium R2, time since the pair was last indicated by pairing, its presenttrend strength, recent total return variables, and more. From this model the one weekahead returns are predicted and the model R2 is returned. The prediction times R2 givesthe predicted return controlled by its conviction. The 160 highest predicted return pairs(multiplied by conviction) may open new trades. (10) Uses the intra trade variance i.e.last 260 days average variance when trading. The lowest variance trades are selected forfuture trading. (11) This is the distance approach method of selecting recent pairs, howevertrades are not closed on the six month expiry but new trades are not opened. (12) The 160past lowest duration trades are selected aiming for fast returns and short duration marketexposure. (13) The equilibrium R2 from the cointegrating regression from the backtests hasbeen saved. The 160 pairs with the strongest relationship is allowed for trading. A strongrelationship supports the control hypothesis: there might be no fundamental relationshipjust short abnormal movements which can be identified by a control variable.

Portfolio selection method (11) has the nice feature of increasing weights to pairs thatstrike most often. Furthermore a pair in the pairing period is likely to be a very profitablepair, thus selecting recent pairs is similar to selecting recent profitable pairs. Many of theother variables seems to favour past strongest performers and equally weighs these. Theprediction method is nice from the perspective that it automatically selects which inputvariables that are relevant. A close substitute to the prediction method, and potentialfavourable, is using a time series model for predicting future returns rather than a linearregressing on present observed variables. Common to all of the portfolio schemes are thatthey use one variable that is attractive and allocate by the relative attractiveness on everysingle date. There is no balancing and the decisions are binomial. If a pair coincides thetop 160 most attractive pairs a new trade may open in this pairs, and this trade will geta fixed weight to equity. This weight is set on the assumption that, on the long run onaverage a fixed amount of trades will strike at once. The weight is chosen to on average

46

Figure 6.2: Applying the new approach trading model under various selecting methods. For summarystatistics refer to table 6.1.

ensure a gross leverage of 2.Each pair is an implementation of a strategy by itself that ought to deliver arbitrage

like profits. In arbitrage circumstances the right decision is to allocate as much capital aspossible to the highest predicted return trades. Trouble arises from the fact that the profitsare not pure arbitrage profits, and expected returns are hard to calculate. The alleviationis to equally weigh amongst the highest predicted returns as using recent performance asthe best indicator.

The technical set-up to perform this strategy in real time can be thought as: 1) athread is allocated to each pair updating the single pair dataset on each observation. Oncea thread has updated a pair it 2) sends each of the portfolio variables to a central cluster.This cluster 3) makes the relevant portfolio decisions indicating which pairs that may opennew trades (the recent relative strongest pairs). The cluster 4) sends these green lightsto the portfolio manager. 5) The portfolio manager consolidates incoming trades on pairsallowed to traded and sends these orders to the prime broker. Likewise does the portfoliomanager keep track on all active pairs, sending closing signals to the prime broker atconvergence.

The returns of the continuous pairs trading model under the thirteen selection schemesare plotted in figure 6.2, and various performance numbers presented in table 6.1. Manyconclusions can be drawn from the figure. Firstly, that the trend following of TF andLT variables are the top performers. Table 6.1 shows TF 65 and LT65 that are usingthe last three quarter returns, are superior to the longer horizontrend following variables.Secondly, has the GGR method showed declining returns in long periods. Perhaps isthe recent selected pair paradigm such a popular that competition is high, or maybe isit not the most relevant paradigm. Thirdly, is the most complex method of predicting

47

TF 260 TF 130 TF 65 LT 260 LT 130 LT 65Geometric ann. return 0.1318 0.1537 0.1847 0.1249 0.1222 0.1483Ann. std. dev. 0.0726 0.0736 0.0747 0.0743 0.0744 0.0743Sharpe(rf ) 1.814 2.0872 2.4713 1.6818 1.6427 1.9948Max drawdown 0.2558 0.1832 0.1037 0.1686 0.2308 0.1437Max drawdown duration 1200 1169 753 1130 1199 1120Months up (normality) 0.6854 0.7084 0.7388 0.6736 0.6701 0.7007Skewness 7.2642 7.0038 8.4768 7.5332 6.1514 6.5376Excess kurtosis 35.5622 37.8313 48.0339 39.863 33.4091 33.5828Avg. assets held daily 207.5691 209.7605 201.0791 184.3598 201.5864 195.6456Avg. weight 0.0048 0.0047 0.0049 0.0054 0.0049 0.0051

Norm. return Sharpe Predict Min var GGR dur RsqGeometric ann. return 0.1292 0.0563 0.0569 0.0484 0.079 0.0826 0.1084Ann. std. dev. 0.0849 0.0649 0.0843 0.0511 0.0871 0.0683 0.066Sharpe(rf ) 1.5214 0.8668 0.6746 0.9475 0.9074 1.2091 1.6428Max drawdown 0.1465 0.2877 0.3015 0.2278 0.319 0.2602 0.2622Max drawdown duration 1092 1560 1527 2777 1302 1366 1288Months up (normality) 0.6586 0.596 0.5768 0.604 0.6004 0.6299 0.6707Skewness 5.4947 4.497 1.9722 5.2828 7.1217 5.2701 5.7819Excess kurtosis 39.9129 25.3589 28.5042 62.8279 118.5389 34.5336 59.8249Avg. assets held daily 145.9915 141.7071 103.6971 92.4309 128.4334 130.2953 147.6222Avg. weight 0.0068 0.007 0.0092 0.0105 0.0077 0.0074 0.0067

Table 6.1: Summary statistics of a new trading approach through various selection methods, plotted infigure 6.2. 10 bps transaction costs and financing costs of pb = 30 and rebate = 25 bps has been applied.All backtests has a gross leverage close to 2. For variable explanation refer to Appendix 10 SummaryStatistic Tables.

48

returns amongst the most inferior portfolio schemes despite its flexibility. Fourth, has theminimum variance (not to be confused with Markovitz MVA) indeed worked in findinglow volatility trades, but unfortunately not positive return trades in recent years. Oneof the big surprises is the R2 method which despite being disconnected with historicalpair performance generates high returns at low risk. That the strong equilibria performswell confirms the model, and the meaningfulness of conducting pairs trading as a way ofcontrolling common information. Lastly, and perhaps most important, is it evident thatperformance has decreased for all the portfolios since 2010, indicating a weak future forthis model. Perhaps refuelling the model with a stronger trading paradigm or smarteroptimization can induce profitability to hold in the future. In fact has only four selectionmethods (TF 65, TF130, LT65, Rsq) shown steady positive returns in recent years. Whenperformance is flat it is a sign of capturing some inefficiency strong enough to cover thetransaction costs of all the other dead end trades.

Table 6.1 shows that total performance has been great for many of the portfolio varia-tions. Clearly these performance metrics are a bit troublesome as they cover four period:1) Strong performance 1986-1991,1993-2003, 2) Slow and safe markets (flat returns)2004-2007, 3) dislocation (explosive performance) 1992,2008-2009, 4) weak performance2010-2015. Nevertheless sticking with some of the strategies through any ups and downswould have yielded annual returns in the region of 12-18% for the trend following typemethods with a volatility of less than 8%.

The two short term trend following methods TF65 and LT65 will in the next be com-bined to one strategy which will be presented in detail. These methods are selected dueto strong historical risk and return profile, and as well show some promising performancein recent years. Most importantly does trend following indicate which pairs that fit thetrading model the best, hence the trend following systems are considered in detail.

6.4 Performance of Short Term Trend Following

The two trend following systems TF65 and LT65 are combined to one by adding thesignal series. This induces pairs that are trending in by both metrics will get the highestallocation, while low conviction pairs still will get some allocation, making the strategy bothdiversified and unbalanced betting on the strongest horses. Furthermore, combining twoportfolios decreases average transaction costs in practice as average weights will be reducedin compare to using a single portfolio. A return plot of the combination of LT65 and TF65is given in figure 6.3, and a returns are presented in table 6.2, while trading statistics aregiven in table 6.3. In overall performance is great with the highest Sharpe ratio yet seen of2.29. There is a decent amount of skewness and the kurtosis is fairly low. Despite a highhistorical Sharpe ratio one should not expect a return higher than 5% per annum in theyears to come. If volatility permanently increases or the market crashes one can experienceincreased returns. That the strategy works well in extreme market conditions, can give

49

Return statisticsAll <2003-11-01 <2008-07-01 <2009-10-01 <2015-06-29

Arithmetic daily return (bps) 6.147 7.8222 0.243 23.1019 1.9537Geometric ann. return 0.17 0.2223 0.0045 0.8005 0.0501Daily std. dev. 0.0046 0.0045 0.0037 0.0096 0.0038Ann. std. dev. 0.0742 0.0719 0.0598 0.1551 0.0618Sharpe(0) 2.8804 3.9822 0.6464 5.2231 0.8327Sharpe(rf ) 2.292 3.0917 0.076 5.1629 0.8114Correlation to SPX 0.0163 -0.0396 -0.0892 0.0312 0.0245β 0.0064 -0.0155 -0.0021 8e-04 7e-04Ann. α 0.1728 0.2267 0.0089 0.8143 0.0513t-α 12.5053 13.2217 0.3167 5.7648 1.9659R2 3e-04 0.0016 0.008 0.001 6e-04Std. dev. on error 0.0742 0.0719 0.0596 0.1552 0.0618Information Ratio 2.329 3.1557 0.1497 5.2457 0.8313Max loss 0.0347 0.0292 0.0116 0.0347 0.0154Max drawdown 0.1189 0.0914 0.1189 0.0781 0.0663Max drawdown duration 993 127 993 62 191Days up 0.547 0.5723 0.4931 0.5573 0.5083Months up (normality) 0.7249 0.7836 0.5117 0.8587 0.5902Skewness 7.705 7.2739 2.3906 5.0218 3.163Excess kurtosis 42.1473 40.2396 4.0623 13.2426 12.4977Start 1985-07-02End 2015-07-29

Table 6.2: Performance of trend following pairs as plotted in figure 6.3. First column gives performancenumbers on the entire data, second on data indexed prior to 2003-11-01, third the period 2003-11-01 to2008-07-01, and so fort for column four and five. The periods may be named: uncompetitive, flat, crisisand recovery, competitive. For variable explanation refer to Appendix 10 Summary Statistic Tables.

50





Table 6.3: Trading statistics of trend following pairs TF65 and LT65 as plotted in figure 6.3. For variableexplanation refer to Appendix 10 Summary Statistic Tables.

51

Figure 6.3: Performance of continuous trading model applied with portfolio selection methods of TF65and LT65. 10 bps transaction costs and financing costs of pb = 30 and rebate = 25 bps has been applied.Return statistics are reported in table 6.2.

rise to the interpretation that it serves as insurance to an equity portfolio. However, thiswas not the case during IT bobble, perhaps as predominantly IT companies where inflatedand deflated in the same pace. Due to the criterion of dislocation there is no guaranteeof insurance. The great performance prior to 2004 may be an attribute of that technicallimitations has in a large fraction of the period hindered the mere possibility to conduct thenecessary computations. The technical developments in recent 15 years has been immense.A $200 2012 mid level graphic card could deliver the same power as the fastest supercomputer in year 2000 priced at $50 million17.

The great performance results and the fact that they are persisting at lower rate, canperhaps bring pairs trading back in the realm of a potential premium. The fact that trendfollowing has worked well in a portfolio selection scheme enforces the interpretation of apersistent inefficiency being traded. Each pair is by itself an implementation of a tradingstrategy, if the performance of these were a random walk timing by trend following wouldbe impossible. The fact that trend following has worked indicates that the strategy can betimed, and underlying performance is systematic.

17http://data-informed.com/fast-database-emerges-from-mit-class-gpus-and-students-invention/

52

http://data-informed.com/fast-database-emerges-from-mit-class-gpus-and-students-invention/

Pair statisticsNo. of pairs 4734Common pairs 4400Unique LT65 pairs 122Unique MR65 pairs 212No. of trades 92131No. of unique trades 55490Avg. new trades per trading day 5.01Avg. of days elapsed since indi-cated for pairing

1615.06

ReturnsMean 0.0187Mean long 0.0269Mean short -0.0083Std. dev. 0.1453Median 0.0348Skewness -1.9682Excess kurtosis 40.0603Pct. > 0 0.7879Q5 -0.2233Q95 0.1586Mean, trace pairs 0.0163Mean, eigen pairs 0.018Mean, PO pairs 0.0225Mean, Engle & Granger pairs 0.0185Mean, correlation pairs 0.0196Mean, distance pairs 0.0174Mean, duration < 60 days 0.0498Mean, duration > 60 days -0.0841

DurationMean duration 43.8933≤ 1 0.066≤ 7 0.3124≤ 14 0.4574≤ 30 0.6241≤ 60 0.7661≤ 90 0.8458Q95 175

Table 6.4: Summary statistics on underlyingtrades of trend following strategies LT65 and TF65presented in table 6.2 and figure 6.3. Avg. of dayselapsed since indicated for pairing are the numberof days since a pair last has been indicated by thepairing algorithm. Duration represents the fractionof trades closed within 1 day, 7 days, etc.

53

Summary statistics on the underlying trades for each of the pairs in the combinedshort term trend following strategy are presented in table 6.4. From table 6.3 it may bededuced that the portfolio was flipped 22 times a year. From the trade statistics in table6.4 the level of this turnover is even more evident; it appears that 55,490 unique tradeswas in total executed. Correspondingly would 5 unique pair trades be executed daily. Thestatistics are based on 4,734 pairs, and based on the non unique trades such that tradesindicated by both TF65 and LT65 gets an increased weight. The average trade was openedin a pair that lastly has been paired by the pairing algorithm 1,615 days ago (4.5 years).7.34% of the trades was opened in less than a month from the pair was last formed. Thisinformation tells that recent formed pairs are likely to have performed well recently. Thereis a bias to trade recently formed pairs, and yet there is a great dispersion in the elapseddays since formation, thus old pairs must not be neglected.

Table 6.4 displays the distribution of returns on the trades. The distribution is neithernormal nor symmetric. It has a heavy left tail; the shape is a function of many shortduration trades closing within two weeks, generating a large mass of positive returns. Theremainder mass is much more dispersed. Despite that all trades do converge at somepoint of time there is yet a probability that a trade do not converge in a long time. Eachequilibrium error is out of sample, so the sequence may have a long time of convergence. Asimple intuition is that a trade may take place in a false equilibrium; if momentum existsthe loosing share will persist losing whilst the wining will persist winning. In this case αmay increase (or decrease, depending on which is the left hand pair), consistently creatinga probabilistic barrier for the error to cross 0. Such effects of updating the model createsa fraction of trades with long duration and dispersed returns.

On average there is a break-even in returns at 60 days (2 months, see table 6.5, re-gression B), trades beyond this point do an average lose. Trades closed within 60 daysgenerated an average return of 5%, whilst those beyond generated -8.4%. This fact mightmake one apply the rule of forced closing after 40 trading days. By mere coincidence werecorded the return of the 40th trading day which allows us to evaluate the result of cap-ping at the 40th trading day: the average return declines, the probability mass between-0.25 and -0.02 increases, and any other mass decreases. Thus capping at 40 trading daysdecreases volatility, specifically will it remove some of the most extreme observations, butunfortunately also remove some medium duration high return trades reducing the averagereturn. Trades unprofitable beyond 60 days was also bad trades at the 60th day. In theportfolio set-up capping at 60 days does not necessarily decrease profits as the averagetrade return decreases. Capping reduces volatility which in expectation increases returns.Also does it leave the capital more flexible as less is bounded in long trades.

Table 6.4 shows that extreme returns are made, one loosing more than 300% and onegaining more than 300%. This high trade volatility does not directly feed to portfoliovolatility, as a single pair normally only takes 1% of the portfolio. Furthermore does thefrequent trading induce the idiosyncratic noise to diminish. About 79% of the trades has

54

Regression ALeft hand side variable: return

Estimate Std. Error t-valueIntercept -0.0013 0.0051 -0.2467Days from pairing 0 0 -2.105865 days return 0.0044 6e-04 6.8163130 days return 0.0027 4e-04 6.8424260 days return 3e-04 1e-04 2.4167R2 (equillibrium) 0.003 0.002 1.539Avg. duration 0 0 -0.4822Avg. Sharpe 0.0039 0.0014 2.8353Avg. return/vol. 0.0448 0.0035 12.7876Avg. intra trade vol. 0 0 -0.6642Predicted 2e-04 0 6.8751Trigger 0.0025 0.0012 2.1503Correaltion 0.002 0.0053 0.3761diff_term 0 0 1.4919diff_vol -0.9193 0.4642 -1.9802pca 0.0013 0.0012 1.0652clust -1e-04 1e-04 -0.5411Eigen 0.0012 0.0016 0.7782PO 0.0054 0.0017 3.2612ADF 0.0012 0.0016 0.754cor 0.0036 0.0019 1.8645dist 0.0032 0.0026 1.2534R2 adj. 0.0052F-statistic 21.41 on 21 and 82571 DF

Regression BLeft hand side variable: return

Estimate Std. Error t-valueIntercept 0.0669 5e-04 130.6586Duration -0.0011 0 -165.5303R2 adj. 0.2297F-statistic 27400 on 1 and 91891 DF

Table 6.5: Regression statistics using trade data from LT65 and MR65. Regression A represents regressionfuture return on current known numbers at entry. Regression B represents the relationship between returnand duration of a trade.

55

on average been successful Before and after 2009-10-01 the this number was 79.5% and75.3% respectively. More dramatically does the average trade return move from 2.1% to0.8%.

The average entry trigger of the trades was 0.6 standard deviations from equilibrium.This does not necessarily apply that a too high trigger has been used in the distance andcointegration approach.

Some regression statistics are given in table 6.5. Regression panel A purely regressesreturn on known information at entry. A few interesting observations can be drawn: 1) thatvariables regarding past return matters for future return, thus the entire trend followingset-up is indeed meaningful. 2) High trigger trades are logically generating greater returns,implying that one could balance the weights by the triggers. 4) It is significant that thePhillip Ouliaris and correlation pairs outperform the others, implying that these metricsought to get an increased weight in renewing the pair universe. 5) The explanatory degreeR2 is so small that implementing the information can only marginally improve returns.The last observation is in some degree counter-working the generalized framework. Thefact that the explanatory variables do not explain much within the all ready trendingpairs, disproves of further optimization of the portfolio. Perhaps is information on pairsnot traded, and trades not made useful. Indeed has historic performance prior to tradingworked. In other words the set-up of continually optimizing cannot take place on the pastportfolio information. Information out of portfolio (not traded) is on the other hand useful.More general, might the optimization be better of regarding optimal windows, switchingbetween types of portfolios and likewise, than predicting performance.

To summarize, the new approach methodology has worked, more trades are successfulthan the previous fixed period frozen parameters paradigm. Furthermore are trades oftenclosed within a very short duration, which is a key parameter for success: as documented ishigh duration bad for the strategy, it is purely short term miss pricing that is sought to becaptured. Since 2010 has performance rapidly decreased. The TF65 and LT65 portfoliosare optimal in the sense that past realized trade information cannot meaningfully be usedincrease returns. Further internal optimization is not meaningful.

In the next two small sections two variations of the just presented strategy will bereported. The first is a long only version, and the second is the long only hedged by themarket. The hedged version can in essence be seen as a feasible strategy if the investorhas limits of short selling.

6.4.1 Long Only

So far it has been evident that the short side is generally loosing due to the positive equitypremium. In this regard some investors may have a preference of investing in the long sideonly. This is done by removing all net signals less than 0. Despite removing the short side,one pair may indicate purchasing security A whilst another may indicate shorting A. IN

56

Return statisticsArithmetic daily return (bps) 7.8486Geometric ann. return 0.1982Daily std. dev. 0.0133Ann. std. dev. 0.2148Sharpe(0) 1.1314Sharpe(rf ) 0.9228Correlation to SPX 0.9178β 1.0512Ann. α 0.1473t-α 9.2747R2 0.8425Std. dev. on error 0.0853Information Ratio 1.7273Max loss 0.1812Max drawdown 0.6104Max drawdown duration 572Days up 0.5504Months up (normality) 0.6039Skewness -2.6463Excess kurtosis 51.5738Start 1985-07-02End 2015-07-29




Returns Geom. ann. gross return long 0.2772Geom. ann. gross return short 0Geom. ann. transaction costs -0.0222Geom. ann. financing costs -0.0374Pct. trunover of m.val. 0.0873

Leverage Margin violations 0No. of maintenance actions 0No. of intial margin actions 0Avg. inst. leverage 1.04Avg. investment ratio 0.308Maximal invested 0.4419Avg. gross leverage 0.9993Max gross leverage 1.3884Avg. net leverage 0.9993Max net leverage 1.3884Min net leverage 0Avg. maintenance to equity 0.2498Max maintenance to equity 0.3471


Table 6.6: Performance of trend following pairs long only; base strategy as of figure 6.4. Return plot isgiven in figure 6.4. For variable explanation refer to Appendix 10 Summary Statistic Tables.

57

,

Figure 6.4: Performance of trend following pairs long only. Summary statistics are given in table 6.6

such a case A is neither bought nor sold.Once applying the long only strategy the target average leverage is set to 1. It is crucialthat the fund will have access to invest in money market products, and obtain periodi-cally leverage, for the returns to represent excess returns. So far has financing costs beenconstructed in a lavish manner of not using internal funds before seeking external. Thislavish manner is very critical in the long only strategy: the prime brokerage spread is paidon all trades even though on average it can fully be avoided. Thus the financing costs arealtered:

FCt = max

(Et−1 −

INT longt−1INpct

, 0

)rft −max

(INT longt−1INpct

− Et−1, 0

)rpbt − Et−1r

ft (6.1)

⇔ FCt = −max

(INT longt−1INpct

− Et−1, 0

)pb− Et−1rft (6.2)

Equation (6.1) has three components, first the return of investing free equity in moneymarket products, secondly any dollar invested above equity value is a lend dollar paying thebrokerage lending rate, thirdly are the forgone profit subtracted to induce excess returns.

The results of the long only version is depicted in figure 6.4 and summary statisticsgiven in table 6.6. The figure and table shows a high beta strategy (1.05) that is highlyrisky, but nevertheless also very profitable, yielding an annual return of almost 20% overthe risk free rate. The alpha corresponds to 15% and is significantly greater than 0. Despitethat this version has had the greatest draw down seen so far, it has been recovered in almosttwo years.

58

Figure 6.5: Performance of trend following pairs long only short the market. Summary statistics aregiven in table 6.7. For variable explanation refer to appendix 10 Summary Statistic Tables.

6.4.2 Hedging by SPX Futures

The long only strategy of above may be applied to investors who might not like the hedge.It could also apply to investors who does not have the means to short individual securities.However these investors are likely to be able to sell futures on the S&P 500 index. Considerthat for each dollar purchased in a security one dollar is sold of the S&P 500 index18.Results of this strategy is plotted in figure 6.5 and return statistics are given in table 6.7.The profile is quite similar to the basis strategy, although it has some beta left. Thisversion seems less prone to the nature of pairs trading, as it is only half a pairs tradingstrategy where most of the beta is hedged. The Sharpe ratio is once again great, but tailrisk has increased. Annual return since 2009-10-01 is 5.9% with a volatility of 9.2%

18Trading the S&P 500 index may not be easy in practice. Here the exact returns of the index are used,where actual returns trading futures may differ.

59








Table 6.7: Performance of trend following pairs long only short the market. Returns plotted in figure6.5.

60

7 Conclusion

In this paper two pairs trading models have been tested. Firstly the distance approach ofGatev et al. (2006) has been replicated and extended to recent data. The results showedthat the method has been unprofitable over the recent years, only the great turmoil ofthe financial crisis and recovery has been a sufficient catalyst to generate positive returns.Cointegration was applied to the model in the quest of alleviating recent negative returns.Contrary to Huck and Afawubo (2015) and Caldeira and Moura (2013) we do not find ourapplication of the cointegration approach generating positive returns in the recent years.Nevertheless, important results relating to the cointegration approach has been found.Evidence has been found on market inefficiency through a high number of securities beingcointegrated. It was also shown that the securities with the highest cointegration statistichas generated the strongest returns. From four cointegration tests there has been a weakindication of the Phillips and Ouliaris test to be the preferred pairing metric.

The recent poor performance of both the cointegration and distance approach has beenattributed to the frozen parameter fixed window set-up. To alleviate the problems a gener-alized pairs trading framework has been suggested. The framework is a backward inductionframework of starting with the most important part of the model namely trading. Thetrading model suggested is a continuous cointegration model with a build in optimizationof entry triggers. Given the trading model, pairs has been selected to fit this. The pairswith the highest returns e.g. the strongest trend, are those who fit the model the best,and those who has been selected for trading. Using the continuous trading model andcontinuous selection by trend following while renewing the pair universe every month, anannual excess returns can be reported of 17% since 1984. A Sharpe ratio of 2.29 has beenobserved, and returns since 2010 to the summer 2015 average to 5% p.a.. A long onlyversion as well as long only hedged by the market has been suggested both significantlyoutperforming the market. It has been shown that even this model of increased sophisti-cation is subset to decreased performance in recent years. Many pairs trading portfolioshas shown unprofitable since 2010.

61

References

Alsayed, H. and F. McGroarty (2012). Arbitrage and the law of one price in the market foramerican depository receipts. Journal of International Financial Markets, Institutionsand Money 22 (5), 1258–1276.

Baronyan, S. R., İ. İ. Boduroğlu, and E. Şener (2010). Investigation of stochastic pairstrading strategies under different volatility regimes. The Manchester School 78 (s1),114–134.

Bogomolov, T. (2013). Pairs trading based on statistical variability of the spread process.Quantitative Finance 13 (9), 1411–1430.

Bowen, D., M. C. Hutchinson, and N. O?Sullivan (2010). High frequency equity pairstrading: transaction costs, speed of execution and patterns in returns. The Journal ofTrading 5 (3), 31–38.

Broussard, J. P. and M. Vaihekoski (2012). Profitability of pairs trading strategy in anilliquid market with multiple share classes. Journal of International Financial Markets,Institutions and Money 22 (5), 1188–1201.

Caldeira, J. F. and G. V. Moura (2013). Selection of a portfolio of pairs based on cointe-gration: A statistical arbitrage strategy. Brazilian Review of Finance 11 (1), 49–80.

Chen, Z. and P. J. Knez (1995). Measurement of market integration and arbitrage. Reviewof financial studies 8 (2), 287–325.

Do, B. and R. Faff (2008). Does naïve pairs trading still work. Technical report, WorkingPaper.

Do, B. and R. Faff (2012). Are pairs trading profits robust to trading costs? Journal ofFinancial Research 35 (2), 261–287.

Do, B., R. Faff, and K. Hamza (2006). A new approach to modeling and estimationfor pairs trading. In Proceedings of 2006 Financial Management Association EuropeanConference.

Elliott, R. J., J. Van Der Hoek*, and W. P. Malcolm (2005). Pairs trading. QuantitativeFinance 5 (3), 271–276.

Engelberg, J., P. Gao, and R. Jagannathan (2009). An anatomy of pairs trading: the role ofidiosyncratic news, common information and liquidity. In Third Singapore InternationalConference on Finance.

Engle, R. F. and C. W. Granger (1987). Co-integration and error correction: represen-tation, estimation, and testing. Econometrica: journal of the Econometric Society ,251–276.

62

Gatev, E., W. N. Goetzmann, and K. G. Rouwenhorst (2006). Pairs trading: Performanceof a relative-value arbitrage rule. Review of Financial Studies 19 (3), 797–827.

Gatev, E. G., W. N. Goetzmann, and K. G. Rouwenhorst (1999, March). Pairs trading:Performance of a relative value arbitrage rule. Working Paper 7032, National Bureau ofEconomic Research.

Hameed, A., J. Huang, and G. M. Mian (2010). Industries and stock return reversals.SSRN eLibrary .

Harlacher, M. (2012). Cointegration based statistical arbitrage.

Huck, N. (2009). Pairs selection and outranking: An application to the s&p 100 index.European Journal of Operational Research 196 (2), 819–825.

Huck, N. (2010). Pairs trading and outranking: The multi-step-ahead forecasting case.European Journal of Operational Research 207 (3), 1702–1716.

Huck, N. (2013). The high sensitivity of pairs trading returns. Applied Economics Let-ters 20 (14), 1301–1304.

Huck, N. and K. Afawubo (2015). Pairs trading and selection methods: is cointegrationsuperior? Applied Economics 47 (6), 599–613.

Jacobs, H. and M. Weber (2015). On the determinants of pairs trading profitability. Journalof Financial Markets.

Johansen, S. (1995). Likelihood-based inference in cointegrated vector autoregressive mod-els. OUP Catalogue.

Juselius, K. (2006). The cointegrated VAR model: methodology and applications. OxfordUniversity Press.

Liew, R. Q. and Y. Wu (2013). Pairs trading: A copula approach. Journal of Derivatives& Hedge Funds 19 (1), 12–30.

Mori, M. and A. J. Ziobrowski (2011). Performance of pairs trading strategy in the us reitmarket. Real Estate Economics 39 (3), 409–428.

Nath, P. (2003). High frequency pairs trading with us treasury securities: Risks andrewards for hedge funds. Available at SSRN 565441 .

Papadakis, G. and P. Wysocki (2007). Pairs trading and accounting information. BostonUniversity and MIT Working Paper .

Pedersen, L. H. (2015). Efficiently Inefficient: How Smart Money Invests and MarketPrices Are Determined. Princeton University Press.

63

Phillips, P. C. and S. Ouliaris (1990). Asymptotic properties of residual based tests forcointegration. Econometrica: Journal of the Econometric Society , 165–193.

Puspaningrum, H. (2012). Pairs trading using cointegration approach.

Triantafyllopoulos, K. and G. Montana (2011). Dynamic modeling of mean-revertingspreads for statistical arbitrage. Computational Management Science 8 (1-2), 23–49.

Vidyamurthy, G. (2004). Pairs Trading: quantitative methods and analysis, Volume 217.John Wiley & Sons.

64

(This page is intentionally left blank)

65

App

endi

x

8Lit

erat

ure

Ove

rvie

w

Autho

r(s)

Distance

Cointegration

Stochatic

2FP

RSR

Data

start

end

Distance

Gatev

etal.(20

06)

xx

0.11

26S&

P50

01962

2006

Doan

dFa

ff(200

8)x

xS&

P50

01962

2008

Doan

dFa

ff(201

2)x

x0.04

28S&

P50

01962

2008

Broussard

andVaihe

koski(

2012

)x

x0.10

480.54

Finland

1987

2008

Bow

enet

al.(20

10)

xx

0.07

022.24

FTSE

100

2007

2007

Eng

elbe

rget

al.(

2009

)x

xHuck(201

3)x

xS&

P50

02004

2009

Jacobs

andWeb

er(201

5)x

xGloba

l2000

2013

Cointegration

Vidyamurthy(200

4)x

Huckan

dAfawub

o(2015)

xx

xx

0.16

051.21

S&P

500

2000

2011

Harlacher

(2012)

xx

0.05

410.70

S&P50

01997

2011

Calde

iraan

dMou

ra(201

3)x

10.16

381.34

Brazil

2005

2012

66

Stochatic

Elliottet

al.(20

05)

xBaron

yanet

al.(20

10)

xx

0.14

130.55

DOW

302001

2008

Doet

al.(20

06)

xMiscellaneous

Bog

omolov

(201

3)Ren

ko&

kagi

x0.11

881.28

Mixed

1996

2011

Liew

andWu(2013)

Cop

ula

Alsayed

andMcG

roarty

(201

2)Thresho

ld0.01

45UK-A

DR

apr-au

g2011

Ham

eedet

al.(20

10)

Reversal

0199

4NYSE

/AMEX

1963

2006

Huck(200

9)MCDM

FTSE

100

1992

2006

Huck(201

0)MCDM

FTSE

100

1992

2006

Notes

totable:

2FP:F

ixed

period

sfrozen

parameters.R

Ann

uale

xcessreturn.SR

Sharpe

ratio(often

notap

plicab

le).1:

Theypa

rtially

uses

theGatev

etal.(20

06)setupof

fixed

period

sfrozen

parameters,bu

tthey

dono

tuseoverlapp

ingpe

riod

s,just

onefullpa

irpo

rtfolio

.The

yuseon

eyear

training

andfour

mon

thstesting.

Rep

ortedreturns(R

):Often

does

theau

thorsrepresentseveralp

erform

ance

measures.

The

mostrelevant

tocompa

rison,

andoftenthehigh

est,is

repo

rted

.

67

9R

etur

nC

alcu

lati

ons

expand(sigi),j

=

tsig i

12

3r i

ωi,1

ωi,2

ωi,3

ωi

P&L

EMV

NL

GL

10

00

00.02

00

00

010

0.00

0.00

0.00

0.00

20

00

00.03

00

00

010

0.00

100.00

1.00

1.00

31

10

00.02

10.000

00

100.00

1.50

101.50

101.50

1.00

1.00

41

10

0-0.02

101.50

00

101.50

-2.03

99.47

99.47

1.00

1.00

51

10

00.01

99.47

00

99.47

0.99

100.46

200.93

2.00

2.00

62

11

00.05

100.46

100.46

020

0.93

10.05

110.51

210.98

1.91

1.91

72

11

00.02

105.49

105.49

021

0.98

4.22

114.73

215.20

1.88

1.88

82

11

00.03

107.60

107.60

021

5.20

6.46

121.19

221.65

1.83

1.83

91

01

0-0.08

0110.83

011

0.83

-8.87

112.32

101.96

0.91

0.91

100

00

00.02

00

00

011

2.32

0.00

0.00

0.00

110

00

0-0.03

00

00

011

2.32

112.32

-1.00

1.00

12-1

00

-1-0.02

00

-112

.32

-112

.32

2.25

114.57

110.07

-0.96

0.96

13-1

00

-1-0.01

00

-110

.07

-110

.07

1.10

115.67

108.97

-0.94

0.94

14-1

00

-10.02

00

-108

.97

-108

.97

-1.63

114.03

110.61

-0.97

0.97

Illustrative

exam

pleof

calculatingreturnsgrossof

costson

asing

leassetwithseveraltrad

es.Eachsign

algets

weigh

t1of

equity.sig i

isa

consolidated

sign

alseries

onsecu

rityi,an

dexpand(sigi)

arethethreeun

derlying

trad

esgene

rating

theconsolidated

sign

alsig i.ωi,jarethe

expo

sure

gene

ratedby

each

ofthethreetrad

es,which

isrecu

rsiveto

theequity.Increase

inequity

increasesinitialexpo

sure.ωiis

thetotal

expo

sure

tothereturn

vectorr i.P&Lisωiri,E

isequity,M

Vthegrossmarketvalue,

NLthene

tleverage,a

ndGLthegrossleverage.

68

10 Summary Statistic Tables

Table explanation for summary statistic tables:

Sharpe(0) The daily excess return plus the daily risk freereturn compounded to geometric annual returndivided by the annual standard deviation of thisseries.

Sharpe(rf) The geometric annual excess return divided by theannual standard deviation of the excess returns.

β rt = α+ β(rmt − rft ) + εt

t-αann.α

√T260√

260·sd(ε)

Information ratio ann. α/(sd(ε) ·√

260)

Max loss −min(rt) ∀ t

Max drawdown duration The longest duration of a drawdown, perhaps notthe largest drawdown. 260 observationscorresponds to one year.

Months up (normality) Assume that rt ∼ N(E[rt], sd(rt)) then months upare P[q > 0] when q ∼ N(E[rt] · 20, sd(rt)

√20)

Average weight The average of the gross market value of eachposition divided the total market value. Thisnumber is useful when investigating liquiditydemand.

Pct. turnover of m.val. The daily trading volume (in $) compared to themarket value prior to the trade. The turnover isthe sum of issues on the long side bought, sold, andthe sum on the short side bought or sold - absolutenumbers. Multiplying this number by 260 yieldsthe number of times the portfolio has been flippeda year.

Margin violations Number of observations whereEt < MVt ·MAINpct, there are maintenanceviolations that has not been alleviated by the bufferrule b, in such cases the returns may not reflectreality.

69

No. of maintenance actions No. of actions actively conducted to reduce marketexposure in order not to get maintenance violations

No. of initial margin actions No. of observations where the investor has forgoneone or more trades due to too little slack equity.

Avg. investment ratio E[INTtEt

]Maximal invested max

[INTtEt

]Avg. mainteance to equity E

[MVtMAINpct

Et

]

11 Cluster Variables

Name Major clas-sification

Minor classi-fication

Data Metric Method

ag Hierarchical Agglomerative Indexed returns Euclidean Completeag2 Averageag3 Wardag4 Weightedman Manhattan Completeman2 Averageman3 Wardman4 Weightedcorag 1-cor.shrinkage Completecorag2 Averagecorag3 Weightedacorag Returns 1-cor.shrinkage Completeacorag2 Averageacorag3 Weighteddi Divisive Indexed returns Euclidean NAdim Manhattancordi 1-cor.shrinkageacordi Returns 1-cor.shrinkagepca Fuzzy NA Indexed returns Euclidean

Table 11.1: Clustering methods for pairing. For detailed description refer to the R documentation on thepackage cluster. The functions agnes, diana, and fanny corresponds to agglomerative, divisive, and fuzzyclusters respectively

70

Figure 12.1: Cumulative returns of the distance approach long only. A transaction cost of 10 bps, arebate spread of 25 bps, as well as a brokerage spread of 30 bps has been applied. Refer to table 12.1 forreturn statistics

12 Distance Approach - Long Only

It is evident from table 4.3 that the short side can be seen as hedge, that on average hasinduced losses to the portfolio. The hedge, hedges idiosyncratic changes common withina pair, as well as common drift in the pair. As the common drift most often is positive(positive equity premium) the hedge will induce losses to the portfolio. A natural questionis what would then happen if the short side is removed. The short side is removed suchthat no security is shorted, but short signals reducing long exposure are not removed. Theweight to a security is set to 1/120, and the instantaneous leverage is set so the averagegross leverage is 1 (no leverage).

Summary statistics and return graph of the long only are represented in table 12.1 andfigure 12.1. The trading statistics verify a gross leverage moving within 52-142%. Thereturn statistics shows a significant outperformance of the market, with higher returns,less volatility, much better skewness, and a longest drawdown as a third of that of themarket. The recovery from the financial criss was about 2 years faster. Most interesting isthat both alpha and the significance of it has increased, indicating a bias to value createdon the long side.

13 Cointegration Approach Hedging by SPX Futures

If the arbitrageur have difficulties in shorting single securities or it i is too expensive toshort single securities, pairs trading is a not a feasible strategy. Nevertheless as seen in

71

Return statisticsArithmetic daily return (bps) 4.708 2.2917Geometric ann. return 0.1145 0.0434Daily std. dev. 0.0104 0.0114Ann. std. dev. 0.1674 0.1839Sharpe(0) 0.9374 0.4522Sharpe(rf ) 0.684 0.2363Correlation to SPX 0.8283 1β 0.7539 1Ann. α 0.08 0t-α 4.6168 0R2 0.6861 1Std. dev. on error 0.0938 0Information Ratio 0.8534 NaNMax loss 0.1539 0.2049Max drawdown 0.5195 0.6788Max drawdown duration 1048 3833Days up 0.532 0.5289Months up (normality) 0.5804 0.5358Skewness 0.7443 -6.3278Excess kurtosis 84.3243 84.1362Start 1985-01-02End 2015-07-29




Returns Geom. ann. gross return long 0.1786Geom. ann. gross return short 0Geom. ann. transaction costs -0.0129Geom. ann. financing costs -0.0404Pct. trunover of m.val. 0.0519

Leverage Margin violations 0No. of maintenance actions 0No. of intial margin actions 0Avg. inst. leverage 2.05Avg. investment ratio 0.3037Maximal invested 0.4943Avg. gross leverage 1.0005Max gross leverage 1.3936Avg. net leverage 1.0005Max net leverage 1.3936Min net leverage 0Avg. maintenance to equity 0.2501Max maintenance to equity 0.3484


Table 12.1: Summary statsitcs for figure 12.1. For variable explanation refer to appendix 10 SummaryStatistic Tables. Panel (a) gives return statistic of long only distance approach and of the market premiumrespectively.

72

Figure 13.1: Cumulative returns of the cointegration approach using the pairs selected by the eigen valuetest statistic. Short sales are removed, taking only long positions in individual stocks, but shorting equaldollar amount in the market. A transaction cost of 10 bps, a rebate spread of 25 bps, as well as a brokeragespread of 30 bps has been applied. Refer to table for return statistics.

Distance Approach - Long Only the strategy works well on the long side but risk increaseswhen removing the short side. The arbitrageur will have the possibility to hedge thelong only strategy by selling futures on the S&P 500 index. Say that the arbitrageurdecides to implement the Eigen test statistic for pairing following of section Review of theCointegration Approach. The arbitrageur will for each dollar bought in a security sell shortone dollar of the market, and likewise when he reduces long exposure by some percent, hewill take the same percentage reduction in short exposure. The fund average gross leverageis set to two.

The performance table 13.1 and figure 13.1, shows that the strategy is in less degreesubset to bad performance since 2005. There has been three meaningful drawdowns in theentire track record, the longest from around 2006 to 2008. The returns of the long sideare slightly higher than of the pair trade version, due to higher instantaneous leverage.The returns on the short side has improved by around 3 percentage points, as well hastransaction costs decreased per annum of about 40 bps. The results is a cheaper hedgeresulting higher returns with the same volatility of the pair trade version, leading to aSharpe ratio close to 1.

73








Table 13.1: Summary statistics for figure 13.1. For variable explanation refer to appendix 10 SummaryStatistic Tables.

74

dynamic cointegration based pairs trading

Documents