philip genetic programming in statistical arbitrage

14
Genetic Programming in Statistical Arbitrage Philip Saks PhD Seminar 17.10.2007

Upload: aiquant

Post on 14-Jun-2015

2.342 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Philip Genetic Programming In Statistical Arbitrage

Genetic Programming in Statistical Arbitrage

Philip SaksPhD Seminar 17.10.2007

Page 2: Philip Genetic Programming In Statistical Arbitrage

Contents

Introduction Genetic Programming Clustering of Financial Data Data Framework Results Conclusion

Page 3: Philip Genetic Programming In Statistical Arbitrage

Introduction To develop an automated framework for trading strategy

design, by employing evolutionary computation in conjunction with other machine learning paradigms

The present framework utilize genetic programming Much of the existing financial forecasting using GP has

focused on high-frequency FX [Jonsson, 1997][Dempster and Jones, 2001][Bhattacharyya et al, 2002] and the general consencus is that there is predictability, and excess return is achievable in the pressence of transaction costs

For stocks, the results are mixed [Allen and Karjalainen, 1999] do not significantly out-perform the buy-and-hold on S&P500 daily data, but [Becker and Sheshadri, 2003] do on monthly.

Page 4: Philip Genetic Programming In Statistical Arbitrage

GP I EC is a concept inspired by the Darwinian survival of the

fittest principle – The rationale being, that natural evolution has proved succesfull in solving a wide range of problems throughout time, hence an algorithm that mimics this behavior, might solve a wide range of artificial problems

The concept was pioneered by Holland (1975) in the form of Genetic Algorithms (GA)

A GA is essentially a population based search method, where each candidate solution is incoded in a fixed length binary string.

The population evolves, via mainly three operators, selection, reproduction and mutation.

The selection process is based on the survival of the fittest principle.

Page 5: Philip Genetic Programming In Statistical Arbitrage

GP II GP’s are basically GA’s in which the genome contitutes

hierachical computer programs Using this representation, we can solve problems in a

wide range of fields such as, symbolic or ordinary regression, classification, optimal control theory etc. since each of these areas “can be viewed as requiring discovery of a computer program that produces some desired output for particular inputs” (Koza, 1992)

Tree representation of programs, function & terminal Set

Evolutionary operators: selection, cross-over & mutation

Page 6: Philip Genetic Programming In Statistical Arbitrage

Clustering of Financial Data

Page 7: Philip Genetic Programming In Statistical Arbitrage

Data Hourly VWAP prices and volume for banking stocks

within the Euro Stoxx Universe, covering the period from 01-Apr-2003 to 29-Jun-2007 (8648 oberservations).

Page 8: Philip Genetic Programming In Statistical Arbitrage

Framework Evolve trading rules with binary decisions We consider the classical single tree setup, but also a

dual tree framework, where buy and sell rules are co-evolved.

The training set comprises 6000 samples, while the remaining 2647 are used for out-of-sample testing

10 runs are performed for each experiment.

Page 9: Philip Genetic Programming In Statistical Arbitrage

Results Trading on VWAP, assuming 1bp market impact

Page 10: Philip Genetic Programming In Statistical Arbitrage

Sensitivity Analysis

Page 11: Philip Genetic Programming In Statistical Arbitrage

Stress Testing I

Page 12: Philip Genetic Programming In Statistical Arbitrage

Turnover Analysis

Page 13: Philip Genetic Programming In Statistical Arbitrage

Transaction Cost Implications

Page 14: Philip Genetic Programming In Statistical Arbitrage

Conclusion It is possible to discover profitable arbitrage trading

rules on the Euro Stoxx banking sector. A cooperative co-evolution of buy and sell rules are

beneficial to the classical single tree structure. Optimizing in the pressence of transaction costs makes

a difference – There should be correspondence between assumption and application for optimal performance.