adaptive predictive trading systems

Adaptive Predictive Trading Systems

Steven C. Rogers, Member, EE-Pub, Leon Luxemburg, Matt McMahonPublished: February 22, 2005

AbstractForecasting or predictions of financial instruments have long been of great interest to practitioners and researchers. Many powerful Digital Signal Processing (DSP) filtering techniques have been developed to support prediction. Predictive filters lend themselves very nicely to the development of trading decision-making. The techniques in this paper have application to high frequency (intra day) trading as well as long-term trades. Some linear and nonlinear concepts will be applied and compared in this paper using matlab for implementation. These approaches are tested using various stocks. Article InformationField of Study—DSPKeywords—estimation adaptive-filters multilayer-Perceptron Prediction trading-systems financial-trending I. INTRODUCTION Digital signal processing has long been used for time series analyses. Prediction is a subset of signal processing. Prediction requires an assumed system model of the signal to be processed. Once such a model exists, then it may be manipulated to estimate future output values. If a model is large and sophisticated the computer throughput requirements are more significant. Since the advent of more powerful computers prediction has increasingly been used in real-time applications, such as semi-automated trading systems. Linear model structures have been used for prediction, since they are simple, easy to implement, and easy to analyze. However, linear models suffer from lack of accuracy, which will give a bias due to model errors and unmodeled dynamics. Nonlinear model structures, such as neural networks, have become a viable alternative to linear models. In this paper adaptive linear and nonlinear predictive systems are presented and compared to fixed observers. These are developed into trading systems and tested on different stock price data.

II. PREDICTION FILTERS Prediction is the forecasting side of information processing. The motivation for using a predictive system is to derive information about the future characteristics of a signal based on current and past information. A typical adaptive predictive filter structure1-6 is given in figure 1.

Figure 1 Adaptive Predictive Filter Architecture

In figure 1 the problem is to predict the present value based on data n time steps in the past. The current estimate is then a smoothed version of the input data. If it is desired to estimate future values of the input data the input to the adaptive filter may be replaced with more current data. An adaptive filter has its weights adjusted regularly to force minimization of some performance criteria, usually deviation from the input signal. The adaptive filter may be linear or nonlinear. It may be feed forward or recurrent (feedback). Feedback filters provide for additional dynamics in the architecture.

The simplest structure is a finite impulse response (FIR) structure. It is also called moving average (MA) and has all zeros. It follows the equation: yk = a0*xk-1 + a1*xk-2 + … + an-1*xk-n, where yk is the output of the filter, x is the input data stream, ai is the coefficient associated with the ith input value, and n is the length of the filter. The coefficient vector is updated based usually on deviation from the current value of the input data stream or xk. A typical update law is:

is frequently modified to account for signal variance or noise power, such as,

Passing it through a low pass filter may smooth the variance. Another linear model structure is the infinite impulse response filter (IIR), which

contains both poles and zeros. It is also called autoregressive-moving average (ARMA). This incorporates feedback into the output. The equation is: yk = b1*yk-1 + b2*yk-2 + … + bm*yk-m + a0*xk-1 + a1*xk-2 + … + an-1*xk-n, where m is the order of the numerator of the

ARMA transfer function and b is the coefficient vector associated with previous filter output values. The update law is similar to the above FIR law, but includes the feedback values as well.

Other alternatives to the above linear adaptive filters include multiplayer perceptrons7-10 (MLP). Generic components of neural networks are shown in figures 2 and 3.

Figure 2 Neural Network Components

Figure 3 Structure of a Neural Network

The MLP is one of the most commonly used neural network architectures and is shown in figure 4. The output equation for a feedforward single hidden layer MLP8 may be written: O = W*f(V*X), where O is the output vector, W is the output matrix, X is the input vector, V is the input matrix, and f(..) is generally a squashing function selected from the examples shown in figure 3. The matrices W and V contain the adjustable parameters. A matlab code segment showing the adaptive update law for an MLP is shown in figure 5 below. In this case the squashing function is a hyperbolic tangent. Note that G is the output of the single hidden layer, dW and dV represent the momentum contribution, out is the MLP output, and mu and bet are adaptation gains. This adaptive update law is slightly modified from the backpropagation law to include the momentum component. Thus, the matrix weights are updated with each new input data sample and are constantly being tuned in a least squared sense.

Figure 4 MLP Architecture

Figure 5 Matlab code segment of the MLP adaptive update law

The code segment in figure 5 may be easily modified to a recurrent network. Recurrent means that feedback or node output is returned and added to the input vector. The recurrent code segment is shown in figure 6. Recurrency is advantageous as it incorporates more dynamics in the neural network. Note that the only modification is in the last line and is to update the input vector to include the full output of the single hidden layer. If it were desired to update feedback from additional sources or a subset of the single hidden layer the same basic structure would be used with G replaced by the appropriate terms.

Figure 6 Matlab code segment of the recurrent MLP adaptive update law

Once we have the adapted model, we have a means to predict the future outputs. In our case we will predict a single step ahead, so we run through the model again. The MLP prediction code segment is shown below in figure 7.

Figure 7 Code segment for step ahead prediction

Now that we have step ahead predictors the next step is to incorporate them into profitable trading systems. The simplest approach is to create a slower comparison signal by passing the predictions through a 1st order low pass filter. This provides a slower signal. Signal crossings will trigger buy/sell indications. This simple but effective approach is shown in the matlab code segment in figure 8. Note that pred is the step ahead prediction, predslow is the slower filtered signal, and alph is the discrete 1st order filter pole. A buy signal is given when the value of pred > predslow. A sell signal occurs when the value of pred < predslow. Since the pred signal reacts more quickly to changes than the slower time constant predslow, it more closely represents the current price trend.

Figure 8 Code segment for buy/sell signals

For comparison 2 scheme state estimation observer-trading systems are developed. These are fixed parameter systems that are guaranteed to be stable. EST4 has 4 states used to estimate the current price and EST3 has 3 states to estimate the current price. Figure 9 shows the matlab code for EST4.

Figure 9 EST4 matlab code

The 4 states include the price estimate, the rate, the acceleration, and the jerk, which is the derivative of the acceleration. The observer is set up inside the ‘if’ loop. The Kalman gain L is designed by pole placement by using the matlab command ‘place’. Ts is the time step in seconds/trading day. The poles are placed reasonably fast so that they have a significantly smaller time constant than the data being tracked. Consequently, good tracking results. EST3 is similar, except the jerk state is omitted. They were compared to determine the impact of additional states on trading system performance.

III. RESULTS The above-described methods were compared against each other based on return as defined by

where P(i) is the current price and P(i-1) is the previous price. R(i) values were summed only during the buy signal periods. This approach normalizes each price for comparison purposes. Since this study is to compare algorithms an ideal transaction is assumed, i.e., transaction delay or transaction costs are not included. Daily prices for the stocks given in Table 1 for a period of ~ 2 years were chosen for evaluation. Table 1 shows the returns

for the 5 approaches. The table values are computed as: If buy then tablei = sum(R(i)). Percent return on investment is obtained by multiplying the table values by 100%.

Table 1. Return sums in percent for the approaches

The approaches rank consistently as: 1) MLP, 2) recurrent MLP (RMLP), 3) IIR, 4) EST4, 5) FIR, and 6) EST3. The RMLP is slightly ahead of MLP and they both consistently outperform the rest. EST4 and the adaptive FIR are close to each other in performance. EST4 and EST3 are fixed non-predictive state estimation observer schemes. EST4 has 4 states and EST3 has 3 states. The adaptive predictive schemes generally outperform the fixed non-predictive schemes.

The following figures show the buy-sell comparisons for some of the stocks.

IV. CONCLUSIONS A number of alternative trading systems have been developed and compared. Also, a simple observer based approach has been presented. Matlab code has been provided as well as simulation results for the various methods. Of the 6 approaches considered, the MLP and recurrent MLP appear to be superior. They have been examined with stock price data having different characteristics and have performed well in all cases. More testing and evaluation is necessary before definitive performance conclusions may be made, however, there is sufficient basis for further testing.

V. ABOUT THE AUTHOR Steven C. Rogers is with the Institute for Scientific Research, Fairmont, WV26554, USA, [email protected]. Leon Luxemburg is with the Institute for Scientific Research, Fairmont, WV26554, USA, [email protected].. Matt McMahon is with the Institute for Scientific Research, Fairmont, WV26554, USA, [email protected]

VI. REFERENCES [1] Gelb, A. Applied Optimal Estimation, MIT Press, 1974, ISBN 0-262-57048-3 [2] Juang, Jer-Nan, Applied System Identification, Prentice Hall, 1994, ISBN 0-13-079211-X [3] Kailath, T,. etal, Linear Estimation, Prentice Hall, 2000, ISBN 0130224642

[4] H. Baher, Analog & Digital Signal Processing (John Wiley & Sons, 1990), ISBN 0-471-92342-7. [5] R.E. Ziemer & W.H. Tranter, Principles of Communications, 4th Edition (John Wiley & Sons, 1995),

ISBN 0-471-12496-6. [6] K. Shenoi, Digital Signal Processing in Telecommunications Processing (Englewood Cliffs, NJ: Prentice-

Hall, 1995), ISBN 0-13-096751-3. [7] Haykin, Simon, Neural Network – A Comprehensive Foundation, 2nd Edition, Prentice Hall, 1999, ISBN

0-13-272250-1 [8] Herbrich, etal, ‘Neural Networks in Economics: Background, Applications and New Developments,’

http://stat.cs.tu-berlin.de/publications [9] Goonatilake, S. etal (editors), Intelligent Systems for Finance and Business, Wiley, 1996, ISBN 0-471-

94404-1 [10] Beltratti, A., etal, Neural Networks for Economic and Financial Modeling, Thomson Computer Press,

1996, ISBN 1-85032-16908

Figure 10 Exxon buy-sell comparisons

Figure 11 Boeing buy-sell comparisons

Figure 12 Raytheon buy-sell comparisons

Figure 13 Walgreen buy-sell comparisons

Figure 14 Pepsico buy-sell comparisons

Figure 15 Monster buy-sell comparisons

Figure 16 Continental Airlines buy-sell comparisons

Figure 17 Fifth Third Bancorp buy-sell comparisons

VII. APPENDIX The following matlab code is the author’s interpretation and is included as a study aid only. The complete matlab code for the six trade systems is given in entirety to show how components interface.

adaptive predictive trading systems

Documents