Download - ANFIS Unfolded in Time
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 1/30
Neurocomputing 61 (2004) 139 – 168www.elsevier.com/locate/neucom
ANFIS unfolded in time for multivariate timeseries forecasting
N. Arzu S isman-Ylmaza;∗ , Ferda N. Alpaslan b , Lakhmi Jainc
a Central Bank of the Republic of Turkey, Ankara, Turkey bMiddle East Technical University, Computer Engineering Department, Ankara 06531, Turkey
cUniversity of South Australia, Adelaide, Australia
Abstract
This paper proposes a temporal neuro-fuzzy system named ANFIS unfolded in time which
is designed to provide an environment that keeps temporal relationships between the variables
and to forecast the future behavior of data by using fuzzy rules. It is a modication of ANFIS
neuro-fuzzy model. The rule base of ANFIS unfolded in time contains temporal TSK(Takagi–
Sugeno–Kang) fuzzy rules. In the training phase, back-propagation learning algorithm is used.
The system takes the multivariate data and the number of lags needed to construct the unfolded
model in order to describe a variable and predicts the future behavior. Computer simulations
are performed by using real multivariate data and a benchmark problem (Gas Furnace Data).Experimental results show that the proposed model achieves online learning and prediction on
temporal data. The results are compared with the results of ANFIS.
c 2004 Elsevier B.V. All rights reserved.
Keywords: Neuro-fuzzy systems; Unfolding in time; Backpropagation
1. Introduction
In multivariate time series analysis, it is possible to dene each time series in terms
of previous values of itself and previous values of other time series in the same system.
The denitions of each time series can be represented as a rule which can be used ina rule-based system. These rules can be utilized for forecasting the future behavior of
the system.
Neuro-fuzzy systems are widely used for combining the function approximation and
learning ability of neural networks and enhanced explanation capability of fuzzy sys-
tems. Recurrent Neural Network is a convenient structure for processing time series
data as stated in [10]. In Recurrent Neural Networks, if the input sequence is of a
∗ Corresponding author.
0925-2312/$ - see front matter c 2004 Elsevier B.V. All rights reserved.
doi:10.1016/j.neucom.2004.03.009
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 2/30
140 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
maximum length T , the recurrent network can be turned into an equivalent feed-forward
neural network dened over T time intervals. The idea is called unfolding in time [12].
The feed-forward neural network is duplicated for T times so that each of the neuralnetwork is kept for a time interval. In other words, each neural network represents a
state of the input sequence in time. The connection weights between the same nodes
at the duplicated neural network are identical in such a way that the neural networks
at dierent time intervals behave identical. A modied version of back-propagation
algorithm is used for training the neural network. The weight updates are summed up
and applied to the same weights at dierent time intervals. In the literature, there are
various examples of unfolding in time applications such as [ 3,12]. These examples are
mostly knowledge-based systems.
In this paper, unfolding in time approach is used to construct the knowledge-base in
a neuro-fuzzy model. A neuro-fuzzy system is used and it is duplicated for the number
of time intervals needed to forecast the system output accurately. A modied temporal back-propagation algorithm is used as the learning algorithm. It is aimed to present
the temporal data to the neuro-fuzzy system continuously. In other words, the learning
algorithm is for online learning. The neuro-fuzzy model unfolded in time is treated
as a single neural network, rather than the duplication of the basic neural network.
The connections between neural networks at dierent time intervals are also taken into
account while network parameters are updated.
During the experiments, each data set is processed by Fuzzy multivariate auto-
regression (MAR) Algorithm [13], which is a variable extraction method for multi-
variate time series data. Fuzzy MAR is based on fuzzy linear regression. It aims to
extract a set of temporal variables by solving a linear programming problem by means
of Simplex method.
2. ANFIS
Adaptive neuro-fuzzy inference system (ANFIS) is a neuro-fuzzy system developed
by Roger Jang [6 – 9]. It has a feed-forward neural network structure where each layer
is a neuro-fuzzy system component (Fig. 1). It simulates Takagi–Sugeno–Kang (TSK)
fuzzy rule [14] of type-3 where the consequent part of the rule is a linear combination
of input variables and a constant. The nal output of the system is the weighted average
of each rule’s output. The form of the type-3 rule simulated in the system is as follows:
IF x1 is A1 AND x2 is A2 AND · · · AND xp is Ap
THEN y = c0 + c1 x1 + c2 x2 + · · · + cp xp.The neural network structure contains ve layers excluding input layer.
• Layer 0 is the input layer. It has n nodes where n is the number of inputs to the
system.
• Layer 1 is the fuzzication layer in which each node represents a membership value
to a linguistic term as a Gaussian function with the mean
Ai ( x) = 1
1 + [(( x − ci)=ai)2]bi;
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 3/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 141
Σ
A1
A2
B1
B2
x 1
x 0
f
x 0 x 1
wi wi
wi f i.
3 41 2 5
µ
0
Layer
Fig. 1. Basic ANFIS structure.
where ai, bi, ci are parameters of the function. These are adaptive parameters. Their
values are adapted by means of the back-propagation algorithm during the learn-
ing stage. As the values of the parameters change, the membership function of the
linguistic term Ai changes. These parameters are called premise parameters.
In that layer there exists n × p nodes where n is the number of input variables and
p is the number of membership functions. For example, if size is an input variable
and there exists two linguistic values for size, which are SMALL and LARGE then
two nodes are kept in the rst layer and they denote the membership values of inputvariable size to the linguistic values SMALL and LARGE.
• Each node in Layer 2 provides the strength of the rule by means of multiplication
operator. It performs AND operation.
wi = Ai( x0) Bi( x1):
Every node in this layer computes the multiplication of the input values and gives the
product as the output as in the above equation. The membership values represented
by Ai ( x0) and Bi( x1) are multiplied in order to nd the ring strength of a rule
where the variable x0 has linguistic value Ai and x1 has linguistic value Bi in the
antecedent part of Rule i.
There are pn nodes denoting the number of rules in Layer 2. Each node representsthe antecedent part of the rule.
If there are two variables in the system namely x1 and x2 that can take two fuzzy
linguistic values SMALL and LARGE, there exist four rules in the system whose
antecedent parts are as follows:
IF x1 is SMALL AND x2 is SMALL
IF x1 is SMALL AND x2 is LARGE
IF x1 is LARGE AND x2 is SMALL
IF x1 is LARGE AND x2 is LARGE
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 4/30
142 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
• Layer 3 is the normalization layer which normalizes the strength of all rules accord-
ing to the equation
wi = wi R j=1 w j
;
where wi is the ring strength of the ith rule which is computed in Layer 2. Node
i computes the ratio of the ith rule’s ring strength to the sum of all rules’ ring
strengths. There are pn nodes in this layer.
• Layer 4 is a layer of adaptive nodes. Every node in this layer computes a linear
function where the function coecients are adapted by using the error function of
the multi-layer feed-forward neural network.
wifi = wi(p0 x0 + p1 x1 + p2);
pi’s are the parameters where i= n+1 and n is the number of inputs to the system
(i.e. number of nodes in Layer 0). In this example, since there exists two variables
( x1 and x2), there are three parameters (p0, p1 and p2) in Layer 4. wi is the output of
Layer 3. The parameters are updated by a learning step. Least squares approximation
is used in ANFIS. In the temporal model, back-propagation algorithm is used for
training.
• Layer 5 is the output layer whose function is the summation of the net outputs of
the nodes in Layer 4. The output is computed as:
i
wifi = i wifi
i
wi
;
where wifi is the output of node i in Layer 4. It denotes the consequent part of
rule i. The overall output of the neuro-fuzzy system is the summation of the rule
consequents.
ANFIS uses a hybrid learning algorithm in order to train the network. For the param-
eters in the layer 1, back-propagation algorithm is used. For training the parameters
in the Layer 4, a variation of least squares approximation is used. The following
example describes the processing of ANFIS over a data set.
Example. Gas furnace data processed by ANFIS
ANFIS accepts the input data in the (GasFlowRate(t − 4);CO2Concentration(t − 1))
format.
(1) An input data pair is given to the network.
(2) The network performs the forward pass, i.e. the output of the function which is
CO 2Concentration(t ) is computed.
(3) Another input data pair is presented to the network and the above computation
continues until the network is trained with sample size-4 data points (last four
pairs cannot be used since the expected output is not known) where sample size
denotes the total number of data points in the training data set.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 5/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 143
(4) Error is computed for this epoch by using an error measure to compare the expected
output to the output of the system.
(5) Training is performed by updating the parameters in layer 1 (a, b, c) and in layer 4 (˜ pi). This is oine learning, because all data set is presented to the network at
once and the parameters are updated.
(6) After a predetermined number of training epochs is reached, the training process
terminates.
The fuzzy rules produced in terms of parameters are as follows.
Rule 1:
IF GasFlowRate(t − 4) is SMALL1 AND CO2Concentration(t − 1) is SMALL2
THEN CO2Concentration(t ) = p11 * GasFlowRate(t − 4)
+p12 * CO2Concentration(t − 1) + p13
Rule 2:
IF GasFlowRate(t − 4) is SMALL1 AND CO2Concentration(t − 1) is LARGE2
THEN CO2Concentration(t ) = p21 * GasFlowRate(t − 4)
+p22 * CO2Concentration(t − 1) + p23
Rule 3:
IF GasFlowRate(t − 4) is LARGE1 AND CO2Concentration(t − 1) is SMALL2
THEN CO2Concentration(t ) = p31 * GasFlowRate(t − 4)
+p32 * CO2Concentration(t − 1) + p33
Rule 4:
IF GasFlowRate(t − 4) is LARGE1 AND CO2Concentration(t − 1) is LARGE2
THEN CO2Concentration(t ) = p41 * GasFlowRate(t − 4)
+p42 * CO2Concentration(t − 1) + p43
In this example there are two fuzzy values SMALLi and LARGE i for both variables(GasFlowRate(t − 4) and CO2Concentration(t − 1)) where i denotes the index of the
variable. Each fuzzy value such as SMALLi is denoted by the parameters in the rst
layer (ai ; bi ; ci). p jk is the parameter in the fourth layer where j denotes the rule and
k denotes the parameter index. It is used in computing the output of the system which
is CO2Concentration(t ).
3. ANFIS unfolded in time
The neuro-fuzzy systems in the literature are mostly multi-layer feed-forward neural
network structures. When temporal data is concerned it is needed to construct a neuralnetwork structure which uses temporal relationships.
Recurrent neural network structures are more convenient for that purpose. Unfolding
in time is a method used for training the recurrent neural network structures. The
neuro-fuzzy approach in the study utilizes this method.
Unfolding in time approach is applied to the neuro-fuzzy system in order to construct
a temporal multi-layer feed-forward neural network. The feed-forward neural network
is duplicated for T times where T is the number of time intervals needed in the specic
problem. The resulting system is called ANFIS unfolded in time.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 6/30
144 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
Y
X0
Y1
X1
Y2 Y3
Y4
X3X2
NN2NN1 NN3 NN4
0
Fig. 2. ANFIS unfolded in time.
The neural network structure enables us to dene a problem where the input can be
a vector such that (̃x; y) and the system produces only one output which is y.
Sample system can be seen in Fig. 2. Each of the boxes represents one ANFIS struc-
ture dened in Fig. 1. In the problem given in Fig. 2, it is assumed that the output
of the system depends on four
previous input values. In order to achieve the network structure, ANFIS is duplicated for 4 time intervals. The input of the neuro-fuzzy sys-
tem is composed of two elements which are X and Y . There is only one output of
the system which is Y . Initially (at time t = 0), X 0 and Y 0 are input to NN1 (net-
work component for time interval 1). The output of NN1 is obtained as Y 1 (at time
t = 1). Then, it is input to the NN2. Another input is needed which is X 1 (external
input). So it is supplied externally, since the system does not produce X 1. Output for
the second time interval is obtained as Y 2. The same process is performed for the
rest of the time intervals (two more time intervals). Finally, Y 4 is obtained as the
output of NN4. It is treated as the output of the system ANFIS unfolded in time for
t = 0. In other words, the input is supplied at time 0, and the output for time 4 is
obtained.
The same process is repeated for time t = 1; 2; : : : until the end of the sample dataset is reached.
3.1. Temporal back-propagation algorithm
The algorithm used in the neural network structure is a modication of the back-
propagation algorithm. Since the basic neuro-fuzzy system is a feed-forward neural
network, back-propagation algorithm is convenient to use. Because the neural net-
work is duplicated T times the basic back-propagation learning algorithm is modied
accordingly.
The neuro-fuzzy system is treated as a black-box containing T neural networks.
The connections between neural networks are also taken into account, representing thetemporal relationships. The parameters in the last network are updated according to the
error in the last interval. The error in one of the previous networks is updated by using
the error in the specied time interval and the errors propagated from the following
intervals. The error coming from the following intervals are back-propagated to the
error in the specic interval. Unlike the conventional unfolding in time method, the
parameters are updated independently.
The algorithm in Fig. 3 describes the steps for processing the data for one specic
time interval. The data is presented to the neuro-fuzzy system at each time interval.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 7/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 145
Part 1 (Forward Phase)
1. The data at time T = k is given as input to system ),( k k y x .
2. The output y k+1 is computed for network at time interval 1. If T isgreater than 1, the last element in the input vector y k becomes
y k+1 (y=y 1). The output of each step will be one of the elements in
the input vector until t = T .
3. The output of the last neural network is the output of
ANFIS_unfolded_in_time. In other words, Y k = y T+k
Part 2 (Backward Phase)
1. Compute the error for the output response of the system Y k and the
given input vector ),( k k y x
2)( k k Y yerror −=
2. Back-propagate the error to the parameters in network T .
3. Back-propagate the error to the error in the network T-1. Then
update the parameters in network T-1 by using the propagated
error .4. Repeat the above step until t=0.
Fig. 3. Forward and backward processing phases in ANFIS unfolded in time.
E(t)∂
E(t−1)∂E(t−1) +
y(t)
E(t)
y(t−1)
NN(t−1) NN(t)
Fig. 4. Error computation for a time interval.
The data is processed and the output is obtained for T time intervals ahead. The error
is computed and back-propagated through the network, updating the parameters of each
node (online learning).
The algorithm contains two phases: Forward phase and Backward phase. In the
Forward phase, the data at specic time k is introduced to the system and the com-
putations are performed according to the input value. The important feature of the
temporal neuro-fuzzy system is that at the end of the computation for a data at time
interval k , it yields into the output of the system at time interval k + T .
In the Backward phase, the parameters in all networks are updated according to the
output produced by the neuro-fuzzy system. The error in network T is used to update
only the parameters in network T . But, for updating the parameters in network T − 1,the error of the following network (which is network T ) is back-propagated to network
T −1. The same process is applied to all previous time intervals until all the parameters
in all networks are updated.
The output of the forward phase is accepted as the output of ANFIS unfolded in time.
At the end of the Backward Phase all parameters are updated and the data in the next
time interval is presented to the system.
The method of back-propagating the error is shown in Fig. 4. If the error computed
at time interval t is E (t ) then the error is back-propagated through the neural network
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 8/30
146 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
Table 1
Comparison of RMSE values for dierent neuro-fuzzy systems and ANFIS unfolded in time
Model nfMod NFIDENT ANFIS System ANFIS unfolded in time
RMSE 0.485 0.623 0.241 0.367 0.662
Number of rules 26 21 49 26 16
Fig. 5. Train results of ANFIS unfolded in time for Gas-furnace data.
at time t (i.e. NN (t )). Moreover, it is also back-propagated through the neural network
at time t − 1 (i.e. NN (t − 1)). For that purpose, partial derivative of E (t ) is taken
over E (t − 1) which is @E t =@E t −1. The error is propagated through the parameters in
the next time interval such that the partial derivative of the error E (t ) over parameters
(a;b;c) and (p0; p1; p2; : : :) in time interval t summed to the error in time interval t
which is E (t − 1). The procedure goes on like this for the previous time intervals.
4. Experimental results
In the experiments real data taken from [1,2,11] are used. First, Gas-furnace data
experiment is performed which is a benchmark problem. Thirteen data sets are used in
the second part of the tests. The fuzzy MAR algorithm [13] is used as a preprocessing
step to obtain the input variables and the number of time intervals.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 9/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 147
Table 2
Temporal Variables dening each time series in the system
Data set Series Dening variables
AAA Bonds Interest Rates x0 x0;t −1; x0;t −2
x1 x1;t −1; x1;t −2; x0;t −1
Agriculture x0 x0;t −1; x0;t −2; x1;t −1; x1;t −2
x1 x2;t −1; x1;t −1; x1;t −2; x0;t −1
x2 x2;t −1; x2;t −2; x3;t −2; x1;t −1
x3 x0;t −1; x0;t −2; x1;t −1; x1;t −2; x3;t −1; x3;t −2
Flour Price Indices x0 x1;t −1; x1;t −2; x2;t −1; x0;t −1;
x1 x2;t −1; x0;t −1
x2 x2;t −1
Forest x0 x0;t −1; x0;t −2; x1;t −1
x1 x1;t −1; x0;t −1; x3;t −1
x2 x2;t −1; x2;t −2; x1;t −1; x3;t −1; x0;t −1
x3 x3;t −1; x3;t −2
Gas Furnace x0 x0;t −1; x0;t −4; x1;t −1
x1 x1;t −1; x1;t −4; x0;t −1
Grain Price Indices x0 x0;t −1; x1;t −1
x1 x1;t −1; x1;t −2; x0;t −1
x2 x2;t −1; x1;t −1; x0;t −1
x3 x3;t −1; x3;t −2
Housing Starts and Sold x0 x1;t −1; x1;t −2; x0;t −1
x1 x1;t −1; x1;t −2
Interest Rates x0 x0;t −1; x1;t −1
x1 x1;t −1; x1;t −2; x2;t −1
x2 x2;t −1; x1;t −1
Investment and Inventories x0 x0;t −1
x1 x0;t −1; x1;t −1Mink-Muskrat Furs x0 x0;t −1; x0;t −2; x1;t −1
x1 x1;t −1; x1;t −2
Power Station x0 x0;t −1; x0;t −2; x2;t −1
x1 x1;t −1; x1;t −2; x2;t −1
x2 x2;t −1; x2;t −2; x0;t −2
Production and Billing x0 x0;t −1; x0;t −2
x1 x1;t −1; x1;t −2; x0;t −2
Umemployment and GDP x0 x0;t −1; x1;t −1
x1 x1;t −1; x1;t −2; x0;t −1
4.1. Gas-furnace data experiment
Gas-furnace data consists of 296 data pairs [2]. The data has two variables which
are gas ow rate (input X ) and the concentration of CO2 in the exhaust gas (output
Y ). The data set consists of measurements sampled at a xed interval of 9 seconds.
The measured input X k represents the ow rate of the methane gas in a gas furnace
and the output measurement Y k represents the concentration of carbon dioxide CO2
in the gas mixture owing out of the furnace under a steady air supply [4]. I t is
stated that the output Y depends on previous values of itself and also the values of
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 10/30
148 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
Table 3
RMSE for Experiments with ANFIS and ANFIS unfolded in time using Real Data
Series Var. ANFIS ANFIS unfolded in time
Train. Recog. Train. Recog.
AAA Bonds Interest Rates x0 0.119902 0.652238 0.129453 0.650826
x1 0.173633 4.317652 0.409995 1.146950
Agriculture x0 0.406124 38.933538 1.491038 1.56818
x1 174.645974 24059.645974 1622.169764 2181.88
x2 1.461472 177.453338 5.934202 8.91102
x3 5.529144 5241.348344 860.494170 1049.02
Flour Price Indices x0 2.960263 28.606507 7.628765 7.875690
x1 2.689367 71.649687 8.334503 8.080240
x2 9.378680 9.485944 10.297852 8.69711
Forest x0 30.690742 978622.038395 53.744128 537.123
x1 0.016055 307.751859 0.056393 0.441184
x2 2.345635 7352.330969 18.841375 68.5216
x3 0.604859 0.848928 1.049248 0.895668
Gas Furnace x0 0.213040 0.514342 0.211783 0.201765
x1 0.109253 1.406189 0.380676 0.918988
Grain Price Indices x0 0.107664 0.41797 0.389521 0.179064
x1 1.544528 2.923548 0.048258 0.086044
x2 0.041943 0.307117 0.094808 0.139154
x3 0.050431 0.092057 0.068364 0.0756004
Housing Starts and Sold x0 0.000747 243.336994 6.911277 12.946600
x1 2.667528 584.922654 5.700290 11.561000
Interest Rates x0 0.156515 1.104897 0.222275 0.745151
x1 0.143651 8.526677 0.2177 0.374334
x2 0.153290 0.665197 0.255971 0.503007
Investment and Inventories x0 2.538655 9.461350 14.156823 6.830420 x1 1.737847 1980.411870 4.490392 6.38886
Mink-Muskrat Furs x0 0.000006 1.262477 0.268479 0.295973
x1 0.241326 0.414645 0.284571 0.492810
Power Station x0 0.000026 2.430449 0.592158 0.963655
x1 0.000175 19.510901 0.825107 0.871221
x2 0.000118 5.237194 0.486922 0.947947
Production and Billing x0 1.065657 1.851736 1.339888 1.783490
x1 1.285181 18.492608 4.48082 7.087390
Umemployment and GDP x0 1.361111 260.3560 0.000028 29.644100
x1 0.211697 9.657877 1.027380 9.2844100
X . Most studies used the inputs X t −4 and Y t −1 for the output Y t . It is observed that
Gas-furnace data in [2] is used as a benchmark problem in the neuro-fuzzy literature.
In order to compare the results for ANFIS unfolded in time, same data is used in an
experiment.
The step-size for learning is set to a very small number (0.02) since the training
algorithm performs online learning. The parameters are updated after each data pair
is presented to the system. The results are compared with the results taken from [5].
RMSE (root mean square error) is used as the error criterion in order to compare the
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 11/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 149
(a)
(b)
Fig. 6. Output for AAA-CP Bonds data (a) x0 (b) x1.
accuracy of the system with other systems. The formula for RMSE is given as follows:
RMSE =
K k =1 (Y k − Y k )
K :
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 12/30
150 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
(a)
(b)
Fig. 7. Output for Agriculture data (a) x0 (b) x1 (c) x2 (d) x3.
In RMSE, K is the number of samples in the data set. Y k is the expected out-
put for the input given and Y k is the system’s response. RMSE is computed after
each epoch, i.e. after all data set is trained. The size of the data set is 292. When
a smaller set having data pairs is used, the RMSE value decreases to 0.583. This
is much smaller than the RMSE computed when the whole data set is used which
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 13/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 151
(c)
(d)
Fig. 7. (continued ).
is 0.662 as seen in the Table 1. The reason is that the identication of a function
which holds for the whole data set is more dicult when the number of data points
increase.
In the results shown in Table 1, the error is computed for one-step-ahead predictions.
This means that the model is using the data pair xt −4; yt −1 and producing the result
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 14/30
152 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
(a)
(b)
Fig. 8. Output for Flour Price data (a) x0 (b) x1 (c) x2.
yt rather than taking xt −4; yt −4 and producing yt , as ANFIS unfolded in time does.
This is not one-step ahead prediction but four-step ahead prediction. Since other models
are all multi-layer feed-forward neural network models they take input values only in
the layer 0. It is not suitable to take the values of variables at dierent time instances
as input to the neural network, when temporal data processing is concerned. Since the
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 15/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 153
(c)
Fig. 8. (continued ).
values at dierent time instances are important for dierent variables, the neuro-fuzzy
model must perform learning and prediction concerning the time interval needed.
ANFIS unfolded in time is a neuro-fuzzy structure which performs the processing for a
predened time interval. This is advantageous when time series prediction and controlapplications are concerned.
The training results for 296 points in time is displayed in Fig. 5. It is observed
that there is no signicant deviation of output produced by the system from expected
output. Two hundred and ninty two data points are used as input since four-step ahead
prediction is performed.
4.2. Real data experiments
In this section, the results obtained by using real data sets are presented. Thirteen
data sets are used in the real data experiments. Data sets are tested by using ANFIS
and ANFIS unfolded in time. The input variables for each of the series in data setsthat are obtained by Fuzzy MAR algorithm can be seen in Table 2.
These data sets contain the following information:
• There exists two time series in AAA Bonds Data Set which are AAA Bond Rate
( x0) and Commercial Paper Rate ( x1). The data is collected quarterly between 1953
and 1970.
• Monthly agriculture data contains four time series which are First dierence of the
logarithm of exchange rate ( x0), price ( x1), logarithm of levels of sales ( x2) and loga-
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 16/30
154 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
(b)
(a)
Fig. 9. Output for Forestry data (a) x0 (b) x1 (c) x2 (d) x3.
rithm of shipments ( x3). The data is collected between February 1978 and December
1992.
• Flour price data set contains monthly our price indices for three US cities which
are Bualo ( x0), Minneapolis ( x1) and Kansas City ( x2). The data belongs to the
time interval January 1972 and November 1980.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 17/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 155
(c)
(d)
Fig. 9. (continued ).
• Monthly Forestry data contains four variables, which are lumber production ( x0),
lumber price ( x1), the price that housing starts ( x2) and disposable income ( x4).
• Monthly US Grain Price Data contains the price in dollars per 100-pound sack for
wheat our ( x0) and per bushel for corn ( x1), wheat ( x2) and rye( x3). The data is
obtained between January 1961 and October 1972.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 18/30
156 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
Fig. 10. Output for Gas furnace data variable x0.
• Monthly US housing starts and sold data contains two variables, which are the prices
that housing starts ( x0) and the prices that housing sold ( x1). The data is collected
for the period January 1965 and December 1974.
• Monthly Interest Rate data contains three series which are Federal Funds Rate ( x0),
90-Day Treasury Bill Rate ( x1), and the 1-Year Treasury Bill Rate ( x2).
• Quarterly, seasonally adjusted, US Fixed investment ( x0) and changes in business
inventories ( x1) are tested by the algorithm. The data is recorded between 1947 and
1971.
• Natural logarithms of the annual sales of mink furs ( x0) and muskrat furs ( x1) by
Hudson’s Bay company are used in the experiments. The data is recorded between
1850 and 1911.
• Power station data is taken from a 50 MW turbo-alternator and contains in-phase
current deviations ( x0), out-of-phase current deviations ( x1) and frequency deviations
of voltage generated ( x2).
• Production data set contains two variables which are weekly production gures inthousands of units ( x0) and weekly billing gures in millions of dollars ( x1) of a
company.
• Unemployment data set contains two variables which are unemployment ( x0) and
gross domestic product ( x1) in UK between 1955 and 1969. The data is recorded
quarterly.
In the experiments, rst half of the observations are used in training phase. The
entire data set is used in recognition phase. The training and recognition results are
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 19/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 157
(a)
(b)
Fig. 11. Output for Grain Price data (a) x0 (b) x1 (c) x2 (d) x3.
presented in Table 3. The gures between Figs. 6 and 18 show the expected output,
obtained output, and the error (RMSE) for all data sets.
The results of the experiments can be summarized as follows:
ANFIS unfolded in time performs better prediction for variable x0 than x1 in the
AAA-CP bonds data set as seen in Fig. 6. In Table 3, the recognition rate for x0
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 20/30
158 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
(c)
(d)
Fig. 11. (continued ).
is slightly better than the recognition rate for x1 which are 0.650826 and 1.146950,
respectively.
For the Agriculture data, the recognition results are very dierent as shown in Table
3. x0 and x2 are recognized better than x1 and x2.This is also validated in Fig. 7.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 21/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 159
(a)
(b)
Fig. 12. Output for Housing data (a) x0 (b) x1.
As seen in Fig. 8, the recognition phase yields similar patterns for the variables x0,
x1 and x2 in Flour Prices data set. All the variables are recognized with recognition
errors close to training errors as seen in Table 3.
For the Forestry data set, it can be said that the recognition errors for x1 and x3 are
superior to the errors for x0 and x2 given as in Table 3. The output gure for x3 shows
the best recognition result in Fig. 9.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 22/30
160 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
(a)
(b)
Fig. 13. Output for Interest Rate data (a) x0 (b) x1 (c) x2.
Variable x0 of Gas Furnace is also used in real data experiments. Fig. 10 shows that
the expected and obtained x0 are very close to each other. This can also be seen in
Table 3.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 23/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 161
(c)
Fig. 13. (continued ).
The recognition error is 0.201765. For the Grain Prices data set, the recognition
errors changes between 0.075 and 0.17 which are very close results given as Table 3.
In Fig. 11, it can be seen that the output for x0 simulates successfully the behavior of
the original x0 data. The other variables also yield error very close to zero.The Housing data set has approximately same recognition errors for both x0 and x1
as seen in Table 3 which are 12.9466 and 11.561. Also as seen in Fig. 12, both for
x0 and x1, the output of ANFIS unfolded in time follows the same pattern with the
original output but misses the local peak data points.
The Interest Rates data set variables x0, x1 and x2 yields promising recognition errors
in Table 3. Fig. 13 validates this situation. The outputs obtained are very close to the
expected output. The recognition errors for x0 and x1 in
Investment and Inventories data set are very close to each other in Table 3. Out-of-
sample recognition error is slightly higher than the error for training sample for both
variables given as in Fig. 14.
The recognition errors are slightly worse for x1 than the error for x0 in Mink-Muskratdata set in Table 3. In Fig. 15, the outputs are close to the expected results for both
of the variables.
In Table 3, for the Power Station data, x0 and x2 has approximately the same recog-
nition errors (0.963655 and 0.947947 respectively), whereas x1 yields better recognition
error (0.871221). Besides to that x1 yields the best gure among other variables in Fig.
16.
For Production and Billing data set, the recognition error for x0 is small compared
to x1 in Table 3.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 24/30
162 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
(a)
(b)
Fig. 14. Output for investment and inventories data (a) x0 (b) x1.
In Fig. 17, the output x0 yields a linear gure missing most of the peak values,
whereas x1 adapts itself to the uctuations in expected output values for x1.
In Unemployment and GDP data x0 gives worse recognition error than x1 as in Table
3. For the out-of-sample values, the recognition gets worse for x1 as seen in Fig. 18.
On the other hand, x0 simulates the expected results better.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 25/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 163
(a)
(b)
Fig. 15. Output for Mink and Muskrat Furs data (a) x0 (b) x1.
The experimental results show that:
• ANFIS yields smaller training errors than ANFIS unfolded in time. This is an ex-
pected result, since during the training phase, our model uses online learning. The
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 26/30
164 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
(a)
(b)
Fig. 16. Output for Power Station data (a) x0 (b) x1 (c) x2.
error obtained for a sample is a cumulative error containing the error in the whole
time interval.
• Our model gives better recognition results than ANFIS. This is also not surprising,
since the neuro-fuzzy model is performing t -ahead prediction, the recognition results
are promising.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 27/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 165
(c)
Fig. 16. (continued ).
5. Conclusion
A neuro-fuzzy system is constructed for prediction of time series data by means of a temporal learning algorithm and a temporal neuro-fuzzy model. The model, namely
ANFIS unfolded in time, provides an on-line environment which takes a time series
and forecasts about the future behavior of it. Because the recurrent neural network
structure seems to be convenient for time series analysis, unfolding-in-time approach is
useful to represent a recurrent neural network as a feed-forward one. The neuro-fuzzy
system which is basically a black box of feed-forward neural networks is duplicated
for T time intervals in this approach. The number of time intervals is provided to
the neuro-fuzzy system as an argument. These are computed by using Fuzzy-MAR
algorithm [13]. As an alternative, tests can be iteratively performed to nd the best
number of time intervals for the time series data given.
Because the resulting model can be used for small time intervals T , it can be appliedto areas including short-term prediction, such as:
• Financial or meteorological data forecasting can be an application area, since the
model is convenient for forecasting problems.
• Sequence detection given by Rumelhart [12] is a possible application of unfolding-in-
time concept.
• Image processing, specically motion detection can be a dierent application area
for ANFIS unfolded in time because of the usage of temporal expert system rules.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 28/30
166 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
(a)
(b)
Fig. 17. Output for Production and Billing data (a) x0 (b) x1.
The method is tested on various data sets and the results are compared with
the results of ANFIS. Although it seem the training error is a little bit higher than
ANFIS, the recognition error is much smaller in ANFIS unfolded in time. Since
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 29/30
N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 167
(a)
(b)
Fig. 18. Output for Unemployment and GDP data (a) x0 (b) x1.
ANFIS unfolded in time uses error at t-ahead time intervals, the error back-
propagated through the network is a cumulative error.
References
[1] D. Akleman, F.N. Alpaslan, Temporal rule extraction for rule-based systems using time series approach,
Proceedings of ISCA CAINE-97, San Antonio, Texas, December 1997.
7/22/2019 ANFIS Unfolded in Time
http://slidepdf.com/reader/full/anfis-unfolded-in-time 30/30
168 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168
[2] G.E. Box, G.M. Jenkins, Time Series Analysis: Forecasting and Control, Holden Day, San Francisco,
1970.
[3] F.N. Civelek-Alpaslan, K.M. Swigger, A temporal neural network model for constructing connectionistexpert system knowledge bases, J. Network Comput. Appl. 19 (1996) 19–133.
[4] W. Farag, A. Tawk, On fuzzy model identication and the gas furnace Data, Proceedings of the
IASTED International Conference, Hawaii, 2000.
[5] M.B. Gorzalczany, A. Gluszek, Neuro-fuzzy systems for rule-based modeling of dynamic processes,
Proceedings of ESIT, 2000, pp. 416–422.
[6] J.-S.R. Jang, Self-learning fuzzy controllers based on temporal back propagation, IEEE Trans. Neural
Networks 3 (5) (1992) 714–723.
[7] J.-S.R. Jang, ANFIS: adaptive-network-based fuzzy inference systems, IEEE Trans. Systems, Man
Cybern. 23 (03) (1993) 665–685.
[8] J.-S.R. Jang, Neuro-Fuzzy and Soft Computing, Prentice-Hall, New Jersey, 1997.
[9] J.-S.R. Jang, Roger Jang’s Publications and Softwares, http://www.cs.nthu.edu.tw/∼ jang/publication.htm.
[10] T.M. Mitchell, Machine Learning, McGraw-Hill, New York, 1997.
[11] G.C. Reinsel, Elements of Multivariate Time Series Analysis, Springer, New York, 1997.
[12] D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning internal representations by error propagation,in: D.E. Rumelhart, J.L. McClelland (Eds.), Parallel Distributed Processing: Explorations in the
Microstructure of Cognition, MIT Press, 1986, pp. 318–362.
[13] N.A. S isman-Ylmaz, A temporal neuro-fuzzy approach for time series analysis, Ph.D. Thesis,
Department of Computer Engineering, Middle East Technical University, 2003.
[14] M. Sugeno, G.T. Kang, Structure identication of fuzzy model, Fuzzy Sets and Systems 28 (1988)
15–33.