principal component analysis based time series ... · fusion’ problem, since "sensor fusion...
TRANSCRIPT
Principal Component Analysis based
Time Series Segmentation –
A New Sensor Fusion Algorithm
Janos Abonyi, Balazs Feil, Sandor Nemeth, Peter Arva
University of Veszprem, Department of Process Engineering
P.O.Box. 158, H-8200, Veszprem, Hungary
www.fmt.vein.hu/softcomp e-mail: [email protected]
Abstract
Segmentation is the most frequently used subroutine in clustering, indexing, sum-
marization, anomaly detection, and classification of time series. Although in many
real-life applications a lot of variables must be simultaneously monitored, most of the
segmentation algorithms are used for the analysis of only one time-variant variable.
Hence, this paper proposes Principal Component Analysis (PCA) based algorithms
that are able to detect: (i) changes in the mean; (ii) changes in the variance; and
(iii) changes in the correlation structure among several variables. The segments
obtained by bottom-up segmentation algorithms are hierarchically clustered using
a PCA similarity factor. The whole approach is applied to the monitoring of the
industrial production of high-density polyethylene.
Key words: sensor fusion, segmentation, PCA, fuzzy clustering, bottom-up
method
Preprint submitted to Elsevier Science 16 November 2004
1 Introduction
Real-life time series can be taken from business, physical, social and behavioral
science, economics, engineering [1–3], etc. Time series segmentation is often
used to extract internally homogeneous segments from a given time-series to
locate stable periods of time, to identify change points, or to simply compress
the original time-series into a more compact representation [4]. Although in
many real-life applications a lot of variables must be simultaneously tracked
and monitored, most of the time-series segmentation algorithms are based on
only one time-variant variable [1].
This paper deals with the problem of multivariate time series segmentation.
A univariate time series can contain data in a time ordered structure origi-
nated from a given sensor. Such time series can be taken from several sources,
e.g. in case of industrial processes the sensors measure physical or chemical
properties, e.g. pressure, temperature, concentration, flow or mass rate, valve
position, density, melt index, grain-size distribution etc. However accurate
and frequent measurements are taken, it is often the case that even the main
changes of the system cannot be detected from the signal of a single sensor.
This is because sometimes the changes of the correlation structure between
the variables (sensor signals) is interesting since such fused information reflects
the hidden change of the system. In these cases there is a need to integrate the
information and data taken from different sensors, which is a typical ’sensor
fusion’ problem, since ”sensor fusion is the combination of sensory data or
data derived from sensory data such that the resulting information is in some
sense better than would be possible when these sources were used individually”
[5].
2
The aim of this paper is to develop new algorithms that are able to handle
time-varying multivariate data to detect: (i) changes in the mean; (ii) changes
in the variance; and (iii) changes in the correlation structure among the vari-
ables. Principal Component Analysis (PCA) is the most frequently applied
tool to discover such information [6], as PCA maps the multivariate data into
a lower (usually two or three) dimensional dimensional space which is useful
in the analysis and visualization of correlated high-dimensional data [2].
PCA is a widely used tool in the field of sensor fusion [7–9], and very popu-
lar multivariate technique used for developing multivariate statistical process
monitoring methods [10]. In most of the related works, PCA is used to elimi-
nate the less significant components or sensors reducing the data representa-
tion only to the most significant ones and to plot the data in two dimensions.
E.g. in [11], the measurements of an electronic nose and tongue were visualized
by PCA, and based on this method information was given about the mutual
correlation of sensors. Another interesting application field is quality moni-
toring. In [12], artificial neural network (ANN) was used to estimate global
pollution parameters in water samples, particularly the Chemical Oxygen De-
mand (COD), and PCA was used to select the input variables of the neural
model. Cimander et. al. applied PCA as a part of a real-time expert system in
[13] which allowed data transmission of more than 1800 different signals from
instrumentation. In another work [14], Cimander and his colleagues applied
PCA and ANN for on-line monitoring of yoghurt fermentation.
Linear PCA models have two particularly desirable features: they can be un-
derstood in great detail and they are straightforward to implement. Since the
PCA model defines linear hyperplane, the proposed segmentation algorithms
can be considered as the multivariate extension of the piecewise linear approx-
3
imation (PLA) based time series segmentation and analysis tools developed by
Keogh [15]. Keogh showed that the PLA based representation of time-series
has many desirable properties:
• High rates of data compression. For example consider Figure 3. The original
time series contains 9.600× 11 points. The segmented version of it contains
only seven segments.
• Relative insensitivity to noise.
• Intuitiveness and ease of visualization.
Based on PLA models effective data mining algorithms have been worked
out for fast similarity search, weighted queries, and change point detection in
univariate time series [15]. Most of these algorithms utilize a simple distance
measure to compare the segments of different time series. This distance mea-
sure is calculated based on the endpoints of the linear lines used to describe
the segments [16]. Unfortunately, the distances among multivariate PCA mod-
els (i.e. hyperplanes) cannot be evaluated with this approach. Hence, for this
purpose the PCA similarity factor developed by Krzanowski [17,18] is used
to compare multivariate time series segments. This paper will show how this
PCA similarity factor can be used to search for similar segments and cluster
the detected subsequences.
The algorithms proposed in this paper are new time series segmentation tools
that can be used to extract new and useful information from multivariate data.
Clustering and the segmentation are the most frequently used data mining al-
gorithms, being useful in it’s own right as an exploratory technique, and also
as a subroutine in rule discovery, indexing, summarization, anomaly detection
and classification. These topics belong to ’data mining’ and ’knowledge discov-
4
ery in databases’, but it can be interpreted in a much more general way: these
can be referred to as methods related to ’data fusion’ and ’information fusion’
[5]. The methods proposed in this paper can be seen as ’direct fusion meth-
ods’ [19], because they fuse (history values of) sensor data. The applicability
of the proposed algorithms is presented by the analysis of real-life process
data taken from an industrial polyethylene plant. It will be shown that in the
exploitation of the segmentation results there is a need for the utilization of a
priori knowledge about the environment (experience of operators, knowledge
of engineers and scientists), so ’indirect information fusion’ approach will be
also followed in this work.
The paper is organized as follows. The aim of time series segmentation is for-
malized in Section 2. Section 3 describes different cost functions for segmenta-
tion based on PCA. The new algorithms are presented in Section 4. Section 5
presents these application examples. Conclusions are given in Section 6.
2 Time Series Segmentation
A time-series T = {xk = [x1,k, x2,k, . . . , xn,k]T |1 ≤ k ≤ N} is a finite set of
N n-dimensional samples labelled by time points t1, . . . , tN . A segment of T
is a set of consecutive time points S(a, b) = {a ≤ k ≤ b}, xa,xa+1, . . . ,xb.
The c-segmentation of time-series T is a partition of T to c non - overlapping
segments ScT = {Si(ai, bi)|1 ≤ i ≤ c}, such that a1 = 1, bc = N , and ai =
bi−1 +1. In other words, an c-segmentation splits T to c disjoint time intervals
by segment boundaries s1 < s2 < . . . < sc, where Si(si, si+1 − 1).
The goal of the segmentation procedure is to find internally homogeneous
5
segments from a given time-series. To formalize this goal, a cost function
cost(S(a, b)) describing the internal homogeneity of individual segments should
be defined. Usually, this cost function cost(S(a, b)) is defined based on the dis-
tances between the actual values of the time-series and the values given by the
a simple function (constant or linear function, or a polynomial of a higher but
limited degree) fitted to the data of each segment. For example in [20,21] the
sum of variances of the variables in the segment was defined as cost(S(a, b)):
cost(Si(ai, bi)) =1
bi − ai + 1
bi∑
k=ai
‖ xk − vi ‖2, (1)
vi =1
bi − ai + 1
bi∑
k=ai
xk,
where vi the mean of the segment.
The segmentation algorithms simultaneously determine the θi parameters of
the models used to approximate the behavior of the system in the segments,
and the ai, bi borders of the segments by minimizing the sum of the costs of
the individual segments:
cost(ScT ) =
c∑
i=1
cost(Si) . (2)
This cost function can be minimized by dynamic programming (e.g. [21]),
which is unfortunately computationally intractable for many real data sets.
Hence, usually one of the the following heuristic approaches are followed:
• Search for inflection points:
Searching for primitive episodes located between two inflection points [2].
• Sliding window: A segment is grown until it exceeds some error bound.
The process repeats with the next data point not included in the newly
6
approximated segment. For example a linear model is fitted on the observed
period and the modelling error is analyzed [15].
• Top-down method: The time-series is recursively partitioned until some
stopping criteria is met [15].
• Bottom-up method: Starting from the finest possible approximation, seg-
ments are merged until some stopping criteria is met [15].
• Clustering based method: Time-series segmentation may be viewed as
clustering, but with a time-ordered structure. In [22] a new fuzzy clustering
algorithm has been proposed which can be effectively used to segment large,
multivariate time-series.
In data mining, the bottom-up algorithm has been used extensively to support
a variety of time series data mining tasks [15], hence in this paper this approach
will be followed. The algorithm begins creating a fine approximation of the
time series, and iteratively merge the lowest cost pair of segments until a
stopping criteria is met. When the pair of adjacent segments Si and Si+1 are
merged, the cost of merging the new segment with its right neighbor and
the cost of merging the Si−1 segment with its new larger neighbor must be
calculated. The pseudocode for algorithm is shown in Table 1.
This algorithm is quite powerful since the the merging cost evaluations re-
quires simple identifications of PCA models which is easy to implement and
computationally cheap to calculate. Because of this simplicities and because
PCA defines linear hyperplane, the proposed approach can be considered as
the multivariate extension of the piecewise linear approximation (PLA) based
time series segmentation and analysis tools developed by Keogh [15,16].
7
Table 1
Bottom-up segmentation algorithm
• Create initial fine approximation.
• Find the cost of merging for each pair of segments:
mergecost(i) = cost(S(ai, bi+1))
• while min(mergecost) < maxerror
· Find the cheapest pair to merge:
i = argmini(mergecost(i))
· Merge the two segments, update the ai, bi boundary indices, and recalculate the
merge costs.
mergecost(i) = cost(S(ai, bi+1))
mergecost(i− 1) = cost(S(ai−1, bi))
end
3 PCA based Segmentation Costs
Since the aim of this paper is to design a segmentation algorithm that is able
to detect changes in the correlation structure among several variables, the cost
function of the segmentation is based on the Principal Component Analysis
of the Fi covariance matrices of the segments:
Fi =1
bi − ai
bi∑
k=ai
(xk − vi) (xk − vi)T . (3)
Principal Component Analysis (PCA) is based on the decomposition of the Fi
covariance matrix Fi = UiΛiUTi into a Λi matrix which includes the eigenval-
ues of Fi in its diagonal in decreasing order, and into a Ui matrix which in-
cludes the eigenvectors corresponding to the eigenvalues in its columns. With
the use of the first few (p < n) nonzero eigenvalues and the corresponding
8
eigenvectors, the PCA model projects the correlated high-dimensional data
onto a hyperplane which is useful for the visualization and the analysis of
multivariate data:
yi,k = Λ− 1
2i,p UT
i,pxk (4)
When the PCA model has adequate number of dimensions, the distance of
the data from the p-dimensional hyperplane of the PCA model is resulted
by measurement failures, disturbances and negligible information. Hence, it is
useful to analyze the reconstruction error of the projection:
Qi,k = (xk − xk)T (xk − xk) = xT
k (I−Ui,pUTi,p)xk. (5)
The analysis of the distribution of the projected data is also informative. The
Hotelling T 2 measure is often used to calculate the distance of the mapped
data from the center of the linear subspace
T 2i,k = yT
i,kyi,k. (6)
Figure 1 illustrates these measures in case of two variables and one principal
component.
These T 2 and Q measures are often used for the monitoring of multivariate
systems and for the exploration of the errors and the causes of the errors.
The main idea of this paper is to use these measures as the measure of the
homogeneity of the segments:
costT 2(Si(ai, bi)) =1
bi − ai + 1
bi∑
k=ai
T 2i,k (7)
9
costQ(Si(ai, bi)) =1
bi − ai + 1
bi∑
k=ai
Qi,k
Fig. 1. Distance measures based on the PCA model.
4 Hierarchial Clustering of Segments and Time-Series
4.1 Distance Measure for PCA Models
An advantage in using the PLA segment representation of the time-series
is that is allows one to define a variety of distance measures to represent
the similarities between two time-series. The distance measure defined for
univariate piecewise linear models is calculated based on the endpoints of the
linear lines used to describe the segments. Unfortunately, the distances among
multivariate PCA models (i.e. hyperplanes) cannot be evaluated with this
approach. Hence, for this purpose the PCA similarity factor, SPCA, developed
by Krzanowski [17,18] is used to compare multivariate time series segments.
Consider two segments, Si and Sj of a historical data set having the same
n variables. Let the PCA models for Si and Sj consist of p PCs each. The
10
corresponding (n× k) subspaces defined by the eigenvectors of the covariance
matrices are denoted by Ui,p and Uj,p respectively. The similarity between
these subspaces is defined based on the sum of the squares of the cosines of
the angles between each principal component of Ui,p and Uj,p:
SPCA =1
p
p∑
i=1
p∑
j=1
cos2 θi,j =1
ptrace
(UT
i,pUj,pUTj,pUi,p
)(8)
Because subspaces Ui,p and Uj,p contain the p most important principal com-
ponents that account for most of the variance in their corresponding data sets,
SPCA is also a measure of similarity between the segments Si and Sj.
4.2 Hierarchical Clustering of Subsequences
The clustering of time series can be broadly classified into two categories:
• Whole clustering: the notation of clustering here is similar to that of
conventional clustering of discrete objects. Given a set of individual time
series data, the objective is to group similar time series into the same cluster.
• Subsequence clustering: Given a single time series, individual time se-
ries (subsequences) are extracted with a sliding window. Clustering is then
performed on the extracted time series.
In the interesting paper of Jessica Lin et.al. it is proven that the subsequence
clustering approach is meaningless [23]. Hence, in this paper the whole cluster-
ing approach is followed, so the extracted segments as individual time series
are clustered based on the PCA similarity factor presented in the previous
subsection.
11
One of the most widely used clustering approaches is hierarchical clustering,
due to the resulted dendrogram effectively shows the merging of the objects
into clusters at various stages of the analysis and the similarities at each stage
of the clustering procedure (see Figure 3 at the bottom). The interpretation
of the results is intuitive, which is the major reason of the application of this
method.
The SPCA similarity factors are organized in the form of a matrix. The similar-
ity matrix is then scanned for the largest value, which corresponds to the most
similar segments. These two segments are linked and the rows and columns
corresponding to the old segments are then removed from the matrix. The
rows and the columns of the similarity matrix for the new group of segments is
then recomputed. This process is repeated until all segments have been linked.
There are a variety of ways to compute the distances between the objects and
clusters in hierarchical clustering. The utilized single-linkage method assesses
similarity by measuring the distance to the farthest object in the cluster.
The results of a hierarchical clustering are usually displayed as a dendrogram,
which is a tree-shaped map of the intersample distances in the data set.
5 Application to Process Monitoring
Manual process supervision relies heavily on visual monitoring of character-
istic shapes of changes in process variables, especially their trends. Although
humans are very good at visually detecting such patterns, for a control sys-
tem software it is a difficult problem. The aim of this example is to show how
the proposed algorithms are able to detect meaningful temporal shapes from
multivariate historical process data of several sensors.
12
The monitoring of a medium and high-density polyethylene (MDPE, HDPE)
plant is considered. HDPE is versatile plastic used for household goods, pack-
aging, car parts and pipe. The plant is operated by TVK Ltd., which is the
largest Hungarian polymer production company in Hungary (www.tvk.hu).
An interesting problem with the process is that it requires to produce about
ten product grades according to market demand. The difficulty of the prob-
lem comes from the fact that there are more than ten process variables to
consider. Measurements are available in every 15 seconds (240 pro hours) on
process variables ~xk, which are the (xk,1 polymer production intensity (PE),
xk,(2,...,6) the inlet flowrates of hexene (C6in), ethylene (C2in), hydrogen (H2in),
the isobutane solvent (IBin) and the catalysts (Kat), xk,(7,...,9) the concentra-
tions of ethylene (C2), hexene (C2), and hydrogen (H2) and xk,10 the slurry in
the reactor (slurry), and xk,11 the temperature of the reactor (T ).
Before the application of the presented methods two important parameters
have to be selected. The first is the number of principal components.
With the increase of p the reconstruction error decreases. In case of p = n = 11
the reconstruction error becomes zero and the Hotelling T 2 becomes the real
distance in the whole range of the data set. If p is too small, the reconstruction
error will be large for the entire time-series. In these two extreme cases the
segmentation does not based on the internal relationships among the variables,
so equidistant segments are detected. When the number of the latent variables
is in the the range p = 3, . . . , 8, reliable segments are detected and the results
(borders of segments and dendrograms) are not sensitive to their numbers in
this range. It is because the first 3, . . . , 8 eigenvalues contain 95, . . . , 99 % of
the total variance.
The other important parameter is the number of segments. It can be deter-
13
0 5 10 15 20 25 300
0.5
1
1.5
2x 10
−5
Wei
ghte
d co
st
Weighted cost based Q
0 5 10 15 20 25 300
0.1
0.2
0.3
0.4
Number of segments
Rel
ativ
e re
duct
ion
rate
Fig. 2. The costQ and its relative reduction rate with number of segments in Ex-
ample 1.
mined by the method presented by Vasko et al in [20]. This method is based
on permutation test so as to determine whether the increase of the model
accuracy with the increase of the number of segments is due to the underlying
structure of the data or due to the noise. In this paper a similar but much
simpler method has been applied for this purpose. It is based on the modelling
error and the relative reduction of the modelling error when the number of
segments is increased by one. It depends on the applied method but similar
diagrams can be obtained in the analyzed cases. The costQ (7) can be seen
in Figure 2 as a function of the number of segments in case of Example 1
and it can be seen that 7 segments can give acceptable results (the relative
reduction rate is quite inaccurate because there is noise on the data).
In the first example consider a 40-hour long period of time with product
transition between the 15th and 20th hour. It can be followed well e.g. by
the temperature. As it has been mentioned above, the algorithms have been
searched for 7 segments with 4 principal components. The results can be seen
in Figure 3. The left column contains the figures obtained by costQ function
and the right column by costT 2 function. The vertical lines show the borders of
14
segments. The dendrograms show based on the applied PCA similarity mea-
sure (8) how similar the resulted segments are. It is important to mention that
the dendrograms must not be compared because they are not related to the
same borders of segments. Both methods determine the change very accurate
and detect two segments around the 16th hour. Both methods distinguish the
irregular operation between the 26th and 29th hour. Method based on costQ
splits the first 15 hours into two segments that method based on costT 2 does
not find, but if 6 segments are searched for by costQ, then the borders are
the same as 7 segments but the first two segments are merged. Consequently,
both methods give very similar results by product transition.
In our second example a 35-hour long period of production of a particu-
lar product was chosen to be analyzed. Both cost function, namely costQ and
costT 2 have been applied to this dataset as well. Based on the change of model
accuracy, 8 segments seemed to be acceptable by both algorithms. As it can
be seen in Figure 4, the algorithms give different results. The first 5 event-
ful hours contain several segments by both algorithms, 3 segments based on
costQ and 4 segments based on costT 2 . These first hours belong to the transi-
tion time period (see e.g. the density of slurry or the temperature) that lasts
approximately up to the 7th-8th hour. The borders of the other segments are
quite different, only the irregular operation round the 15th hour is similar:
both algorithms detect 2 segments close to this time. (The concentration of
ethylene is increased, the inlet flowrates of hexene, ethylene and isobutane sol-
vent are decreased, and the polymer production intensity is changed rapidly
during this sort time period, approximately three-quarters hour.) As it can
be seen in Figure 1, modelling error Q is more sensitive to changes in the
correlation structure of the data, since Hotelling T 2 can determine more ac-
15
curate the changes in the mean and variance of the data, even if the analyzed
data points were generated by the same model. These changes usually happen
together during product transitions but they can be different by normal oper-
ating conditions. The differences between the two approaches can be the topic
of a possible future work, but here it must be summarized that these methods
are complementary approaches and they can be applied for a segmentation
problem together rather than separately.
Humans play the most important rule in the exploitation of results of segmen-
tation. In knowledge discovery from time series the goal is to detect interesting
patterns in the series that may help to better recognize the regularities in the
observed variables and thereby improve the understanding of the system. Hu-
mans are extremely good at visual observation, and able to observe the changes
even by 10-12 variables, while it is a very difficult and challenging problem to
computers. However, more variables cannot be reviewed by humans, and when
extremely large databases have to be analyzed it is also worth having it done
by computers. For this purpose, in this paper a new method has been devel-
oped, and humans, mainly the experts of the analyzed technology, take part
in explanation and utilization of the segmentation results because this cannot
be done by computers. E.g. it has to be determined why a given segment is
different from the other one, what its cause can be and how it can be used in
the operation of the technology to reduce the amount of the off-grade prod-
ucts etc. Hence, for the exploitation of the segmentation results there is a need
for the utilization of a priori knowledge about the environment (experience
of operators, knowledge of engineers and scientists), so ’indirect information
fusion’ approach is also followed in this work.
In the current state of our project we use this tool to compare the production
16
of different products and extract homogenous segments, and the results show
that the application of the proposed tool useful to extract information about
the changes of the operation regimes of the process and process faults.
6 Conclusions
This paper presented a new algorithm for the segmentation of multivariate
time-series. The algorithm is based on the simultaneous identification of bor-
ders of the segments and the hyperplanes of the local PCA models used to
measure the homogeneity of the segments. Two homogeneity measures cor-
responding to the two typical application of PCA models have been defined.
The Q reconstruction error segments the time-series according to the change
of the correlation among the variables, while the Hotelling T 2 measure seg-
ments the time-series based on the drift of the center of the operating region.
The algorithm was applied to the monitoring of the production of high-density
polyethylene. The results suggest that the proposed tool can be applied to dis-
tinguish and cluster typical operational conditions and analyze product grade
transitions of process systems.
Beside the industrial application examples synthetic datasets can be analyzed
as well to convince the readers about the usefulness of the method. For this
purpose, the MATLAB code of the algorithm is available from our website
(www.fmt.vein.hu/softcomp/segment), so the readers can easily test the pro-
posed method on their own datasets.
The application of the identified segments in intelligent query system designed
for multivariate historical process databases is an interesting and useful idea
17
for future research.
Acknowledgements
The support of the Cooperative Research Center (2001-II-1), the Hungarian
Ministry of Education (FKFP - 0073 / 2001), the Hungarian Science Foun-
dation (T037600) and the the Janos Bolyai Research Fellowship is gratefully
acknowledged.
18
0 5 10 15 20 25 30 35 400
0.5
1
C2 (
w%
)
0 5 10 15 20 25 30 35 400
0.5
1
C6 (
w%
)
0 5 10 15 20 25 30 35 400.5
0.55
0.6
H2 (
mol
%)
0 5 10 15 20 25 30 35 400.8
0.85
0.9
slur
ry (
g/cm
3 )
0 5 10 15 20 25 30 35 400
0.2
0.4
T (
o C)
Time [h]
0 5 10 15 20 25 30 35 400
0.5
1
C2 (
w%
)
0 5 10 15 20 25 30 35 400
0.5
1
C6 (
w%
)
0 5 10 15 20 25 30 35 400.5
0.55
0.6
H2 (
mol
%)
0 5 10 15 20 25 30 35 400.8
0.85
0.9
slur
ry (
g/cm
3 )
0 5 10 15 20 25 30 35 400
0.2
0.4
T (
o C)
Time [h]
0 5 10 15 20 25 30 35 400
0.5
1
PE (
t/h)
0 5 10 15 20 25 30 35 400
0.5
1
C6i
n (kg
/h)
0 5 10 15 20 25 30 35 40
0.70.80.9
C2i
n (t/h
)
0 5 10 15 20 25 30 35 400.2
0.3
0.4
H2i
n (kg
/h)
0 5 10 15 20 25 30 35 400.5
0.6
0.7
IBin
(t/h
)
0 5 10 15 20 25 30 35 400
0.5
1
Kat
in
Time [h]
0 5 10 15 20 25 30 35 400
0.5
1
PE (
t/h)
0 5 10 15 20 25 30 35 400
0.5
1
C6i
n (kg
/h)
0 5 10 15 20 25 30 35 40
0.70.80.9
C2i
n (t/h
)
0 5 10 15 20 25 30 35 400.2
0.3
0.4
H2i
n (kg
/h)
0 5 10 15 20 25 30 35 400.5
0.6
0.7
IBin
(t/h
)
0 5 10 15 20 25 30 35 400
0.5
1
Kat
in
Time [h]
1 5 2 7 3 4 6
0.25
0.3
0.35
0.4
0.45
Lev
el
1 4 3 5 6 7 2
0.24
0.26
0.28
0.3
0.32
0.34
0.36
0.38
0.4
0.42
0.44
Lev
el
Fig. 3. The plots in the left side show the borders of the segments and the den-
drogram based on costQ, the plots in the right side based on costT 2 with product
transition. (Example 1)
References
[1] S. Kivikunnas, Overview of process trend analysis methods and applications,
ERUDIT Workshop on Applications in Pulp and Paper Industry (1998) CD
19
0 5 10 15 20 25 30 35
0.35
0.4
0.45
C2 (
w%
)
0 5 10 15 20 25 30 350
0.5
1
C6 (
w%
)
0 5 10 15 20 25 30 350.4
0.5
0.6
H2 (
mol
%)
0 5 10 15 20 25 30 350.8
0.9
1
slur
ry (
g/cm
3 )
0 5 10 15 20 25 30 350
0.2
0.4
T (
o C)
Time [h]
0 5 10 15 20 25 30 35
0.35
0.4
0.45
C2 (
w%
)
0 5 10 15 20 25 30 350
0.5
1
C6 (
w%
)
0 5 10 15 20 25 30 350.4
0.5
0.6
H2 (
mol
%)
0 5 10 15 20 25 30 350.8
0.9
1
slur
ry (
g/cm
3 )
0 5 10 15 20 25 30 350
0.2
0.4
T (
o C)
Time [h]
0 5 10 15 20 25 30 350
0.5
1
PE (
t/h)
0 5 10 15 20 25 30 35
0.6
0.8
1
C6i
n (kg
/h)
0 5 10 15 20 25 30 35
0.70.80.9
C2i
n (t/h
)
0 5 10 15 20 25 30 350.2
0.3
0.4
H2i
n (kg
/h)
0 5 10 15 20 25 30 350.4
0.6
0.8
IBin
(t/h
)
0 5 10 15 20 25 30 35
0.350.4
0.45
Kat
in
Time [h]
0 5 10 15 20 25 30 350
0.5
1
PE (
t/h)
0 5 10 15 20 25 30 35
0.6
0.8
1
C6i
n (kg
/h)
0 5 10 15 20 25 30 35
0.70.80.9
C2i
n (t/h
)
0 5 10 15 20 25 30 350.2
0.3
0.4H
2in (
kg/h
)
0 5 10 15 20 25 30 350.4
0.6
0.8
IBin
(t/h
)
0 5 10 15 20 25 30 35
0.350.4
0.45
Kat
in
Time [h]
1 2 3 4 7 5 8 60.24
0.26
0.28
0.3
0.32
0.34
0.36
0.38
0.4
0.42
Lev
el
1 2 5 8 4 7 6 3
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Lev
el
Fig. 4. The plots in the left side show the borders of the segments and the dendro-
gram based on costQ, the plots in the right side based on costT 2 without transition.
(Example 2)
ROM.
[2] G. Stephanopoulos, C. Han, Intelligent systems in process engineering: A
review, Comput. Chem. Eng. 20 (1996) 743–791.
[3] J. C. Wong, K. McDonald, A. Palazoglu, Classification of process trends based
20
on fuzzified symbolic representation and hidden markov models, Journal of
Process Control 8 (1998) 395–408.
[4] M. Last, Y. Klein, A. Kandel, Knowledge discovery in time series databases,
IEEE Transactions on Systems, Man, and Cybernetics 31 (1) (2000) 160–169.
[5] W. Elmenreich, An introduction to sensor fusion, Research Report 47/2001,
Technische Universitat Wien, Institut fur Technische Informatik, Treitlstr. 1-
3/182-1, 1040 Vienna, Austria (2001).
[6] M. E. Tipping, C. M. Bishop, Mixtures of probabilistic principal components
analysis, Neural Computation 11 (1999) 443–482.
[7] B. Karlsson, J.-O. Jrrhed, P. Wide, A fusion toolbox for sensor data fusion in
industrial recycling, IEEE Tansactions on Instrumentation and Measurement
51 (1) (2002) 144–149.
[8] G. L. Marcialis, F. Roli, Fusion of lda and pca for face verification, Springer-
Verlag, London, UK (2002) 30–38.
[9] F. Samadzadegan, Fusion techniques in remote sensing, The International
Archives of the Photogrammetry, Remote Sensing.
[10] A. Negiz, A. Cinar, Monitoring of multivariable dynamic processes and sensor
auditing, Journal of Process Control 8 (5) (1998) 375–380.
[11] C. Natale, R. Paolesse, A. Macagnano, A. Mantini, A. DAmico, A. Legin,
L. Lvova, A. Rudnitskaya, Y. Vlasov, Electronic nose and electronic tongue
integration for improved classification of clinical and food samples, Sensors and
Actuators B 64 (2000) 15–21.
[12] A. Charef, A. Ghauch, P. Baussand, M. Martin-Bouyer, Water quality
monitoring using a smart sensing system, Measurement 28 (2000) 219–224.
21
[13] C. Cimander, T. Bachinger, C.-F. Mandenius, Integration of distributed multi-
analyzer monitoring and control in bioprocessing based on a real-time expert
system, Journal of Biotechnology 103 (2003) 237–248.
[14] C. Cimander, M. Carlsson, C.-F. Mandenius, Sensor fusion for on-line
monitoring of yoghurt fermentation, Journal of Biotechnology 99 (2002) 237–
248.
[15] E. Keogh, S. Chu, D. Hart, M. Pazzani, An online algorithm for
segmenting time series, IEEE International Conference on Data Mining (2001)
http://citeseer.nj.nec.com/keogh01online.html.
[16] E. Keogh, M. Pazzani, An enhanced representation of time series which allows
fast and accurate classification, clustering and relevance feedback, 4th Int. Conf.
on KDD. (1998) 239–243.
[17] W. Krzanowsky, Between group comparison of principal components, J. Amer.
Stat. Assoc. (1979) 703–707.
[18] A. Singhal, D. Seborg, Matching patterns from historical data using PCA and
distance similarity factors, Proceedings of the American Control Conference
(2001) 1759–1764.
[19] G. McKee, What can be fused?, Multisensor Fusion for Computer Vision, Nato
Advanced Studies Institute Series F (99).
[20] K. Vasko, H. Toivonen, Estimating the number of segments in time series data
using permutation tests, IEEE International Conference on Data Mining (2002)
466–473.
[21] J. Himberg, K. Korpiaho, H. Mannila, J. Tikanmaki, H. T. Toivonen,
Time-series segmentation for context recognition in mobile devices, IEEE
International Conference on Data Mining (ICDM01), San Jose, California
(2001) 203–210.
22
[22] J. Abonyi, B. Feil, S. Nemeth, P. Arva, Fuzzy clustering time series
segmentation, IDA 2003 Conference (2003) http://www.fmt.vein.hu/softcomp.
[23] J. Lin, E. Keogh, W. Truppel, Clustering of streaming time series is meaningless:
Implications for previous and future research, SIGKDD’03.
23