landman, rinat; kortela, jukka; sun, qiang; jämsä-jounela, sirkka … · landman, rinat; kortela,...

39
This is an electronic reprint of the original article. This reprint may differ from the original in pagination and typographic detail. Powered by TCPDF (www.tcpdf.org) This material is protected by copyright and other intellectual property rights, and duplication or sale of all or part of any of the repository collections is not permitted, except that material may be duplicated by you for your research use or educational purposes in electronic or print form. You must obtain permission for any other use. Electronic or print copies may not be offered, whether for sale or otherwise to anyone who is not an authorised user. Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of oscillations in control loops using data-driven causality and plant connectivity Published in: Computers and Chemical Engineering DOI: 10.1016/j.compchemeng.2014.09.017 Published: 01/01/2014 Document Version Early version, also known as pre-print Please cite the original version: Landman, R., Kortela, J., Sun, Q., & Jämsä-Jounela, S-L. (2014). Fault propagation analysis of oscillations in control loops using data-driven causality and plant connectivity. Computers and Chemical Engineering, 71, 446- 456. https://doi.org/10.1016/j.compchemeng.2014.09.017

Upload: others

Post on 12-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

This is an electronic reprint of the original article.This reprint may differ from the original in pagination and typographic detail.

Powered by TCPDF (www.tcpdf.org)

This material is protected by copyright and other intellectual property rights, and duplication or sale of all or part of any of the repository collections is not permitted, except that material may be duplicated by you for your research use or educational purposes in electronic or print form. You must obtain permission for any other use. Electronic or print copies may not be offered, whether for sale or otherwise to anyone who is not an authorised user.

Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-LiisaFault propagation analysis of oscillations in control loops using data-driven causality and plantconnectivity

Published in:Computers and Chemical Engineering

DOI:10.1016/j.compchemeng.2014.09.017

Published: 01/01/2014

Document VersionEarly version, also known as pre-print

Please cite the original version:Landman, R., Kortela, J., Sun, Q., & Jämsä-Jounela, S-L. (2014). Fault propagation analysis of oscillations incontrol loops using data-driven causality and plant connectivity. Computers and Chemical Engineering, 71, 446-456. https://doi.org/10.1016/j.compchemeng.2014.09.017

Page 2: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Fault propagation analysis of oscillations in control

loops using data-driven causality and plant connectivity

R. Landman, J. Kortela, Q. Sun, S.-L. Jamsa-Jounela

Aalto University, School of Chemical Technology, Process Control and AutomationResearch Group,P.O Box 16100 FI-00076 Espoo Finland

(e-mail: [email protected]).

Abstract

Oscillations in control loops are one of the most prevalent problems in indus-

trial processes. Due to their adverse effect on the overall process performance,

finding how oscillations propagate through the process units is of major im-

portance. This paper presents a method for integrating process causality

and topology which ultimately enables to determine the propagation path of

oscillations in control loops. The integration is performed using a dedicated

search algorithm which validates the quantitative results of the data-driven

causality using the qualitative information on plant connectivity. The out-

come is an enhanced causal model which reveals the propagation path. The

analysis is demonstrated on a case study of an industrial paperboard machine

with multiple oscillations in its drying section due to valve stiction.

Keywords: Causal analysis, Fault propagation, Control loops, Paper

machine, Connectivity information, Industrial application.

Preprint submitted to Computers & Chemical Engineering July 9, 2014

Page 3: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

1. Introduction

Large-scale industrial systems are often subject to abnormal events such

as faulty operations, external disturbances and control system failures lead-

ing to low productivity, increased operational costs and sometimes even haz-

ardous operations (Yang et al., 2010a). In particular, oscillations in control

loops are very common in industrial processes and lead to poor control perfor-

mance, low product quality and excessive energy consumption (Yuan & Qin,

2013). Oscillations in control loops are typically caused by valve problems

such as excessive friction (stiction), poor tuning of controllers or controller

interactions (Hagglund, 1995). In large-scale systems with many interacting

control loops, oscillations can easily propagate through the process units in

multiple paths, making it difficult to determine the most probable propaga-

tion path.

In recent years, capturing causality between different process variables

has become a vital tool in the diagnosis of faulty systems due to its ability

to identify the propagation path of disturbances. (Heim et al., 2002; Yang

et al., 2012). Typically, the outcome of causal analysis is a causal model in

the form of a signed directed graph (SDG) representing process variables as

nodes and causal relationships as arcs (Heim et al., 2002).

SDGs can be constructed from process knowledge and/or process data.

Models based on process knowledge can be developed using mathematical

equations describing the system (Maurya et al., 2003, 2004, 2006) or they can

be established directly from piping and instrumentation diagrams (P&IDs).

Models which are based on the physical layout of the process are typically re-

ferred to as topology-based models or process connectivity models (Di Geron-

2

Page 4: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

imo Gil et al., 2011). Several techniques for extracting plant connectivity

information from P&IDs have been developed in recent years (Yim et al.,

2006; Thambirajah et al., 2007, 2009). Topology-based models are qualita-

tive, i.e., they do not provide any information on the level of interactions

among variables.

On the other hand, data-driven causal analysis utilizes historical process

data in the form of time series and measures to what extent the time se-

ries corresponding to specific variables influence each other. Usually, the

analysis yields a causality matrix which contains the structural information

of the causal model. Among the most commonly used methods are the

cross-correlation (Bauer & Thornhill, 2008), the Granger causality (Bressler

& Seth, 2010; Granger, 1969) and the transfer entropy (Bauer et al., 2007;

Schreiber, 2000) methods. Unlike process knowledge-based modeling, data-

driven modeling does not require prior information on the intrinsic system.

Moreover, it produces a quantitative model due to its ability to estimate

the level of interactions among variables. However, the data-driven meth-

ods suffer from several limitations and drawbacks. The main difficulty in

data-driven causal analysis is in establishing the statistical significance of

the results, hereby eliminating redundant links from the causal model. Fur-

thermore, occasionally the causal model suggests several hypotheses for the

root cause or in the case of using several methods each method points to a

different potential source. In such occasions, it is essential to utilize process

knowledge for isolating the most probable root cause. Indeed, both Bauer

et al. (2005) and Yang et al. (2012) concluded that process insights derived

from process schematic or site expertise are still essential for validating the

3

Page 5: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

results of the data-driven methods.

Consequently, several attempts have been made in recent years to com-

bine data-driven causal analysis with topology-based models. For instance,

Yang et al. (2010b, 2012) applied the cross-correlation and transfer entropy

methods on an industrial case study in order to validate an SDG based on

process knowledge and vice versa. Thambirajah et al. (2009) introduced the

cause and effect analyzer which combines a causality matrix derived from

process data and qualitative information about the process in the form of a

connectivity matrix which is captured from an XML (extensive markup lan-

guage) description of the process schematic. Then, if more than one probable

root causes are detected, a search of the process connectivity matrix deter-

mines whether a propagation path is feasible and which one is most likely to

be the root cause and propagation path. However, in cases where the system

has a high degree of connectivity among the process units, finding feasible

propagation paths among the process components might not be sufficient to

capture precisely the causal topology.

The present study was designed to identify the propagation path of os-

cillations in control loops by utilizing a dedicated search algorithm which

validates each entry in the causality matrix obtained from the data-driven

analysis using the connectivity matrix extracted from the P&ID. The search

algorithm has two main functionalities: finding feasible propagation paths

between two control elements and determining whether a path is direct or

indirect. Consequently, the entries in the causality matrix which do not rep-

resent genuine direct interactions are excluded and the outcome is a refined

causality matrix which contains the structural information of the propaga-

4

Page 6: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

tion path. The efficiency of the analysis is successfully demonstrated on a

case study of an industrial board machine utilizing the Granger causality

(GC) to obtain the initial causality matrix while the connectivity matrix was

captured from an AutoCAD P&ID as an XML schema.

This type of analysis can be applied in conjunction with different fault

detection methods (Venkatasubramanian et al., 2003) in order to facilitate

the fault diagnosis procedure and expedite process recovery. Consequently,

it can assist in identifying the process units of concern once a certain fault

is detected whilst gaining valuable insights on the process dynamics.

This paper is organized as follows. Section 2 describes the data-driven

and topology based modeling techniques applied in the current study and

how they are combined using the search algorithm. Section 3 describes the

process case study and the fault propagation analysis. The paper ends with

concluding remarks in Section 4.

2. Fault Propagation Analysis

This section first provides an overview on topology based models and

data-driven causal analysis. Due to practical reasons, the section mainly

concentrates on the methods which were implemented in the current study.

Then, the refinement procedure using the search algorithm is described in

detail including each of its functionalities.

2.1. Generation of a Topology-based Model

There are two types of topology-based models: causal digraph and con-

nectivity matrix which can be considered as a graphical and a numerical

representation of the process schematics, respectively. The digraph reflects

5

Page 7: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

physical or signal flows between the equipment and instruments based on the

physical layout of the components it represents. Similarly to the digraph, the

connectivity matrix indicates the relationships between process components

in the form of a binary matrix whose elements are assigned according to the

existence of a directional connection from the row header component to the

column header component (Sun, 2013; Thambirajah et al., 2009).

In this study, topology data was extracted from an electronic P&ID which

is drawn by the specialized Autodesk AutoCAD P&ID drafting application

that has been developed based on Autodesk AutoCAD. In the developed ap-

plication, the topology data is exported in the format of ISO 15926-compliant

XML scheme XMpLant (Noumenon, 2008).

The automated generation of topology information includes the following

tasks. First, the schematic information on the initial component and the

terminal component of every line segment, such as pipes and control signals

is included in the drawing. Secondly, this information is attained through

the database object of the drawing which includes all the topology informa-

tion, namely, the names of the process components, the coordinates of the

components and the connections among them. Finally, this data is further

processed by MATLAB program and converted into connectivity information

which includes the tags, coordinates, and the connectivity between process

components (Sun, 2013).

2.2. Data-Driven Causal Analysis

Yang & Xiao (2012) have recently reviewed and evaluated different data-

driven methods for capturing causality. In practice, the appropriate data-

based method should be selected carefully based on process dynamics, the

6

Page 8: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

available data and type of fault. The outcome of the analysis is a causality

matrix where each element (i,j) in the matrix represents the causal relation-

ship from variable i to variable j. In this study, the analysis is aimed to

identify causal relationships among controllers, thus, the nodes in the causal

model represent the controllers. We used the Granger causality method to

obtain the initial causality matrix. However, due to the high level of connec-

tivity among the controllers, we employed the frequency domain methods as

well to verify the final results of the analysis and to gain further insights on

the level of interactions among the controllers. A description of the methods

which were employed in the current study is given below.

2.2.1. Time Domain Granger Causality (GC)

Granger causality has received great attention in many areas due to its

ease of implementation and efficiency when investigating causal relationships

(Seth, 2005; Yuan & Qin, 2013). Moreover, the method has been extended to

multivariate (MV) time series analysis (Geweke, 1982) which makes it highly

beneficial when investigating large-scale systems.

The basic notion of the GC is that if one time series affects another

series, then the knowledge of the former series should help to predict the

future values of the latter one (Granger, 1969). To illustrate the concept of

the method, consider two time series X1(t) and X2(t) and their corresponding

autoregressive (AR) model:

X1(t) =

p∑j=1

A11,jX1(t− j) +

p∑j=1

A12,jX2(t− j) + ε1(t)

X2(t) =

p∑j=1

A21,jX1(t− j) +

p∑j=1

A22,jX2(t− j) + ε2(t)

(1)

7

Page 9: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

where p is the model order and ε1, ε2 are the residuals for each series. (1) is

typically referred to as the unrestricted model (Bressler & Seth, 2010). The

GC from X2 to X1 is defined as:

Fx2→x1 = ln

[var(ε′1)

var(ε1)

](2)

where ε′1 is obtained from (1) by omitting all A12 coefficients for all j (Seth,

2010). The model after omitting all A12 coefficients is typically referred to

as the restricted model (Bressler & Seth, 2010). The statistical significance

of the GC can be determined via the F -statistic test (Greene, 2002) :

F =RRSr −RRSur

RRSur× T − 2p− 1

p(3)

where RSSr and RSSur are the residual sum of squares of the restricted and

unrestricted models respectively and T is the total number of observations.

For MV processes, the MV (conditional) GC (Guo et al., 2008), which is

based on the expansion of a univariate Auto Regressive (AR) model to a

Multivariate Auto Regressive (MAR) model to include all measured variables

can be used. The method requires that the time series are stochastic and

wide sense stationary (WSS). Otherwise, the AR model can produce spurious

results (Granger & Newbold, 1974). The main limitation of the method lies in

its linearity, thus its application to non-linear systems may not be appropriate

(Bressler & Seth, 2010).

2.2.2. Frequency domain methods

The frequency domain methods represent the energy transfer between

each pair of time series at each frequency. The evaluation of directional

interactions in the frequency domain is especially useful for a process with

8

Page 10: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

oscillatory behavior (Faes et al., 2010). The methods are applied by esti-

mating the MAR model of the time series followed by Fourier transform into

frequency domain. A MAR model is defined as:x1(t)

...

xN(t)

=

p∑r=1

Ar

x1(t− r)

...

xN(t− r)

+

e1(t)

...

eN(t)

(4)

where Xt = [x1(t), x2(t), ...xN(t)] is the vector of N process variables, Et =

[e1(t), e2(t), ...eN(t)] is N dimensional vector of the MV noise terms, A1, A2, ...Ap

are NxN matrices of the model coefficients and p is the model order. By

applying the Z transform operation (Z−i = e−i2πfj) on (4), it is transformed

into frequency domain:

X(f) = H(f)E(f) (5)

where H(f) is the transfer function of the MAR model with the following

relation to the model coefficients:

H(f) = (I − A(f))−1 = A(f)−1

= (I −p∑r=1

ArZ−r)−1

(6)

Σ is the noise covariance matrix of the model and is defined as:

Σ =

σ211 · · · σ1N...

. . ....

σN1 · · · σ2NN

(7)

where σ refers to covariance and σ2 to the variance of the noise terms. Notice

that since it is assumed that the noise terms are uncorrelated, Σ is diagonal

9

Page 11: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

matrix and independent on frequency. If the model coefficients were esti-

mated correctly, the diagonal elements of the Σ matrix should be relatively

small.

The Directed Transfer Function (DTF) introduced by Kaminski & Bli-

nowska (1991) is defined as:

γij(f) =Hij(f)√∑Nj=1|Hij(f)|2

(8)

The DTF represents the signal power that spreads from variable j to vari-

able i over all possible pathways. The DTF is constructed solely from the

transfer function and does not depend on the noise covariance matrix of the

MAR model. In contrast, the Partial Directed Coherence (PDC) (Baccala &

Sameshima, 2001) reveals only the power of the direct interactions between

each pair of variables. The PDC from variable j to i is defined as:

πij(f) =Aij(f)√∑Ni=1|Aij(f)|2

(9)

Notice that the PDC is a function of Aij(f) alone and similarly to the DTF

it does not depend on the noise covariance matrix. While the DTF can be

seen as a spectral measure of the total causal influence of one variable on

the other, the PDC can be seen as a measure of the direct influence of one

variable on the other. Furthermore, the PDC can be seen as the frequency

domain representation of the GC (Baccala & Sameshima, 2001), thus, it can

efficiently complement the time domain GC analysis.

According to Winterhalder et al. (2005), if the diagonal terms of the noise

covariance matrix differ by orders of magnitude, a false detection of influences

from a low variance variable to a high variance variable may occur. Therefore,

10

Page 12: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Baccala et al. (2007) and Yameshita et al. (2005) proposed a renormalized

definition of the PDC and the DTF respectively using the variance of the

noise terms.

Several techniques are available for determining the statistical significance

of the PDC and the DTF (Faes et al., 2010; Schelter et al., 2005; Takahashi

et al., 2007).

2.3. Refinement of the Causality Matrix

The refinement of the causality matrix is based on the process connectiv-

ity information. The aim of this operation is to eliminate all the values in

the causality matrix which do not represent direct causal interactions. The

realization of the refinement procedure of nxn causality matrix X is obtained

according to the following implementation:

1. In matrix X, select the next non-zero (i,j)th

element that has not been

tested.

2. Check if there is a direct physical path from controller i to controller j

using the search algorithm.

3. If there is no direct path from controller i to j, set X(i,j) to zero.

4. Repeat steps 1-3 until all non-zero elements in matrix X are tested.

Note that we define a direct path from controller i to controller j if it does

not transverse any other controller other than j. Recall that the data-driven

methods identify interactions according to their level of influence since when

a fault propagates along a certain path it may stop at some point due to signal

attenuation (Yang et al., 2010a). As a result, the data-driven methods cap-

ture only the interactions associated with the fault while the topology-based

11

Page 13: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

models capture all physical interactions between the process components.

Hence, in the refinement procedure, the connectivity information is used to

validate the results of the data-driven causal analysis and not the other way

around.

2.3.1. The Search Algorithm

The search algorithm consists of two steps: first, it finds whether a feasible

propagation path between two control elements exists using the connectivity

matrix. Then, if a physical path is found, it checks whether the path is

direct or indirect. A detailed description of both steps is given below while

an example of the algorithm implementation in the investigated case study

is shown in Section 3.4.1.

Search for a Physical Path

The algorithm for finding a physical path between two process compo-

nents is a generic algorithm which is based on a graph traversal which searches

a series of nodes, ensuring that each node is only traversed once (Thambirajah

et al., 2009). Particularly, the method uses the depth-first search algorithm

whereby the search begins at a specific node and continues along a set of

nodes until it reaches termination or the target node (Thambirajah et al.,

2009). This recursive algorithm starts by moving from the row in the con-

nectivity matrix representing the ’cause’ variable in the causality matrix and

then searches for ’1’ which indicates that the row element is connected to

the column element. The search then proceeds to the row now representing

the column element and so on until one of the following conditions occurs:

its reaches the ’effect’ variable and hereby the path is identified, it reaches a

12

Page 14: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

component which had been visited before (i.e., there is a circulation path),

or it reaches a component that is not connected to any other element. If

one of the two latter scenarios occurs, the algorithm backtracks to other

nodes until it checks every possible path originating from the ’cause’ variable

(Thambirajah et al., 2009).

In this study, the ’cause’ and ’effect’ variables represent control elements,

thus, the elements used in the search are the components which are directly

connected to the control elements via signal lines or pipe lines (Thambirajah

et al., 2009). The search first begins by checking which equipment elements

are attached to the ’cause’ controller and then proceeds from those elements

to find if there is a path leading to the ’effect’ controller. The prolog code

equivalent to this algorithm was described in detail by Yim et al. (2006)

while few generic examples for the algorithm implementation were depicted

by Thambirajah et al. (2007, 2009).

Search for a Direct Path

Once a physical path between the ’cause’ variable and the ’effect’ variable

is found, an additional unique algorithm is employed to find if it is direct or

not. According to (Jiang et al., 2008) a direct interaction between controller i

and controller j exists if the output of controller i can directly affect controller

j without going through the controller output of any other controller. Based

on this concept, the algorithm checks every possible physical path originating

from the component connected to controller i to see if it intersects with an

element which belongs to a controller that is neither i or j. Namely, if the

path between controllers i and j traverses a controller that is neither i or j,

then the path cannot be considered as direct.

13

Page 15: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

In the initial phase of implementation, all the components in the connectivity

matrix are classified into four categories: equipment, controllers, sensors and

valves. The algorithm checks the type of each element in each physical path

that had been found in the previous step. If it finds a control element (i.e.,

valve, controller or sensor) which belongs to a control loop that is neither the

’cause’ nor the ’effect’, the corresponding path is indirect. Otherwise, if the

component belongs to the ’effect’ control loop, the path is direct. The logic

behind this algorithm is illustrated in Fig. 1.

3. Process Case Study

In this section, we first describe the process case study. Next, spectral

analysis is applied to identify the variables associated with the fault. Finally,

the fault propagation analysis is applied along with a demonstration of the

search algorithm. The data-driven causal analysis involves the implementa-

tion of the time domain Granger causality (GC) while the frequency domain

methods are applied as an auxiliary step in order to provide further insights

on the level of interactions among the controllers.

3.1. Process Description

The process case study is a large-scale board machine (BM) which pro-

duces a three-layer liquid packaging boards and board cups. The analysis

is focused on the drying section of the BM since the web conditions in this

section such as temperature and moisture content have a significant effect on

the board quality, thus, they require proper control and monitoring (Karls-

son, 2000). Moreover, due to the high connectivity among the process units

14

Page 16: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Find a direct path from controller i to j

Is there a physical path between controller i and j?

Check the next physical path

from i to j

Check the next element in the existing path

Is it a control elment?

Does the element belong to control loop i

or j

Yes

NoThere is no direct path from

controller i to j

Yes

NoNo

There is a direct path from controller i to j. Return the

direct path

Does the element belong

to loop i

Yes

No

Yes

Figure 1: The logic of finding a direct path between two controllers

and the control elements in this section, faults can easily propagate through

its subsystems.

In the drying section, the remains of excess water in the web are evapo-

rated to achieve the desirable moisture content in the board using steam-filled

drying cylinders. The condensing steam in the cylinders releases latent heat

which is used to evaporate the bound water in the web. The condensate

from the cylinders is collected by siphons to condensate tanks where steam

15

Page 17: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

and condensate are separated. Steam is then delivered back to the process

and condensate is returned to a power plant. A scheme of the drying sec-

tion and its control loops can be seen in Fig. 2. The drying section contains

74 drying cylinders in total which are divided into six steam groups (SG).

Each SG and its corresponding condensate tank (CT) form a single drying

group (DG). Each DG has its own controllers to control the steam pressure,

the steam pressure difference between steam and condensate headers and the

level of the condensate. The pressure controllers (PCs) are used to control

steam pressure in each SG using 5 and/or 10 bar pressurized steam headers

(denoted as the red lines at the top of Fig. 2). The pressure difference control

between the steam headers and the condensate tanks is important for proper

operation of the drying section since condensate removal with a siphon re-

quires a proper pressure difference. This is achieved by manipulating the

control valves in the steam outlet of the condensate tanks (CTs) using pres-

sure difference controllers. The level of the condensate tanks is controlled by

regulating their outlet flow valves using level controllers (LCs).

Valve stiction, one of the main faults in the drying section, has a detrimen-

tal effect on the control loops performance and the board quality (Pozo Gar-

cia et al., 2013). The present case study entails a valve stiction in the pressure

controller PC1652 and its effect on the interacting loops of the drying section

of the board machine. The stiction diagnosis is based on the long-term main-

tenance records of the plant. The normalized time series of the controlled

variables (PVs) are shown in Fig. 3 where the measurement of PC1652 is

colored in red. The series were normalized by removing the mean values and

scaling to a unit standard deviation. The sampling interval is 10 seconds and

16

Page 18: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Figure 2: Flow sheet of the drying section. Red lines indicate steam pipes, blue lines

indicate condensate pipes and purple lines indicate mixed flow of steam and condensate

(PI=pressure indicator, PC=pressure controller, LC=level controller, SG=steam group,

C= condensate tank)

3000 samples were included in the analysis. There were slight changes in the

set points during the episode, particularly in control loop PC654 (after 2000

samples). Oscillating signals can be clearly seen in loops PC1653, PC651,

PC652, PC653, LC652, PC670, PC1652, PC671, PC672, PC673, LC653 and

LC654.

17

Page 19: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

0 500 1000 1500 2000 2500 3000

PI667PC668

PC1653LC661PC651PC652LC651PC653PC670LC652

PC1652PC671LC653PC654PC672LC654PC659PC673LC657PC660PC674LC658

Samples

Figure 3: Process measurements

3.2. Variable Subset Selection

The aim of this step is to select the subset of variables which are the

most pertinent to the fault. This operation can reduce the dimensionality of

the analysis, improve the results and facilitate their interpretation (Yuan &

Qin, 2013). Commonly used methods for this purpose are different clustering

algorithms, spectral and oscillation analysis or principal component analysis

(PCA) (Bauer et al., 2005; Yuan & Qin, 2013). In this study, the power

spectra of the time series were examined in order to detect measurements

with similar dynamic behavior. The power spectra of the process controlled

variables (PVs) can be seen in Fig. 4 where the spectra of PC1652 is shown

18

Page 20: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

in red.

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1PI667

PC668PC1653

LC661PC651PC652LC651PC653PC670LC652

PC1652PC671LC653PC654PC672LC654PC659PC673LC657PC660PC674LC658

Frequency/Fs

Figure 4: Spectra of the process measurements (PVs)

The results show that the most prominent oscillation occurs at a fre-

quency of nearly 0.007 Hz (0.07 on the frequency axis) corresponding to

1/0.07 ≈ 14 samples per cycle. The loops oscillating at a common frequency

are: PC668, PC1653, PC651, PC652, PC653, PC670, LC652, PC1652, PC671,

LC653, PC672 and PC673. Thus, the disturbance is mainly affecting SG1,

SG2 and SG3.

3.3. Results of the GC Analysis

Seeing that this study is concerned with control loops, the concept of the

control loop digraph (Jiang et al., 2008) was adopted for the causal analysis.

According to this concept, the GC method was applied by evaluating the in-

fluences of the controllers outputs (OPs) on the process controlled variables

19

Page 21: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

(PVs) and included only the control loops which were found to be oscillating

at the same frequency based on the spectral analysis. The time domain MV

(conditional) GC analysis was applied according to Guo et al. (2008). The

MAR model estimation was performed using the least squares method and

model order was chosen based on the AIC criteria (p = 10). In addition to

mean removal and division by standard deviation, further pre-processing of

the data was required to ensure that the series are WSS. The stationarity

of the series was examined by testing for ’unit roots’ using the Augmented

Dickey-Fuller (ADF) test (Seth, 2010). Time series which were found to be

non-stationary were differentiated (i.e., x′t = xt − xt−1). The statistical sig-

nificance was determined via the F -statistic test (Granger, 1969) and the

results were corrected using the Bonferrori correction for multiple compar-

isons with a p-value of 0.01 (Seth, 2010). The initial causality matrix based

on the GC analysis is shown in Fig. 5. Each (i,j) entry in the causality

matrix is the GC from OPi to PVj. Zeros correspond to GC values which

failed the p-value significance testing.

3.4. Construction of The Causal Model

This step involves the utilization of the search algorithm in order to refine

the causality matrix. First, the search algorithm is demonstrated and then

the results of the causality matrix refinement are given.

3.4.1. Demonstration of The Search Algorithm

The first example of the output of the search algorithm is given for the

entry (PC652 → PC651) in the causality matrix (marked in green in Fig

5) and is presented in Fig. 6. The corresponding connecitivty matrix is

20

Page 22: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Fig. 8. The initial causality matrix based on the GC analysis

7.2 Construction of the causal model

This step involves the utilization of the search algorithm in order to refine the causality matrix. The procedure was implemented on the GC causality matrix according to the scheme presented in Fig. 1. First, the search algorithm is demonstrated and then the results of the GC causality matrix refinement are given.

7.2.1 Demonstration of the search algorithm

An example for the search algorithm output results for the entry (PC652→ PC651) in the causality matrix (bolded in green) is presented below:

PC1653 PC651 PC652 PC653 PC670 LC652 PC1652 PC671 LC653 PC673

PC1653 - 0 0.072 0 0 0 0 0 0.028 0

PC651 0.039 - 0.056 0 0 0 0 0 0 0

PC652 0.065 0.016 - 0 0 0 0 0 0 0

PC653 0.014 0.019 0.017 - 0.017 0 0.024 0.018 0 0

PC670 0.018 0.029 0.031 0 - 0 0.018 0 0 0

LC652 0 0 0 0 0 - 0 0 0 0

PC1652 0.024 0.013 0 0.016 0.018 0 - 0.113 0.044 0.032

PC671 0 0 0 0.016 0.029 0 0.144 - 0.074 0.031

LC653 0.105 0.013 0.014 0.015 0 0.068 0.019 0.021 - 0.012

PC673 0 0 0 0 0 0 0 0 0 -

Checking cause and effect relationship between PI-651 and PI-652... PI-651 is connected to SG1_Steam_Line There are 2 feasible propagation paths from SG1_Steam_Line to PI-652... Path 1 is Path 2 is SG1_Steam_Line SG1_Steam_Line PI-651 SG-1 PC-651 PI-652 PV-651 SG-1 PI-652 There is a direct link between SG1_Steam_Line and PI-652... The direct path is: SG1_Steam_Line SG-1 PI-652

Figure 5: The initial causality matrix based on the GC analysis

shown in Table 1. Note that according to the search algorithm, PV denotes

control valve, PI denotes pressure indicator, PC denotes controller, C denotes

condensate tank , SG denotes steam group and Steam Lines denote the pipes

providing high pressure steam to the SGs (see Fig. 2).

The algorithm starts by specifying the start and end points of the paths.

In this case, the starting and end points are the pressure indicators (PI)

of controllers PC651 and PC652. Next, the algorithm finds that PI-651 is

connected to the steam line of SG1 and it finds two propagation paths from

Table 1: The connectivity matrix for example 1

Element PI-651 SG1 Steam Line PC-651 PV-651 SG-1 PI-652

PI-651 - 0 1 0 0 0

SG1 Steam Line 1 - 0 0 1 0

PC-651 0 0 - 1 0 0

PV-651 0 0 0 - 1 0

SG-1 0 0 0 0 - 1

PI-652 0 0 0 0 0 -

21

Page 23: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Fig. 8. The initial causality matrix based on the GC analysis

7.2 Construction of the causal model

This step involves the utilization of the search algorithm in order to refine the causality matrix. The procedure was implemented on the GC causality matrix according to the scheme presented in Fig. 1. First, the search algorithm is demonstrated and then the results of the GC causality matrix refinement are given.

7.2.1 Demonstration of the search algorithm

An example for the search algorithm output results for the entry (PC652→ PC651) in the causality matrix (bolded in green) is presented below:

PC1653 PC651 PC652 PC653 PC670 LC652 PC1652 PC671 LC653 PC673

PC1653 - 0 0.072 0 0 0 0 0 0.028 0

PC651 0.039 - 0.056 0 0 0 0 0 0 0

PC652 0.065 0.016 - 0 0 0 0 0 0 0

PC653 0.014 0.019 0.017 - 0.017 0 0.024 0.018 0 0

PC670 0.018 0.029 0.031 0 - 0 0.018 0 0 0

LC652 0 0 0 0 0 - 0 0 0 0

PC1652 0.024 0.013 0 0.016 0.018 0 - 0.113 0.044 0.032

PC671 0 0 0 0.016 0.029 0 0.144 - 0.074 0.031

LC653 0.105 0.013 0.014 0.015 0 0.068 0.019 0.021 - 0.012

PC673 0 0 0 0 0 0 0 0 0 -

Checking cause and effect relationship between PI-651 and PI-652... PI-651 is connected to SG1_Steam_Line There are 2 feasible propagation paths from SG1_Steam_Line to PI-652... Path 1 is Path 2 is SG1_Steam_Line SG1_Steam_Line PI-651 SG-1 PC-651 PI-652 PV-651 SG-1 PI-652 There is a direct link between SG1_Steam_Line and PI-652... The direct path is: SG1_Steam_Line SG-1 PI-652

Figure 6: Example 1: The output of the search algorithm for the link PC652→ PC651

the steam line to PI-652. In fact, both paths are the same, but since the SG1

steam line is connected to both PC-651 and to SG1 (see Fig. 2), both paths

are identified. However, since path 1 passes through the ’cause’ controller,

it is not considered as direct and only path 2 is considered therefore as

direct. This ensures that the shortest path is identified as direct. In this

case, since at least one direct path had been found between the controllers,

the corresponding GC value in the causality matrix (0.016) remains. On the

other hand, calling the search algorithm for the entry (PC1652 → PC653)

in the causality matrix (marked in red in Fig. 5) leads to the results shown

in Fig. 7.

In this case, the algorithm found five feasible propagation paths from

PC1652 to PC653, however none of them is direct since all of them pass

22

Page 24: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

The algorithm starts by specifying the start and end points of the paths. In this case, the starting and end points are the pressure indicators (PI) of controllers PC-651 and PC-652. Next, the algorithm finds that PI-651 is connected to the steam line of SG1 and it finds two propagation paths from the steam line to PI-652. In fact, both paths are same, but since the SG1 steam line is connected to both PC-651 and to SG1, both paths are identified. However, since path 1 passes through the ‘cause’ controller, it is not considered as direct and only path 2 is considered therefore as direct. In this case, since at least one direct path had been found between the controllers, the corresponding GC value in the causality matrix (0.016) remains.

On the other hand, calling the search algorithm for the entry (PC1652→ PC653) in the causality matrix (bolded in red) leads to the following results:

Checking cause and effect relationship between PI-1652 and PI-653... PI-1652 is connected to SG3-Steam-Line There are 5 feasible propagation paths from SG3-Steam-Line to PI-653… Path 1 is Path 2 is SG3-Steam-Line SG3-Steam-Line PI-1652 PI-1652 PC-1652 PC-1652 PV-1652.2 PV-1652.2 SG-3 SG-3 PI-671 C-4 PC-671 C4-Steam-Outlet-Line PV-671.1 PV-671.1 SG2-Steam-Line SG2-Steam-Line PI-653 PI-653 Path 3 is Path 4 is SG3-Steam-Line SG3-Steam-Line PI-1652 SG-3 PC-671 PI-671 PV-671.1 PV-671.1 SG2-Steam-Line SG2-Steam-Line PI-653 PI-653 Path 5 is SG3-Steam-Line SG-3 C-4 C4-Steam-Outlet-Line PV-671.1 SG2-Steam-Line PI-653 There is no direct link between SG3-Steam-Line and PI-653...

Figure 7: Example 2: the output of the search algorithm for the link PC1652→ PC653

through control loop PC671. Therefore, the corresponding entry in the

causality matrix (0.016) is eliminated.

3.4.2. The Refined Causality Matrix and The Causal Model

The refined GC causality matrix (i.e., after all the non-zero entries have

been checked by the search algorithm) is shown in Fig. 8. All the GC values

which correspond to non-direct interactions based on the process connectivity

have been set to zero. The causal model based on the refined causality matrix

23

Page 25: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

In this case, the algorithm found five feasible propagation paths from PC1652 to PC653; however, none of them is direct since all of them pass through control loop PC671. Therefore, the corresponding entry in the causality matrix (0.016) is eliminated.

7.2.2 The refined causality matrix and the causal model

The refined GC causal matrix (i.e., after all the non-zero entries have been checked by the search algorithm) is shown in Figure 9. All the GC values which correspond to non-direct interactions based on the process connectivity have been set to zero. The

causal model based on the refined causality matrix is shown in Figure 10.

Fig. 9. The refined causal matrix based on the GC analysis

PC1653 PC651 PC652 PC653 PC670 LC652 PC1652 PC671 LC653 PC673

PC1653 - 0 0 0 0 0 0 0 0 0

PC651 0 - 0.056 0 0 0 0 0 0 0

PC652 0.065 0.016 0 0 0 0 0 0 0 0

PC653 0 0 0 - 0.017 0 0 0 0 0

PC670 0 0.029 0.031 0 - 0 0 0 0 0

LC652 0 0 0 0 0 - 0 0 0 0

PC1652 0 0 0 0 0 0 - 0.113 0.044 0

PC671 0 0 0 0.016 0.029 0 0.144 - 0.074 0

LC653 0 0.013 0.014 0 0 0.068 0 0 - 0

PC673 0 0 0 0 0 0 0 0 0 -

Figure 8: The refined causality matrix based on the GC analysis

is shown in Fig. 9.

The search algorithm was able to eliminate most of the redundant results

from the GC analysis, however the model was still assumed to have few

redundant arcs (denoted by the dashed arcs in Fig. 9 and highlighted entries

in Fig. 8) based on process knowledge. The links LC653 → PC651 and

LC653 → PC652 were identified as direct since the level controller LC653

discharges the condensate from CT4 into CT3 and the steam outlet of CT3 is

discharged to SG1 (see Fig. 2), thus affecting the pressure in SG1. However,

LC653 primarily affects the level of CT3 while the steam pressure of SG1 is

mainly affected by PC670. Similarly, the pressure controller PC1652 directly

affects the pressure difference in SG3 rather than the level of CT4. Those

types of ambiguous results are sometimes inevitable and in-depth process

knowledge is needed to detect them. Nonetheless, the search algorithm was

able to eliminate approximately 88% of the spurious results obtained from

the GC analysis, herewith affirming the efficacy of the refinement procedure.

24

Page 26: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

PC1653

PC651 PC653 PC1652

PC652 PC670 PC671

LC653LC652

Figure 9: The causal model obtained from the refined causal matrix

3.5. Frequency Domain Analysis

As discussed above, the search algorithm was able to eliminate most of

the spurious results from the GC analysis, however, due to the high level of

connectivity among the controllers in the drying section, the causal model

indicates on multiple propagation paths. Furthermore, the causal model

in Fig. 9 implies on two possible root causes: PC1652 and PC671 since

only those controllers can affect all the other ones. Thus, it is essential

to identify the most powerful interactions in order to determine the most

probable propagation path. Consequently, the frequency domain analysis

was applied as an auxiliary step. The frequency domain analysis is especially

useful if the oscillation period in known. In particular, we utilized the PDC

to validate the refined causal model due to its ability capture the direct causal

interactions. On the other hand, the DTF was used to isolate the root cause.

25

Page 27: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

3.5.1. Results of the Frequency Analysis

The frequency domain analysis was applied by evaluating the values of

the PDC and the DTF between each pair of PVs. The MAR model (p = 10)

was estimated similarly as in the GC analysis. Some of the diagonal vari-

ance terms of the noise covariance matrix differed by order of magnitude.

Therefore, the PDC and the DTF were calculated according the renormal-

ization offered by Baccala et al. (2007) and Yameshita et al. (2005) re-

spectively. The thresholds for the statistical significance of the PDC and

DTF were determined using the Direct Causal Fourier Transform (CFTd)

and the Full Causal Fourier Transform (CFTf) surrogates respectively (Faes

et al., 2010). The threshold for significance for each computed value at

each frequency was set at the 95th percentile of the empirical distribution

of the PDC/DTF values computed over 100 sets of multivariate surrogate

series. Due to the ability of the PDC to reveal only the direct interactions

among variables, the plots of the PDC values of each pair of variables in

the model were utilized in order to identify the most prominent propaga-

tion path. This approach is especially useful in this case study due to the

high number of mutual interactions among the controllers. The PDC grid

of plots are shown in Fig. 10. The PDC plots clearly show that the links

LC653 → PC052, LC653 → PC651, PC1652 → LC653 which were previ-

ously assumed to be specious are indeed most likely indirect. Therefore, the

assumptions made earlier are reasonable. The maximum PDC and DTF val-

ues corresponding to the plots are shown in Table 2 with the most powerful

interactions highlighted in gray.

In order to isolate the root cause, the DTF was utilized since it can be

26

Page 28: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC651 to PC652

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC652 to PC1653

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC652 to PC651

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC653 to PC670

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC670 to PC651

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC670 to PC652

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC1652 to PC671

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC1652 to LC653

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC671 to PC653

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC671 to PC670

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC671 to PC1652

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1PC671 to LC653

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1LC653 to PC651

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1LC653 to PC652

0 0.010.020.030.040.050

0.2

0.4

0.6

0.8

1LC653 to LC652

Figure 10: The PDC values (blue lines) at each frequency between each pair of variables

in the causal model and their corresponding threshold for significance estimated by the

CFTd surrogates (red lines). x-axis: frequency (Hz), y-axis: magnitude of the PDC.

seen as the ’amount’ of variation originated from each ’cause’ variable to the

’effect’ variable through direct and indirect paths (Gigi & Tangirala, 2010).

Under this logic, the variable which ’contributed’ the most to the variation

in the process can be seen as the root cause of a disturbance. Furthermore,

since the spectral analysis revealed a common oscillation at the frequency of

0.007 Hz, the sum of the DTF originating for each source j at that frequency

was calculated (i.e.,∑N

i=1 |hij(0.007)|2). The results are shown in Figure 11.

The results confirm that control loop PC1652 is the root cause of the

disturbance. Since PC1652 is highly interacting with the pressure difference

controller PC671, the latter also significantly contributes to the variation in

27

Page 29: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Table 2: The maximum of the PDC and DTF of each link in the causal model

Link Max PDC Max DTF

PC651→ PC652 0.486 0.482

PC652→ PC1653 0.441 0.407

PC652→ PC651 0.181 0.221

PC653→ PC670 0.773 0.808

PC670→ PC651 0.416 0.417

PC670→ PC652 0.278 0.250

PC1652→ PC671 0.805 0.812

PC1652→ LC653 0.281 0.749

PC671→ PC653 0.491 0.651

PC671→ PC670 0.332 0.612

PC671→ PC1652 0.691 0.778

PC671→ LC653 0.438 0.617

LC653→ PC651 0.231 0.393

LC653→ PC652 0.233 0.475

LC653→ LC652 0.374 0.438

the process. Finally, considering the results from the frequency analysis , the

propagation path of the valve stiction in loop PC1652 is depicted in Fig. 12.

The oscillation caused by the stiction will initially propagate to the closely

interacting loop PC671 which will primarily affect the pressure in the suc-

cessive SG but also the level of condensate in CT4. From SG3 the oscillation

will continue to propagate to SG2 (simultaneously affecting the pressure and

pressure difference controllers) until it finally reaches SG1.

28

Page 30: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

PC1653 PC651 PC652 PC653 PC670 LC652 PC1652 PC671 LC653 PC6730

1

2

3

4

5

6

Sum

of t

otal

ene

rgy

trans

fer f

rom

eac

h so

urce

at f

requ

ency

of 0

.007

Figure 11: Sum of the magnitude of the DTF originating from each ’cause’ variable at the

frequency of 0.007 Hz

PC1653

PC651 PC653 PC1652

PC652 PC670 PC671

LC653LC652

Figure 12: The propagation path of valve stiction originating in loop PC1652

29

Page 31: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

4. Conclusions and Future Directions

This paper introduced a fault propagation analysis by the virtue of the

automatic consolidation of data-driven causal analysis with topology-based

model using a dedicated search algorithm. This combination results in an

enhanced causal model due to the ability of the search algorithm to eliminate

indirect interactions from the causality matrix.

The fault propagation analysis was successfully applied to a case study of

a drying section of an industrial board machine where the search algorithm

was able to eliminate most of the spurious results of the causal analysis.

However, several redundant links remained in the causal model in spite of

the refinement procedure, thus, process expert knowledge was essential in

eliminating those. Alternatively, numerous data-driven methods can be em-

ployed in order to construct the causality matrix prior to the refinement

procedure, particularly, in cases where the system is with a high degree of

connectivity among variables.

A number of caveats need to be noted regarding the present study. First,

the study was focused only on one case study with a specific fault (valve stic-

tion). Thus, other case studies with different sources of oscillation should be

examined in order to attain a more precise evaluation of the proposed anal-

ysis. Secondly, so far, only an oscillatory disturbance has been investigated.

In the future, the proposed fault propagation analysis can be used to study

how different types of faults propagate in a system and accordingly select

the critical variables for monitoring. Moreover, studying how different faults

propagate can facilitate their diagnosis; thereby, the source of the fault can

be eliminated before it severely deteriorates the quality of the final product.

30

Page 32: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Acknowledgements

The research leading to these results has received funding from the Euro-

pean Union Seventh Framework Programme (FP7/2007-2013) under Grant

Agreement No. 257580. Especially, the authors would like to thank Stora

Enso Oyj for providing the data and the expert knowledge for the analysis.

List of Figures

1 The logic of finding a direct path between two controllers . . . 15

2 Flow sheet of the drying section. Red lines indicate steam

pipes, blue lines indicate condensate pipes and purple lines in-

dicate mixed flow of steam and condensate (PI=pressure indi-

cator, PC=pressure controller, LC=level controller, SG=steam

group, C= condensate tank) . . . . . . . . . . . . . . . . . . . 17

3 Process measurements . . . . . . . . . . . . . . . . . . . . . . 18

4 Spectra of the process measurements (PVs) . . . . . . . . . . 19

5 The initial causality matrix based on the GC analysis . . . . 21

6 Example 1: The output of the search algorithm for the link

PC652→ PC651 . . . . . . . . . . . . . . . . . . . . . . . . 22

7 Example 2: the output of the search algorithm for the link

PC1652→ PC653 . . . . . . . . . . . . . . . . . . . . . . . . 23

8 The refined causality matrix based on the GC analysis . . . . 24

9 The causal model obtained from the refined causal matrix . . 25

31

Page 33: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

10 The PDC values (blue lines) at each frequency between each

pair of variables in the causal model and their corresponding

threshold for significance estimated by the CFTd surrogates

(red lines). x-axis: frequency (Hz), y-axis: magnitude of the

PDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

11 Sum of the magnitude of the DTF originating from each ’cause’

variable at the frequency of 0.007 Hz . . . . . . . . . . . . . . 29

12 The propagation path of valve stiction originating in loop

PC1652 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

References

Baccala, L., Sameshima, K., & Takahashi, D. (2007). Generalized partial

directed coherence. In The 15th international conference on digital signal

processing (pp. 163–166). Cardiff, UK.

Baccala, L. A., & Sameshima, K. (2001). Partial directed coherence: a new

concept in neural structure determination. Biological Cybernetics , 84 ,

463–474.

Bauer, M., Cox, J., Caveness, M. H., Downs, J. J., & Thornhill, N. F. (2007).

Finding the direction of disturbance propagation in a chemical process

using transfer entropy. IEEE Transactions on Control Systms Technology ,

15 , 12–21.

Bauer, M., Thornhill, N., & Meaburn, A. (2005). Cause and effect analysis

of a chemical process analysis of a plant-wide disturbance. In The IEEE

32

Page 34: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Seminar on Control Loop Assessment and Diagnosis (pp. 23–30). London,

UK.

Bauer, M., & Thornhill, N. F. (2008). A practical method for identifying the

propagation path of plant-wide disturbances. Journal of Process Control ,

18 , 707–719.

Bressler, S. L., & Seth, A. K. (2010). Wiener-granger causal-

ity: A well established methodology. NeuroImage, 58 , 323–329.

doi:10.1016/j.neuroimage.2010.02.059.

Di Geronimo Gil, G. J., Alabi, D. B., Lyun, O. E., & Thornhill, N. F.

(2011). Merging process models and plant topology. In The 2011 4th

International Symposium on Advanced Control of Industrial Processes (pp.

15–21). Thousand Islands Lake, Hangzhou, China: IEEE.

Faes, L., Porta, A., & Nollo, G. (2010). Testing frequency-domain causality

in multivariate time series. IEEE Transactions on Biomedical Engineering ,

57 , 1897–1906.

Geweke, J. (1982). Measurement of linear dependence and feedback between

multiple time series. Journal of The American Statistical Association, 77 ,

304–313.

Gigi, S., & Tangirala, A. K. (2010). Quantitative analysis of directional

strengths in jointly stationary linear multivariate processes. Biological Cy-

bernetics , 103 , 119–133.

Granger, C., & Newbold, P. (1974). Spurious regression in econometrics.

Journal of Econometrics , 2 , 111–120.

33

Page 35: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Granger, C. W. J. (1969). Investigating causal relations by econometric

models and cross-spectral methods. Econometrica, 37 , 424–438.

Greene, W. H. (2002). Econometric Analysis . (5th ed.). Upper Saddle River,

NJ, USA: Prentice-Hall.

Guo, S., Seth, A. K., Kendrick, K., Zhou, C., & Feng, J. (2008). Par-

tial granger causality : eliminating exogenous inputs and latent variables.

Journal of Neuroscience Methods , 172 , 79–93.

Hagglund, T. (1995). A control -loop performance monitor. Control Engi-

neering Practice, 3 , 1543–1551.

Heim, S., B; Gentil, Cauvin, S., Trave-Massuyes, L., & Braunschweig, B.

(2002). Fault diagnosis of a chemical process using causal uncertain model.

In Workshop PAIS Prestigious Applications of Intelligent Systems, 15th

European Conference on Artificial Intelligence. Lyon, France.

Jiang, H., Patwardhan, R., & Shah, S. L. (2008). Root cause diagnosis of

plant-wide oscillations using the adjacency matrix. In The 17th World

Congress, The International Federation of Automatic Control (pp. 13893–

13900). Seol, Korea.

Kaminski, M. J., & Blinowska, K. J. (1991). A new method of the description

of the information flow in the brain structures. Biological Cybernetics , 65 ,

203–210.

Karlsson, M. (2000). Papermaking Part 2, Drying . Helsinki, Finland: Fapet

Oy.

34

Page 36: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Maurya, M. R., Rengaswamy, R., & Venkatasubramanian, V. (2003). A sys-

tematic framework for the development and analysis of signed digraphs for

chemical processes. 1. algorithms and analysis. Industrial & Engineering

Chemistry Research, 42 , 4789–4810.

Maurya, M. R., Rengaswamy, R., & Venkatasubramanian, V. (2004). Ap-

plication of signed digraphs-based analysis for fault diagnosis of chemical

process flowsheets. Engineering Applications of Artificial Intelligence, 17 ,

501–518.

Maurya, M. R., Rengaswamy, R., & Venkatasubramanian, V. (2006). A

signed directed graph-based systsystem framework for steady-state mal-

function diagnosis inside control loops. Chemical Engineering Science, 61 ,

1790–1810.

Noumenon (2008). Xmplant overview. URL:

http://www.noumenon.org.uk/background.html.

Pozo Garcia, O., Tikkala, V.-M., Zakharov, A., & Jamsa-Jounela, S.-L.

(2013). Integrated FDD system for valve stiction in a paperboard ma-

chine. Control Engineering Practice, 21 , 818–828.

Schelter, B., Winterhalder, M., Eichler, M., Peifer, M., Hellwig, B.,

Guschlbauer, B., Lucking, C. H., Dahlhaus, R., & Timmer, J. (2005).

Testing for directed influences among neural signals using partial directed

cohoerence. Journal of Neuroscience Methods , 152 , 210–219.

Schreiber, T. (2000). Measuring information transfer. Physical Review Let-

ters , 85 , 461–464.

35

Page 37: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

Seth, A. K. (2005). Causal connectivity of evolved neural networks during

behavior. Network:Computation in Neural Systems , 16 , 35–54.

Seth, A. K. (2010). A matlab toolbox for granger causal connectivity analysis.

Journal of Neuroscience Methods , 186 , 262–273.

Sun, Q. (2013). Generation of process topology-based causal models . Msc

thesis Espoo, Finland.

Takahashi, D. Y., Baccala, L. A., & Sameshima, K. (2007). Connectivity

inference between neural structures via partial directed coherence. Journal

of Applied Statistics , 34 , 1255–1269.

Thambirajah, J., Benabbas, L., Bauer, M., & Thornhill, N. F. (2007). Cause

and effect analysis in chemical processes utilizing plant connectivity infor-

mation. In The IEEE Advanced Process Control Applications for Industry

Workshop (pp. 1–6). Vancouver, Canada.

Thambirajah, J., Benabbas, L., Bauer, M., & Thornhill, N. F. (2009). Cause-

and-effect analysis in chemical processes utilizing xml, plant connectivity

and quantitative process history. Computers and Chemical Engineering ,

33 , 503–512.

Venkatasubramanian, V., Rengaswamy, R., Kavuri, S. N., & Yin, K. (2003).

A review of process fault detection and diagnosis: Part iii: Process history

based methods. Computers and Chemical Engineering , 27 , 327–346.

Winterhalder, M., Schelter, B., Hesse, W., Schwab, K., Leistritz, L., Klan,

D., Bauer, R., Timmer, J., & Witte, H. (2005). Comparison of linear signal

36

Page 38: Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka … · Landman, Rinat; Kortela, Jukka; Sun, Qiang; Jämsä-Jounela, Sirkka-Liisa Fault propagation analysis of

processing techniques to infer directed interactions in multivariate neural

systems. Signal Processing , 85 , 2137–2160.

Yameshita, O., Sadato, N., Okada, T., & Ozaki, T. (2005). Evaluating

frequency-wise directed connectivity of bold signals applying relative power

contribution with the linear multivariate time-series models. NeuroImage,

25 , 478–490.

Yang, F., Shah, S. L., & Xiao, D. (2010a). SDG (signed directed graph) based

process description and fault propagation analysis for tailings pumping

process. In 13th Symposium on Automation in mining, Mineral and Metal

Processing (pp. 50–55). Cape Town, South Africa.

Yang, F., Shah, S. L., & Xiao, D. (2010b). Signed directed graph modeling

of industrial processes and their validation by data-based methods. In

Conference on Control and Fault Tolerant Systems (pp. 387–392). Nice,

France.

Yang, F., Shah, S. L., & Xiao, D. (2012). Signed directed graph based mod-

eling and its validation from process knowledge and process data. Interna-

tional Journal of Applied Mathematis and Computr Science, 22 , 387–392.

Yang, F., & Xiao, D. (2012). Progress in root cause and fault propagation

analysis of large-scale industrial processes. Journal of Control Science and

Engineering , 2012 , 1–10.

Yim, S., Ananthakumar, H., Benabbas, L., Horch, A., Drath, R., & Thorn-

hill, N. F. (2006). Using process topology in plant-wide control loop per-

formance assessment. Computers and Chemical Engineering , 31 , 86–99.

37