journal of loss prevention in the process industries a large amount of nuisance alarms. in order to...

14
Main causes of long-standing alarms and their removal by dynamic state-based alarm systems * Jiandong Wang a, * , Tongwen Chen b a College of Engineering, Peking University, Beijing, China b Dept. of Electrical & Computer Eng., Univ. of Alberta, Edmonton, Alberta, Canada article info Article history: Received 7 April 2016 Received in revised form 11 May 2016 Accepted 11 May 2016 Available online 14 May 2016 Keywords: Industrial alarm systems Long-standing alarms Dynamic state-based alarm systems Nuisance alarms abstract Long-standing alarms are those in the alarm state continuously for a long period of time. Some long- standing alarms belong to nuisance alarms, playing a detrimental role to the performance of indus- trial alarm systems, and hence they should be removed. The paper analyzes the main causes leading to long-standing alarms as nuisance ones; industrial examples from a large-scale thermal power plant are provided as supportive evidences of the main causes. A dynamic state-based alarm system is designed to remove long-standing alarms caused by the inconsistency between the alarm design and discrete-valued operating states. The design is based on two rules formulated to select state variables and a novel alarm generation mechanism to generate state-based alarm variables. Industrial case studies illustrate the effectiveness of the dynamic state-based alarm system in signicantly reducing the severity of long- standing alarms. © 2016 Elsevier Ltd. All rights reserved. 1. Introduction An industrial alarm system is the collection of hardware and software that generates, records and communicates alarm states to operators (ISA, 2009). Alarm systems are the safeguards to prevent the deterioration of near misses to accidents. As implied by the safety pyramid in Fig. 1 , almost every accident is associated with a number of near misses as precursors. Alarm systems are the tools for industrial plant operators to promptly detect the occurrences of near misses and take corrective actions to drive processes back to normal operating ranges. Retrospective investigation on a large number of accidents also support the importance of alarm systems for the safety of industrial plants. For instance, in the nal report of the Bunceeld accident (HSE, 1997, 2008), which is by far the most severe industrial accident in Europe, the 8th recommendation was to develop high-high level alarms for oil overll prevention, and the 23rd recommendation was to collect accident data to nd the de- fects of the installed alarm system. Therefore, industrial alarm systems have been well recognized as the critical assets for process safety of modern industrial plants in many industries such as po- wer, chemical, oil-gas, petrochemical, and pulp and paper (Bransby and Jenkinson, 1998; Macdonald, 2004; Rothenberg, 2009; Pariyani et al., 2010; Stauffer and Clarke, 2016). Despite the importance of alarm systems, industrial surveys showed that industrial alarm systems often suffered from poor performance in terms of having too many alarms to be promptly handled by industrial plant operators (Bransby and Jenkinson, 1998; Rothenberg, 2009). This phenomenon of alarm overloading is clearly revealed from Table 1 (Rothenberg, 2009), which is based on a study of 39 industrial plants ranging from oil and gas, petro- chemical, and power industries. The statistics of performance metrics such average alarms per day are much larger than the benchmarks from the Engineering Equipment and Materials UsersAssociation (EEMUA). The occurred alarms include an excessive large number of nuisance alarms that are not associated with any abnormalities so that no corrective actions are required from industrial plant oper- ators. By contrast to the nuisance alarm, an informative alarm does require operators to take some corrective action; otherwise, ab- normalities associated with informative alarms would have nega- tive effects on operation safety and/or efciency. Nuisance alarms is extremely detrimental to the important role played by alarm sys- tems. Due to cry wolfeffect, operators do not trust alarm systems and are very likely to miss informative alarms that are buried * This research was partially supported by the National Natural Science Foun- dation of China under grant No. 61433001, and the Natural Sciences and Engi- neering Research Council of Canada under Grant No. CRDPJ-446412-12. * Corresponding author. E-mail addresses: [email protected] (J. Wang), [email protected] (T. Chen). Contents lists available at ScienceDirect Journal of Loss Prevention in the Process Industries journal homepage: www.elsevier.com/locate/jlp http://dx.doi.org/10.1016/j.jlp.2016.05.006 0950-4230/© 2016 Elsevier Ltd. All rights reserved. Journal of Loss Prevention in the Process Industries 43 (2016) 106e119

Upload: lamhanh

Post on 03-Jul-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

lable at ScienceDirect

Journal of Loss Prevention in the Process Industries 43 (2016) 106e119

Contents lists avai

Journal of Loss Prevention in the Process Industries

journal homepage: www.elsevier .com/locate/ j lp

Main causes of long-standing alarms and their removal by dynamicstate-based alarm systems*

Jiandong Wang a, *, Tongwen Chen b

a College of Engineering, Peking University, Beijing, Chinab Dept. of Electrical & Computer Eng., Univ. of Alberta, Edmonton, Alberta, Canada

a r t i c l e i n f o

Article history:Received 7 April 2016Received in revised form11 May 2016Accepted 11 May 2016Available online 14 May 2016

Keywords:Industrial alarm systemsLong-standing alarmsDynamic state-based alarm systemsNuisance alarms

* This research was partially supported by the Nadation of China under grant No. 61433001, and theneering Research Council of Canada under Grant No.* Corresponding author.

E-mail addresses: [email protected] (J. Wang),

http://dx.doi.org/10.1016/j.jlp.2016.05.0060950-4230/© 2016 Elsevier Ltd. All rights reserved.

a b s t r a c t

Long-standing alarms are those in the alarm state continuously for a long period of time. Some long-standing alarms belong to nuisance alarms, playing a detrimental role to the performance of indus-trial alarm systems, and hence they should be removed. The paper analyzes the main causes leading tolong-standing alarms as nuisance ones; industrial examples from a large-scale thermal power plant areprovided as supportive evidences of the main causes. A dynamic state-based alarm system is designed toremove long-standing alarms caused by the inconsistency between the alarm design and discrete-valuedoperating states. The design is based on two rules formulated to select state variables and a novel alarmgeneration mechanism to generate state-based alarm variables. Industrial case studies illustrate theeffectiveness of the dynamic state-based alarm system in significantly reducing the severity of long-standing alarms.

© 2016 Elsevier Ltd. All rights reserved.

1. Introduction

An industrial alarm system is the collection of hardware andsoftware that generates, records and communicates alarm states tooperators (ISA, 2009). Alarm systems are the safeguards to preventthe deterioration of near misses to accidents. As implied by thesafety pyramid in Fig. 1, almost every accident is associated with anumber of near misses as precursors. Alarm systems are the toolsfor industrial plant operators to promptly detect the occurrences ofnear misses and take corrective actions to drive processes back tonormal operating ranges. Retrospective investigation on a largenumber of accidents also support the importance of alarm systemsfor the safety of industrial plants. For instance, in the final report ofthe Buncefield accident (HSE, 1997, 2008), which is by far the mostsevere industrial accident in Europe, the 8th recommendation wasto develop high-high level alarms for oil overfill prevention, and the23rd recommendation was to collect accident data to find the de-fects of the installed alarm system. Therefore, industrial alarmsystems have been well recognized as the critical assets for process

tional Natural Science Foun-Natural Sciences and Engi-

CRDPJ-446412-12.

[email protected] (T. Chen).

safety of modern industrial plants in many industries such as po-wer, chemical, oil-gas, petrochemical, and pulp and paper (Bransbyand Jenkinson, 1998; Macdonald, 2004; Rothenberg, 2009; Pariyaniet al., 2010; Stauffer and Clarke, 2016).

Despite the importance of alarm systems, industrial surveysshowed that industrial alarm systems often suffered from poorperformance in terms of having too many alarms to be promptlyhandled by industrial plant operators (Bransby and Jenkinson,1998; Rothenberg, 2009). This phenomenon of alarm overloadingis clearly revealed from Table 1 (Rothenberg, 2009), which is basedon a study of 39 industrial plants ranging from oil and gas, petro-chemical, and power industries. The statistics of performancemetrics such average alarms per day are much larger than thebenchmarks from the Engineering Equipment and Materials Users’Association (EEMUA).

The occurred alarms include an excessive large number ofnuisance alarms that are not associated with any abnormalities sothat no corrective actions are required from industrial plant oper-ators. By contrast to the nuisance alarm, an informative alarm doesrequire operators to take some corrective action; otherwise, ab-normalities associated with informative alarms would have nega-tive effects on operation safety and/or efficiency. Nuisance alarms isextremely detrimental to the important role played by alarm sys-tems. Due to “cry wolf” effect, operators do not trust alarm systemsand are very likely to miss informative alarms that are buried

Fig. 1. Safety pyramid with typical historical data Pariyani et al. (2010).

Table 1Cross-industry study Rothenberg (2009).

EEMUA Oil-gas PetroChem Power

Average alarms/day 144 1200 1500 2000Peak alarms/10 min 10 220 180 350Average standing alarms/day 9 50 100 65

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119 107

among a large amount of nuisance alarms. In order to removenuisance alarms and achieve desired performance benchmarks,ANSI/ISA-18.2 presented ten stages for an alarm managementlifecycle, namely, alarm philosophy, identification, rationalization,detailed design, implementation, operation, maintenance, moni-toring and assessment, management of change, and audit (ISA,2009).

Long-standing alarms are the ones continuously remaining inthe alarm state for a long period of time, e.g., 24 h. As clarified laterin Section 3, many long-standing alarms are the nuisance ones, sothat they are culprits for poor performance of industrial alarmsystems. For the long-standing alarms that do not belong tonuisance alarms, they have to be checked and understood regularly,so that operators are aware of the ongoing abnormal situationscausing the long-standing alarms. Thus, the long-standing alarmsbeing nuisance ones add extra workloads to operators; as a result,the operators may overlook the ongoing abnormal situations if thenumber of long-standing alarms is large. For example, an emer-gency shutdown of a power generation unit occurred at 23:49:13on January 24, 2014 at a large-scale thermal power plant in Shan-dong Province in China; during the incident, there were 348 alarmvariables in the alarm state, among which 96 alarm variables hadbeen continuously in the alarm state for more than 24 h. Therefore,ANSI/ISA-18.2 imposed a performance metric that “there should beless than five long-standing alarms on any given day, with actionplans to address them” (ISA, 2009). EEMUA-191 required a usabilitymetric that the average number of long-standing alarms per day isno larger than nine as given in Table 1 (EEMUA, 2013).

There are very few methods in handling long-standing alarms,despite a well recognition of their importance in practice. The in-dustrial standard ANSI/ISA-18.2 stated that logic, programmatic, orstate-based methods could be used to eliminate long-standingalarms (ISA, 2009). The guide EEMUA-191 suggested the usage ofa maintenance shelf or one-shot shelving to deal with long-standing alarms (EEMUA, 2013). Hollifield and Habibi (2010) andJerhotova et al. (2013) mentioned the implementation of some logicor state-based alarm methodologies, especially for shutdownstates. Kim (1994), Hatch (2005), Arjomandi and Salahshoor (2011)and Beebe et al. (2013) recommended state-based alarming based

on different operation modes such startup, shutdown, full-rate andhalf-rate modes. However, no technical details were provided inthese references. One possible reason for the shortage of relatedstudies is that the state-based alarming seems rather straightfor-ward, e.g., if the device is in the shutdown state, then the alarm isturned off; however, such an intuitive design is associated with aserious flaw that mistakenly ignores the previous status of alarmvariables (to be clarified later in Example 4). There are also somerelated works on the design of alarm systems based on differentprocess states, but they are not specific for handling the long-standing alarms. Ghetie et al. (1998) used multiple binary de-cisions to generate an alarm signal with five states. Nihlwing andKaarstad (2012) designed a state-based alarm system for the Hal-den Boiling Water Reactor Simulator, based on a number of well-defined process states, in order to detect the secondary distur-bances earlier and more often. Ragsdale et al. (2012) showed thatoperators performed better and had more trust in three-statealarms (‘OK’, ‘Warning’, ‘Alarm’) than two-state alarms (‘OK’,‘Alarm’). Zhu et al. (2014a) obtained dynamic alarm trippointsdepending upon multiple steady states and transitions betweenthese states. Blaauwgeers et al. (2013) and Zhu et al. (2014b) rec-ommended using dynamic alarm priorities for different processstates and operational scenarios.

This paper presents two main contributions on the study oflong-standing alarms. First, three main causes of long-standingalarms are identified, with a focus on those leading long-standingalarms to nuisance ones. Industrial examples are provided as sup-portive evidences of the identified causes. Second, a dynamic state-based alarm system is designed to remove long-standing alarmscaused by the inconsistency between the alarm design anddiscrete-valued operating states. In particular, two rules areformulated to select state variables based on historical data sam-ples; a dynamic alarm generation mechanism is proposed togenerate a state-based alarm variable by taking the previous statusof the original alarm variable into consideration. To the best of ourknowledge, the two rules and the dynamic state-based alarmgeneration mechanism are the first systematic techniques tohandle long-standing alarms.

The rest of the paper is organized as follows. Section 2 givessome basic information of long-standing alarms. Section 3 analyzesthe main causes of long-standing alarms. The dynamic state-basedalarm system is designed in Section 4. Section 5 provides industrialcase studies as illustrations. Some concluding remarks are pre-sented in Section 6. A nomenclature section is given at the end ofthe paper.

2. Basics of long-standing alarms

This section presents the definitions of long-standing alarms,and proposes an index to quantify the severity level of long-standing alarms.

There are several closely-related definitions of long-standingalarms. Hollifield and Habibi (2010) treated long-standing alarmsas those in the alarm state continuously for more than 24 h. ANSI/ISA-18.2 had a similar definition as “an alarm that remains in thealarm state for an extended period of time (e.g., 24 h)” (ISA, 2009).Rothenberg (2009) separated the definitions of stale alarms andlong-standing alarms: the former as the alarms acknowledged butuncleared for 8e12 h, and the latter as the alarms acknowledgedbut uncleared for 24 or more hours. EEMUA-191 regarded a long-standing alarm as any alarm active for a full operating shift orlonger (EEMUA, 2013). These definitions have a common feature ofhaving large alarm durations, but are different in alarm durationthresholds.

Let xa(t) represent the value of an alarm variable at the time

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119108

instant th, where h is the real-valued sampling period of thecomputerized industrial alarm system, and t is the integer-valuedsample index. The alarm variable xa(t) takes the value of ‘1’ (‘0’)when xa(t) is in the alarm (non-alarm) state. For instance, if aprocess variable x(t) is configured with a high alarm with a trip-point xtp, then xa(t) is generated as

xaðtÞ ¼�1; if xðtÞ � xtp0; if xðtÞ< xtp

: (1)

Eq. (1) is applicable to both analog and digital process variables,where xtp is a real number if x(t) is analog, and xtp takes the value of‘1’ or ‘0’ if x(t) is digital. The alarm duration, denoted as TAD, is thenumber of consecutive ‘1’s in xa(t), i.e.,

TAD :¼ t2 � t1 þ 1; (2)

where

xaðt1 � 1Þ ¼ 0; xaðt2 þ 1Þ ¼ 0;Xt2t¼t1

xaðtÞ

¼ t2 � t1 þ 1; for t2 > t1:

Here TAD, t1 and t2 take integer values. Thus, the above defini-tions of long-standing alarms can be unified as:

Definition 1. If there is an alarm duration larger than a threshold T0,i.e., TAD > T0, then a long-standing alarm is present.

A default threshold value in this context is T0h ¼ 8 h, a commontime interval for one shift of operators.

Definition 1 can be exploited to detect the presence of long-standing alarms. In practice, a large amount of alarm variablesmay suffer from long-standing alarms. Thus, it is necessary toquantify the severities of long-standing alarms for different alarmvariables, so that the alarmvariables with higher severities could bedealt with at the first place. According to Definition 1, we propose along-standing alarm index for the alarm sequence X :¼ fxaðtÞgTmax

t¼1as

h ¼ 1Tmax

1N

XNi¼1

TADðiÞ: (3)

Here TAD(i) is the i-th TAD defined in Eq. (2), N is the total numberof TAD(i)’s, and Tmax is the data length of X. Clearly, Tmax is the largestpossible value of TAD(i),ci. Tomake hmeaningful, Tmax should be noless than T0 in Definition 1, i.e., Tmax � T0. The index h is in the range[0,1]. The value of h closer to one says that the alarm durations onaverage are larger so that the severity of having long-standingalarms is higher. It is worthy to point out that the summation ofall TAD(i)’s should not be used as an index for long-standing alarms,because such an index is likely to misclassify chattering alarmshaving many small alarm durations as long-standing alarms.

3. Main causes of long-standing alarms

This section identifies the main causes of long-standing alarms,and provides industrial examples as supportive evidences. The in-dustrial examples are from the large-scale thermal power plant inShandong Province in China, referred to as Plant A in the sequel.The involved signals are with the sampling period h ¼ 1 s, so thatwith a little abuse of notations, the sampling index t also denotesthe time variable in sec.

The causes of long-standing alarms can be diverse. Bransby andJenkinson (1998) and EEMUA (2013) gave the four reasons why analarm may be long-standing:

� The alarm is indicative of an operational problem that theoperator should do something about or of which the operatorshould be aware;

� The alarm is indicative of a problem about which others (typi-cally maintenance staff) should do something;

� The alarm is indicative of a problem that cannot be resolved inthe short term (for example, it may have to wait until an annualoverhaul);

� The alarm is spurious in the sense that it requires no action anddoes not indicate an operational problem (i.e., it is not really analarm).

The last reason implies that the resulting long-standing alarmsindeed are one type of nuisance alarms, because a well-acceptedcriterion to distinguish nuisance alarms is that an informativealarm must require an operator response (ISA, 2009). By contrast,the long-standing alarms owing to the first three reasons are notnuisance alarms, because they do require corrective actions.Therefore, the long-standing alarms due to the last reason are theones to be investigated and removed in this context. We refine thelast reason by looking at more concrete causes for long-standingalarms being nuisance alarms, and identify the main causes oflong-standing alarms as follows.

The first cause of long-standing alarms is that alarm variablesshould not be configured with alarms. In modern computerizedmonitoring systems such as the distributed control system (DCS),alarm variables are very easily realized in a technical sense byclicking a mouse and entering alarm trippoint values. Such a quickalarm configuration is regarded as “free” without any cost, and isoften implemented without a careful study. Hence, some alarmvariables, which should not be configured with alarms, may lead tolong-standing alarms.

Example 1. The two primary air fans at Plant A are controlled viatwo final control elements, namely, the dampers and frequencyconvertors. The dampers are usually exploited only at the start-upstage of the power generation unit. The frequency convertors takeover the role of controlling the primary fans, when some conditionsare satisfied, in order to have a high energy efficiency. The dampersand frequency convertors are controlled by two controllers runningin parallel with the same control objective in terms of obtaining thedesired primary air pressure. Themanual mode variables of the twocontrollers are configured with alarms. That is, if the manual modevariables take the value of ‘1’, then alarms occur, indicating that thecontrollers are in the manual mode. Obviously, the dampercontroller and the frequency convertor controller for the sameprimary air fan cannot be in the auto mode simultaneously. Long-standing alarms are unavoidable for one of the manual mode var-iables. As shown in Fig. 2, the primary air fan worked properly,under an automatic control from the frequency convertor. Mean-while, the damper was in the full open position, and the dampercontroller had been in the manual mode all the time, resulting in along-standing alarm.

The second cause of long-standing alarms is that the design ofalarm variables is inconsistent with discrete-valued operating states.The daily operation of industrial plants usually involves severalstates such as startup, half-rate and full-rate operations, each ofwhich has different demands for equipment. Some equipment isintentionally stopped, after industrial plants transit from one stateto another. These alarm variables are usually designed only for therunning status of the equipment; thus, long-standing alarms arisewhen the equipment is turned off.

Example 2. The 300 MW power generation unit at Plant A isequippedwith three coal mills, namely, Mills A, B and C. Eachmill is

1 2 3 4 5 6

x 105

9

9.5

10

Am

plitu

de

(a)

t (sec)

1 2 3 4 5 6

x 105

80

100

Am

plitu

de

(b)

t (sec)

1 2 3 4 5 6

x 105

0

0.5

1

t (sec)

Am

plitu

de

(c)

Fig. 2. (a) Primary air pressure (solid) and reference signal for controllers (dash), (b)frequency convertor controller output (solid), damper controller output (dash), (c)manual mode variables for the frequency convertor (solid) and for the damper (dash)during March 1e7, 2014.

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119 109

associated with two combustion zones composed of eight burners.An alarm variable is configured for the measurement of a digitalsensor monitoring whether the burner works properly. That is, thealarm variable is in the alarm state if the burner has no fires. Owingto the variations of operational states, some mills may have to be

0 1 2150

200

250

300

350

Am

plitu

de

t (

0 1 2

0

100

200

300

Am

plitu

de

t (

0 1 20

0.5

1

t (

Am

plitu

de

Fig. 3. (a) Generated power of the entire unit, (b) electrical current variable of the electricalshifted upwards 0.2 for a better visualization) of the electrical motor for Mill C during Mar

switched off accordingly. As a result, there are no fires for theassociated burners when the mills are off, and the alarm variablesfor the burners are in the alarm state for a long period of time.However, no actions are required from operators to address theselong-standing alarms so that they are clearly nuisance alarms. Fig. 3presents the generated power of the entire unit, the electricalcurrent variable and the off-status variable of the electrical motorfor Mill C, and the alarm variable xa(t) for one burner of Mill Cduring March 1e7, 2014. Clearly, the power generation unit hadbeen under the normal operation, but Mill C was turned off and onfor several occasions. When Mill C was turned off, the alarm vari-able would run into the alarm state for large time durations, e.g.,the three largest alarm durations in Fig. 3-(c) were 9.13, 9.17 and9.38 h, which were long-standing alarms according to Definition 1.

The third cause of long-standing alarms is that alarm trippoints areinconsistent with operating states represented by continuous-valuedprocess variables. Eq. (1) says that alarm variables are generatedby comparing the measurements of process variables with the highand/or low alarm trippoints. In contemporary alarm systems, alarmtrippoints usually take constant values. However, process variablesmay experience large-scale variations at different operating statesthat are represented by some continuous-valued process variables.The constant alarm trippoints are usually suitable only for oneoperating state, and hardly agreewith other operating states so thatlong-standing alarms may arise.

Example 3. The outlet steam temperature of the reheater is one ofcritical process variables related to the operational efficiency andsafety. It is configured with low and high alarm trippoints 520 and

3 4 5 6

x 105

(a)

sec)

3 4 5 6

x 105

(b)

sec)

3 4 5 6

x 105sec)

(c)

motors for Mill C, (c) alarm variable (solid) for one burner, and off-status variable (dash,ch 1e7, 2014.

0 1 2 3 4 5 6

x 105

490

500

510

520

530

540

Am

plitu

de

(a)

t (sec)

0 1 2 3 4 5 6

x 105

0

0.2

0.4

0.6

0.8

1

Am

plitu

de

(b)

t (sec)

0 1 2 3 4 5 6

x 105

150

200

250

300

350

t (sec)

Am

plitu

de

(c)

Fig. 4. (a) Outlet steam temperature of the reheater (solid) with the low alarm trippoint 520 (dash), (b) alarm variable, (c) generated power of the entire unit during March 13e20,2014.

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119110

546, respectively. These constant alarm trippoints are designed forthe power generation unit running in its full capacity around300 MW. As shown in Fig. 4, the outlet steam temperature wasoften below the low alarm trippoint 520. The maximum alarmduration was equal to 12.01 h for the time interval [1.11 � 105,1.54 � 105] s, during which the generated power was about195 MW. Hence, such a long-standing alarm is resulted from theinconsistency between the low alarm trippoint 520 and the oper-ating state that is represented by the generated power variable inMW.

It is clear that the long-standing alarms caused by the abovethree causes given in this paper are the nuisance alarms, and theirappearances should be avoided. For the long-standing alarmscaused by the first main cause, a re-configuration of associatedalarm variables is usually a feasible solution. Owing to a high dif-ficulty in designing dynamic alarm trippoints, the long-standingalarms caused by the third cause are hard to handle, which areleft as future work. The next section designs a dynamic state-basedalarm system to deal with long-standing alarms caused by thesecond main cause.

4. Dynamic state-based alarm system

This section designs a dynamic state-based alarm system toavoid the appearance of long-standing alarms caused by theinconsistency between the alarm design and discrete-valuedoperating states, which is the second main cause of long-standingalarms in Section 3. First, two major components of the dynamicstate-based alarm system, namely, the state variable and the alarmgeneration mechanism, are discussed. Next, the design steps and

the online implementation of the dynamic state-based alarm sys-tem are given.

4.1. State variable

The state variable, denoted as xs(t), is the one indicating thediscrete-valued operating states; thus, it usually is a digital variablewith binary values ‘0’ and ‘1’. Without loss of generality, let xs(t)¼ 0stand for the state that does not affect the design of alarmvariables.

The state variable is selected in two steps, namely, (i) the statevariable candidates are selected based on related process knowl-edge, and (ii) the final choice among the state variable candidates isdetermined based on historical data samples. In the first step, oncean alarm variable has been detected to have long-standing alarmsvia Definition 1 and the index h in Eq. (3), several candidates ofstate variables can be found based on the physical principles of theequipment related to the alarm variable; see Example 6 later inSection 5 for illustration. Hence, this step is specific to each alarmvariable. For the second step, a systematic way is developed asfollows.

First, a qualification rule for state variables is formulated as:

Rule 1 If an alarm variable xa(t) is in the non-alarm state at timeinstant (t0�1), then the value change of the state variable xs(t)from ‘0’ to ‘1’ at the next time instant t0 is always accompaniedby an alarm occurrence, i.e., xa(t) switches to the alarm state att0.

The rationale of Rule 1 is based on the following observations onthe relation between the alarm and state variables:

0 500 1000 1500 2000 2500220

230

240

250

260

Am

plitu

de

(a)

0 500 1000 1500 2000 2500

0

100

200

300

Am

plitu

de

(b)

0 500 1000 1500 2000 25000

0.5

1

Time (sec)

Am

plitu

de

(c)

Fig. 5. (a) Generated power of the entire unit, (b) current variable of the electrical motors for Mill C, (c) alarm variable (solid) for one burner, and off-status variable (dash, shiftedupwards 0.2 for a better visualization) of the electrical motor for Mill C on March 24, 2014.

50 100 150 200 250 3000

0.5

1

Am

plitu

de

(a)

50 100 150 200 250 3000

0.5

1

Am

plitu

de

(b)

50 100 150 200 250 3000

0.5

1

Am

plitu

de

(c)

t (sec)

Fig. 6. (a) Alarm variable xa(t), (b) state variable xs(t), (c) state-based alarm variablesxa,s(t) in Eq. (13) (solid) and xa;sðtÞ in Eq. (14) (dash, shifted upwards for 0.2).

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119 111

i Whenever some equipment is switched off, the related alarmvariable always runs into the alarm state, if the alarm variablepreviously is in the non-alarm state.

ii The alarm variable may have already been in the alarm state,when the equipment is switched off.

iii When a piece of equipment is switched on, the alarm variabledoes not necessarily switch to the non-alarm state.

iv When the equipment is staying at the running or shutdownstatus, the alarm variable could be either in the alarm or non-alarm state.

The last three observations are owing to a fact that the equip-ment status could be only one of the conditions for the occurrenceand clearance of alarms. Fig. 3-(c) illustrates the above observation(i). Fig. 6-(a) and (b) (appearing later at the end of Section 4.2) leadto the observation (ii). Fig. 5, a counterpart of Fig. 3 in Example 2during a different time period, clearly supports the observations(iii) and (iv).

Rule 1 can be evaluated based on the historical data of xa(t) andxs(t), by checking the consistency between the alarm occurrenceand the event that xs(t) switches from ‘0’ to ‘1’. In order to calculatethe consistency, two relatives of xa(t) and xs(t) are required. Therelative of xa(t), referred to as the alarm-on variable, takes the value‘1’ only at the time instant when xa(t) switches from the non-alarmstate to the alarm state, i.e.,

x0aðtÞ ¼�1; if xaðt � 1Þ ¼ 0 and xaðtÞ ¼ 10; otherwise

: (4)

The relative of xs(t), named as the state-on variable, is defined totake value ‘1’ only when xs(t) changes from ‘0’ to ‘1’, subject to thecondition in Rule 1 that xa(t � 1) is in the non-alarm state, i.e.,

x0sðtÞ ¼�1; if xaðt � 1Þ ¼ 0; xsðt � 1Þ ¼ 0 and xsðtÞ ¼ 10; otherwise :

(5)

Mathematically, the consistency in Rule 1 is evaluated based onthe two sequences fx0aðtÞ; x0sðtÞgTmax

t¼1 ,

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119112

r ¼PTmax

t¼1 x0sðtÞ$x0aðtÞPTmaxt¼1 x0sðtÞ

:

The consistency ratio r is in the range [0,1]. In order to satisfyRule 1, rmust take the value ‘1’; otherwise, Rule 1 does not hold sothat xs(t) is not a valid state variable for xa(t).

Owing to the potential dynamics between xa(t) and xs(t) as wellas the presence of stochastic noises/disturbances, there may be arandom delay t between ‘1’ in x0sðtÞ and the corresponding ‘1’ inx0aðtÞ. Hence, Rule 1 needs to be revised as.

Rule 2 Each ‘1’ in x0sðtÞ at the time instant t0 is always accompanied bya subsequent alarm event x0aðtÞ ¼ 1 at the time instant(t0 þ t) 2 [t0,t1), where t1 is the time instant that xs(t)switches back from ‘1’ to ‘0’, i.e., xs(t) ¼ 1 for t 2 [t0,t1) andxs(t1) ¼ 0.

To tolerate such a random delay t, additional ‘1’s are appendedin front of each ‘1’ of x’aðtÞ, e.g., x0aðtÞ ¼ /;0;1

[;0;/ becomes

x0a;LðtÞ ¼ /;0;1;/;1|fflfflfflffl{zfflfflfflffl}L

;1[;0;/: (6)

Here L is the number of additional ‘1’s, and should be smallerthan Lmax that is the minimum value of “alarm durations” of xs(t),i.e.,

Lmax ¼ mini

TAD;xsðiÞ; (7)

where TAD;xs ðiÞ denotes the sequence of “alarm durations” obtainedby applying Eq. (2) to xs(t). Thus, the consistency ratio r becomes

rðLÞ ¼PTmax

t¼1 x0sðtÞ$x0a;LðtÞPTmaxt¼1 x0sðtÞ

: (8)

It is possible that there are multiple state variables xs,i(t) for

Fig. 7. Flowchart of the steps to design a dynamic state-based alarm system.

i¼ 1,2,…,M satisfying the hard requirement r¼ 1. In this case, a softrequirement is on the number of ‘1’s in x0s;iðtÞ, namely, the de-nominator in Eq. (8). A large denominator implies that more oflong-standing alarms could be removed by using the dynamicstate-based alarm systems. Therefore, the state variable having thelargest denominator in Eq. (8) is the final choice, i.e.,

xsðtÞ ¼ argmaxxs;iðtÞ

(XTmax

t¼1

x0s;iðtÞ)M

i¼1

: (9)

4.2. Alarm generation mechanism

The alarm generation mechanism is to generate the state-basedalarm variable xa,s(t) based on the alarm variable xa(t) and the statevariable xs(t). It is in the following form:

xa;sðtÞ ¼ f�~xa;sðtÞ

�; (10)

where

~xa;sðtÞ ¼�

0; if xsðtÞ ¼ 1 and ~xa;sðt � 1Þ ¼ 0xaðtÞ; otherwise

: (11)

The alarm functions f($) commonly adopted in practice includethe filter, deadband and delay timer (ISA, 2009; Rothenberg, 2009).The delay timer is recommended as the choice of f($), because itworks directly on alarm variables, while other functions require thepresence of analog process variables. The m-sample delay timerraises (clears) an alarm if and only if m consecutive samples of~xa;sðtÞ are ‘1’s (‘0’s), i.e.,

xa;sðtÞ ¼8<:

1; if ~xa;sðt�mþ1 : tÞ ¼ 1 and xa;sðt�1Þ ¼ 00; if ~xa;sðt�mþ1 : tÞ ¼ 0 and xa;sðt�1Þ ¼ 1

xa;sðt�1Þ; otherwise;

(12)

where ~xa;sðt �mþ 1 : tÞ denotes the set f~xa;sðt �mþ 1Þ;/; ~xa;sðtÞg.The factor m can be designed in a systematic manner based on theperformance indices of alarm systems such as false alarm rate andmissed alarm rate; see Wang and Chen (2014) for the method todesign m.

In this context, the design of f($) or m is not the main concern.Without loss of generality, we choose m ¼ 1, or equivalently,xa;sðtÞ ¼ ~xa;sðtÞ, to avoid the complexities from the design of f($) orm. Thus, Eq. (10) is simplified to

xa;sðtÞ ¼�

0; if xsðtÞ ¼ 1 and xa;sðt � 1Þ ¼ 0xaðtÞ; otherwise

: (13)

One speciality of Eq. (13) lies at the dependence of xa,s(t) on theprevious sample of xa,s(t � 1), i.e., Eq. (13) is a dynamic system. Thisspeciality makes xa,s(t) in Eq. (13) very different from an intuitivedesign of the state-based alarm system as

xa;sðtÞ ¼�

0; if xsðtÞ ¼ 1xaðtÞ; if xsðtÞ ¼ 0

: (14)

The state-based alarm variable xa;sðtÞ in Eq. (14) is not a properdesign, because it is a static alarm generation mechanism and doesnot consider the previous sample of xa(t), which is indispensable asimplied by Rules 1 and 2. By contrast, xa,s(t) in Eq. (13) depends onthe previous sample of xa(t). That is, xa(t) is not passed on to xa,s(t) inEq. (13) only for the time interval t2 [t0,t1] such that xa(t0 � 1) ¼ 0and xs(t0:t1) ¼ 1, i.e.,

Fig. 8. Flowchart of the online implementation of the dynamic state-based alarm system.

100 200 300 400 500

0

0.2

0.4

0.6

0.8

1

i−th alarm variable

Ave

rage

of η

(a)

0 0.5 10

100

200

300

400

Average of η

His

togr

am

(b)

100 200 300 400 500

0

0.2

0.4

0.6

0.8

1

i−th alarm variable

Max

imum

of η

(c)

0 0.5 10

100

200

300

400

Maximum of η

His

togr

am

(d)

Fig. 9. (a) average values of h0s, (b) histogram of average values of h0s, (c) maximum values of h0s, (d) histogram of maximum values of h0s, for alarm variables having non-zero alarmdurations during March 1e31, 2014 for Example 5.

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119 113

xa;sðtÞ ¼ 0; for t2½t0; t1�; where xaðt0 � 1Þ ¼ 0 and xsðt0 : t1Þ¼ 1:

(15)

Eq. (15) says that if and only if other conditions causing xa(t) ¼ 1do not hold, then the condition xs(t) ¼ 1 plays its role by settingxa,s(t) ¼ 0. Therefore, xa,s(t) in Eq. (13) is physically reasonable,while xa;sðtÞ in Eq. (14) is not. This is numerically illustrated in the

0 20 40 60 80 1000

20

40

60

80

100

120

140

160

180

# of 8 hours

Num

ber o

f ala

rm v

aria

bles

Fig. 10. Number of alarm variables having long-standing alarms every 8 h duringMarch 1e31, 2014 for Example 5.

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119114

following example.

Example 4. The industrial data samples are from Example 6appearing later in Section 5, where the alarm variable is associ-ated with the speed of one coal feeder, and the state variable is therunning mode variable of the electrical motor. Fig. 6 compares twostate-based alarm variables xa,s(t) in Eq. (13) and xa;sðtÞ in Eq. (14).The state variable xs(t) experienced value changes from ‘0’ to ‘1’ at

Table 2Comparison of long-standing alarm indices h for 35 alarm variables and their state-base

# Alarm variable State variab

1 Mill C-#1 coal feeder speed Mill C-#1 el2 Mill C-#2 coal feeder speed Mill C-#2 el3 Mill C-#1 air pressure Mill C-#1 ai4 Mill C-#2 air pressure Mill C-#2 ai5 Mill C-C1-1 seal valve Mill C-C1-16 Mill C-C1-2 seal valve Mill C-C1-27 Mill C-C1-3 seal valve Mill C-C1-38 Mill C-C1-4 seal valve Mill C-C1-49 Mill C-C2-1 seal valve Mill C-C2-110 Mill C-C2-2 seal valve Mill C-C2-211 Mill C-C2-3 seal valve Mill C-C2-312 Mill C-C2-4 seal valve Mill C-C2-413 Oil A1 fire Oil A1 inlet14 Oil A2 fire Oil A2 inlet15 Oil A3 fire Oil A3 inlet16 Oil A4 fire Oil A4 inlet17 Oil B1 fire Oil B1 inlet18 Oil B2 fire Oil B2 inlet19 Oil B3 fire Oil B3 inlet20 Oil B4 fire Oil B4 inlet21 Oil C1 fire Oil C1 inlet22 Oil C2 fire Oil C2 inlet23 Oil C3 fire Oil C3 inlet24 Oil C4 fire Oil C4 inlet25 Coal C1-1 fire Coal C1-1 in26 Coal C1-2 fire Coal C1-2 in27 Coal C1-3 fire Coal C1-3 in28 Coal C1-4 fire Coal C1-4 in29 Coal C2-1 fire Coal C2-1 in30 Coal C2-2 fire Coal C2-2 in31 Coal C2-3 fire Coal C2-3 in32 Coal C2-4 fire Coal C2-4 in33 Coal A1-2 fire Coal A1-2 in34 Coal A1-4 fire Coal A1-4 in35 Coal B2-3 fire Coal B2-3 in

two occasions. In the first occasion about t 2 [75,80], xa,s(t) in Eq.(13) and xa;sðtÞ in Eq. (14) are the same. However, around the sec-ond occasion t2 [135,155], the alarm variable xa(t) had been in thealarm state, possibly owing to some conditions other than xs(t) ¼ 1forcing xa(t) in the alarm state. As a result, xa,s(t) in Eq. (13) was notaffected by the value change of xs(t). By contrast, xa;sðtÞ in Eq. (14)introduced an artificial alarm occurrence that was solely due to theimproper design of the state-based alarm system in Eq. (14).

4.3. Steps of designing and implementing dynamic state-basedalarm systems

It takes the following steps to design a dynamic state-basedalarm system. The steps are also shown at the flowchart in Fig. 7.

Step 1 Collect the historical data samples to formulate the se-quences of alarm and state variables, fxaðtÞgTmax

t¼1 andfxsðtÞgTmax

t¼1 , as well as their relatives fx’aðtÞgTmax

t¼1 in Eq. (4) andfx’sðtÞg

Tmax

t¼1 in Eq. (5).Step 2 Calculate the “alarm durations” in Eq. (2) for fxsðtÞgTmax

t¼1 toobtain Lmax in Eq. (7), and increase the number L of addi-tional ‘1’s appended to x’aðtÞ to obtain x’a;L in Eq. (6), till theconsistency ratio r in Eq. (8) is equal to 1 or L � Lmax; if thehard requirement r¼ 1 cannot be satisfied, then xs(t) is not avalid state variable, and the design procedure stops. If thereare multiple state variables satisfying r ¼ 1, then Eq. (9)yields the final choice of the state variable.

Step 3 Generate the intermediate alarm variable ~xa;sðtÞ in Eq. (11).Step 4 Design the delay timer factorm based on ~xa;sðtÞ by using the

method in Wang and Chen (2014).

d counterparts in the boiler combustion system during May 1ste7th, 2014.

le h for xa(t) h for xa,s(t)

ectrical motor off 1 0ectrical motor off 1 0r valve close 1 0r valve close 1 0off 1 0off 1 0off 1 0off 1 0off 1 0off 1 0off 1 0off 1 0valve close 1 0valve close 1 0valve close 0.99 0valve close 0.96 0valve close 0.94 0valve close 1 0valve close 0.92 0valve close 0.94 0valve close 1 0valve close 1 0valve close 1 0valve close 1 0let valve close 1 0let valve close 1 0let valve close 1 0let valve close 1 0let valve close 1 0let valve close 1 0let valve close 1 0let valve close 1 0let valve close 0.23 0let valve close 0.31 0let valve close 0.18 0

0 0.5 1 1.5 2 2.5

x 106

0

5

10

15

20

25

30

35

40

45

t (sec)

Am

plitu

de

(a)

0 0.5 1 1.5 2 2.5

x 106

0

0.2

0.4

0.6

0.8

1

t (sec)

Am

plitu

de

(b)

10 20 30 40 50 60 70 80 90

0

0.2

0.4

0.6

0.8

1

# of 8 Hours

η

(c)

0.144 0.432 0.72 1.008 1.296 1.584 1.872 2.16 2.448 2.736

x 104

0

5

10

15

20

25

30

35

40

TAD

His

togr

am

(d)

Fig. 11. (a) process variable x(t) (solid) and its low alarm trippoint xtp ¼ 10 (dash), (b) alarm variable xa(t), (c) long-standing-alarm index h, (d) histogram of alarm durations TADduring March 1e31, 2014 for Example 6.

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119 115

The dynamic state-based alarm system is implemented in anonline manner as shown at the flowchart in Fig. 8. The system iscomposed of three parts. The first part is an online mechanism,

Fig. 12. a schematic diagram of the coal feeder.

involving the blocks with dash boundary lines in Fig. 8, to monitorthe validity of xs(t) by checking whether Rule 2 is satisfied. If x0sðtÞtakes the value of ‘1’, then the search for the condition x0aðtÞ ¼ 1 isinitiated; if such a condition is not found before x0sðtÞ takes the valueof ‘1’ again, then xs(t) is not a valid state variable and the dynamicstate-based alarm system has to be suspended. The second part,composed of the blocks with dot-dash boundary lines in Fig. 8, isthe formulation of the state-based alarm variable ~xa;sðtÞ in Eq. (11).The third part, composed of the blocks with thick solid boundarylines, is the implementation of the m-sample delay timer in Eq.(12). The three parts only involve simple algebraic computationsthat can be easily implemented in DCSs.

5. Industrial case studies

This section presents two industrial case studies. The first one ison the overall status of long-standing alarms at Plant A in order tosee the severity of long-standing alarms in practice, and to showthe effectiveness of the proposed dynamic state-based alarm sys-tem. The second case study illustrates the design and imple-mentation of a dynamic state-based alarm system for onerepresentative alarm variable having long-standing alarms.

Example 5. This example investigates the long-standing alarmsfor all alarm variables in the industrial alarm system of the 300MWpower generation unit at Plant A. The alarm duration threshold T0in Definition 1 is chosen as T0¼ 8 h. The long-standing alarm indexh in Eq. (3) is calculated every 8 h for 523 alarm variables that

Fig. 13. (a) xa(t) (solid), xs,1(t) (dash), (b) x’aðtÞ (solid), x’s;1ðtÞ (dash), (c) histogram of t, (d) r(L) for xs,1(t) during March 1e31, 2014 for Example 6.

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119116

experienced alarm occurrences in 31 days of March 1e31, 2014.Fig. 9 presents the average and maximum values of h0s for eachalarm variable, and the histograms of the average and maximumvalues of h’s. There are 137 alarm variables with the average valuesof h’s equal to 1, implying that they are in the alarm status all thetime in the entire 31 days. There are additional 116 alarm variableswith the average values of h0s less than 1 but with the maximum

Fig. 14. (a) xa(t) (solid), xs,2(t) (dash), (b) x’aðtÞ (solid), x’s;2ðtÞ (dash), (c) hist

values of h0s equal to 1, saying that Definition 1 is satisfied and long-standing alarms are present. In total, there are 253 alarm variableshaving long-standing alarms in the 31 days. Fig. 10 shows thenumber of alarm variables having long-standing alarms every 8 h.That is, all these alarm variables have at least one alarm durationequal to T0 ¼ 8 h every 8 h. The maximum, average and minimumvalues in Fig. 10 are 172, 156 and 93, respectively. These numbers

ogram of t, (d) r(L) for xs,2(t) during March 1e31, 2014 for Example 6.

Fig. 15. (a) xa(t) (solid), xs,3(t) (dash), (b) x’aðtÞ (solid), x’s;3ðtÞ (dash), (c) histogram of t, (d) r(L) for xs,3(t) during March 1e31, 2014 for Example 6.

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7

x 105

0

10

20

30

t (sec)

Am

plitu

de

(a)

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7

x 105

0

0.5

1

t (sec)

Am

plitu

de

(b)

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7

x 105

0

0.2

0.4

0.6

0.8

1

t (sec)

Am

plitu

de

(c)

Fig. 16. (a) x(t) (solid), xtp ¼ 10 (dash), (b) xs,3(t) (solid), x’s;3ðtÞ (dash), (c) xa(t) (solid) forsome enlarged parts during March 1e31, 2014 for Example 6.

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119 117

are far more than the benchmark of 5 ~ 10 alarm variables havinglong-standing alarms in ANSI/ISA-18.2 and EEMUA-191.

The proposed dynamic state-based alarm system has been

online implemented for a large number of alarm variables at PlantA. Table 2 lists 35 alarm variables and their state variables in theboiler combustion system having long-standing alarms during May1ste7th, 2014. The long-standing-alarm indices for the first 32alarm variables xa(t)’s are equal or close to 1, and the last 3 alarmvariables have less severe long-standing alarms. The long-standingindices h in Eq. (3) for the state-based counterparts xa,s(t)’s areequal to 0, which shows the effectiveness of the designed dynamicstate-based alarm system.

Example 6. One representative alarm variable xa(t) having long-standing alarms is the one associated with the speed of one coalfeeder for Mill C at Plant A (the first one in Table 2). Since Mill C isdesigned as the backup mill, it is often switched on or off subject tothe operation variations; so are the two coal feeders equipped forMill C. The process variable x(t) (the speed of one coal feeder) isconfigured with alarm variable xa(t) taking a low alarm trippointxtp ¼ 10. Fig. 11 presents the time trends of x(t), xa(t), the evolu-tions of the long-standing-alarm index h in Eq. (3) for every 8hours, and the histogram of the alarm durations TAD in 31 days ofMarch 1e31, 2014. Some of the alarm durations are equal to themaximum alarm duration Tmax¼ T0¼ 8 h, and the index h reachesthe value ‘1’ for several times, both of which indicate the presenceof long-standing alarms in xa(t).

Fig. 12 is a schematic diagram of the coal feeder, with threemajor components, namely, the electrical motor, the outlet plateand the inlet plate, which may directly affect the coal feeder speedx(t). Three candidates of the state variable are obtained based onthis process knowledge, namely, the off-status variable xs,1(t) of theelectrical motor, the position status variable xs,2(t) of the outletplate, and the position status variable xs,3(t) of the inlet plate. Threecandidates are binary-valued, with the value ‘1’ standing for theshutdown/close status. Fig. 13 gives the time trends of xa(t) andxs,1(t), those of the alarm-on variable x0aðtÞ and state-on variablex0s;1ðtÞ, the histogram of the random delay t between ‘1’s in x0aðtÞand x0s;1ðtÞ, and the consistency ratio r(L) in Eq. (8). Note that for abetter visualization, xs,1(t) in Fig. 13-(a) is shifted upwards for 0.2,

20 40 60 800

0.2

0.4

0.6

0.8

1

# of 8 hours

η

(a)

0 0.5 1 1.5 2 2.5

x 105

0

10

20

30

TAD

His

togr

am

(d)

20 40 60 800

0.2

0.4

0.6

0.8

1

# of 8 hours

η

(b)

0 10 20 30 40 500

10

20

30

40

TADH

isto

gram

(e)

20 40 60 800

0.2

0.4

0.6

0.8

1

# of 8 hours

η

(c)

0 0.5 1 1.5 2 2.5

x 105

0

10

20

30

TAD

His

togr

am(f)

Fig. 17. Histograms of alarm durations and long-standing-alarm indices h for xa(t) ((a) and (d)), xa,s,1(t) ((b) and (e)), and xa,s,2(t) ((c) and (f)) during April 1e30, 2014 for Example 6.

Fig. 18. (a) x(t) (solid), xtp ¼ 10 (dash), (b) xs,1(t) (solid), x’s;1ðtÞ (dash), (c) xa(t) (solid),xa,s,1(t) (dash) during April 1e30, 2014 for Example 6.

Fig. 19. (a) x(t) (solid), xtp ¼ 10 (dash), (b) xs,2(t) (solid), x’s;2ðtÞ (dash), (c) xa(t) (solid),xa,s,2(t) (dash) during April 1e30, 2014 for Example 6.

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119118

J. Wang, T. Chen / Journal of Loss Prevention in the Process Industries 43 (2016) 106e119 119

i.e., xs,1(t) þ 0.2.Figs.14 and 15 are the counterparts of Fig. 13 for xs,2(t) and xs,3(t),

respectively. The consistency ratio in Fig. 15-(d) does not arrive atthe value ‘1’, not satisfying Rule 2. Thus, xs,3(t) is an invalid statevariable. Fig. 16 is an enlarged view of the related variables aroundt0 ¼ 92,215 s where x0s;3ðt0Þ ¼ 1 breaks Rule 2. By contrast, theconsistency ratios in Fig. 13-(d) and Fig. 14-(d) reach the value ‘1’,saying that both xs,1(t) and xs,2(t) satisfy the hard requirement inRule 2. However, x0s;2ðtÞ contains only one single ‘1’ as shown inFig. 14-(b), while x0s;1ðtÞ has 47 time instants taking ‘1’ in Fig. 13-(b),the same as the number of ‘1’s in x0aðtÞ. Hence, Eq. (9) gives x0s;1ðtÞ asthe final choice of the state variable, because the dynamic state-based alarm system based on x0s;1ðtÞ can remove more long-standing alarms.

The online implementation of the dynamic state-based alarmsystem in Fig. 8 is performed for the alarm variable xa(t) in the next31 days of April 1e30, 2014. For the purpose of a comparison, weobtain the two state-based alarmvariables xa,s,1(t) and xa,s,2(t) basedon xs,1(t) and xs,2(t), respectively. The alarm durations of xa(t),xa,s,1(t) and xa,s,2(t) are given in Fig. 17. Figs. 18 and 19 respectivelypresent xa,s,1(t) and xa,s,2(t), as well as the process variable x(t),alarm variable xa(t) and state variables xs,1(t) and xs,2(t). Clearly,xa,s,1(t) has much smaller alarm durations than xa(t), and the long-standing alarm indices h for xa,s,1(t) every 8 h are close to zero,indicating the effectiveness of the dynamic state-based alarm sys-tem based on xs,1(t). These alarms in xa,s,1(t) are owing to therandom delays t between ‘1’s in x0s;1ðtÞ and x0a;1ðtÞ. Thus, thosehaving small alarm durations can be safely treated as chatteringalarms, and be removed by the m-sample delay timer. By contrast,the state-on variable x0s;2ðtÞ has no value of ‘1’, so that xa,s,2(t) isactually the same as xa(t), saying that xs,2(t) is not a good choice asthe state variable.

6. Conclusion

This paper studied long-standing alarms. First, the definition oflong-standing alarms was summarized in Definition 1, namely, along-standing alarm is present if the alarm duration TAD is largerthan a threshold T0. Second, the three main causes for long-standing alarms were identified as the ones making the resultinglong-standing alarms to be nuisance alarms. Third, a dynamic state-based alarm system was designed to remove long-standing alarmsthat are caused by the inconsistency between the alarm design anddiscrete-valued operating states. Rules 1 and 2 were formulated toselect the state variables. The alarm generation mechanism in Eqs.(10) or (13) was proposed to generate the state-based alarm vari-ables. The proposed dynamic state-based alarm systems have beensuccessfully deployed to more than 100 alarm variables in Plant A.

The long-standing alarms caused by the third cause are difficultto handle, as mentioned in Section 3. The difficulty comes from thedesign of dynamic alarm trippoints to replace the current constantones in order to work with different operating states. Multiplerelated process variables have to be involved, and high-dimensional normal and abnormal operation zones may need tobe defined for designing dynamic alarm trippoints to remove thelong-standing alarms. As a result, the design of dynamic alarmtrippoints is left as one future work.

Nomenclature

xa(t) the alarm variablet the integer-valued sampling indexh the real-valued sampling periodTAD the alarm durationx(t) the process variableT0 the threshold for long-standing alarmsh the long-standing alarm indexxs(t) the state variablex0aðtÞ the alarm-on variablex0sðtÞ the state-on variabler the consistency ratioxa,s(t) the state-based alarm variable

References

Arjomandi, R.K., Salahshoor, K., 2011. Development of an efficient alarm manage-ment package for an industrial process plant. In: 2011 Chinese Control & De-cision Conf., May 23e25, 2011, Mianyang, China, pp. 1875e1880.

Beebe, D., Ferrer, S., Logerot, D., 2013. The connection of peak alarm rates to plantincidents and what you can do to minimize. Process Saf. Prog. 32, 72e77.

Blaauwgeers, E., Dubois, L., Ryckaert, L., 2013. Real-time risk estimation for bettersituational awareness. In: The 12th IFAC Symp. Analysis, Design, and Evaluationof Human-machine Systems, LA, USA, Aug. 11e15, 2013, pp. 232e239.

Bransby, M.L., Jenkinson, J., 1998. The Management of Alarm Systems. Health andSafety Executive.

EEMUA, 2013. EEMUA-191: Alarm Systems e a Guide to Design, Management andProcurement. Engineering Equipment and Materials Users Association.

Ghetie, M., Sauter, D., Saif, M., 1998. Many-state alarm generation using multiplebinary decisions. In: The 37th IEEE Conference on Decision and Control(CDC1998), December 16e18, 1998, Tampa, Florida, USA, pp. 575e580.

Hatch, D., 2005. Alarms: prevention is better than cure. Chem. Eng. 769, 40e42.Hollifield, B., Habibi, E., 2010. The Alarm Management Handbook, second ed. PAS,

Houston.HSE, 1997. The Explosion and Fires at the Texaco Refinery, Milford Haven, 24 July

1994: a Report of the Investigation by the Health and Safety Executive into theExplosion and Fires on the Pembroke Cracking Company Plant at the TexacoRefinery, Milford Haven on 24 July 1994. Health and Safety Executive.

HSE, 2008. The Final Report of the Major Incident Investigation Board. Health andSafety Executive.

ISA, 2009. ANSI/ISA-18.2: Management of Alarm Systems for the Process Industries.International Society of Automation.

Jerhotova, E., Sikora, M., Stluka, P., 2013. Dynamic alarm management in nextgeneration process control systems. In: APMS 2012, Part II, IFIP AICT 398,pp. 224e231.

Kim, I.S., 1994. Computerized systems for online management of failures: a state-of-the-art discussion of alarm systems and diagnostic systems applied in thenuclear industry. Reliab. Eng. Syst. Saf. 44, 279e295.

Macdonald, D., 2004. Practical Hazops, Trips and Alarms. Elsevier.Nihlwing, C., Kaarstad, M., 2012. The development and usability test of a state based

alarm system for a nuclear power plant simulator. In: NPIC & HMIT 2012, SanDiego, USA, July 22e26, 2012.

Pariyani, A., Seider, W.D., Oktem, U.G., Soroush, M., 2010. Incidents investigationand dynamic analysis of large alarm databases in chemical plants: a fluidized-catalytic-cracking unit case study. Ind. Eng. Chem. Res. 49, 8062e8079.

Ragsdale, A., Lew, R., Dyre, B.P., Boring, R.L., 2012. Fault diagnosis with multi-statealarms in a nuclear power control simulator. In: The 56th Annual Meeting ofHuman Factors and Ergonomics Society (HFES 2012), October 22e26, 2012,Boston, MA, USA, pp. 2167e2171.

Rothenberg, D., 2009. Alarm Management for Process Control. Momentum Press.Stauffer, T., Clarke, P., 2016. Using alarms as a layer of protection. Process Saf. Prog.

35, 76e83.Wang, J., Chen, T., 2014. An online method to remove chattering and repeating

alarms based on alarm durations and intervals. Comput. Chem. Eng. 67, 43e52.Zhu, J., Shu, Y., Zhao, J., Yang, F., 2014a. A dynamic alarm management strategy for

chemical process transitions. J. Loss Prev. Process Ind. 30, 207e218.Zhu, J., Zhao, J., Yang, F., 2014b. Dynamic risk analysis with alarm data to improve

process safety using Bayesian network. In: The 11th World Congress on Intel-ligent Control and Automation, June 29eJuly 4, 2014, Shenyang, China,pp. 461e466.