data-driven methods for monitoring, fault diagnosis, control and optimization

30
Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization John MacGregor Ali Cinar ProSensus, Inc. Illinois Institute of Technology McMaster University

Upload: coty

Post on 22-Feb-2016

46 views

Category:

Documents


0 download

DESCRIPTION

Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization. John MacGregor Ali Cinar ProSensus, Inc. Illinois Institute of Technology McMaster University. Overview. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Data-driven Methods for Monitoring, Fault Diagnosis,

Control and Optimization

John MacGregor Ali CinarProSensus, Inc. Illinois Institute of TechnologyMcMaster University

Page 2: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Overview• An overall theme: Making use of historical plant data

• Empirical models • Optimization• Control

• Monitoring and fault diagnosis• Fault tolerant control

John MacGregor

Ali Cinar

Page 3: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Models• Mechanistic

– Structure from theory / Parameters from data– Advantages are well known– Problems:

• Assumptions that may be poor; theory for many y’s not known• May not incorporate many of measured variables • Examples: Y’s or X’s that are images or PAT sensors

• Empirical– Structure and parameters from data– Advantages are again well known:– Problems:

• Structure is often imposed and unrealistic, no interpretability nor any causality

Page 4: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Latent Variable Models - Concepts

Latent variable spaceMeasured variables

TX

t1

t2

(c) 2004-2010, ProSensus, Inc.

Summary statistics: T2 and SPE

Page 5: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Latent variable regression modelsTwo data matrices: X and Y

Symmetric in X and Y• No hypothesized relation between X and Y• Both X and Y are functions of the latent variables, T• Choice of what is X and Y depends upon objectives

X = T PT + E Y = T CT + F

TX Y

(c) 2004-2010, ProSensus, Inc.

Page 6: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Why Latent Variable Models?

• Low dimensional models – Define the space containing most of the information

• Simultaneously model both the X and Y spaces– Model structure truly determined by the data– This makes models unique and interpretable– Provides causal models in the low dimensional LV space

• Allows for active use of the model (eg. optimization)– Allows for

• easy handling of missing data• Easy detection of abnormal observations (*)

• Other regression methods (MLR, ANN, etc) do not share these advantages when using historical data. – Non-unique, uninterpretable, non-causal

Page 7: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Optimization in Latent Variable Spaces• For active use of model, must have causality

– Active use optimization / control / diagnosis• Historical plant data generally do not contain causal

information on individual variables– Nor will any model built from these data

• But latent variable models do provide causality in the low dimensional LV space (t1, t2, …)– Y = TCT X= TPT (t’s define Y and X)– T = XW* (To change T must move combinations of x’s)

• Optimization in low dimensional LV space– Then X and Y obtained from LV’s

• Illustrate concept with 2 industrial examples

Page 8: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Optimization: Injection molding process• GE water systems (2003)

– Polyurethane film manufacture very sensitive to humidity, temperature and raw material variations

– Operators periodically readjusted the process largely by trial and error

• Inject ~50 parts; measure ~10 quality variables; make adjustments– Injection velocity profiles, timing sequences, etc.

• Iterate until within specification– Provided a good set of data for LV modeling

• Nonlinear PLS model – 20 raw material properties; 26 process variables; 10 quality variables

• Models for both Y and Variance (Y)

Page 9: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

– Constraints:• Humidity , temp and raw material properties constrained to

their currently measured values• SPE < ϵ; T2 < T2

99% These ensure validity of model

– Applied only when multivariate control limits violated– Results:

• Readjustment in one step• Improved quality• Reduced scrap• Operational since 2004

- 1 0

- 5

0

5

- 7 - 6 - 5 - 4 - 3 - 2 - 1 0 1 2 3 4 5 6 7

t[2 ]

23

24

25

2826

20

2930

21

27

39 37

363538

34

)(ˆ)(ˆ)(ˆ)(ˆ 21

4,..2,1,

newnewT

newdesT

newdest

tyQtytyyQtyyMina

anew

Optimization: Injection molding process

Page 10: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Optimization of a batch polymerization

Pilot plant data (Air Products & Chemicals)

Z X Y

Recipe & Initial Conditions Variable Trajectories

End Properties (13)

time

variables

batc

hes

• Very high dimensional optimization problem

• Easily solved in low dimensional LV space

Page 11: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Optimization for new product quality • Constraints or desired values are specified for the 13 y’s• Minimize batch duration• Optimization done in the three dimensional LV space

4

3

22

1

2

432

21

ˆˆ

)(),,ˆ,ˆ(

ˆ)(ˆ)(ˆ3,2,1

,

ayDaxCaT

aSPEtPLSTSPExy

xqSPEqTqtyyQtyy Tnewdes

Tnewdes

tMina

anew

Multiple solutions for Z, X-all satisfying the y specifications, but with different batch times.

Page 12: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Supervisory MPC of Batch Processes• Objective: Control final product quality

– Product quality only measured upon completion of batch– Control problem is thus one of

1. Predicting final quality from all the initial and evolving data2. Making optimal mid-course corrections at several decision

points during the batch (QP)3. Different objectives at each decision point

– PLS models have been shown to be ideal for modeling batch trajectory data and predicting final quality• Build from historical batch data• Plus some DOE runs at the decision points• Closed-loop identification used for subsequent implementations

Page 13: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Supervisory MPC of Batch Processes• Commercial systems in food industry

> 100,000 batches controlled> 99.9% up-time> 50% reduction in std dev of all final quality attributes- 20-40% increases in productivity

-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.40

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

SP

with ABC

without ABC

Final quality attribute

• LV models also allow MSPC monitoring throughout the batch.

• This helps make controller robust to faults – e.g. wireless temp sensor failure – default controllers.

Page 14: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Summary (first part)

• Latent variable models provide powerful ways to use historical operating data – Can make use of all measured variables– Provide unique, interpretable models for analysis– Provide causality in the LV space for optimization,

control • Industrial examples used to illustrate this

– Provide monitoring and diagnosis capabilities (next part)

Page 15: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Implementation and Automation of Process Supervision

• Many variations of PCA: PCA, MBPCA, DPCA, …• Many techniques: PCA, PLS, Independent Component

Analysis (ICA), …• The Irish potato famine - single kind of potato (Lumper)

Diversity provides robustness• Develop a SPM, FD and control system that uses many

alternate techniques– How to decide which technique works better for a given

situation? Add a management layer– How to improve decision-making with experience? Use

distributed AI

Page 16: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Adaptive, Decentralized Process Supervision

Develop an agent-based monitoring, faultdetection, diagnosis, and control system to:– Coordinate alternative techniques for

reliable and accurate fault detection, diagnosis (FDD) and control

– Improve performance via: ocontext-dependent performance

assessment and decision-makingomulti-level learningoadaptation

Page 17: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Distributed Artificial Intelligence• Implement with Agent-Based Systems (ABS)• Decision-making is decentralized and divided

into hierarchical layers• Agents:

– are autonomous software entities– observe their environment – act on their environment according to predefined

rules/algorithms– may adapt by changing their rules/interpretation

based on their environmental conditions at run time

Page 18: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

MADCABS: Monitoring Analysis Diagnosis and Control with Agent-Based Systems

• MADCABS is built using a hierarchical layout, with physical communication layer, process supervision layer and agent management layer

Sim u lato r

output: si mulatedsystem data

example: solver .dl l i nour case P la n t (re a l o r

s im u la te d )

P r o c e ss su p e r v is io n an d c o n tro lag e n ts .

T his laye r n ee d s p er fo rm an c ee v alu atio n .

M an ag e m e n t L aye r fo r ag e n tp er fo rm an c e e v a lu atio n an d

p r io r ity ass ig n m e n t

M A D C A B S' P h ysica lL a yer

S e nsors

A c tua tors

Preprocessing, monitoring, diagnosis, control

Collection of raw data from plantBasic information flow:

Mapping control actions back to processEvaluation of technique and agentperformance

Page 19: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Process Monitoring

• Statistical process monitoring techniques used– Principal component analysis (PCA)– Dynamic PCA (DPCA)– Multi-block PCA (MB-PCA)

Process

Calculates the performances: Accumulated performances of fault detection agents are summed to find the total performance of the monitoring agent

Builds new statistical models: When all monitoring agents are performing badly or the process operating mode changes.

Monitoring Organizer

Page 20: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Process

Fault Detection• Fault detection agents are the monitoring

statistics for PCA, DPCA and MB-PCA– Hotelling’s T2 and SPE statistics

Fault detection organizer

1. PCA_SPE2. PCA_T2

3. MB-PCA_SPE4. MB-PCA_T2

5. DPCA_SPE6. DPCA_T2

Fault detection agents

Gives out-of-control signals based on the consensus formed between fault detection agents

Observes performances of fault detection agents under different fault magnitudes and keeps history

Triggers diagnosis agent

Page 21: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Fault Diagnosis

Database

Diagnosis Training Agent

Diagnosis Agent

Diagnosis Manager

Fault Detection Organizer

Process

Consensus Fault Decision

Fault Identification

Agents

Page 22: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Fault Diagnosis: Identification techniques

• Contribution Plots– Variable

contributions to monitoring statistics T2 and SPE

• Fishers Discriminant Analysis (FDA)

• Partial Least Squares Discriminant Analysis (PLSDA)

Observation

SPE SPE = eeT

For an out-of-control observation:Squared Prediction Error SPE = ∑ej , j = 1, …, number of variables N

X1 X3 X5

Variable Contributions

Fault Type 1

Fault Type 2

Variance between classesVariance within classes

Classify new observation based on the closeness to the existing clusters

max

Fault Type 1

Fault Type 2

Classify new observation based on the class membership

y = BPLS x

1s

1s

Fault Type 1

Fault Type 2

X Y

Page 23: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Fault Diagnosis (Identification) Agents

Process

Identification (Discrimination)

Agents

FDA PLSDA

Contribution Map

Estimator

Project the new fault data on the model and

determine the most likely fault class

PCA_SPE [X1,X4,X7]PCA_T2 [X1,X4]MB-PCA_SPE [X1,X4]MBPCA_T2

[X1,X4,X7,X8]DPCA_SPE [X1,X4,X6]DPCA_T2 [X1,X4]

[X1,X4]Fault Signature for F1

Contribution Maps[X1,X4] : [F1, F1, …]

Page 24: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Agent Performance Management Layer

Performance Evaluation:• Record the performance of the agent and the

state metrics that define the state of the system when that performance is observed.

[State Metrici, i=1,…,I ] = f (performance)

• Compare the current state of the system to recorded states, and estimate the performance of the agent for the current state based on its performance for similar states in history.

Page 25: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

State Metric 2

New Data Point:What would the performances

of each agent be for this state?

State Metric 1

Agent A

Agent B

Agent C

d1 d2

d3For each agent:- Identify performances at closest state points.- Obtain a performance estimate for the current state point by interpolation.

P1 P2

P3

Pestimate

Pest,A

Pest,C

Pest,B

Performance History Space

Page 26: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Diagnosis Performance History• Record:

– Fault signature• Fault signatures are the process variables significantly

contributing to the inflation of the monitoring statistic• Fault signatures are available once the fault is detected

– Performance of the agent for that fault signature• Performance is recorded only after diagnosis is confirmed

• Use the history to find: – Agents that are the best performers for the current fault

signature.

Diagnosis agent uses the estimated performances of fault identification agents for the potential fault to form the consensus diagnosis decision

Page 27: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Adaptation:Performance-Based Consensus Analysis- Agents update their built-in knowledge and

methods they use- Discriminant agents update their models

with current dataAdaptive

FDAAdaptivePLSDA

Contribution Map Estimator

Over time, after a diagnosis decision is confirmed for a fault type, the misclassifications are used to update the models of the adaptive instances

Page 28: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Fault-tolerant Control StructuresPl

ant o

r sim

ulat

or

System Identification

PID control

Set of Controllers

MPC Control

Controller Performance Assessment

Monitoring and Diagnosis

Single centralized control system

Decentralized control: . Local coordinated MPCs. Local MPCs integrated with local FDD modules using ABS

Page 29: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Summary/Conclusions • Latent variable models provide powerful ways to

use historical operating data • Data-driven methods are well-suited for

distributed process supervision• Learning and adaptation in monitoring, FDD and

control enable fault- tolerant control• MADCABS provides an environment for adaptive

fault diagnosis and fault-tolerant control• There are alternative approaches – Vive la

difference!

Page 30: Data-driven Methods for Monitoring, Fault Diagnosis, Control and Optimization

Acknowledgements

• IIT & ANL:• Fouad Teymour• Cindy Hood• Michael North• Arsun Artel• Inanc Birol• David Mendoza• Sinem Perk• QuanMin Shao• Derya Tetiker• Eric Tatara• Cenk Undey

Financial Support by National Science Foundation CTS-0325378 of the ITR program.

• McMaster University & ProSensus

• Many of my former grad students at McMaster

• The ProSensus team