digital media lab 1 data mining applied to fault detection shinho jeong jaewon shim hyunsoo lee...
Post on 03-Jan-2016
214 Views
Preview:
TRANSCRIPT
Digital Media Lab
Digital Media Lab 1
Data Mining Applied To Fault Detection
Shinho JeongJaewon ShimHyunsoo Lee
{cinooco, poohut, darth7}@icu.ac.kr
Digital Media Lab 2
LogoIntroduction Aims of work
Neural Network Implementation of the Non-linear PCA model using Principal Curve algorithm to increase both rapidity & accuracy of fault detection.
Data mining? Extracting useful information from raw data using statistical methods and/or AI techniques.
Characteristics Maximum use of data available. Rigorous theoretical knowledge not required. Efficient for a system with deviation between actual process and first
principal based model . Application
Process monitoring Fault detection/diagnosis/isolation
Process estimation Soft sensor
Digital Media Lab 4
LogoIssues
Major concerns Rapidity
Ability to detect fault situation at an earlier stage of fault introduction.
Accuracy Ability to distinguish fault situation from possible process
variations.
Trade-off problem Solve through
Frequent acquisition of process data. Derivation of efficient process model through data
analysis using Data mining methodologies.
Digital Media Lab 5
LogoInherent Problems
① Multi-colinearity problem Due to high correlation among variables.
Likely to cause redundancy problem. Derivation of new uncorrelated feature variables required.
② Dimensionality problem Due to more variables than observations.
Likely to cause over-fitting problem in model-building phase. Dimensional reduction required.
③ Non-linearity problem Due to non-linear relation among variables.
Pre-determination of degree of non-linearity required. Application of non-linear model required.
④ Process dynamics problem Due to change of operating conditions with time.
Likely to cause change of correlation structure among variables.
Digital Media Lab 6
LogoStatistical Approach
Statistical data analysis Uni-variate SPC
Conventional Shewart, CUSUM, EWMA, etc. Limitations
Perform monitoring for each process variable. Inefficient for multi-variate system.
More concerned with how variables co-vary. Need for multi-variate data analysis
Multi-variate SPC PCA
Most popular multi-variate data analysis method. Basis for regression modesl(PLS, PCR, etc).
Digital Media Lab 7
LogoLinear PCA(1)
Features Creation of…
Fewer => solve ‘Dimensionality problem‘
& Orthogonal => solve ‘Multi-colinearity problem‘
new feature variables(Principal components)
through linear combination of original variables. Perform Noise reduction additionally. Basis for PCR, PLS.
Limitation Linear model => inefficient for nonlinear process.
Digital Media Lab 8
LogoLinear PCA(2)
Theory
1 2 3
'
' ' ' ' ' '1 1 2 2 3 3 1 1
'
, [ ] ~ original var's
( ) , ( 1, 2,3, , )
( ~ orthonormal matrix)
{ } { }
( , )
m
i ii
i i
l l l l m m
l l l l
l l
Let x x x x x
Cov x p p i m
t x p t x P x t P P
x t p t p t p t p t p t p
t P e x e
x f x P t P
' '
'
( ) ( ( ))
( ) ~encoding mapping
( ) ~decoding mapping
l l l
l l
l l l
x P P F G x
G x x P t
F t t P x
Decoding mapping
x xlt
Encoding mapping
Digital Media Lab 9
LogoLinear PCA(3)
ERM inductive principle
Limitation
Alternatives Extension of linear functions to non-linear ones
using… Neural networks. Statistical method.
( ), ( ) ~ linear functionsi lG x F t
2
'
1
1R ( ) , ( ( )) ( )
n
i iemp l i i i l li
P x x where x F G x x p pn
Digital Media Lab 10
LogoKramer’s Approach
Limitations Difficult to train the networks with 3 hidden layers. Difficult to determine the optimal # of hidden nodes. Difficult to interpret the meaning of the bottle-neck layer.
Input layerMapping
layerBottleneck
layerDemapping
layerOutput layer
x 'x x
Digital Media Lab 11
LogoNon-linear PCA(1)
Principal curve(Hastie et al. 1989)
Statistical, Non-linear generalization of the first linear Principal component.
Self-consistency principle
① Projection step(Encoding)
② Conditional averaging(Decoding)
2( ( )) ( | arg min ( ) )
zx F G x x z F z x
2( ) arg min ( ) )
zz G x F z x
( ) ( | )x F z x z
Digital Media Lab 12
LogoNon-linear PCA(2)
Limitations Finiteness of data. Unknown density distribution. No a priori information about data.
Additional consideration② Conditional averaging => Locally weighted
regression, Kernel regression Increasing flexibility(Span decreasing)
Span : fraction of data considered to be in the neighborhood.
~ smoothness of fit
~ generalization capacity
0
0.2
0.4
0.6
0.8
1
-5 -4 -3 -2 -1 0 1 2 3 4 5
σ=0.5
σ=1
σ=2σ=4
Digital Media Lab 14
LogoProposed Approach(1)
Creation of Non-linear principal scores
1 1 1 1 1 1
1 0
1 1 2 2
1 2
( ) where, ( )
( ) where, =1,2,3, and =
= ( ) ( ) ( )
[ , , , ] ~ non-linear principal score
i i i i
l l l l
l
x F z e F z C
e F z e i e x
x F z F z F z e x e
z z z z
Digital Media Lab 15
LogoProposed Approach(2)
Implementation of Auto-associative N.N.
Construction of 2 MLP N.N.'s from ( , ) & ( , )x z z x
Reconstructed
1st MLP 2nd MLPInput layer 1st MLP 's hidden 2nd MLP 's hidden Reconstruc ted
NLP C score
x z z x
Digital Media Lab 16
LogoCase Study
Objective Fault detection during operating mode change using
6 variables Data acquisition & Model building
NOC data : 120 observations => NLPCA model building Fault data : another 120 observations
LI
Vap /liqseparator
Stripper
Reactor
CondenserANALYZER
XB
XA
XG
XF
XE
XD
XC
XH
Product
ANALYZER
XE
XD
XH
XG
XF
Purge
FI9
LIStm
FI
Cond
TI
FI
11
ANALYZER
XB
XA
XF
XE
XD
XC
FI
LI PI
10
TI
D
FI2
A
FI1
E
FI3
FI
FI4
C
CompressorJI
FI
8
5
PI
6
7
CWS
CWR
TI
13
SC
CWS
TI
CWR
TI
12
PI
drift
Digital Media Lab 17
LogoModel Building
Auto-associative N.N. using 2 MLP’s
5 iterations
50 iterations
30 iterations
1st MLP N.N.
2nd MLP N.N.
Principal curve fitting
Digital Media Lab 18
LogoMonitoring Result
NLPCA model more efficient than LPCA model!!!
Fault introduction
top related