model discrimination and parameter estimation for … · model discrimination and parameter...
TRANSCRIPT
-
Model Discrimination and Parameter Estimation for Complex Reactive Systems
Yajun Wang, Weifeng Chen, Yisu NieL. T. Biegler
Chemical Engineering DepartmentCarnegie Mellon University
Pittsburgh, PA
-
2
Overview
Introduction
How? Model Building Tools Direct Transcription Parameter Estimation - Nonlinear Programming Inference - NLP Sensitivity
What? Industrial Case Studies Solid-Liquid Reactions Chemical Kinetics from Spectra
Why? Process Optimization
Summary and Conclusions
-
3
zi,I0 zi,II
0 zi,III0 zi,IV
0
zi,IVf
zi,If zi,II
f zi,IIIf
Bi
A + B CC + B P + EP + C G
Model Building and Optimization for Complex Reactive Systems
Model Building Formulation of First Principles Models Parameter Estimation Model Discrimination and Validation
Control Optimal reference trajectories Real-time optimization
Operations Transitions Upsets Integration with logistics
-
4
Optimization Models based on Physics and Chemistry (First Principles)
Goal: establish predictive capability that extrapolates beyond observed conditions
Apply conservation laws at macroscopic and microscopic levels
Apply constitutive relationships at smallest available time/length scale.
Assess assumptions and adjust for missing information through parameter estimation and model validation
All models are wrongsome are useful. G.E.P. Box
-
ExxonMobil PROPRIETARY
5
Work Process for Model Development(www.eurokin.org)
FundamentalsThermodynamicsKinetic Databases
Microkinetic ModelsAb Initio Calculations
Create Reaction NetworkConstruct rate expressions
Initialize parameter estimation
Incorporate into Reaction System
Parameter Estimation
Model Discrimination
and TestingDesign of Experiments
Uncertainty Quantification
Experimental Design and Data
Decision-making for model development- Versatile, interactive user interface- Fast, reliable numerical tools- Integrated data, tasks and results
-
6
tf, final timeu(t), control variablesp, time independent parameters
t, timez(t), differential variablesy(t), algebraic variables
Dynamic Optimization Model for Reactive Systems
s.t.
-
7
-
8
Nonlinear Programming Problem
uL
x
xxx
xc
xfn
=
0)(s.t
)(mins.t.
-
9
Full-space NLP Formulation for Parameter Estimation
Original Formulation
Barrier Approach
Can generalize for
As 0, x*() x* Fiacco and McCormick (1968)
-
10
Solution of the Barrier Problem - IPOPT
Newton Directions (KKT System)
0 )(0 0 )()(
===+
xceXVevxAxf
Solve Reducing the System
What are the Benefits for Parameter Estimation?
-
11
Inertial Corrections for Factorization of KKT Matrix
Modify KKT matrix to preserve correct inertia for each Newton iteration:
1 - Correct inertia to guarantee descent direction SSOSC 2 Correct rank deficient Ak LICQ
KKT matrix factored by sparse LTBL factorization
Solution with 1= 0 primal variables unique
Solution with 1= 2= 0 primal and dual variables unique
Estimation Result with 1= 0 unique (observable) parameter estimates necessary for predictive model
Reduced Hessian available for confidence regions
++IA
AIWTk
kkk
2
1
-
Sensitivity of KKT Conditions
Analyze sensitivity of estimates wrt changes in data
At solution we have linearized optimality conditions
Introduce perturbations and Obtain Covariance of parameters
http://www.cheme.cmu.edu/http://www.cheme.cmu.edu/
-
13
For normal, unbiased distributions, linear models and known V, this probability follows a 2 distribution so that the region can be defined by:
(true-*)TV-1 (true-*) c()
c() is 2 value for level of confidence with n degrees of freedom.
Elliptical confidence regions are correct if model is linear or for small levels of confidence, .
Elliptical confidence regions - commonly used for parameter screening
nonlinear confidence regions more expensive.
principal axes of V
99%
95%
90%*
-
ExxonMobil PROPRIETARY
14
Model DiscriminationPostulate first principle models, Mj-Rate controlling mechanisms? Slow reactions?-What are the competing models/mechanisms?
Occams Razor: balance model simplicity with best fit
-
15
Case Study I: Solid-Liquid Reactions(Y. Wang)
15
Surface reaction, dissolution, diffusion - reaction on solid or in liquid phase?
Different particle shapes and sizes - reaction surface?
Product effects products growing on surface or breaking off?
Preparation
Reaction
Solvent
Solid W
Liquid X
Solvent and reactant materials
Reactor discharge
Vent
Agitator
Reactor jacket
Cooling water inlet
Cooling water outlet
W(s) + X(l) Y(s/l) + Z(s)
-
16
Reactant reactantreactantFluid film
blc
slc
Liquid reactant diffuses onto the particle surface
Solid-liquid reaction
Reaction
Solid product breaks off from reaction surface
Shrinking particle model
1/ 1 1/0 0
1 1 0
1/ 1 1/0 0
1 1 0
Solid: (c )
Liquid: (c )
aks s k
aks s k
EK Ka a ss s RT
sk k sk s s k lk k s
EK Ka a sl s RT
lk k lk s s k lk k s
dN aMSR N N k edt R
dN aMSR N N k edt R
= =
= =
= =
= =
Surface area Reaction rateSurface reaction rate depends on surface concentration of the liquid reactant.
-
17
Dissolution model
17
Reactant reactantreactant
Solid particles dissolve into solvent
Liquid liquid reaction
Products precipitate into solid phase
1/ 1 1/0 0
1 0
1/ 1 1/0 0
1 0
Solid:
Liquid:
as s
as s
EKa as s RT
sk s sk s
EKa al s RT
lk s sk s
dN aM N N k edt R
dN aM N N k edt R
=
=
=
=
Surface area Dissolution rate
Rate independent of surface concentration of the liquid reactant.
-
18
Batch Reactor Model
18
Surface concentration of liquid reactant
F=0 Dissolution modelF>0 Shrinking particle model
Model indicating factor
-
19
Lots of Data - Too Few Informative Measurements (NS = 9 Data Batches)
19
Jacket temperatures (Tcw) Inlet flowrates (Fc) Reactor weight (WR)
Reactor temperatures (TR) Endpoint Concentrations
(Ci(tf))
-
2020
Measured output errors
Measured input errors
Reactor temperatures End-point concentrations Jacket temperatures Weights and flowrates
Errors in Variables Measured (EVM)
+ Simultaneous parameter estimation and model solution+ Better than output data fitting-Additional inputs as decision variables-EVM has 15771 variables and 13830 equation constraints
-
21
Estimation results
Estimation results of the full model by EVM
21
Large reliability factors of parameter D and F Parameter UA is estimated at its upper bound
-
22
Estimation results
22
Large reliability factors of parameter D and F Parameter UA is estimated at its upper bound
0 0.2 0.4 0.6 0.8 10.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
Scaled time
Scal
ed te
mpe
ratu
re
Reactor temperature
PredictData
-
23
D: diffusion coefficient Data are mainly temperatures, less information
of bulk and surface concentrations Set Cb,X = Cs,X
F: model indicating factor Zero is contained in the confidence interval Set F = 0 dissolution model
Heat transfer coefficient UA: the largest value of heat transfer coefficient UB: the highest value of temperature UC: the shape /spread Fix parameter UA
Use a linearly temperature-dependent heat transfer coefficient
Estimation Quality Analysis
23
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
-
24
Posterior Probability Share
24
According to previous analysis, the batch reactor model can be simplified step by step and 5 candidate models are generated
Estimations are conducted for all candidate models and posterior probability shares are evaluated by
Model 4 has the largest posterior probability share and requires the least computational time
-
25
Estimation Results of Selected Model
25
Model 4 Fixed parameter D, F and UA Nonlinear heat transfer coefficient
Large reliability factors are avoided Estimability of kinetic parameters are enhanced with even smaller variances
-
26
Estimation Results of Selected Model
26
Model 4. Data fitting
Measured input
Measured output
Measured output
-
27
Model Cross Validation 9-fold cross validationa) Randomly split measured output data to
9 setsb) In each iteration, estimate parameters
by 8 sets of data and use the left dataset to do model validation
c) Repeat step b) 9 times, each dataset is used once in model validation and 8 times in estimation
27
1 2 3 4 5 6 7 8 90.09
0.095
0.1
0.105
0.11
0.115
0.12
Iteration
Par
amet
er A
3
Estimated value in each iterationAverage of 9 estimationsEstimated value by all data
1 2 3 4 5 6 7 8 9
0.76
0.78
0.8
0.82
0.84
0.86
0.88
0.9
0.92
Iteration
Par
amet
er U
B
3
Estimated value in each iterationAverage of 9 estimationsEstimated value by all data
-
28
Case Study II: Measured Spectra and Reaction Models (Chen, B., 2016)
TD C S E= +
, ,ntp nwp ntp nc nwp ncD R C R S R
Measurement Model
2,, (0, )
ntp nwpi jE R N
Reaction Model( ) ( ( ), ( ), )
( ( ), ( )) 0
dc t f c t y tdt
g c t y t
=
=
Instrument PrecisionInstrument AgeingBackground Noise
-
29
Beer-Lambert Law (D = C ST)
Real Spectra
=
WavelengthTime
Time
Wavelengthc1 c2
s1s2
ConcentrationMatrix
AbsorbanceMatrix
1 1 2 2( , ) ( ) ( )+ ( ) ( ) ... ( ) ( )i j i j i j nc i nc jd t c t s c t s c t s = + +
UV Visible 190-700 nmNear Infrared 700-3000 nm
-
30
Multivariate Curve Resolution (MCR)Typical Current MethodsNon-Iterative Methods
Window Factor Analysis (WFA)Subwindow Factor Analysis (SFA)
Iterative ApproachesIterative Target Transformation Factor Analysis (ITTFA)Multivariate Curve Resolution Alternating Least Squares (MCR-ALS)
Model Free MCR Combined with Model-based Kinetics
Goals Develop method for simultaneous estimation of concentrations and kinetic parameters directly from spectraDeconvolute instrument noise and system disturbancesObtain confidence regions for estimated parameters
-
31
Reaction Model with Disturbances (SDEs)
( ) ( ( ), ( ), )( ( ) ( , )) 0
dc t dt f c t y tg c t y t
==
[ ]1 2( ) ( ( ), ( ), ) ( ), ( ) , ,...,( ( ), ( )) 0
Tncdc t f c t y t dt dW t W t W W W
g c t y t = + =
=
0 100 200 300 400 500 600 700 800 900 1000-4
-2
0
2
4
kW Standard Brownian Motion or Wiener Process(0) 0kW =(a).
with probability 1(b).
0 ( ) ( ) ~ (0,1)k ks t T W t W s t sN < (c). 0
( ) ( ) ( ) ( )k k k k
s t u v TW t W s indep W v W u
< < <
-
32
SDE Model Description
,
( ) ( ( ), ( ), ) ( ) ( ( ), ( )) 0
( ), 1,.., ; 1,..,i j jT
i
dc t f c t y t dt dW tg c t y tC c t j nc
D C
t
S E
i n p
=
= + = = = =
+
Convert Stochastic DAEs to DAEs through Euler discretization
Recover an independent Wiener process with (small) Gaussian noise
Compare to exact DAE solution to extract linear perturbation terms on disturbance distribution
Simplify Jacobian Terms Apply Maximum Likelihood Principles
-
33
Problem Transformation
( ) ( )FP c z pc
=
=1Fc
( ) ( )P c z p =
,
( ) ( ( ), ( ), ) ( ) ( ( ), ( )) 0
( ), 1,.., ; 1,..,i j iT
j
D C S E
dc t f c t y t dt dW tg c t y tC c t j nc i ntp
= +=
=
= =
+
=
, ,1
( ) ( ( ), ( ), )( ( ), ( )) 0( ) ( )+ ( )
( ) ( )+ , 1.. , 1.. , 1. .
k i k i k inc
i j k i k j i jk
dz t dt f z t y tg z t y tc t z t t
D c t s i ntp j nwp k nc
=
==
=
= = = =
Original ProblemDescription
-
34
Problem Transformation
,1 1
( ) ( )ntp nwp
Ti j
i j
p D CS E p = =
= =
,1 1 1 1
( , | ) ( ) ( ( ))ntp nwp ntp nc
Ti j k i
i j i k
p D CS E c z p p t = = = =
= = =
Measurement Independence Assumption
Disturbance and Measurement Independence Assumption
( )2
22 2,
1 1 1 1 1min ( ) ( ) ( ) ( )
( ). . ( ( ), ( ), )
( ( ), ( )) 0 0, 0
ntp nwp ntpnc nc
i j k i k j k i k i ki j k i k
D c t s c t z t
dz ts t f z t y tdt
g z t y tC S
= = = = =
+
=
=
Maximum Likelihood Principle with Assumed Variance
-
35
Variance Initialization/Estimation Roadmap
Solve TP1
Convg.?
Solve TP2
Solve TP3
VarianceEquations
Variancesk2, 2
NoYes
( )k ic t
( )k js ( )k iz t
Apply optimality conditions to NLP for , and , Substitute to get transformed problems:
P1 for TP1, P2 for j2 = k sk(j) k2 + 2 TP2, P3 for 2 TP3
-
36
Variance Estimation Roadmap
Solve TP1
Convg.?
Solve TP2
Solve TP3
VarianceEquations
Variancesk2, 2
NoYes
( )k ic t
( )k js ( )k iz t
Apply optimality conditions to NLP for , and , Substitute to get transformed problems:
P1 for TP1, P2 for j2 = k sk(j) k2 + 2 TP2, P3 for 2 TP3
-
37
Variance Estimation Roadmap
Solve TP1
Convg.?
Solve TP2
Solve TP3
VarianceEquations
Variancesk2, 2
NoYes
( )k ic t
( )k js ( )k iz t
Apply optimality conditions to NLP for , and , Substitute to get transformed problems:
P1 for TP1, P2 for j2 = k sk(j) k2 + 2 TP2, P3 for 2 TP3
-
38
Variance Estimation Roadmap
Solve TP1
Convg.?
Solve TP2
Solve TP3
VarianceEquations
Variancesk2, 2
NoYes
( )k ic t
( )k js ( )k iz t
Apply optimality conditions to NLP for , and , Substitute to get transformed problems:
P1 for TP1, P2 for j2 = k sk(j) k2 + 2 TP2, P3 for 2 TP3
-
39
Variance Estimation Roadmap
Apply optimality conditions to NLP for , and , Substitute to get transformed problems:
P1 for TP1, P2 for j2 = k sk(j) k2 + 2 TP2, P3 for 2 TP3
-
40
1
2
3
4
2
2 2
( ) ( )
( ) ( )
d
c
k
k
k
k
k
k
SA AA ASA HAASA AA ASAA HAASAA H O ASA HA
AA H O HA
SA s SA lASA l ASA s
+ +
+ +
+ +
+
( )
( )( )
2
2
1 1
2 2
3 3
4 4
( ) ( )( ) ( )( ) ( )
( ) ( )
( ) ( ) , ( ) 0
0, ( ) 0
max ( ) ( ),0
SA AA
ASA AA
ASAA H O
AA H O
dsatd SA SA SA
d
SA
csatg c ASA ASA
r k c t c tr k c t c tr k c t c t
r k c t c t
k c T c t m tr
m t
r k c t c T
===
=
=
-
41
Aspirin Synthesis Case
Exact Estimated Abs Error Rel Error Std Deviation
k1: 0.036031 0.036011 2.010-5 0.056% 9.610-6
k2: 0.15961 0.15967 6.810-5 0.043% 1.710-4
k3: 6.8032 7.0390 0.24 3.5% 0.13
k4: 1.8029 1.8560 0.053 2.9% 0.037
kc: 0.75669 0.76021 0.0035 0.46% 2.210-3
kd: 7.1109 7.1073 0.0035 0.049% 4.610-3
: 2.0627 2.0629 2.410-4 0.012% 3.610-4
dim(D) = 471 x 111 CPU time = 83 s, 8 iterations for variance IPOPT = 9.63 CPUs for parameter estimation
Comparison between exact and estimated parameters
-
42
Aspirin Synthesis Case
Typical profile (ASA) with estimated parameter values and profile bands corresponding to standard deviations
0 50 100 150 2000
0.5
1
1.5
2
2.5
Time (min)
c AS
A(m
ol/L
)
9.45 9.5 9.550.133
0.134
0.135
0.136
165.5 165.52 165.54 165.56 165.58
0.503
0.5032
0.5034
0.5036
95.1 95.2 95.3
1.5317
1.5318
1.5319
-
43
Recipe Optimization with Validated ModelSemi-Batch Polymer Process (Nie et al., 2013)
-
44
Comprehensive population balance models for MWD properties Moment models implemented and compared Operating strategies validated in plant
Semi-batch polyether polyol process
-
45
Polyol Dynamic Process Validation
-
46
Process Recipe Optimization
-
47
Recipe Optimization Results
-
48
Optimal Constraint Profiles
-
49
Satisfaction of Product Specifications
-
50
Summary and Conclusions
Parameter Estimation and Model Discrimination with First Principle Models
Maximum Likelihood Formulations Normal measurement error distributions
Optimization-based tools Parameter estimation Statistical Inference Probability Shares
Challenging case studies Model discrimination with non-informative data Deconvolute spectral distributions Validation to ensure predictive optimization models
-
51
Extracting Reduced Hessian from IPOPT
If dynamic system is linear with Gaussian noise, this reduces to the Kalman Smoothing equations
KKT conditionsat optimal solution
xj is the j-th column of the inverted reduced Hessian In Ipopt KKT matrix is already factorized! One back-solve per column of the covariance
1. Zavala, V. M.; Laird, C. D. & Biegler, L. T.; Journal of Process Control, 2008, 18, 876-884
Interior point solvers do not form the Reduced Hessian, can be extracted from the optimality conditions1
51
-
52
Apply Collocation on Finite Elements NLP
( )2
22 2,
1 1 1 1 1
0
01
min ( ) ( ) ( ) ( )
. . ( ) ( , , ) 0, 1..
( , ) 0, 1.. , 1..
( ) + ( ) ,
ntp nwp ntpnc nc
i j k i k j k i k i ki j k i k
K
m jm j jm jmm
jm jm
KKj i i i j ij
j
D c t s c t z t
s t l z h f z y j ne
g z y j ne m K
z t z h z
= = = = =
=
=
+
= =
= = =
=
1..
0, 0
j nc
C S
=
How to get the variances, and ?
-
53
Posterior Probability Share
Choose from candidate models by Bayes theorem
Posterior Probability[6]
Normalized posterior probability share 53
PriorPenalty for the number of parameters
Penalty for the accumulated squared errors
-
Estimation results of the full model by EVM
54
Large reliability factors of parameter D and F Parameter UA is estimated at its upper bound
Full model estimation results
-
Full model estimation results
55
Full Model Unscaled results
2.07 05 4.05 03 7.87 02 3.244.05 03 1.04 00 1.15 01 3.217.87 02 1.15 01 1.39 03 2.16
E E E EE E E EE E E E
+ +
+ +3.24 05 3.21 03 2.16 01 1.201.14 04 1.20 02 5.82 01 3.591.73 10 4.37 08 5.28 07 1.75
E E E EE E E EE E E E
2.89 02 7.41 00 9.40 01 3.435.80 00 1.49 03 1.88 04 6.82
E E E EE E E E
+ + + + +
Inversed reduced Hessian
Eigenvector9.73 01 3.60 03 5.012.88 03 9.98 01 6.441.12 05 6.48 03 1.00
E E EE E EE E E
1.96 01 1.18 03 1.511.21 01 7.23 03 4.031.64 05 1.30 06 8.36
E E EE E EE E E
3.24 04 6.26 02 9.231.61 06 4.47 04 1.90
E E EE E E
Eigenvalue of inversed reduced Hessian HR-1
1.50 077.66 01
1.35 037.11 06
9.15 045.74 04
2.89 029.92 06
EE
EE
EE
EE
+
+
*Order of eigenvalue/eigenvector is the same as parameter order in the result table
-
Selected model estimation results
56
Model 4 Fixed parameter D, F and UA Nonlinear heat transfer coefficient
Large reliability factors are avoided Estimability of kinetic parameters are enhanced with even smaller variances
-
57
Model 4 Unscaled results
1.10 05 3.17 03 2.15 05 2.13 02 4.34 003.17 03 9.87 01 4.04 03 7.92 00 1.61 032.15 05 4.04 03 1.39 04 2.32 02 4.79 002.13 02 7.92 00 2.32 02 4.24 02 8.70 044.34 00 1.61 03 4.79 00 8.70 04
E E E E EE E E E EE E E E EE E E E EE E E E
+ + + +
+ + + + + + + 1.78 07E
+
Inversed reduced Hessian HR-1
9.98 01 3.29 03 6.88 02 1.03 03 2.43 072.96 03 9.98 01 5.74 03 5.59 02 9.05 056.89 02 5.31 03 9.98 01 4.04 03 2.69 075.83 04 5.60 02 3.78 03 9.98 01 4.88 032.84 06 3.63 04 1.77 05 4
E E E E EE E E E EE E E E EE E E E EE E E
.87 03 1.00 00E E
+
Eigenvector
2.58E-07 0 0 0 00 0.84 0 0 00 0 1.14E-04 0 0 0 0 0 2.86E-02 00 0 0 0 1.78E+07
Selected model estimation results
Eigenvalue of inversed reduced Hessian HR-1
-
Dynamic Optimization Approaches
DAE Optimization Problem
Multiple Shooting
Embeds DAE Solvers/SensitivityHandles instabilities
Single Shooting
Hasdorff (1977), Sullivan (1977), Vassiliadis (1994)Discretize controls
Simultaneous Collocation(Direct Transcription)
Large/Sparse NLP - Betts; B
Apply a NLP solver
Efficient for constrained problems
Simultaneous Approach
Larger NLP
Discretize state, control variables
Variational Approach
Pontryagin et al.(1956)
Bock and coworkers
Take Full Advantage of Open StructureMany Degrees of FreedomPeriodic Boundary ConditionsMulti-stage Formulations
-
Reduced Hessian and Covariance
We can show that the inverse of the Reduced Hessian is the smoothed covariance1
where is the null space basis of the constraint Jacobian
Changing variables for simplicity
1. Pirnay, Lopez-Negrete, & Biegler, Optimal Sensitivity with IPOPT, Math Prog Comp, 201459
-
Simultaneous Estimation Comparison of WLS and EVM
60
0 0.2 0.4 0.6 0.8 1Scaled time
Reactor temperature
PredictData
0 0.2 0.4 0.6 0.80.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
Scaled timeSc
aled
tem
pera
ture
Reactor temperature
PredicData
WLS EVM
Fitting by EVM is much better than it by WLSAccumulated squared errors of EVM is reduced by 44% compared with WLS
Model Discrimination and Parameter Estimation for Complex Reactive SystemsOverviewModel Building and Optimization for Complex Reactive SystemsOptimization Models based on Physics and Chemistry (First Principles)Work Process for Model Development(www.eurokin.org)Slide Number 6Slide Number 7Slide Number 8Slide Number 9Slide Number 10Inertial Corrections for Factorization of KKT Matrix Sensitivity of KKT ConditionsSlide Number 13Model DiscriminationCase Study I: Solid-Liquid Reactions (Y. Wang)Slide Number 16Slide Number 17Batch Reactor Model Lots of Data - Too Few Informative Measurements (NS = 9 Data Batches)Slide Number 20Estimation resultsEstimation resultsEstimation Quality AnalysisPosterior Probability ShareEstimation Results of Selected ModelEstimation Results of Selected ModelModel Cross ValidationSlide Number 28Slide Number 29Slide Number 30Slide Number 31Slide Number 32Slide Number 33Slide Number 34Slide Number 35Slide Number 36Slide Number 37Slide Number 38Slide Number 39Slide Number 40Slide Number 41Slide Number 42Slide Number 43Slide Number 44Slide Number 45Process Recipe OptimizationSlide Number 47Optimal Constraint ProfilesSatisfaction of Product SpecificationsSummary and ConclusionsExtracting Reduced Hessian from IPOPTSlide Number 52Posterior Probability ShareFull model estimation resultsFull model estimation resultsSelected model estimation resultsSlide Number 57Slide Number 58Reduced Hessian and CovarianceSimultaneous Estimation