calibration, sensitivity analysis and uncertainty analysis for computationally expensive models...

84
Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School of Civil and Environmental Engineering and School of Operations Research and Industrial Engineering Cornell University South Florida Water District Morning Meeting Sept. 24, 3003

Upload: samson-cross

Post on 23-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive

Models

Prof. Christine Shoemaker

Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman

School of Civil and Environmental Engineering and School of Operations Research and Industrial Engineering

Cornell University

South Florida Water District Morning MeetingSept. 24, 3003

Page 2: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Models Help Extract Information

Point Data

from monitoring or experiments at limited number of points in space and time

Model

that describes temporal and spatial connections

Forecasts (with statistical representation)

Comparison of Alternative Management Options

Understanding Processes

from Point Data to Processes Continuous in Space and Time

Page 3: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Models Help Extract Information from

Data___________________

Point Data

from monitoring or experiments at limited number of points in space and time

Model

that describes temporal and spatial connections

Forecasts (with statistical representation)

Comparison of Alternative Management Options

Understanding Processes

Model Outputs

for Multiple Outputs

Page 4: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Steps in Modeling

• Calibration—selecting parameter values within acceptable limits to fit the data as well as possible

• Validation—applying the model and calibrated parameters to independent data set

• Sensitivity Analysis—assess the impact of changes in uncertain parameter values on model output

• Uncertainty Analysis-assessing the range of model outcomes likely given uncertainty in parameters, model error, and exogenous factors like weather.

Page 5: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Computationally Expensive Models

• It is difficult to calibrate for many parameters with existing methods with a limited number of simulations.

• Most existing uncertainty methods require thousands of simulations.

• We can only do a limited number of model simulations if models that hours to run.

• Our methods are designed to reduce the number of simulations required to do good calibration and sensitivity analysis.

Page 6: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Methods and Applications

• We will discuss a general methodology for calibration, sensitivity analysis and uncertainty analysis that can be applied to many types of computationally expensive models.

• We will present numerical examples for two “real life” examples: a watershed and a groundwater remediation.

Page 7: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

1.Effective Use of Models and Observations Through Calibration, Sensitivity Analysis and Uncertainty

Analysis

A description of the technical approach and “real life applications. Including:

1. Sensitivity Analysis for large number of parameters with application to a large watershed.

2. Optimization methods for calibration with application to ground water remediation based on field data.

3. Uncertainty Analysis based on groundwater model

Page 8: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Cannonsville Watershed• Cannonsville Reservoir Basin – agricultural basin• Supply of New York City drinking water • To avoid $8 billion water filtration plant, need

model analysis to help manage phosphorous

1200 km2

Watershed subject to economic constraints if P violations of TMDL.

Page 9: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Monitoring Stations

#S

#S

#S

#S#S

$T

$T

$T$T

$T

%U

%U

Trout Creek

W. Br. Delaware R. @ Walton

Little Delaware R.

W. Br. Delaware @ Delhi

Beerston

Town Brook

Town Brook

Subwatersheds BoundariesRivers and Streams

#S USGS Flow Gauges$T Climate Stations%U Sediment Monitoring Stations

5 0 5 10 Kilometers

N

There are over 20,000 data for this watershed

Page 10: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Questions

• Using all this data, can we develop a model that is a useful forecasting tool to assess the impact of weather and phosphorous management actions on future loading the reservoir?

• What phosphorous management strategies should be undertaken if any?

Page 11: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

I. Methodology for Sensitivity Analysis of a Model with Many Parameters: Application to Cannonsville Basin

• Joint work with Jennifer Benaman (Cornell Ph.D. in Civil and Environmental Engineering, 2003)

• Funded by EPA Star Fellowship

Page 12: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Sensitivity Analysis with Many Parameters

• Sensitivity Analysis measures the change in model output associated with the change (perturbation) in model input (e.g. in parameter values).

• Purposes include:– To help select which parameters should be adjusted

in a calibration and which can be left at default values.

– This makes multivariate sensitivity and uncertainty analysis more feasible for computationally expensive models

Page 13: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Sensitivity Analysis with Many Parameters- Additional Purposes– To prioritize additional data collection, and

– To estimate potential errors in model forecasts that could be due to parameter

value errors.• Sensitivity Analysis and calibration are

difficult with a large number of parameters.

Page 14: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Questions

• Can we develop a sensitivity analysis method that is:– robust (doesn’t depend strongly on our

assumptions)?– computationally efficient for a large

number of parameters (hundreds)?– allows us to consider many different

model outputs simultaneously?– .

Page 15: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

• 160 parameters– 35 basinwide– 10 vary by land use (10 x 5 land uses)– 7 vary by soil (7 x 10 soil types)– 2 additional for corn and hay– 1 additional for pasture

• Ranges obtained from literature, databases, and SWAT User’s Manual

Choose Parameters Establish ParameterRanges

Choose OutputVariables of Concern

Application to Cannonsville Watershed

Page 16: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Monitoring Stations

#S

#S

#S

#S#S

$T

$T

$T$T

$T

%U

%U

Trout Creek

W. Br. Delaware R. @ Walton

Little Delaware R.

W. Br. Delaware @ Delhi

Beerston

Town Brook

Town Brook

Subwatersheds BoundariesRivers and Streams

#S USGS Flow Gauges$T Climate Stations%U Sediment Monitoring Stations

5 0 5 10 Kilometers

N

Page 17: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Output Variables of Concern• Basinwide (average annual from 1994-1998)

– Surface water runoff– Snowmelt– Groundwater flow– Evapotranspiration– Sediment yield

• Location in-stream (monthly average over entire simulation)– Flow @ Beerston– Flow @ Trout Creek– Flow @ Town Brook– Flow @ Little Delaware River– Sediment load @ Beerston– Sediment load @ Town Brook

Choose Parameters Establish ParameterRanges

Choose OutputVariables of Concern

Page 18: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Final Results

Weighting Method A Weighting Method B Weighting Method C Weighting Method D

All Equal Weights Focus on Beerston Focus on CalibrationFocus on Basinwide

ManagementAPMBASIN 100 100 100 100BIOMIXBASIN 100 100 100 100CN2CSIL 100 100 100 100CN2FRSD 100 100 100 100CN2PAST 100 100 100 100RSDCOPAST 100 100 100 100SLSUBBSNBASIN 100 100 100 100SMFMNBASIN 100 100 100 100T_BASEPAST 100 100 100 100T_OPTPAST 100 100 100 100USLEKNY129 100 100 100 100ESCONY129 100 75 75 100SMTMPBASIN 100 75 75 100LAT_SEDBASIN 100 50 100 100CN2HAY 75 75 75 75ESCONY132 75 75 75 50GWQMNBASIN 75 75 75 75TIMPBASIN 75 50 75 75BIO_MINPAST 75 50 50 75ROCKNY132 75 25 50 50REVAPMNBASIN 50 50 50 75ROCKNY129 50 25 50 25USLEPCSIL 25 25 50 25HVSTICSIL 25 25 25 50USLECPAST 25 25 25 25SMFMXBASIN 25 0 0 50GSIPAST 0 0 25 0ROCKNY026 0 0 25 0

Percentage of times in the 'Top 20'

These are in top 20 for ALL cases

These are in top 20 most of the time

Page 19: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel
Page 20: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Computational Issues• We have a robust method for determining

importance and sensitivity of parameters.• An advantage is that the number of model

simulations is independent of the number of output variables, sensitivity indices, or weighting factors considered in the combined sensitivity analysis. (Almost no extra computation is required to do many output variables, indices or weightings.)

• The number of simulations is simply the number required to do a single (non robust) univariate sensitivity analysis multiplied by the number of perturbation methods (=2 in this example).

Page 21: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Next Steps• Once the most important parameters have

been identified we can extend the analysis to more detailed analyses including:– Multivariate sensitivity analysis (changes in more

than one parameter at a time)– Uncertainty Analysis (e.g. GLUE)

• Both of these analyses above are highly computationally demanding and can hence only be done with a small number of parameters.

• The (univariate) sensitivity analysis done here can identify the small number of parameters on which these analyses should be focused.

Page 22: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Questions

• Can we develop a sensitivity analysis method that is:

– robust (doesn’t depend strongly on our assumptions)?

– computationally efficient for a large number of parameters (hundreds)?

– allows us to consider many different model outputs simultaneously?

– Yes, the results for Cannonsville indicate this is possible with this methodology.

– Models with longer simulation times require more total simulation times or fewer parameters.

Page 23: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

II: Use of Response Surface Methods in Non-Convex Optimization,

Calibration and Uncertainty Analysis

• Joint work with– Pradeep Mugunthan (PhD Candidate in Civil and

Environmental Engineering)– Rommel Regis (Postdoctoral Fellow with PhD in

Operations Research)

– Funded by three National Science Foundaton (NSF) Projects

Page 24: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Computational Effort for Trial and Error (Manual) Calibration

• Assume that you have P parameters and you want to consider N levels of each.

• Then the total number of combinations of possible sets of parameter is NP.

• So with 10 parameters, considering only 2 values each (very crude evaluation), there are 21024 possible combinations, too many to evaluate all of them for computationally expensive function.

• With 8 parameters considering a more reasonable 10 values each gives 100 million possible combinations of parameters!

• With so many possibilities it is hard to find with trial and error good solutions with few (e.g. 100) function evaluations.

Page 25: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Automatic Calibration

• We would like to find the set of parameter values (decision variables) such that– the calibration error (objective function) is

minimized – subject to constraints on the allowable range

of the parameter values.

This is an Optimization Problem.

It can be a global optimization problem.

Page 26: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

NSF Project 1: Function Approximation Algorithms for Environment Analysis with

Application to Bioremediation of Chlorinated Ethenes

• Title: “Improving Calibration, Sensitivity and Uncertainty Analysis of Data-Based Models of the Environment”,

• The project is funded by the NSF Environmental Engineering Program.

• The following slides will discuss the application of these concepts to uncertainty analysis.

Page 27: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

“Real World Problem”:Engineered Dechlorination by Injection of Hydrogen

Donor and Extraction

We have developed a user friendly transport model of engineered anaerobic degradation of chlorinated ethenes that models chemical and biological species and utilizes MT3D and RT3D.

This model is the application for the function approximation research.

Page 28: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Optimization

• Because our model is computationally expensive, we need to find a better way than trial and error to get a good calibration set of parameters.

• Optimization can be used to efficiently search for a “best” solution.

• We have developed optimization methods that are designed for computationally expensive functions.

Page 29: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Optimization

• Our goal is to find the

minimum of f(x)

where x є D• We want to do very few evaluations of f(x)

because it is “costly to evaluate.

This can be a measure of error between model prediction and observations

X can be parameter values

Page 30: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Global versus Local Minima

F(x)

X (parameter value)

Local minimum

Global minimum

Many optimization methods only find one local minimum.

We want a method that finds the global minimum.

Page 31: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Experimental Design with Symmetric Latin Hypercube (SLHD)

• To fit the first function approximation we need to have evaluated the function at several points.

• We use a symmetric Latin Hypercube (SLHD) to pick these initial points.

• The number of points we evaluate in the SLHD is (d+1)(d+2)/2, where d is the number of parameters (decision variables).

Page 32: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

x (parameter value-one dimensional example)

Objective

Function

f(x)

measure of error

One Dimensional Example of Experimental Design to Obtain Initial Function Approximation

Costly Function Evaluation (e.g. over .5 hour CPU time for one evaluation).

Page 33: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

x (parameters)

f(x)

Function Approximation with Initial Points from Experimental Design

In real applications x is multidimensional since there are many parameters (e.g. 10).

Page 34: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

x (parameter value)

f(x)

Update in Function Approximation with New Evaluation

Update done in each iteration for function approximation for each algorithm expert.

Function Approximation is a guess of the function value of f(x) for all x.

new

Page 35: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Use of Derivatives

• We use the gradient-based methods only on the function approximations R(x) (for which accurate derivatives are inexpensive to compute).

• We do not try to compute gradients/derivatives for the underlying costly function f(x).

Page 36: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Our RBF Algorithm

• Our paper on RBF optimization algorithm has will appear soon in Jn. of Global Optimization .

• The following graphs show a related RBF method called “Our RBF” as well as an earlier RBF optimization suggested by Gutmann (2000) in Jn. of Global Optimization called “Gutmann RBF”.

Page 37: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Comparison of RBF Methods on a 14-dimensional Schoen Test Function (Average of 10 trials)

120 140 160 180 200 220 240 260 280 30015

20

25

30

35

40

45

number of function evaluations

mean o

f th

e b

est

valu

e in 3

0 r

uns

Comparison of RBF Methods on a 14-dimensional Schoen Test Function

ExpRBF-LGutmannRBFGreedyRBF

Objective Function

Our RBF

Number of Function Evaluations

Page 38: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Comparison of RBF Methods on a 12-dimensional Groundwater Aerobic Bioremediation Problem ( a PDE system)

(Average of 10 trials)

80 100 120 140 160 180 200400

500

600

700

800

900

1000

1100

number of function evaluations

mean o

f th

e b

est

valu

e in 1

0 r

uns

Comparison of RBF Methods on a 12-dimensional Groundwater Bioremediation Problem

ExpRBF-LGutmannRBFGreedyRBF

Objective Function

Number of Function Evaluations

Our RBF

Page 39: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

The following results are from:

NSF Project 1: Function Approximation Algorithms for Environment Analysis with

Application to Bioremediation of Chlorinated Ethenes

• Title: “Improving Calibration, Sensitivity and Uncertainty Analysis of Data-Based Models of the Environment”,

• The project is funded by the NSF Environmental Engineering Program.

Page 40: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Now a real costly function: DECHLOR: Transport Model of

Anaerobic Bioremediation of Chlorinated Ethene

• This model was originally developed by Willis and Shoemaker based on kinetics equations by Fennell and Gossett.

• This model will be our “costly” function in the optimization.

• Model based on data from a field site in California.

Page 41: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Complex model: 18 species at each of thousands of nodes of finite difference model

Butyrate

Propionate

H2

Acetate

LactateLac2Ace

PCE DCETCE VC Ethene

Dechlorinator

Lac2Prop

But2AceProp2Ace

Methane

But2Ace

Hyd2Meth

Chlorinated.Ethenes

Donors

Page 42: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Example of Objective Function for Optimization of Chlorinated Ethene Model

2 T

1 t

I

1 i

s tij

o tij

J

1 j ) Y (Y SSE

where, SSE is the sum of squared errors between observed and simulated chlorinated ethenes

is the observed molar concentration of species j at time t, location i

is the simulated molar concentration of species j at time t, location i

t = 1 to T represent time points at which measured data is available

j = 1 to J represents PCE, TCE, DCE, VC and ethene in that order

i = 1 to I is a set monitoring locations

otijY

stijY

Observation Model

Page 43: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Algorithms Used for Comparison of Optimization Performance on Calibration

• Stochastic Greedy Algorithm– Neighborhood defined to make search global– Neighbors generated from triangular distribution around current

solution. Moves only to a better solution.• Evolutionary Algorithms

– Derandomized evolution strategy DES with lambda = 10 and b1 = 1/n and b2 = 1/n0.5 (Ostermeier et al. 1992)

– Binary or Real Genetic algorithm GA, population size 10, one point cross-over, mutation probability 0.1, crossover probability 1

• RBF Function Approximation Algorithms– RBF Gutmann- radial basis function approach, with cycle length

five, SLH space filling designRBF-Cornell radial basis function approach.

• FMINCON– derivative based optimizer in Matlab with numerical derivatives

• 10 trials of 100 function evaluations were performed for heuristic and function approximation algorithms for comparison

Page 44: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Comparison of algorithms for NS as objective function on a hypothetical problem

-1

4

9

14

19

30 50 70 90

Number of function evaluations

-(Ave

rage

NS)

FMINCON

RBF-CORNELL

RBF-GUT

FMINCON+RBF

DES

RealGA

BinaryGA

Average is based on 10 trials. The best possible value for –NS is –1. 28 Experimental design evaluations done.

ours

Lower curve is better

Page 45: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Boxplot comparing best objective value (CNS) produced by the algorithms in each trial over 10 trials

ours

average

outlier

Page 46: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Conclusions• Optimizing costly functions is typically done only

once.• The purpose for our examination of multiple

trials is to examine how well one is likely to do if you do solve the problem only once.

• Hence we want the method that has both the smallest Mean objective function value and the smallest Variance.

• Our RBF has both the smallest Mean and the smallest Variance.

• The second best method is Gutmann RBF, so RBF methods seem very good in general.

Page 47: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Conclusions• Optimizing costly functions is typically done only

once.• The purpose for our examination of multiple

trials is to examine how well one is likely to do if you do solve the problem only once.

• Hence we want the method that has both the smallest Mean objective function value and the smallest Variance.

• Our RBF has both the smallest Mean and the smallest Variance.

• The second best method is Gutmann RBF, so RBF methods seem very good in general.

Page 48: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Alameda Field Data

• The next step was to work with a real field site. • We obtained data from a DOD field site studied

by a group (including Alleman, Morse, Gossett, and Fennell).

• Running the simulation model takes about three hours for one run of the chlorinated ethene model at this site because of the nonlinearities in the kinetics equations.

Page 49: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Site Layout

Page 50: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Range of objective values for SSE objective function at Alameda field site - Mean, min and max are shown for each

algorithm

150000

250000

350000

450000

550000

650000

DES FA-Gutmann FA-RS FMINCON

SS

E (m M

)2

ours

gradient

Page 51: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Conclusions on RBF Optimization of Calibration

• Radial Basis Function Approximation Methods can be used effectively to find optimal solutions of costly functions.

• “Our RBF” performed substantially better than the previous RBF method by Gutmann on the difficult chlorinated ethene remediation problem, especially because our RBF is robust (small variance).

• Both Genetic algorithms and derivative-based search did very poorly.

• The two RBF methods did much better on the Alameda field data problem than other methods.

Page 52: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

However,300 hours is a long time to wait!

Solution: Parallel Algorithms• We would like to be able to speed up

calculations for costly functions by using parallel computers.

• To get a good speed up on a parallel computer, you need an algorithm that parallelizes efficiently.

• We are developing such an algorithm through a second NSF grant (from Computer and Information Science Directorate).

Page 53: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

III: Uncertainty Analysis

• Modelers have discovered that there is often more than one set of parameters that gives and “adequate” fit to the data.

• One approach to assessing uncertainty associated with a model output is to look at the weighted mean and the variability of the output associated all the sets of parameters that give an equally good fit.

Page 54: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

x (parameters)

f(x)

More than one parameter value might give acceptable goodness of fit

If we impose a “filter” and allow only the acceptable points, then only the black points are incorporated in the analysis.

acceptable

Page 55: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Uncertainty Analysis: GLUE Approach

• GLUE is a methodology (by Bevins and co-workers) used largely in watersheds (where computation times are not long).

Page 56: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Uncertainty Analysis via GLUE: Dots are Model Simulations of Parameter Combinations Chosen at

Random (Two Parameter Example)

parameter 1

parameter 2

parameter combination that gives R2 greater than .75parameter combination that gives R2 less than .75

Page 57: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Glue Methodology (used mostly in watershed modeling)

• Step 1: Select combinations of parameter values at random and simulate model for each combination.

• Step 2:compare goodness of fit (e.g. R 2) for each model simulation compared with data

• Step 3: Simulate model at acceptable points and weight output to determine variability characteristics of model output (e.g. mean and variance of amount of contamination remaining after N years)

Page 58: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Problems with GLUE Methodology

• We applied GLUE to the Cannonsville Watershed SWAT model predictions for sediment (a very hard quantity to model).

• We did 20,000 Monte Carlo runs (which took about three weeks of computer time).

• Of the 20,000 runs only two runs were within the allowable R2. (only two )

• This does not adequately characterize uncertainty, and it is not computationally feasible to make more runs.

Page 59: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

• For computationally expensive models like our groundwater problem or your Everglades problem, it is not feasible to run the model 20,000 times!

Page 60: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

• Hence GLUE has the problem that it finds very few samples within an acceptable level (filter) if the filter is fairly stringent.

Page 61: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Groundwater Example Used for Numerical Comparison with GLUE

• 2-D confined aquifer contaminated with chlorinated ethenes.

• Same PDE equations as earlier field case• 400m long, 100m wide• Modeled using a coarse 10mx10m finite difference grid

– Simulation time for 6 month calibration period was approximately ¾ minute in a Pentium4® 3GHz computer

– Typical simulation time for long-term forecast scenarios is of the order of several hours to days

Page 62: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Calibration Problem

• Calibration of 3 parameters were considered – 2 biological parameters and one biokinetic parameter

• Synthetic observations were generated for a period of 6 months using a known set of parameters

• Optimal calibration was attempted using a response surface (RS) optimization method (Regis and Shoemaker, 2004)

• GLUE based calibration/uncertainty assessment was also performed for comparison

Page 63: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Output Definition

• Output: The total moles of toxic compounds (chlorinated ethenes) remaining in aquifer at final time period. (This cannot be measured but can be estimated through model.)

• Uncertainty in the Output was analyzed using GLUE and RS based methods

Page 64: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Goodness-of-fit Measure

• Nash-Sutcliffe Efficiency Measure (Nash and Sutcliffe, 1970)

• Optimization algorithm was setup to minimize CNS = 1-NS, so that a CNS of zero is best

S

i

tj

avi

obstji

tj

obstji

simtji

CC

CC

SNS

12

,,,

2

,,,,,

11

1 NS

Page 65: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Uncertainty Estimates for Output Total Moles of Chlorinated Ethenes Remaining

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Our Method 1 with 200 function evaluations

Page 66: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Uncertainty Estimates for Output Total Moles of Chlorinated Ethenes Remaining

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

GLUE 1 with 500 function evaluations

Page 67: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Uncertainty Estimates for Total Moles of Chlorinated Ethenes

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Is the mean, range is 99% of data

This is the true answer

Page 68: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Uncertainty Estimates for Total Moles of Chlorinated Ethenes

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Number of points after applying filter

RS200 uses 200 function evaluations. G200 found 0 solutions (none) for this filter. GS500 found only 5 solutions.

This is the true answer

Page 69: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Uncertainty Estimates for Total Moles of Chlorinated Ethenes

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Number of points after applying filter

Is the mean, range is 99% of data

This is the true answer

The mean estimate is almost perfect for our RS method and is far off for GLUE method with 250% as many points evaluated !

Page 70: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Uncertainty Estimates for Total Moles of Chlorinated Ethenes

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Number of points after applying filter

Even with 2000 function evaluations, GLUE has a much worse mean than our RS method with only 1/10 as many function evaluations.

Page 71: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Our Method 2(RSG)

• Step 1: Same as in Method 1• Step Construct a function approximation surface of the

output• Step 3: Make a large number of samples from function

approximation. Do further function evaluations if function approximation is negative and refit function approximation.

• Step 4: Filter out points that are not acceptable and compute statistics

• Determine statistical characteristics of model output (e.g. mean and variance of amount of contamination remaining after N years) at all acceptable points.

Page 72: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Uncertainty Estimates for Total Moles of Chlorinated Ethenes

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Number of points after applying filter

Our Method 2 with 200 function evaluations and 20,000 samples from the response surface

Page 73: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Difference Between Method 1 and Method 2

The uncertainty analysis in Method 1 is based only on actual function evaluations.

The uncertainty analysis in Method 2 is based on a very large number of samples from the function approximation.

Page 74: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Comments on Results

• A strict filter produces very few points with GLUE – even after 2000 function evaluations, only 12 points

remain after filtering

• Our RS method produces the tightest bounds and also provides more points for uncertainty assessment with only 200 function evaluations– Limited with respect to sample independence

• The RSG provides an improvement over GLUE – Independent samples for uncertainty assessment– A larger sample size for a tight filter

Page 75: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Effect of Relaxing Filter – CNS of 0.1

Empirical 98% Bounds obtained using a filter of 0.1 for CNS

G2000RS200 G200 G500 G1000 RSG20k TRUE135.00

140.00

145.00

150.00

155.00

160.00

165.00

To

tal m

ole

s o

f ch

lori

nat

ed

eth

enes

90

12 44 84 167

1542

Page 76: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Comparison of percentage of points after filtering

0

20

40

60

80

100

120

0.01 0.1 0.3 1 inf

CNS Filter

Per

cen

tag

e o

f p

oin

ts

afte

r fi

lter

ing

RS200

G200

G500

G1000

G2000

RSG20k

Comparison of percentage of points after filtering

0

10

20

30

40

50

0.01 0.1 0.3

CNS Filter

Perc

en

tag

e o

f p

oin

ts a

fter

filt

eri

ng

Filter

RS200

G200

G500

G1000

G2000

RSG20k

Percentage of Points for Different Filters

Page 77: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Advantages of Method 2

• The samples are independent

• Reuse information from calibration

• Computationally cheap – – use only the same number of costly function

evaluations as in the regular RS optimization method (200 in these examples)

– Can obtain goodness-of-fit and output values for many thousands of points

Page 78: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

Summary

• Models can help us use data take a small scale and at discrete time points to understand and manage environmental processes over large spatial areas and time frames.

• Development of computationally efficient methods for automatic calibration, sensitivity and uncertainty analysis are very important.

Page 79: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

New Project 2: Parallel Optimization Algorithms

• Funded by the Computer Science (CISE) Directorate at NSF

• The method is general and can be used for a wide range of problems including other engineering systems in addition to environmental systems.

• This research is underway.

Page 80: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

2. How are calibration sensitivity analysis and uncertainty

analysis used in environmental analyses?

Page 81: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

3. What are the alternatives to sensitivity analysis and uncertainty

analysis?

Page 82: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

How Do we address the uncertainties that are not directly related to parameter uncertainty

such as data uncertainty?

Page 83: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel

My NSF Projects

• NSF-Environmental Engineering: applications of methods to watershed and groundwater

• NSF-Advanced Computing: development of parallel algorithms for function approximation optimization

• NSF-Statistics: development of an integration of Bayesian statistical methods with function approximation optimization for computationally expensive functions.

• All this previously funded research can be useful in applications to the Everglades.

Page 84: Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive Models Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel