complements on surrogate based optimization for engineering...
TRANSCRIPT
Complements on Surrogate Based Optimizationfor Engineering Design
Lecture Series AVT-167Strategies for Optimization and Automated Design
of Gas Turbine Engines
Ingrid Lepot
Cenaero
October 26, 2009
Surrogate based optimization
What is a surrogate model ?
Low cost replacement of the original function for a wide variety ofpurposes
Educated guess as to what an engineering function might look like,based on a few points in space where one can afford to measure thefunction values
Basic idea: Avoid the temptation to invest one’s computation budget inanswering the question at hand and, instead, invest in developing fastmathematical approximations to the long running computer codes⇒ trade-offs exploration and insights gain
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 2 / 26
Surrogate based optimization typical workflow
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 3 / 26
Some important topics
In a surrogate model approach, the devil’s in the details:
What points do you sample in building the approximation ?
What approximation method do you employ ?
How do you manage the approximation model(s) ?
How do you use the approximation to suggest new, improved designs ?
How do you use the approximations to explore tradeoffs betweenobjectives ?
(What to do if your simulation has numerical “noise” in it ?)
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 4 / 26
Sampling the design space
Design of Experiments (DoE)
Classical DoE techniques (e.g. full factorial, fractional factorial,Central Composite, Box Benhken, ...)
Optimal DoE techniques (e.g. Taguchi methods, ...)
Space-filling techniques (random and quasi-random sequences, LatinHypercube Sampling, ...) appropriate for computer experiments andwhen there is no a priori knowledge of the considered cost functions
Adaptive or so-called capture/recapture techniques, whichincorporate function knowledge
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 5 / 26
Surrogate models
Two main categories of low-fidelity surrogate models :
I Data-fitting models
non-physics-based approximations
typically involve interpolation or regression of a set of data generatedfrom the original expensive model
Global modelsI provide information about the global behaviour of the system
Local modelsI Construct a local model around the current design pointI Use in local search to move in a downhill direction
characterized by the number of data points used in the fitI local approximations use data from a single pointI multipoint approximations use a small number of data pointsI global approximations use a set of data points distributed over the
whole design space
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 6 / 26
Surrogate models
I Physics-based surrogate models
Hierarchical modelsI also known as multi-fidelity, variable fidelity or variable complexity
models. Such models use corrected results from a low-fidelity model(e.g. coarser mesh discretization, looser convergence tolerances, simplermodel that neglects some physics: Navier-Stokes vs. Euler equations,...) as an approximation to the results of a high-fidelity model.
Reduced-order modelsI models with fewer unknowns than the original high-fidelity modelI generated directly from a high-fidelity model through the use of a
reduced basis (modal analysis or Proper Orthogonal Decomposition)and projection of the original high-dimensional system down to a smallnumber of generalized coordinates
I do not require multiple models of varying fidelity
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 7 / 26
Physics-based surrogates - POD techniques
Proper Orthogonal Decomposition (POD)
Also known as Karhunen-Loeve Decomposition and PrincipalComponent Analysis
Standard tool in data analysis to reduce a large, complex data set toa lower dimensional one⇒ to identify the most meaningful basis, remove as much ofredundancy as possible, and give a compact representation⇒ enables to reveal the sometimes hidden, simple underlyingstructures in complex structures
Widely used in CFD (low dimensional description of turbulent flows),image processing, signal analysis, data compression, etc
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 8 / 26
POD techniques
An efficient method for computing PODs for large dimensional problems isthe method of Snapshots (Sirovich).
Consider a computationally expensive simulation S(x) ∈ IRn,depending on np parameters (x1, . . . , xnp ).
Generate observations or “snapshots” {sk}k∈m of S at m locations inthe design space.
Construct the snapshot deviation matrix asD = ((s1 − s) . . . (sm − s)) where s is the mean vector.
Compute the Singular Value Decomposition (SVD) of D :
D = UΣV ∗ = U
σ1 0. . .
0 σr
V ∗
where σ1 ≥ · · · ≥ σr > 0, r = rank(D).
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 9 / 26
POD techniques
Then
sk = s +r∑
i=1
α(k)i Φi (k = 1, . . . ,m)
where the i th mode Φi = U(:, i).
The most important part of the “energy” contribution isconcentrated in the first modes.
POD truncated approximation (p < r)
sk ≈ s +
p∑i=1
α(k)i Φi (k = 1, . . . ,m)
POD basis = optimal basis which contains more information than anyother one.
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 10 / 26
Non intrusive POD surrogate
Derivation of low-order models by combining
POD techniquesI to perform the space reduction of the modelI to obtain the basis functions for the low-order model
Generic data-fitting techniques like RBF or KrigingI to perform the low-dimensional reconstruction in the design spaceI to approximate the POD coefficients for cases not included in thesnapshot set, i.e. to express the POD coefficients directly as functionsof the design variables
For any x , one can estimate S(x) :
xdata-fitting−−−−−−−−−→
Kriging or RBF
(α
(x)i
)i∈p
reconstitution−−−−−−−−→POD
S(x)
Does not require an intrusive or code-specific implementationI ... as good as the observed data set used for its training!
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 11 / 26
Illustration of POD approach
Optimal rotary control of the cylinder wakeusing POD Reduced Order Model
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 12 / 26
Model Assessment
The accuracy of the metamodel depends on the number and location ofsample points in the design space.
The leave-one-out (LOO) procedure is a way to estimate the accuracy of ametamodel without the need for creating extra data for validation.
LOO procedure
Create a data set with k sample points.
k − 1 samples are used to build a metamodel and the ithsample is left out in the fitting process.
The output value at the ith sample is estimated with themetamodel.
I k − 1 samples ; training dataI ith sample ; validation data
Repeat for all the k sample points (each sample point is usedonce as the validation data).
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 13 / 26
Model Assessment
20 samples 50 samples
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 14 / 26
Surrogate model(s) management
Key issues
Offline(i.e. a priori trained)/Online(i.e. adaptively improved) modelsmanagement I search infill criteria
Global/Local models management (move limit strategies, trust regiontechniques, ...)
Graal quest: Optimum exploitation/exploration balance
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 15 / 26
Search infill criteria
Enhance the accuracy of the model using further function calls : infill orupdate points
Online model management
Improve the accuracy only in the region of the optimum predicted bythe surrogate → local exploitationI quickly converge to an optimum valueI possibly get stuck at a local optimum
Ability to search away from the current optimum and explore otherregions → global exploration
Instead of either exploiting or exploring the surrogate model, use infillcriteria which balance these options.
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 16 / 26
Search infill criteria
Kriging not only allows to compute a predictor, y(x), but also a measureof the possible error in the predictor, s(x).
There are 2 “zones” where it isdesirable to add new samplepoints :
where the model is minimizedI min y(x)
where there is a significanterror in the predictionI max s(x)
I Updating approaches based on explicit measures of uncertainty
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 17 / 26
Search infill criteria
Simplest way of balancing exploitation of y(x) and exploration using s(x)is to minimize the lower confidence bounding (LCB) function
LCB(x) = y(x)− ρ s(x)
small ρ leads rapidly to an optimum,but possibly only a local one
large ρ explores the potential regions of improvement,in which the uncertainty is high
I Difficult to choose the user defined parameter ρ to obtain a goodbalance between exploration and exploitation.
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 18 / 26
Search infill criteria
The amount of improvement we expect may also be evaluatedI Maximizing the Expected Improvement (EI) criteria
EI (x) =
{(ymin − y)Φ
(ymin−y
s
)+ s Ψ
(ymin−y
s
)if s > 0
0 if s = 0
where Φ and Ψ are the cumulative and density normal distributionfunctions.
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 19 / 26
Search infill criteria
The amount of improvement we expect may also be evaluatedI Maximizing the Expected Improvement (EI) criteria
EI (x) =
{(ymin − y)Φ
(ymin−y
s
)+ s Ψ
(ymin−y
s
)if s > 0
0 if s = 0
where Φ and Ψ are the cumulative and density normal distributionfunctions.
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 19 / 26
Search infill criteria
The amount of improvement we expect may also be evaluatedI Maximizing the Expected Improvement (EI) criteria
EI (x) =
{(ymin − y)Φ
(ymin−y
s
)+ s Ψ
(ymin−y
s
)if s > 0
0 if s = 0
where Φ and Ψ are the cumulative and density normal distributionfunctions.
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 19 / 26
Search infill criteria
The concept of merit function can be employed to maintain diversityin the solution space.
Search both in regions where the surrogate model indicates theremight be a minimizer of the objective and where we realize that weknow very little about the problem (few samples).
Take into account the distance ofan individual with the otherindividuals and favor the solutionsfar away from their neighbours :
merit (x) = y(x) − ρ d(x)
with ρ ≥ 0 and d(x) = mini‖x − xi‖2.
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 20 / 26
Parallel infill points
Parallel updating approaches : multiple designs are selected for samplingduring each iterationI make effective use of parallel computing resources
Different approaches :
Some infill criteria (e.g. EI, PI) exhibit multi-modal behaviourI identify several local optima and select them as infill points
Search the infill criterion, temporarily add the surrogate predictedvalue at this point (assume the model is correct at this location),rebuild the surrogate, search again the infill criterion, . . .
Focus on one target at a time rather than compromise by, forinstance, adding two new design points per iteration, one coming fromthe exploration search and the other one from the exploitation search
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 21 / 26
Surrogate model(s) management
Move limit strategy allows to
adapt the search range of the variables along the design process basedon efficacity of approximations
focus the optimization search on smaller regions of the design spaceand exploiting local models
ensure that the inner optimization does not produce design pointsoutside the region where the surrogate model is valid
As the optimization proceeds, the idea is to enlarge or restrict the searchspace based on a heuristic rule in order to refine the candidate optimalregion (new additional points are chosen within these move limits).
If improvement ⇒ we can trust the model and the search region isenlarged.
If no improvement ⇒ the search region is contracted.
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 22 / 26
Surrogate model(s) management
Example for the Rosenbrock function (locations of one additional randomsample point at each design iteration)
with move limit without move limit
f ∗ = 1.7 10−8 f ∗ = 1.4 10−6
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 23 / 26
ANOVA
Analysis of Variance
Gives information on which design variables have more influence overthe outcome, e.g. performance.
Allows to quantify first order sensitivities, cross terms, higher orderinteraction volume ...
Based on multi-dimensional integration of the model with givenvariable(s) held constant. This process is repeated number of timeswith different value of the fixed variable(s).
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 24 / 26
Conclusion
Multidisciplinary design optimization (MDO) has considerable impact onthe design by increasing performance, lowering lifecycle cost and
shortening design time for complex products
However,the objective is not only to eke out a “5% “ performance improvement inthe design solution, but most importantly to
Gain insight into the design space
Assess key factors
Quantify trades
Point out innovative design options
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 25 / 26
Conclusion
No one optimization method works well for all problems
“Chicken and egg dilemma ...”The best search algorithm to exploit depends upon the type of designspace that has been defined. But the characteristics of the design spaceare typically not known until it has been explored, ...which is the primaryrole of the search method.
⇒ Trend towards hybrid and adaptive search strategies
Fundamental role of the optimization specification:Parameterization, bounds definition, model simulations choice, cost
functions and constraints definitions, ...
AVT167 - SBO complements (Lecture 4) copyright@cenaero 2009 26 / 26