aiaa-2002-5631
TRANSCRIPT
-
7/27/2019 AIAA-2002-5631
1/9
American Institute of Aeronautics and Astronautics1
USE OF ADAPTIVE METAMODELING FOR DESIGN OPTIMIZATION
Jay D. Martin*
Applied Research Laboratory
State College, PA, USA 16805
Timothy W. Simpson
The Pennsylvania State University
University Park, PA, USA 16802
* Research Assistant, Applied Research Laboratory. Phone: (814) 865-5930. Email:[email protected].
Assistant Professor, Departments of Mechanical & Nuclear Engineering and Industrial & Manufacturing Engineering.
Member AIAA. Corresponding Author. Phone/fax: (814) 863-7136/4745. Email: [email protected].
ABSTRACT
This paper describes a method to implement an
adaptive metamodeling procedure during simulation-
based design. Metamodels can be used for design space
visualization and design optimization applications when
model evaluation performance is critical. The proposed
method uses a sequential technique to update a kriging
metamodel. This sequential technique will determine
the point of the metamodels design space with the
maximum mean square error and select this as the next
point to use to update the metamodel. At each iteration
the quality of the metamodel is assessed using a leave-
k-out cross-validation technique with three different
values for k. The method is intended to permit
continuous updating of the metamodel to investigate the
entire design space without concern of finding an
optimal value in the metamodel or model.
Keywords: metamodel, kriging, design of experiments,
conceptual design
INTRODUCTION
Researchers at the Applied Research Laboratory at
Pennsylvania State University are developing an
advanced computing environment to support
Simulation Based Design (SBD) of undersea vehicles,
focusing on their conceptual design.1, 2 The approach
involves composing a vehicle from a variety of
subsystems, considering multiple technologies for each
possible subsystem (see Figure 1), such as a maximum
size and speed of operation. Our SBD method will
iterate through possible subsystem designs in order to
achieve the system-level requirements while satisfying
all of the system constraints. These constraints can be
characterized as either constraints placed on the
subsystem by high-level requirements or constraints
that result from the coupling of subsystems into a single
system. A typical example of this is the selection of a
propulsion system that has sufficient thrust to overcome
the drag the vehicle experiences at its desired operating
conditions.
We employ multidisciplinary optimization (MDO)
techniques in order to iterate through the possible
designs to create a feasible system, if it exists, which
satisfies the given requirements. Since we are working
at the conceptual level of design, we typically have
little knowledge of how an actual subsystem will
perform until the new technologies have been fully
developed and tested. The only design rules available
are physics-based models that can be used to estimate
the performance of the different subsystems and, when
combined with the other subsystems, the system as a
whole. These physics-based models can be
computational mechanics models, dynamic analyses, or
even simple algebraic equations. The evaluation of
these physics-based models is often time-consuming.During the optimization and constraint satisfaction
process, many analyses of each subsystem is required,
as a result even simplified cases of vehicle design have
taken hours to complete in a fully automated process.
The majority of this time is spent trying to satisfy the
constraints and create a feasible system.
System
SubSystem A
SubSystemB
SubSystem C
technologies under investigation
requirements
gap in modeling
Figure 1. System Composition
In the conceptual design of undersea vehicles, it has
proven difficult to create a single objective function
defining an optimal design. In order to assist in this
selection it is beneficial to produce a large number of
9th AIAA/ISSMO Symposium on Multidisciplinary Analysis and Optimization4-6 September 2002, Atlanta, Georgia
AIAA 2002-563
Copyright 2002 by the author(s). Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.
-
7/27/2019 AIAA-2002-5631
2/9
American Institute of Aeronautics and Astronautics2
feasible designs instead and then visualize all of thepossible designs to understand the tradeoffs in thesystem design space.3 The procedure adds significantlyto the number of calculations needed. In order to makethis a computationally efficient process, significantimprovements in the time to determine subsystem
performance is required. Consequently, we employmetamodels in our work to achieve this goal.
A metamodel is a model of a model4 that provides anapproximation to the physics-based model that is muchfaster to execute. The creation of metamodels oftenuses Design of Experiments (DOE) to select a set of
points in the design space in order to create an adequatemodel describing the behavior of thesubsystems.
There are many different techniques that have beendeveloped for DOE.5 The focus in this work is on anautomated approach to the Design of Experiments andthe creation of the most appropriate metamodels to
approximate the physics-based model. As shown inFigure 2, the proposed method is used to a)automatically design experiments on the differentsubsystems, b) evaluate these experiments, c) fit ametamodel, d) test the fit of the model, e) decide if andwhere to perform another experiment on the model, andf) update the existing model with the new information.
Design of
Experiment
ModelConductExperiment
Factors and Levels
Model InputsModel Outputs
Fit Metamodel
Metamodel
User
ComputeResults
Adapt
MetamodelChange Form
Metamodel error
User interest
Results
MetamodelParameters
CreateMetamodel
Desired data
Area of Interest
Build Mode
Use Mode
Figure 2. Functional Diagram of Method
This is a continuous process, meant to occur evenbefore a need for the metamodel is recognized, takingadvantage of idle computing time to autonomouslyinvestigate the design space with minimal userinteraction. The method will (1) make a preliminary
version of the metamodel available once the initialversion has been fitted and (2) continuously update themetamodel with new data as it becomes available.
The work presented here is a preliminary investigationinto the implementation of this method. Given a
physics-based model, a preliminary set of experimentsis executed against it. A kriging metamodel is fit to thedata. After each fit of the metamodel, three differenttests are completed to assess the quality of the currentmetamodel. As discussed later in this paper, cross-validation techniques are used rather than calculatingthe true error in the metamodel, an expensive process.
Using the kriging metamodel, the Mean Square Error(MSE) at any point in the space can be determined. The
point in the space with the maximum MSE is taken asthe next experimental point. The maximum MSE issignificant as it correlates both the lack of knowledge inthe input space and the lack of knowledge about the
output. The selected point is executed by the physics-based model, and a new metamodel is fit to the data.
The times to calculate the maximum MSE and to refitthe kriging metamodel have proven to be significant.Therefore we hope to determine if it is necessary to refitthe metamodel at each iteration. If not, a significantsaving in computing time will be possible. A review of
previous work related to this work follows.
SEQUENTIAL METAMODELING TECHNIQUES
Many researchers advocate the use of a sequentialmetamodeling approach using move limits6 or trust
regions.7-9 For instance, the Concurrent SubSpaceOptimization procedure uses data generated duringconcurrent subspace optimization to develop responsesurface approximations of the design space which formthe basis of the subspace coordination procedure.10-12
More recently, Perez, et al.13 present an adaptiveexperimental strategy for reducing the number ofsample points needed to maintain accuracy of second-order response surface approximations duringoptimization. Similarly, Wang14 proposes an adaptiveresponse surface methodology that inherits sample
points from one iteration to the next to reduce thenumber of samples need to maintain an accurate
approximation. Mathematically rigorous techniques andtrust regions are also being combined to develop severalother metamodel management frameworks to managethe use of approximation models in optimization.15-18
Osio and Amon19 develop a multi-stage kriging strategyto sequentially update and improve the accuracy ofsurrogate approximations as additional sample pointsare obtained. The Hierarchical and Interactive DecisionRefinement methodology uses statistical regression andother metamodeling techniques to recursively
-
7/27/2019 AIAA-2002-5631
3/9
American Institute of Aeronautics and Astronautics3
decompose the design space into sub-regions and fiteach region with a separate model during design spacerefinement.20 Finally, Schonlau, et al.21 describe asequential algorithm to balance local and globalsearches using approximations during constrainedoptimization.
Most of these approaches are based on the desire to findthe optimal design. Schonlau, et al.21 and Sasena, etal.22 have described the algorithm, Efficient GlobalOptimization (EGO), to select a new experimentlocation based on a combination of a measurement ofthe models lack of information at a location and thechance that the location may be an extreme point. Coxand John23 described a similar method that uses thelower confidence bound of the metamodel to drive theselection of the next experimental point. In the work byPerez, et al.,13 a typical augmented Lagrangianoptimization technique is employed, but the lack ofknowledge of the metamodel is used to determine the
step size. Model evaluation is completed once theoptimization algorithm has selected a new point and themetamodels are updated with the new point. All ofthese processes are repeated until some stoppingcondition is met.
Osio and Amon19 use maximum entropy when selectingtheir new experiment points rather than results fromoptimization. In their approach, a preliminary set ofexperiments is selected without any knowledge of theoutputs; they are chosen using a space filling technique.A metamodel is fit to the data, and an assessment of thenext set of points to take that will best improve the
capability of the model to predict the response is made.After all of these experiments have been completed, themetamodel is fit to the data once again.
The method we propose is a combination of thesemethods. We desire to use a continuous process, butwe do not want to only look for extreme values of themodel within the design space. Instead, we seek anunderstanding of the entire feasible design space tofacilitate design space visualization.3 The foundationfor our proposed method is kriging, which is describednext. This is followed by an overview of metamodelassessment strategies and the experimental set up used
to test the ability of this method to improve themetamodel accuracy at each iteration.
KRIGING METAMODELS
Originating in geostatistics,24 kriging models have beenused successfully by many researchers to approximatedeterministic computer simulations.5, 25 A krigingmodel is a combination of a polynomial model plusdepartures of the form:
)Z()f()(y xxx +=K
(1)
where )(y xK
is the unknown function of interest, f(x) is
a known (usually polynomial) function ofx, and Z(x) isthe realization of a stochastic process with mean zero,variance 2, and non-zero covariance. The f(x) term in
Eqn. (1) provides a global model of the design spaceand is similar to the polynomial model in a responsesurface. In this work, we take f(x) to be a constant term
based on previous studies.26, 27
While f(x) globally approximates the design space,Z(x) creates localized deviations so that the krigingmodel interpolates the ns sampled data points; however,non-interpolating kriging models can also be created tosmooth noisy data.28, 29 The covariance matrix of Z(x)is:
Cov[Z(xi),Z(xj)] = 2R([R(xi,xj)] (2)
In Eq. (2), R is the correlation matrix, and R(xi,xj) is the
correlation function between any two of the ns sampleddata points xi andxj. R is a (ns x ns) symmetric matrixwith ones along the diagonal. The correlation functionR(xi,xj) is specified by the user, and a variety ofcorrelation functions exist.30-32 In this work, we utilizethe popular Gaussian correlation function:
R(x i, x j) = exp[ k x ki
x kj 2
k=1
n dv ] (3)
where ndv is the number of design variables, k are theunknown correlation parameters used to fit the model,and xk
i and xkj are the kth components of sample points
xi and xj. In some cases using a single correlation
parameter gives sufficiently good results;
5, 16, 19
however, we use a different for each design variableto maximize the flexibility of the approximation.
Predicted estimates, y (x), of the response y(x) atuntried values ofx are computed using:
)(R)( 1T fr += yxy (4)
where y is the column vector of length ns whichcontains the sample values of the response, and f is acolumn vector of length ns which is filled with oneswhen f(x) is taken as a constant. The correlation vector,r
T(x), between an untriedx and the sampled data points
{x1, ..., xns} is given by:
rT(x) = [R(x,x1), R(x,x2), ..., R(x,xns)]T. (5)
The estimate for is given by the generalized leastsquares estimate of:
= (fTR1 f)1 fTR1y (6)
-
7/27/2019 AIAA-2002-5631
4/9
American Institute of Aeronautics and Astronautics4
and the estimate of the variance, 2, of the sample data,denoted as y, from the underlying global model (not thevariance in the observed data itself) is given by:
2 = (y f)T R1(y f)ns
(7)
where f(x) is assumed to be the constant . Themaximum likelihood estimates (i.e., best guesses) forthe kused to fit the model are found by maximizing:
max > 0,
n
[n
sln(2 ) + ln | R |]
2(8)
where 2 and |R| are both functions of. While anyvalues for the k create an interpolating kriging model,the best kriging model is found by solving this k-dimensional unconstrained non-linear optimization
problem. Currently, we use FindMinimum in
Mathematica Version 4.0.1 running on Windows2000 to perform this optimization. FindMinimumuses a modification of Powells search method inseveral dimensions. To help bound the optimization
problem, the inputs to the model are scaled to be [0..1],and the values ofk are limited to values between 0.01and 100.
One advantage of using a kriging metamodel over othermetamodels is the ease in which we can calculate theMean Square Error (MSE) of y (x). This is given by:
=
)()f(1)](),(f[1
)](y[MSE
TTT2
xr
x
Rf
f0xrx
x
K
(9)
The MSE can be used to calculate a confidence intervalon the predicted values. In our work, we find the pointin the design space with the maximum MSE. This pointis taken as the next place to complete an experiment onthe model to update or adapt the metamodel to maintainits accuracy. Assessment strategies for validatingmetamodels during this process are described next.
METAMODEL ASSESSMENT
There are many ways to assess metamodel accuracy.33
A popular measure is the Root Mean Square Error(RMSE) over the design space, but computing RMSErequires sampling the model at a large number of pointsthat have not been used to create the metamodel, whichcan become prohibitive for a model that iscomputationally intensive. As a result, alternativeapproaches such as cross-validation have beendeveloped to reduce the computational expense ofmetamodel assessment.33 A popular cross-validation
method is a leave-k-out strategy. In this strategy, kpoints are left out of the fitted metamodel, and theresulting error is calculated based on the omitted points.This can result in a large number of possible cases,depending on the choice of k and the sample size.Mitchell and Morris31 and Osio and Amon19 describeusing this technique fork=1, and Meckesheimer, et al.33
present guidelines for selecting k for different type ofmetamodels, including kriging. They recommend using
k=0.1ns or sn when validating kriging metamodels.
The cross-validation RMSE, RMSECV, for eachmetamodel is computed using:
=
=k
j
jjCV yyk
iRMSE1
2)(1
)(
(10)
and the RMSECV values for each metamodel predictionare then averaged together to form the average RMSE.
s
n
i
CV
RMSEn
iRMSE
AVE
s
CV
== 1
)(
. (11)
This cross-validation procedures is employed in theexperimental set up which is described next to reducethe computational burden of validating the metamodel.
EXPERIMENTAL SETUP
This section outlines the steps taken to improve thequality of a metamodel in the iterative process. The
process starts with a given physics-based model.
Preliminary Stage1. Identify the input/output parameters for the
model, defining the maximum and minimumvalues for the inputs to the model.
2. Perform a preliminary DOE over the expectedfeasible design space of the model. Each possibleset of input parameters forms an experiment. A setof 50 possible experiments are generated, eachusing a Latin Hypercube design, and the designthat maximizes the minimum distance (maxi-min)
between the points is selected in a manner similar
to Mitchell and Morris.31
3. Execute the experiments against the model andstore the results of the model in the database alongwith the experiments.
4. Fit a metamodel to the results using the techniquedescribed in the kriging section and store the
parameters in a database, making the metamodelavailable for use.
-
7/27/2019 AIAA-2002-5631
5/9
American Institute of Aeronautics and Astronautics5
Iteration StageThe next series of steps are repeated at each iteration toimprove the quality of the metamodel.1. Load metamodel from database.2. Assess the current metamodel. Use the leave-k-
out technique described in the metamodelassessment section. For this experiment, we use k=
1, 0.1ns, and sn , saving the results in a database.
3. Find the maximum MSE location(s) in the designspace. The same optimization routine used to findthe MLE is used to calculate the maximum MSE.
Note that there are many local maximums in thespace. To reasonably ensure that a global maximais found, the optimization will be seeded withmultiple points selected at random.a. Select ns random points in the space bounded
by the maximum and minimum values chosenin Step 1 of the preliminary stage.
b. Calculate the maximum MSE using these
random points as seeds for the gradient-basedoptimization.
c. Sort the resultantpoints based on their MSEvalue.
d. Evaluate the model at the maximum MSElocation. The following steps are used toaccomplish this task.i. Check against known infeasible points.
This is accomplished by comparing thepoint with a list of know infeasible pointsand determining if it is proximity to aknown infeasible point. If point is knownto be infeasible, move to the next point in
the list.ii. Evaluate the candidate feasible point.Test the point against a set of feasibilityconstraints to determine if the point isfeasible.
iii. If feasible, save the point marked asfeasible, along withthe MSE value in thedatabase.
iv. If infeasible, save the point marked asinfeasible for tests of future candidate
points, and continue with the next pointin the maximum MSE list.
4. Fit the metamodel to the new set of feasible datapoints. Use the previous values of
kto seed the
optimization.5. Evaluate stopping criteria. If the criteria are not
met, repeat the process. We currently stop after apredetermined number of iterations. Our currentresearch seeks to identify suitable stopping criteria.
As an alternative to this method, the metamodel is alsorefit to the data after 10 iterations as a comparison tofitting every iteration.
APPLICATION
This method has been applied to a model being used todesign a new class of underwater vehicle. The modelcalculates the performance of a gas generation system.It is characterized with five input and five output
parameters. The model is a Steady State Steady Flow
analysis of a solid fuel combustion process. It iscompleted in Mathematica.
RESULTS
A baseline case was completed for the gas generationmodel. An original DOE of 49 points from a maxi-minLatin hypercube was completed, and an initial krigingmetamodel was fit to this data.
Continuous Refitting of MetamodelThe iteration process was allowed to continue for 56cycles, and the results are plotted in Figure 3. As seenin the figure, the assessment strategies of k=0.1ns and
sn provide very similar results. The case ofk=1 tend
to underestimate the error found with the other two
choices fork. The MSE is also plotted along with the
other values to determine if there is any correlationamong any of these values. In Figure 4 the actual errorfound between the metamodel and the point selected toupdate the model is displayed. Vertical lines have beenused in both of these plots to indicate the correlation
between the large prediction errors and their effect onthe cross-validation assessments.
In Figure 5 and Figure 6, plots of k for the first two
input parameters are plotted over the iteration history.Vertical lines have been added to highlight the twoevents shown in Figures 3 and 4.
10 20 30 40 50
iteration
number
10
20
30
40
50
error
81