abc and history matching with gps - richard wilkinson · abc and history matching with gps richard...
TRANSCRIPT
![Page 1: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/1.jpg)
ABC and history matching with GPs
Richard Wilkinson, James Hensman
University of Nottingham
Gaussian Process Summer School, Sheffield 2015
![Page 2: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/2.jpg)
Talk plan
(a) Uncertainty quantification (UQ) for computer experiments
(b) Design
(c) Calibration - history matching and ABC
![Page 3: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/3.jpg)
Computer experiments
Rohrlich (1991): Computer simulation is
‘a key milestone somewhat comparable to the milestone thatstarted the empirical approach (Galileo) and the deterministicmathematical approach to dynamics (Newton and Laplace)’
Challenges for statistics:How do we make inferences about the world from a simulation of it?
how do we relate simulators to reality?
how do we estimate tunable parameters?
how do we deal with computational constraints?
how do we make uncertainty statements about the world thatcombine models, data and their corresponding errors?
There is an inherent a lack of quantitative information on the uncertaintysurrounding a simulation - unlike in physical experiments.
![Page 4: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/4.jpg)
Computer experiments
Rohrlich (1991): Computer simulation is
‘a key milestone somewhat comparable to the milestone thatstarted the empirical approach (Galileo) and the deterministicmathematical approach to dynamics (Newton and Laplace)’
Challenges for statistics:How do we make inferences about the world from a simulation of it?
how do we relate simulators to reality?
how do we estimate tunable parameters?
how do we deal with computational constraints?
how do we make uncertainty statements about the world thatcombine models, data and their corresponding errors?
There is an inherent a lack of quantitative information on the uncertaintysurrounding a simulation - unlike in physical experiments.
![Page 5: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/5.jpg)
Uncertainty Quantification (UQ) for computer experiments
CalibrationI Estimate unknown parameters θI Usually via the posterior distribution π(θ|D)I Or history matching
Uncertainty analysisI f (x) a complex simulator. If we are uncertain about x , e.g.,
X ∼ π(x), what is π(f (X ))?
Sensitivity analysisI X = (X1, . . . ,Xd)>. Can we decompose Var(f (X )) into contributions
from each Var(Xi )?I If we can improve our knowledge of any Xi , which should we choose to
minimise Var(f (X ))?
Simulator discrepancyI f (x) is imperfect. How can we quantify or correct simulator
discrepancy.
![Page 6: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/6.jpg)
Surrogate/Meta-modellingEmulation
![Page 7: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/7.jpg)
Code uncertainty
For complex simulators, run times might be long, ruling out brute-forceapproaches such as Monte Carlo methods.
Consequently, we will only know the simulator output at a finite numberof points.
We call this code uncertainty.
All inference must be done using a finite ensemble of model runs
Dsim = {(θi , f (θi ))}i=1,...,N
If θ is not in the ensemble, then we are uncertainty about the valueof f (θ).
![Page 8: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/8.jpg)
Code uncertainty
For complex simulators, run times might be long, ruling out brute-forceapproaches such as Monte Carlo methods.
Consequently, we will only know the simulator output at a finite numberof points.
We call this code uncertainty.
All inference must be done using a finite ensemble of model runs
Dsim = {(θi , f (θi ))}i=1,...,N
If θ is not in the ensemble, then we are uncertainty about the valueof f (θ).
![Page 9: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/9.jpg)
Meta-modelling
Idea: If the simulator is expensive, build a cheap model of it and use thisin any analysis.
‘a model of the model’
We call this meta-model an emulator of our simulator.
Gaussian process emulators are most popular choice for emulator.
Built using an ensemble of model runs Dsim = {(θi , f (θi ))}i=1,...,N
They give an assessment of their prediction accuracy π(f (θ)|Dsim)
![Page 10: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/10.jpg)
Meta-modelling
Idea: If the simulator is expensive, build a cheap model of it and use thisin any analysis.
‘a model of the model’
We call this meta-model an emulator of our simulator.
Gaussian process emulators are most popular choice for emulator.
Built using an ensemble of model runs Dsim = {(θi , f (θi ))}i=1,...,N
They give an assessment of their prediction accuracy π(f (θ)|Dsim)
![Page 11: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/11.jpg)
Gaussian Process IllustrationZero mean
![Page 12: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/12.jpg)
Gaussian Process Illustration
![Page 13: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/13.jpg)
Gaussian Process Illustration
![Page 14: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/14.jpg)
Challenges
Design: if we can afford n simulator runs, which parameter valuesshould we run it at?
High dimensional inputsI If θ is multidimensional, then even short run times can rule out brute
force approaches
High dimensional outputsI Spatio-temporal.
Incorporating physical knowledge
Difficult behaviour, e.g., switches, step-functions, non-stationarity...
![Page 15: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/15.jpg)
Uncertainty quantification for Carbon Capture and StorageEPSRC: transport
10-4
Volume [m^3/mol]
6
8
10
Pres
sure
[MPa
]
310K307K304.2K (Tc)296K290KCritical pointCoexisting vapourCo-existing liquid
Technical challenges:
How do we find non-parametric Gaussian process models that i) obeythe fugacity constraints ii) have the correct asymptotic behaviour
![Page 16: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/16.jpg)
Storage
Knowledge of the physical problem is encoded in a simulator f
Inputs:Permeability field, K(2d field)
yf (K )y
Outputs:Stream func. (2d field),concentration (2d field),surface flux (1d scalar),...
0 5 10 15 20 25 30 35 40 45 500
5
10
15
20
25
30
35
40
45
50
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
010
2030
4050
0
10
20
30
40
500.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
↓ f (K )
True truncated streamfield
5 10 15 20 25 30 35 40 45 50
5
10
15
20
25
30
35
40
45
50
−0.02
−0.015
−0.01
−0.005
0
0.005
0.01True truncated concfield
5 10 15 20 25 30 35 40 45 50
5
10
15
20
25
30
35
40
45
50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Surface Flux= 6.43, . . .
![Page 17: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/17.jpg)
CCS examplesLeft=true, right = emulated, 118 training runs, held out test set.
True streamfield
5 10 15 20 25 30 35 40 45 50
5
10
15
20
25
30
35
40
45
50
−0.04
−0.035
−0.03
−0.025
−0.02
−0.015
−0.01
−0.005
0Emulated streamfield
5 10 15 20 25 30 35 40 45 50
5
10
15
20
25
30
35
40
45
50
−0.04
−0.035
−0.03
−0.025
−0.02
−0.015
−0.01
−0.005
0
True concfield
5 10 15 20 25 30 35 40 45 50
5
10
15
20
25
30
35
40
45
50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Emulated concfield
5 10 15 20 25 30 35 40 45 50
5
10
15
20
25
30
35
40
45
50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 18: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/18.jpg)
Design
![Page 19: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/19.jpg)
Design
We build GPs using data {xi , yi}ni=1
Call the collection Xn = {xi}ni=1 ⊂ Rd the design
For observational studies we have no control over the design, but we dofor computer experiments!
GP predictions made using a good design will be better than thoseusing a poor design (Cf location of inducing points for sparse GPs)
What are we designing for?
Global prediction
Calibration
Optimization - minimize the Expected Improvement (EI)?
![Page 20: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/20.jpg)
Design for global predictione.g. Zhu and Stein 2006
For a GP with known hyper parameters, space filling designs are good asthe minimize the average prediction variance
Latin hypercubes, maximin/minimax, max. entropy
However, if we only want to estimate hyperparameters then maximize
det I(θ) = − detE(∂2
∂θ2f (X ; θ)
)
Usually, we want to make good predictions after having estimatedparameters, and a trade-off between these two criteria has been proposed.
36 Z. ZHU AND M. L. STEIN
Figure 1. Plots of designs for prediction with unknown parameters. For all the plots ϕ = 0.5 and σ2 = 1.
the plug-in kriging predictor and the plug-in kriging variance at the evaluation grid E. Thetrue MSPE of the plug-in kriging prediction and the MSE of the ratio M(s; !θ)/M(s; θ) areshown in Table 1. The table shows that the true MSPE of the plug-in prediction for DEA
and DAKV designs are very close, but the DEA designs give better estimates of MSPE. Theimprovement is always substantial and sometimes dramatic.
In practice, the true parameters of the process are unknown and it is desirable to havea design that has good performance for a range of parameters. As a simple example, wecalculated the relative efficiency for the six parameter combinations considered in Figure1, and found that the design for ν = 0.5 and τ 2 = 0.01 minimizes the maximum rela-tive efficiency among the six designs, which we will refer to as the minimax design. Themaximum relative efficiency is 1.045, meaning that, in the worst case, the EA criterion forthe minimax design is 4.5% larger than that for the best design we obtained for the trueparameter values. In Table 2, simulation results for the minimax design are compared withthe DAKV design for several different parameters. In all cases the minimax design gives
Table 1. Comparison of DEA (estimation adjusted, (2.16)) and DAKV (average kriging variance) DesignsUsing Simulation
(ν, τ2) (0.5, 0.01) (1.0, 0.01) (1.0, 0.10) (3.0, 0.10)
Design DEA DAKV DEA DAKV DEA DAKV DEA DAKV
True MSPE 0.111 0.108 0.109 0.115 0.260 0.253 0.167 0.167MSE of Ratio 0.273 1.273 0.438 1.318 0.217 0.317 0.159 0.270
Notes: Results are based on 500 simulations. For all designs ϕ = 0.5, σ2 = 1 and the sample size n = 30.
![Page 21: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/21.jpg)
Sequential design
The designs above are all ‘one-shot’ designs and can be wasteful.Instead we can use adaptive/sequential designs/active learning and add apoint at a time:
Choose location xn+1 to maximize some criterion/acquisition rule
C (x) ≡ C (x | {xi , yi}ni=1)
Generate yn+1 = f (xn+1)
For optimization, we’ve seen that a good criterion for minimizing f (x) isto choose x to maximize the expected improvement criterion
C (x) = E[( mini=1,...,n
yi − f (x))+]
![Page 22: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/22.jpg)
Sequential design for global predictionGramacy and Lee 2009, Beck and Guillas 2015
Many designs work on minimizing some function of the predictivevariance/MSE
s2n(x) = Var(f (x)|Dn)
Active learning MacKay (ALM): choose x at the point with largestpredictive variance
Cn(x) = s2n(x)
This tends to locate points on the edge of the domain.
Active learning Cohn (ALC): choose x to give largest expectedreduction in predictive variance
Cn(x) =
∫s2n(x ′)− s2n∪x(x ′)dx ′
ALC tends to give better designs than ALM, but has costO(n3 + NrefNcandn
2) for each new design point
![Page 23: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/23.jpg)
Sequential design for global predictionGramacy and Lee 2009, Beck and Guillas 2015
Many designs work on minimizing some function of the predictivevariance/MSE
s2n(x) = Var(f (x)|Dn)
Active learning MacKay (ALM): choose x at the point with largestpredictive variance
Cn(x) = s2n(x)
This tends to locate points on the edge of the domain.
Active learning Cohn (ALC): choose x to give largest expectedreduction in predictive variance
Cn(x) =
∫s2n(x ′)− s2n∪x(x ′)dx ′
ALC tends to give better designs than ALM, but has costO(n3 + NrefNcandn
2) for each new design point
![Page 24: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/24.jpg)
Sequential design for global predictionGramacy and Lee 2009, Beck and Guillas 2015
Many designs work on minimizing some function of the predictivevariance/MSE
s2n(x) = Var(f (x)|Dn)
Active learning MacKay (ALM): choose x at the point with largestpredictive variance
Cn(x) = s2n(x)
This tends to locate points on the edge of the domain.
Active learning Cohn (ALC): choose x to give largest expectedreduction in predictive variance
Cn(x) =
∫s2n(x ′)− s2n∪x(x ′)dx ′
ALC tends to give better designs than ALM, but has costO(n3 + NrefNcandn
2) for each new design point
![Page 25: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/25.jpg)
Sequential design for global predictionGramacy and Lee 2009, Beck and Guillas 2015
Many designs work on minimizing some function of the predictivevariance/MSE
s2n(x) = Var(f (x)|Dn)
Active learning MacKay (ALM): choose x at the point with largestpredictive variance
Cn(x) = s2n(x)
This tends to locate points on the edge of the domain.
Active learning Cohn (ALC): choose x to give largest expectedreduction in predictive variance
Cn(x) =
∫s2n(x ′)− s2n∪x(x ′)dx ′
ALC tends to give better designs than ALM, but has costO(n3 + NrefNcandn
2) for each new design point
![Page 26: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/26.jpg)
Sequential design for global predictionMICE: Beck and Guillas 2015
The Mutual Information between Y and Y ′ is
I(Y ;Y ′) = H(Y )−H(Y | Y ′) = KL(py ,y ′ ||pypy ′)
Choose design Xn to maximize mutual information between f (Xn) andf (Xcand \ Xn) where Xcand is a set of candidate design points.A sequential version for GPs reduces to choosing x to maximize
Cn(x) =s2n(x)
scand\(n∪x)(x , τ2)
20 JOAKIM BECK AND SERGE GUILLAS
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
40 60 80 100 120 140 160
Nor
mal
ized
RM
SPE
Number of design points
ALM-1000ALC-150
MICE-150, τs2=1
MICE-150, τs2=10-12
MmLHDmMLHD
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
40 60 80 100 120 140 160
Nor
mal
ized
RM
SPE
Number of design points
τs2=10-12
10-6
10-3
10100
1
Figure 10. Left: comparison between algorithms for the Oscillatory function over [0, 1]4. Right: theperformance with MICE-150 for different choices of τ2
s .
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
40 60 80 100 120 140 160
Nor
mal
ized
RM
SPE
Number of design points
ALM-1000ALC-150
MICE-150, τs2=1
MICE-150, τs2=10-12
MmLHDmMLHD
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
40 60 80 100 120 140 160
Nor
mal
ized
RM
SPE
Number of design points
τs2=10-12
10-6
10-3
110
100
Figure 11. Left: comparison between algorithms for the Oscillatory function over [0, 1]8. Right: theperformance with MICE-150 for different choices of τ2
s .
5.2. Piston simulation function. Here we consider a 7-dimensional example from [2],where the output describes the circular motion of a piston within a cylinder; it obeys thefollowing equations:
y(x) = 2π
�
x1
x2 + x23
x4x5x6
x7g1(x)
, where g1(x) =x3
2x2
��
g22(x) + 4x2
x4x5
x6x7 − g2(x)
�
g2(x) = x3x4 + 19.62x1 −x2x5
x3
Here y(x) is the cycle time (s) which varies with seven input variables. The design spaceis given by x1 ∈ [30, 60] (piston weight, kg), x2 ∈ [1000, 5000] (spring coefficient, N/m),x3 ∈ [0.005, 0.020] (piston surface area, m2), x4 ∈ [90000, 110000] (atmospheric pressure,N/m2), x5 ∈ [0.002, 0.010] (initial gas volume, m3), x6 ∈ [340, 360] (filling gas temper-ature, K) and x7 ∈ [290, 296] (ambient temperature, K). The nonlinearity makes thisdeterministic computer experiment problem challenging to emulate. MICE-300 yields aslight improvement over MICE-150, see Figure 12. MICE with 300 candidate points is
SEQUENTIAL DESIGN WITH MUTUAL INFORMATION 21
0.01
0.1
20 40 60 80 100 120 140 160
Nor
mal
ized
RM
SPE
Number of design points
ALM-1000ALC-150
MICE-150, τs2=1
MICE-300, τs2=1
MmLHDmMLHD
0
10000
20000
30000
40000
50000
60000
20 40 60 80 100 20 40 60 80 100 20 40 60 80 100
Run
time
(s)
Number of design points
TcandTmle
Tselect
MICE-300MICE-150ALC-150
Figure 12. Results for the 7-D Piston Simulation function.
not that much more expensive than with 150; in fact, it is significantly cheaper compu-tationally than ALC with 150. Again, the proposed algorithm MICE performs the best.For high-dimensional problems, ALM tends to be the worst, probably due to the highpercentage of points on the boundary.
6. Application to a tsunami simulator. There is a pressing need in tsunami modelingfor uncertainty quantification with the specific purpose of providing accurate risk maps orissuing informative warnings. Sarri, Guillas and Dias [27] were the first to demonstratethat statistical emulators can be used for this purpose. Recently, Sraj et al. [32] studiedthe propagation of uncertainty in Manning’s friction parameterization to the predictionof sea surface elevations, for the Tohoku 2011 tsunami event. They used a polynomialchaos (PC) expansion as the surrogate model of a low resolution tsunami simulator. Notethat Bilionis and Zabaras [3] showed that GP emulators can outperform PC expansionswhen small to moderate-sized training data are considered. Stefanakis et al. [33] used anactive experimental design approach for optimization to study if small islands can protectnearby coasts from tsunamis.
We consider here the problem of predicting the maximum free-surface elevation of atsunami wave at the shoreline, for a wide range of scenarios, following a subaerial landslideat an adjoining beach across a large body of shallow water. A tsunami wave simulator isused. A landslide of seafloor sediments, initially at the beach, has a Gaussian shaped massdistribution, and generates tsunami waves that propagates towards the opposite shorelineacross from the beach (see Figure 13). The sea-floor bathymetry is changing over time,and is used as input to the tsunami simulator. The floor motion is described by thechange in bathymetry of the sloping beach over time, h(x, t) = H(x) − h0(x, t), whereH(x) = x tan β is the static uniformly sloping beach, and h0(x, t) = δ exp
�
−(x̃ − t̃)2�
is
the perturbation with respect to H(x, t). Here x̃ = 2 xµ2
δ tan φ1, t̃ =
�
gδµt, δ is the maximum
vertical slide thickness, µ is the ratio of the thickness and the slide length, and tanφ1 isthe beach slope. The free surface elevation is defined as z(x, t) = −h(x, t). It is assumedthe initial water surface is undisturbed, that is, z(x, 0) = 0 for all x. The slope tanφ2 ofthe beach at the opposite shoreline is chosen so that the distance between the shorelines
![Page 27: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/27.jpg)
Calibration, history matching andABC
![Page 28: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/28.jpg)
Inverse problems
For most simulators we specify parameters θ and i.c.s and thesimulator, f (θ), generates output X .
The inverse-problem: observe data D, estimate parameter values θ
![Page 29: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/29.jpg)
Two approachesProbabilistic calibrationFind the posterior distribution
π(θ|D) ∝ π(θ)π(D|θ)
for likelihood functionπ(D|θ) =
∫π(D|X , θ)π(X |θ)dX
which relates the simulatoroutput, to the data,e.g.,
D = X + e + ε
where e ∼ N(0, σ2e ) representssimulator discrepancy, andε ∼ N(0, σ2ε ) representsmeasurement error on the data
History matchingFind the plausible parameter setPθ:
f (θ) ∈ PD ∀ θ ∈ Pθwhere PD is some plausible set ofsimulation outcomes that areconsistent with simulatordiscrepancy and measurementerror, e.g.,
PD = {X : |D −X | ≤ 3(σe + σε)}
Calibration aims to find a distribution representing plausible parametervalues, whereas history matching classifies space as plausible orimplausible.
![Page 30: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/30.jpg)
Calibration - Approximate Bayesian Computation (ABC)
ABC algorithms are a collection of Monte Carlo methods used forcalibrating simulators
they do not require explicit knowledge of the likelihood function
inference is done using simulation from the model (they are‘likelihood-free’).
ABC methods are popular in biological disciplines, particularly genetics.They are
Simple to implement
Intuitive
Embarrassingly parallelizable
Can usually be applied
![Page 31: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/31.jpg)
Rejection ABC
Uniform Rejection Algorithm
Draw θ from π(θ)
Simulate X ∼ f (θ)
Accept θ if ρ(D,X ) ≤ ε
ε reflects the tension between computability and accuracy.
As ε→∞, we get observations from the prior, π(θ).
If ε = 0, we generate observations from π(θ | D).
Rejection sampling is inefficient, but we can adapt other MC samplerssuch as MCMC and SMC.
![Page 32: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/32.jpg)
Rejection ABC
Uniform Rejection Algorithm
Draw θ from π(θ)
Simulate X ∼ f (θ)
Accept θ if ρ(D,X ) ≤ ε
ε reflects the tension between computability and accuracy.
As ε→∞, we get observations from the prior, π(θ).
If ε = 0, we generate observations from π(θ | D).
Rejection sampling is inefficient, but we can adapt other MC samplerssuch as MCMC and SMC.
![Page 33: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/33.jpg)
ε = 10
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
−3 −2 −1 0 1 2 3
−10
010
20
theta vs D
theta
D
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●● ●
●●
●
●
●
●
●
●
● ● ●●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●●●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●● ●
●
● ●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
− ε
+ ε
D
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Density
theta
Den
sity
ABCTrue
θ ∼ U[−10, 10], X ∼ N(2(θ + 2)θ(θ − 2), 0.1 + θ2)
ρ(D,X ) = |D − X |, D = 2
![Page 34: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/34.jpg)
ε = 7.5
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●● ●
●
●
● ●●
●
●
●
●
●
●
● ●
●
● ●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
−3 −2 −1 0 1 2 3
−10
010
20
theta vs D
theta
D
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●●● ●
●●
●
●
●
●
●
●
● ● ●●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●● ●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●● ●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●●●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●●
●●
●
●
●
●
● ●
●●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●● ●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
● ●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●
●
●●
●●
●●
●● ●
● ●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
− ε
+ ε
D
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Density
theta
Den
sity
ABCTrue
![Page 35: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/35.jpg)
ε = 5
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
● ●●
●
●
●
●
●● ●
● ●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
● ●●●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
●
● ●
● ●
●
●
●●
●
●●● ●
●●
●
●
●
●
●
●
●●
●
●●●
●
●
●
● ●●
●
●
●
●
●●
●
●●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
−3 −2 −1 0 1 2 3
−10
010
20
theta vs D
theta
D
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●●
●
●
●●● ●
●●
●
● ●
●
● ● ●●
●
●●
●●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●●●
●● ●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●● ●
●●
●
●
●
●●
●● ●
●
●●
●●●●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●●
●
●●
●
●
●●
●
●
●
● ●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
● ●● ●
●
●●
●●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
● ●
●
●●●
●
●
●●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●●
●
●
●
●●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●●
●
●● ●
●
●●
●●
●
●
●
●
●
●
●
●●●
●
●●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
− ε
+ ε
D
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Density
theta
Den
sity
ABCTrue
![Page 36: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/36.jpg)
ε = 2.5
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●● ●
●
●
●●
●
●
●
●
●
●
●
●
●
● ● ●●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
● ●●
●
●
●●
●●
●
●
●
●
●
●
●
●
● ●
●●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●●
●
●
●●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●●
●
●
●
●●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
−3 −2 −1 0 1 2 3
−10
010
20
theta vs D
theta
D
●
●●
●
●●
●
●
●
●
●●
●●●●
●
● ●●
●
●●
●●
●●
●●●●
●
●●●
● ●●
●●
●●
●
● ●
●
●
●●
●
●●●
●
●
●●
●
●●
●●
●●
●
●●
●●●●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●● ●
●
●
●
●
●
●●●
●
●●
●
●
●●
●●●
●
●
●
●
●●
●
●
●●
●
●
●
●●●
●●
●●
●
●
●●
●
●
●
●●●
●
●●●
●
●
●●●
● ●●
●
●●
●●●
●●
●●
●●
●●
●
●●●●
●●
●
●
●●
●●
●
●
●
●
●●●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●● ●
●
●●
●
●●
●● ●
●●
●●
●
●●●
●
●●
●●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●
●
●● ●
●
●●
●
●
− ε
+ ε
D
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Density
theta
Den
sity
ABCTrue
![Page 37: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/37.jpg)
ε = 1
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ●●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
● ●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
−3 −2 −1 0 1 2 3
−10
010
20
theta vs D
theta
D ●●● ●●●
● ●●●●●
●● ●
●●
●●● ●●● ●
●●
●●
●●
●●
●●●●●
●
●●●●●
●
●●●
●●●
● ●●●●
● ●●●●
●●
●●
●●
●●●●●
●●
●●
●●
●● ●● ●
● ●● ●●●
●
● ●● ● ●
●●●●●
− ε
+ ε
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Density
theta
Den
sity
ABCTrue
![Page 38: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/38.jpg)
Limitations of Monte Carlo methods
Monte Carlo methods are generally guaranteed to succeed if we run themfor long enough.
This guarantee is costly and can require more simulation than is possible.
However,
Most methods sample naively - they don’t learn from previoussimulations.
They don’t exploit known properties of the likelihood function, suchas continuity
They sample randomly, rather than using careful design.
We can use methods that don’t suffer in this way, but at the cost oflosing the guarantee of success.
![Page 39: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/39.jpg)
Limitations of Monte Carlo methods
Monte Carlo methods are generally guaranteed to succeed if we run themfor long enough.
This guarantee is costly and can require more simulation than is possible.
However,
Most methods sample naively - they don’t learn from previoussimulations.
They don’t exploit known properties of the likelihood function, suchas continuity
They sample randomly, rather than using careful design.
We can use methods that don’t suffer in this way, but at the cost oflosing the guarantee of success.
![Page 40: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/40.jpg)
Likelihood estimationWilkinson 2013
It can be shown that ABC replaces the true likelihood π(D|θ) by an ABClikelihood
πABC (D|θ) =
∫π(D|X )π(X |θ)dX
where π(D|X ) is the ABC acceptance kernel.
We can estimate this using repeated runs from the simulator
π̂ABC (D|θ) ≈ 1
N
∑π(D|Xi )
where Xi ∼ π(X |θ).
For many problems, we believe the likelihood is continuous and smooth,so that πABC (D|θ) is similar to πABC (D|θ′) when θ − θ′ is small
We can model L(θ) = πABC (D|θ) and use the model to find the posteriorin place of running the simulator.
![Page 41: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/41.jpg)
Likelihood estimationWilkinson 2013
It can be shown that ABC replaces the true likelihood π(D|θ) by an ABClikelihood
πABC (D|θ) =
∫π(D|X )π(X |θ)dX
where π(D|X ) is the ABC acceptance kernel.
We can estimate this using repeated runs from the simulator
π̂ABC (D|θ) ≈ 1
N
∑π(D|Xi )
where Xi ∼ π(X |θ).
For many problems, we believe the likelihood is continuous and smooth,so that πABC (D|θ) is similar to πABC (D|θ′) when θ − θ′ is small
We can model L(θ) = πABC (D|θ) and use the model to find the posteriorin place of running the simulator.
![Page 42: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/42.jpg)
Likelihood estimationWilkinson 2013
It can be shown that ABC replaces the true likelihood π(D|θ) by an ABClikelihood
πABC (D|θ) =
∫π(D|X )π(X |θ)dX
where π(D|X ) is the ABC acceptance kernel.
We can estimate this using repeated runs from the simulator
π̂ABC (D|θ) ≈ 1
N
∑π(D|Xi )
where Xi ∼ π(X |θ).
For many problems, we believe the likelihood is continuous and smooth,so that πABC (D|θ) is similar to πABC (D|θ′) when θ − θ′ is small
We can model L(θ) = πABC (D|θ) and use the model to find the posteriorin place of running the simulator.
![Page 43: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/43.jpg)
History matching wavesWilkinson 2014
The likelihood is too difficult to model, so we model the log-likelihoodinstead.
l(θ) = log L(θ)
However, the log-likelihood for a typical problem ranges across too wide arange of values.
Consequently, most GP models will struggle to model the log-likelihoodacross the parameter space.
Introduce waves of history matching.
In each wave, build a GP model that can rule out regions of space asimplausible.
When this works, it can give huge savings in the number of simulator runsrequired.
![Page 44: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/44.jpg)
History matching wavesWilkinson 2014
The likelihood is too difficult to model, so we model the log-likelihoodinstead.
l(θ) = log L(θ)
However, the log-likelihood for a typical problem ranges across too wide arange of values.
Consequently, most GP models will struggle to model the log-likelihoodacross the parameter space.
Introduce waves of history matching.
In each wave, build a GP model that can rule out regions of space asimplausible.
When this works, it can give huge savings in the number of simulator runsrequired.
![Page 45: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/45.jpg)
History matching wavesWilkinson 2014
The likelihood is too difficult to model, so we model the log-likelihoodinstead.
l(θ) = log L(θ)
However, the log-likelihood for a typical problem ranges across too wide arange of values.
Consequently, most GP models will struggle to model the log-likelihoodacross the parameter space.
Introduce waves of history matching.
In each wave, build a GP model that can rule out regions of space asimplausible.
When this works, it can give huge savings in the number of simulator runsrequired.
![Page 46: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/46.jpg)
History matching wavesWilkinson 2014
The likelihood is too difficult to model, so we model the log-likelihoodinstead.
l(θ) = log L(θ)
However, the log-likelihood for a typical problem ranges across too wide arange of values.
Consequently, most GP models will struggle to model the log-likelihoodacross the parameter space.
Introduce waves of history matching.
In each wave, build a GP model that can rule out regions of space asimplausible.
When this works, it can give huge savings in the number of simulator runsrequired.
![Page 47: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/47.jpg)
Example: Ricker Model
The Ricker model is one of the prototypic ecological models.
used to model the fluctuation of the observed number of animals insome population over time
It has complex dynamics and likelihood, despite its simplemathematical form.
Ricker Model
Let Nt denote the number of animals at time t.
Nt+1 = rNte−Nt+er
where et are independent N(0, σ2e ) process noise
Assume we observe counts yt where
yt ∼ Po(φNt)
Used in Wood to demonstrate the synthetic likelihood approach.
![Page 48: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/48.jpg)
Results - Design 1 - 128 pts
![Page 49: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/49.jpg)
Diagnostics for GP 1 - threshold = 5.6
![Page 50: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/50.jpg)
Results - Design 2 - 314 pts - 38% of space implausible
![Page 51: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/51.jpg)
Diagnostics for GP 2 - threshold = -21.8
![Page 52: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/52.jpg)
Design 3 - 149 pts - 62% of space implausible
![Page 53: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/53.jpg)
Diagnostics for GP 3 - threshold = -20.7
![Page 54: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/54.jpg)
Design 4 - 400 pts - 95% of space implausible
![Page 55: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/55.jpg)
Diagnostics for GP 4 - threshold = -16.4
![Page 56: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/56.jpg)
MCMC ResultsComparison with Wood 2010, synthetic likelihood approach
3.0 3.5 4.0 4.5 5.0
01
23
45
67
Wood’s MCMC posterior
r
Density
0.0 0.2 0.4 0.6 0.8
0.0
1.0
2.0
3.0
Green = GP posterior
sig.e
Density
5 10 15 20
0.0
0.2
0.4
Black = Wood’s MCMC
phi
Density
![Page 57: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/57.jpg)
Computational details
The Wood MCMC method used 105 × 500 simulator runs
The GP code used (128 + 314 + 149 + 400) = 991× 500 simulatorruns
I 1/100th of the number used by Wood’s method.
By the final iteration, the Gaussian processes had ruled out over 98% ofthe original input space as implausible,
the MCMC sampler did not need to waste time exploring thoseregions.
![Page 58: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/58.jpg)
Design for calibrationwith James Hensman
![Page 59: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/59.jpg)
Implausibility
When using emulators for history-matching and ABC, the aim is toaccurately classify space as plausible or implausible by estimating theprobability
p(θ) = P(θ ∈ Pθ)
based upon a GP model of the simulator or likelihood
f (θ) ∼ GP(m(·), c(·, ·))
The key determinant of emulator accuracy is the design used
Dn = {θi , f (θi )}Ni=1
![Page 60: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/60.jpg)
Implausibility
When using emulators for history-matching and ABC, the aim is toaccurately classify space as plausible or implausible by estimating theprobability
p(θ) = P(θ ∈ Pθ)
based upon a GP model of the simulator or likelihood
f (θ) ∼ GP(m(·), c(·, ·))
The key determinant of emulator accuracy is the design used
Dn = {θi , f (θi )}Ni=1
![Page 61: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/61.jpg)
Entropic designsCalibration doesn’t need a global approximation to the simulator - this iswasteful
Instead build a sequential design θ1, θ2, . . . using the currentclassification
p(θ) = P(θ ∈ Pθ|Dn)
to guide the choice of design points
First idea: add design points where we are most uncertain
The entropy of the classification surface is
E (θ) = −p(θ) log p(θ)− (1− p(θ)) log(1− p(θ))
Choose the next design point where we are most uncertain.
θn+1 = arg maxE (θ)
![Page 62: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/62.jpg)
Entropic designsCalibration doesn’t need a global approximation to the simulator - this iswasteful
Instead build a sequential design θ1, θ2, . . . using the currentclassification
p(θ) = P(θ ∈ Pθ|Dn)
to guide the choice of design points
First idea: add design points where we are most uncertain
The entropy of the classification surface is
E (θ) = −p(θ) log p(θ)− (1− p(θ)) log(1− p(θ))
Choose the next design point where we are most uncertain.
θn+1 = arg maxE (θ)
![Page 63: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/63.jpg)
Toy 1d example f (θ) = sin θ
Add a new design point (simulator evaluation) at the point of greatestentropy
![Page 64: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/64.jpg)
Toy 1d example f (θ) = sin θ
![Page 65: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/65.jpg)
Toy 1d example f (θ) = sin θ
![Page 66: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/66.jpg)
Toy 1d example f (θ) = sin θ
![Page 67: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/67.jpg)
Toy 1d example f (θ) = sin θ - After 10 and 20 iterations
This criterion spends too long resolving points at the edge of theclassification region.
![Page 68: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/68.jpg)
Expected average entropyChevalier et al. 2014
Instead, we can find the average entropy of the classification surface
En =
∫E (θ)dθ
where n denotes it is based on the current design of size n.
Choose the next design point, θn+1, to minimise the expectedaverage entropy
θn+1 = arg min Jn(θ)
whereJn(θ) = E(En+1|θn+1 = θ)
![Page 69: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/69.jpg)
Toy 1d example f (θ) = sin θ - Expected entropy
![Page 70: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/70.jpg)
Toy 1d example f (θ) = sin θ - Expected entropy
![Page 71: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/71.jpg)
Toy 1d example f (θ) = sin θ - Expected entropy
![Page 72: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/72.jpg)
Toy 1d example f (θ) = sin θ - Expected entropy
![Page 73: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/73.jpg)
Toy 1d: min expected entropy vs max entropyAfter 10 iterations, choosing the point of maximum entropy
we have found the plausible region to reasonable accuracy.
Whereas maximizing the entropy has not
In 1d, a simpler space filling criterion would work just as well.
![Page 74: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/74.jpg)
Toy 1d: min expected entropy vs max entropyAfter 10 iterations, choosing the point of maximum entropy
we have found the plausible region to reasonable accuracy.Whereas maximizing the entropy has not
In 1d, a simpler space filling criterion would work just as well.
![Page 75: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/75.jpg)
Solving the optimisation problem
Finding θ which minimises Jn(θ) = E(En+1|θn+1 = θ) is expensive.
Even for 3d problems, grid search is prohibitively expensive
Dynamic grids help
We can use Bayesian optimization to find the optima:
1 Evaluate Jn(θ) at a small number of locations
2 Build a GP model of Jn(·)3 Choose the next θ at which to evaluate Jn so as to minimise the
expected-improvement (EI) criterion
4 Return to step 2.
![Page 76: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/76.jpg)
Solving the optimisation problem
Finding θ which minimises Jn(θ) = E(En+1|θn+1 = θ) is expensive.
Even for 3d problems, grid search is prohibitively expensive
Dynamic grids help
We can use Bayesian optimization to find the optima:
1 Evaluate Jn(θ) at a small number of locations
2 Build a GP model of Jn(·)3 Choose the next θ at which to evaluate Jn so as to minimise the
expected-improvement (EI) criterion
4 Return to step 2.
![Page 77: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/77.jpg)
History matchCan we learn the following plausible set?
A sample from a GP on R2.Find x s.t. −2 < f (x) < 0
![Page 78: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/78.jpg)
Iteration 10Left=p(θ), middle= E(θ), right = J̃(θ)
![Page 79: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/79.jpg)
Iterations 15Left=p(θ), middle= E(θ), right = J̃(θ)
![Page 81: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/81.jpg)
EPm: climate model3d problemDTcrit conv - critical temperature gradient that triggers convectionGAMMA - emissivity parameter for water vapourCalibrate to global average surface temperature
30 35 40 45 50DTcrit_conv
1.0
1.2
1.4
1.6
1.8
2.0
GAM
MA
ABC samples
30 35 40 45 50DTcrit_conv
1.0
1.2
1.4
1.6
1.8
2.0
GAM
MA
285.000
287.500
290.000
292.500
295.000
297.500
MLH Emulator n=10
30 35 40 45 50DTcrit_conv
1.0
1.2
1.4
1.6
1.8
2.0
GAM
MA
285.000
287.500
290.000
292.500
295.000297.500
MLH Emulator n=30
30 35 40 45 50
1.0
1.2
1.4
1.6
1.8
2.0
285.0
00
287.5
00
290.0
00 2
92.5
00
295.0
00
297.5
00
Bayesian optimization, 10 design points
30 35 40 45 50
1.0
1.2
1.4
1.6
1.8
2.0
285.0
00
287.5
00
290.0
00 292.5
00
295.000297.5
00
Bayesian optimization, 20 design points
![Page 82: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/82.jpg)
ConclusionsGaussian processes are widely used in the analysis of complex computermodels
Good design can lead to substantial improvements in accuracy
Design needs to be specific to the task required
Space-filling designs are inefficient for calibration and historymatching problems.
Entropy based designs give good trade-off between exploration anddefining the plausible region
Bayesian optimisation techniques allow us to solve the hardcomputation needed to use optimal entropic designs
Using emulators of the likelihood function, although adding anotherlayer of approximation, does enable progress in hard problems.
Thank you for listening!
![Page 83: ABC and history matching with GPs - Richard Wilkinson · ABC and history matching with GPs Richard Wilkinson, James Hensman University of Nottingham Gaussian Process Summer School,](https://reader030.vdocument.in/reader030/viewer/2022041012/5ebe0425ea70ce07b352f2c4/html5/thumbnails/83.jpg)
ConclusionsGaussian processes are widely used in the analysis of complex computermodels
Good design can lead to substantial improvements in accuracy
Design needs to be specific to the task required
Space-filling designs are inefficient for calibration and historymatching problems.
Entropy based designs give good trade-off between exploration anddefining the plausible region
Bayesian optimisation techniques allow us to solve the hardcomputation needed to use optimal entropic designs
Using emulators of the likelihood function, although adding anotherlayer of approximation, does enable progress in hard problems.
Thank you for listening!