![Page 1: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/1.jpg)
What Can Gaussian Processes and Model Discrepancy Do For You?
Mike GrosskopfStatistical Sciences, LANL
Work supported by the U.S. Department of Energy, Office of Science, Office of Nuclear Physics, and NNSA
LA-UR-19-29720
![Page 2: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/2.jpg)
Some topics that I will cover as time allows:
• GP review
• GP Numerical Solvers
• Hierarchical Model Validation
• Exploring Model Discrepancy with • Statistics• Machine Learning
• Extra time:• Quick aside about error bars and
the parametric bootstrap
![Page 3: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/3.jpg)
Some topics that I will cover as time allows:
• GP review
• GP Numerical Solvers
• Hierarchical Model Validation
• Exploring Model Discrepancy with • Statistics• Machine Learning
• Extra time:• Quick aside about error bars and
the parametric bootstrap
![Page 4: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/4.jpg)
A reminder about what Gaussian processes are:• Have interest in estimating an
unknown function, 𝑓(𝒙)• And want a measure of how certain
your estimate is
• Being Bayesian: Put a prior on it!
• Characterized by mean and covariance functions• Covariance function determines the
supported functions
𝑓 𝑡&, 𝑥& ~ 𝐺𝑃(0, Σ.)
𝐶𝑜𝑣 𝑓 𝒙, , 𝑓 𝒙2 = 𝜅 ∗6789
:
𝑒<=>∗ ?><?>@ A
𝑌& = 𝑓 𝑥& + 𝜖
![Page 5: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/5.jpg)
Gaussian processes are useful regression model in many contexts• One of the most common tools for
emulation and uncertainty quantification for scientific computing models• Kennedy-O’Hagan Hierarchical Model
• Flexible enough to capture complex response surface
• Gives uncertainty about the function where unobserved
![Page 6: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/6.jpg)
A note about what is Gaussian about Gaussian Processes• While Gaussian error often
assumed, the Gaussian process prior doesn’t require it
• GP => Gaussian prior on the coefficients of the relevant function space
![Page 7: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/7.jpg)
Some drawbacks to Gaussian processes
• Can be computationally expensive!• O(n3) speed, O(n2) storage
• Many sparse GP and other approaches for speed improvements often at the cost of accuracy
• Categorical features are tricky• Extrapolation is questionable• Though when isn’t it?
• Stationarity is rarely a good assumption• Just a necessary one• Need a lot of data to identify non-
stationary structure• Mainly second order effects
• Mean is still a good predictor• Uncertainty intervals less
trustworthy • Stationary uncertainty saturates
rather than growing
![Page 8: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/8.jpg)
Some topics that I will cover as time allows:
• GP review
• GP Numerical Solvers
• Hierarchical Model Validation
• Exploring Model Discrepancy with • Statistics• Machine Learning
• Extra time:• Quick aside about error bars and
the parametric bootstrap
![Page 9: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/9.jpg)
Numerical Methods for Solving Differential Equations Have Made an Incredible Impact on Modern Science
• Almost all areas of modern science utilize simulation of complex systems
• Uncertainty quantification (UQ) in scientific computing attempts to understand all sources of uncertainty that impact predictive science
• One source that’s been largely ignored is discretization uncertainty• “Just run at high enough
resolution that it doesn’t matter”• Can sometimes work out rigorous
bounds on the solution• Often either infeasible or overly
conservative• Estimate based on convergence
arguments and treat as a random effect
![Page 10: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/10.jpg)
Can we be Bayesian about this?
• Solution is an unknown function, so we can put a prior on it.• Prior on functions eh? Gaussian
process!
• What do we observe?• Boundary conditions (BC)• Derivative information
• Chkrebtii, et al. (2016), (2019)• Sample solution from the GP
conditional on BCs to obtain sample derivatives• Condition on derivatives at
discrete intervals
• Similar approaches by Conrad (2017), Cockayne (2018), Schober (2016, 2019) • More detailed references available
on request
![Page 11: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/11.jpg)
How this works (from Chkrebtii et al. 2016):
Figure from Page 8 of Chkrebtii (2016)
![Page 12: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/12.jpg)
Example: The Chaotic Duffing Equation
• Non-linear damped and driven oscillator
𝑑F𝑥𝑑𝑡F + 0.2
𝑑𝑥𝑑𝑡 + 𝑥 + 𝑥
I = 0.3 cos(𝑡)
![Page 13: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/13.jpg)
Example: The Chaotic Duffing Equation
• Uncertainty based on discretization is captures through multiple realizations
• While respecting the solution manifold through the realizations
![Page 14: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/14.jpg)
Kalman-Filter/SDE formulation of some Gaussian process covariance functions gives promise here• Using GP as a solver means
increasing observations means exploding computational cost
• Särkkä, et al.(2014) have shown an approach to solving time resolved GPs that is linear in time
• Works for many common covariance functions (Maternfamily) • Can approximate squared-
exponential
• Schober (2019) recently showed this work in probabilistic numerics context
![Page 15: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/15.jpg)
Some topics that I will cover as time allows:
• GP review
• GP Numerical Solvers
• Hierarchical Model Validation
• Exploring Model Discrepancy with • Statistics• Machine Learning
• Extra time:• Quick aside about error bars and
the parametric bootstrap
![Page 16: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/16.jpg)
Extrapolation is constantly discouraged but lies at the heart of science and engineering• Why do we think we can
extrapolate anyway?
• That’s the point of the physics in the physical model
• We expect the underlying physical principles to allow us to generalize
![Page 17: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/17.jpg)
What happens when we use a Gaussian process discrepancy?• GP does a great job of correcting
bias within the range of field data• Identifiability issues with
calibration • (Brynjarsdottr 2014, Tuo 2015)• Clever approaches to trying to get
around this (Gu 2018, Plumlee2017)
• Stationary GPs don’t extrapolate well
![Page 18: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/18.jpg)
Instead let’s use discrepancy as a diagnostic in the model validation process • Test the ability to generalize to
new data• Treat the model + discrepancy as
one would a physical model and attempt to validate it on independent data
• Successful validation with a structured discrepancy can be informative of missing processes
CalibrationExperiment Simulation 1
Calibration
Validation Experiment
Simulation 2
Assessment
Prediction
If assessment indicates problems or if the model changes, inspect both
physical and statistical model and recalibrate
.....
![Page 19: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/19.jpg)
We’ve started work on UNEDF treating different nuclei properties as multivariate output
• Firstcalibratedwithoutanydiscrepancy:
𝒀. = ∑a89bcdefe 𝑤a 𝜽& 𝝓a + 𝛿 𝒙& + 𝜖
(Higdon2008)
TossaGPoneverythingandroll?
𝒀. ∼ 𝑁(𝟎,𝝓vΣw𝝓 + Σx + Σy)
![Page 20: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/20.jpg)
The predictions on calibration data look good at first glance
Index
Bind
ing
Ener
gy
![Page 21: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/21.jpg)
But the residuals show a lot of structure
Bind
ing
Ener
gy R
esid
ual
Index
![Page 22: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/22.jpg)
Bind
ing
Ener
gy R
esid
ual
N
The structure is even more clear as a function of neutron number (N)
![Page 23: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/23.jpg)
So we’ll add a discrepancy to the calibration
𝒀. = ∑a89bcdefe 𝑤a 𝜽& 𝝓a + 𝛿 𝒙& + 𝜖
(Higdon2008)
TossaGPoneverythingandroll?
𝒀. ∼ 𝑁(𝟎,𝝓vΣw𝝓 + Σx + Σy)• No! We’ll try a form that may be more
reasonable at capturing the missing physics
![Page 24: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/24.jpg)
Expert elicitation indicated model discrepancy has a clear and physically intuitive pattern• Unmodeled processes exist near
“magic number” nuclei
• Test Discrepancy:• Exponential decay as distance
from magic number• Magnitude of exponential as linear
function of Z
𝛿 𝑍, 𝑑 = (𝛽| + 𝛽9𝑍)𝑒<}~�
![Page 25: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/25.jpg)
Comparing calibration residuals with no discrepancy case:
Bind
ing
Ener
gy R
esid
ual
Index Index
Bind
ing
Ener
gy R
esid
ual
No Discrepancy Discrepancy #1
![Page 26: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/26.jpg)
How do the validation results compare:
No Discrepancy Magic Number-based Discrepancy
Index
Bind
ing
Ener
gy R
esid
ual
Bind
ing
Ener
gy R
esid
ual
Index
![Page 27: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/27.jpg)
Some topics that I will cover as time allows:
• GP review
• GP Numerical Solvers
• Hierarchical Model Validation
• Exploring Model Discrepancy with • Statistics• Machine Learning
• Extra time:• Quick aside about error bars and
the parametric bootstrap
![Page 28: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/28.jpg)
Exploring Model Discrepancy with StatisticsMachine LearningA.I.
• Driven by work to identify sources of bias in criticality simulations
• Want to use machine learning to augment search for sources of simulation bias in a high dimensional feature space
• Build a prediction model for bias as a function of features
• Identify features that the predictor finds most informative for predicting bias This Photo by Unknown Author is licensed under CC BY-SA-NCThis Photo by Unknown Author is licensed under CC BY-SA
![Page 29: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/29.jpg)
• Build a prediction model for the bias using the large set of potentially informative features:
• Assess some metric of importance for features to identify important ones:• Not looking for causal relationships per se
• Assessing causality from observational data often requires strong assumptions• Looking to point to potential relationships that can be further explored
How do we apply machine learning to this problem?
k������� − k���
��� = Δ𝑘�.. = 𝑓 𝑋9, … , 𝑋F9||| + ϵ
![Page 30: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/30.jpg)
• Predicts bias as a nonlinear function of features with possibly high-order interactions
• Ensemble of regression trees• Each tree uses a resample of the data• Each split uses a random subset of features
• Averaging over many high-variance, low-bias predictors
We use a random forest regression model for predicting model bias given a fixed ‘optimal set of parameters
+ + …+
![Page 31: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/31.jpg)
• SHapley Additive exPlanations (SHAP)• Lundberg and Lee, NIPS, 2017
• Decomposes each prediction to additive components assigned to each feature• Somewhat like Sobol indices with expectations taken with respect to the
empirical distribution of the data• Done by estimating expected difference in prediction when adding feature j
to a subset of conditioning features for observation i. Then taking expectation over feature subsets
• Global measure is commonly to use mean, absolute additive component over all benchmarks
Given the predictor, we can use importance measures motivated by function decomposition methods
![Page 32: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/32.jpg)
Used this approach to identify poor estimate in fluorine nuclear data that otherwise had been unnoticed
§ Further investigation showed wide disagreement with ENDF and available data
§ Results to be submitted to Nuclear Data Sheets “this week”
§ Flourine showed up as a globally important feature for predicting 233U benchmarks
![Page 33: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/33.jpg)
Assessment of feature importance can be complicated by correlated features• If two or more features can be used
interchangeably, the predictor will have no way to distinguish in using them
• Additionally, SHAP and similar methods assume independence in their calculation
• Empirically observe diffusion of importance to groups of correlated features
![Page 34: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/34.jpg)
Accumulated Local Effects provide a potential path for reducing the impact of correlated features
• Feature importance measure using accumulated derivatives of the prediction function with respect to the feature of interest:
• Derivative removes the impact due to correlation if the function can be derivatives are known or function can be directly queried
𝐴𝐿𝐸a 𝑥a = ����(��)
?�𝐸𝑑𝑓 𝐗𝑑𝑧a
∣ 𝑋a = 𝑧a 𝑑𝑧a
![Page 35: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/35.jpg)
Accumulated Local Effects provide a potential path for reducing the impact of correlated features• Useful guarantees on function recovery…
• … if the derivates known or can be estimated by querying the function directly
• Functional variance of 𝐴𝐿𝐸a 𝑥a is measure of importance• ALE is commonly simply plotted to investigate main effects
• ALE isn’t magic• If function is learned from data with a prediction model, correlations get ’baked in’ to the model• No way to completely disentangle them
![Page 36: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/36.jpg)
A simple illustrative example:
• Simple function with two active and two inactive inputs:
• All inputs have mean 0, variance 1, • But have a correlation coefficient of 0.99 between them
• Simple 1D projection plots make all look active• See 𝑋I on the right
𝑓 𝑋 = 𝑋|F + 𝑋9
𝑋I
![Page 37: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/37.jpg)
A simple illustrative example:
• Simple function with two active and two inactive inputs:
• When the derivative is known or approximate derivates can be found from querying the function, we recover all effects, even with the correlation
𝑓 𝑋 = 𝑋|F + 𝑋9
![Page 38: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/38.jpg)
A simple illustrative example:
• Simple function with two active and two inactive inputs:
• This doesn’t work as well if we use a machine learning model to learn 𝑓 𝑋 and apply ALE to learned model
• But still recovers reasonably despite very high correlation
𝑓 𝑋 = 𝑋|F + 𝑋9
![Page 39: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/39.jpg)
ALE outperforms SHAP on the simple high correlation example despite losing the perfect recovery:
ALE SHAP
![Page 40: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/40.jpg)
Some topics that I will cover as time allows:
• GP review
• GP Numerical Solvers
• Hierarchical Model Validation
• Exploring Model Discrepancy with • Statistics• Machine Learning• A.I.
• Extra time:• Quick aside about error bars and
the parametric bootstrap
![Page 41: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/41.jpg)
Given the observation and 2-𝜎 error bar, which of the following is consistent with being 100 more samples from the generating distribution?
(a) (b)
(c) (d)
![Page 42: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/42.jpg)
All but (b)!
• Each observation is the true mean plus error
𝑦 = 𝜇 + 𝜖• Typical error bars on data are
designed to cover the unknown mean with a certain percentage confidence• NOT cover some percent of future
data
(a) (b)
(c) (d)
![Page 43: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/43.jpg)
What does this mean for parametric bootstrap?• I have seen work where the
bootstrap samples were done by adding parametric noise to the observed data• This is not actually correct according
to bootstrap theory• Adding error something that already
has error in it: 𝑦• Instead should add noise to the
regression mean or otherwise predicted mean: 𝜇• With something like a GP, could add
to realizations of random functions
(a) (b)
(c) (d)
![Page 44: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/44.jpg)
Summary of what I hope I covered:
• GP review
• GP Numerical Solvers
• Hierarchical Model Validation
• Exploring Model Discrepancy with • Statistics• Machine Learning• A.I.
• Extra time:• Quick aside about error bars and
the parametric bootstrap
![Page 45: What Can Gaussian Processes and Model Discrepancy Do For You? · Exploring Model Discrepancy with Statistics Machine Learning A.I. •Driven by work to identify sources of bias in](https://reader034.vdocument.in/reader034/viewer/2022042212/5eb5ce3c9218351a4f35336e/html5/thumbnails/45.jpg)
Thanks! Questions?