course round-up subtitle- statistical model building marian scott university of glasgow glasgow, aug...
TRANSCRIPT
![Page 1: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/1.jpg)
Course round-upsubtitle- Statistical model building
Marian ScottUniversity of Glasgow
Glasgow, Aug 2013
![Page 2: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/2.jpg)
Step 1
why do you want to build a model- what is your objective?
what data are available and how were they collected?
is there a natural response or outcome and other explanatory variables or covariates?
![Page 3: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/3.jpg)
Modelling objectives
explore relationships make predictions improve understanding test hypotheses
![Page 4: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/4.jpg)
Conceptual system
Data
Model
Policy
inputs & parameters
model results
feedbacks
![Page 5: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/5.jpg)
Value judgements
Different criteria of unequal importance key comparison often comparison to
observational data (RSS, AIC......)
but such comparisons must include the model
uncertainties and the uncertainties on the observational data.
![Page 6: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/6.jpg)
Questions we ask about models
Is the model valid? Are the assumptions
reasonable? Does the model make
sense based on best scientific knowledge?
Is the model credible? Do the model predictions
match the observed data?
How uncertain are the results?
![Page 7: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/7.jpg)
Stages in modelling
Design and conceptualisation:– Visualisation of structure– Identification of processes– Choice of parameterisation
Fitting and assessment– parameter estimation (calibration)– Goodness of fit
![Page 8: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/8.jpg)
a visual model- atmospheric flux of pollutants
•Atmospheric pollutants dispersed over Europe
•In the 1970’ considerable environmental damage caused by acid rain
•International action
•Development of EMEP programme, models and measurements
![Page 9: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/9.jpg)
The mathematical flux model
L: Monin-Obukhov length
u*: Friction velocity of wind
cp: constant (=1.01)
: constant (=1246 gm-3)
T: air temperature (in Kelvin)
k: constant (=0.41)
g: gravitational force (=9.81m/s)
H: the rate of heat transfer per unit area
gasht: Current height that measurements are taken at.
d: zero plane displacement
![Page 10: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/10.jpg)
what would a statistician do if confronted with this problem?
Look at the data understand the measurement processes think about how the scientific knowledge,
conceptual model relates to what we have measured
![Page 11: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/11.jpg)
Step 2- understand your data
study your data learn its properties tools- graphical
![Page 12: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/12.jpg)
measured atmospheric fluxes for 1997
•measured fluxes for 1997 are still noisy.
•Is there a statistical signal and at what timescale?0
5
10
15
100 200 300
1997
Flu
xes
Index
![Page 13: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/13.jpg)
Key properties of any measurement
Accuracy refers to the deviation of the measurement from the ‘true’ value
Precision refers to the variation in a series of replicate measurements (obtained under identical conditions)
![Page 14: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/14.jpg)
Accurate
Imprecise
Inaccurate
Precise
Accuracy and precision
![Page 15: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/15.jpg)
Data properties
Nature and distribution of the data- continuous, counts.... Normal, exponential, poisson, maybe need a transformation
Missing data- outliers- limits of detection
Use pictures to explore
![Page 16: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/16.jpg)
Step 3- build the statistical model
Outcomes or Responses
Causes or Explanationsthese are the conditions or environment within which the outcomes or responses have been observed -the covariates.
This has very much been the focus of much of the week- whether a linear model, a smooth flexible model, a time series model, a bayesian model.....
![Page 17: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/17.jpg)
Are you a bayesian?
What does that mean?
It means, you have prior information (belief) that you want to include in your statistical model
You need to find a way of capturing this in the prior distribution
Model output then a posterior distribution on the quantity of interest- automatically incorporates uncertainty
![Page 18: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/18.jpg)
Calibration-using the data
A good idea, if possible to have a training and a test set of data-split the data (90%/10%)
Fit the model using the training set, evaluate the model using the test set.
why? because if we assess how well the model
performs on the data that were used to fit it, then we are being over optimistic
other methods: bootstrap and jackknife
![Page 19: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/19.jpg)
Which variables to include
Use your science knowledge Use pictures to look for patterns Maybe use some of the more algorithmic
ways to select the set (stepwise, BSR...)
How to compare models? Nested models (ANOVA, likelihood ratio test)
![Page 20: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/20.jpg)
Uncertainty and sensitivity analysis
![Page 21: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/21.jpg)
Uncertainty (in variables, models, parameters,
data) what are uncertainty and sensitivity analyses?
![Page 22: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/22.jpg)
Modelling tools - SA/UA
Sensitivity analysis
determining the amount and kind of change produced in the model predictions by a change in a model parameter
Uncertainty analysis
an assessment/quantification of the uncertainties associated with the parameters, the data and the model structure.
![Page 23: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/23.jpg)
SA flow chart (Saltelli, Chan and Scott, 2000)
![Page 24: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/24.jpg)
Design of the SA experiment
Simple factorial designs (one at a time) Factorial designs (including potential
interaction terms) Fractional factorial designs Important difference: design in the context of
computer code experiments – random variation due to variation in experimental units does not exist.
![Page 25: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/25.jpg)
Global SA
Global SA apportions the output uncertainty to the uncertainty in the input factors, covering their entire range space.
A global method evaluates the effect of xj while all other xi,ij are varied as well.
![Page 26: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/26.jpg)
How is a sampling (global) based SA implemented?
Step 1: define model, input factors and outputs
Step 2: assign p.d.f.’s to input parameters/factors and if necessary covariance structure. DIFFICULT
Step 3: simulate realisations from the parameter pdfs to generate a set of model runs giving the set of output values.
![Page 27: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/27.jpg)
SA -analysis
At the end of the computer experiment, data is of the form (yij, x1i,x2i,….,xni), where x1,..,xn are the realisations of the input factors.
Analysis includes regression analysis (on raw and ranked values), standard hypothesis tests of distribution (mean and variance) for subsamples corresponding to given percentiles of x, and Analysis of Variance.
![Page 28: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/28.jpg)
How can SA/UA help?
SA/UA have a role to play in all modelling stages:– We learn about model behaviour and ‘robustness’ to
change;– We can generate an envelope of ‘outcomes’ and
see whether the observations fall within the envelope;
– We can ‘tune’ the model and identify reasons/causes for differences between model and observations
![Page 29: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/29.jpg)
On the other hand - Uncertainty analysis
Parameter uncertainty– usually quantified in form of a distribution.
Model structural uncertainty– more than one model may be fit, expressed as a
prior on model structure.
Scenario uncertainty– uncertainty on future conditions.
![Page 30: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/30.jpg)
An uncertainty example (Ron Smith)
OriginalMean of 100 simulations
Standard deviation
![Page 31: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/31.jpg)
An uncertainty example
CV from 100 simulations
Possible bias from 100 simulations
![Page 32: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/32.jpg)
An uncertainty example
• model sensitivity analysis identifies weak areas• lack of knowledge of accuracy of inputs a
significant problem• there may be biases in the model output which,
although probably small in this case, may be important
• Model emulators have become popular
![Page 33: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/33.jpg)
Take home message
• Only able to give you a flavour of what might be possible
• Good environmental science and good statistical science is key for all problems
• Think critically- test and re-test your hypotheses and assumptions
![Page 34: Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013](https://reader035.vdocument.in/reader035/viewer/2022070305/5515c95255034693758b4a2b/html5/thumbnails/34.jpg)
Take home message
• Resources• Many good books (have seen some of these
over the sessions- not one size fits all
• JISC mail list- Envstat (worth joining)• Royal Statistical Society has an Environmental
Statistics section, sometimes holds tutorial meetings on topics.