model evaluation and comparison
DESCRIPTION
DAIDD Gainesville, Florida December 2013. Model Evaluation and Comparison. Jim Scott, Ph.D, M.A., M.P.H. Goals. By the end of this talk, I hope you’ll: Have a good sense of what model evaluation is, why it’s important, and how it’s tied to your research question - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/1.jpg)
DAIDDGainesville, FloridaDecember 2013
Jim Scott, Ph.D, M.A., M.P.H.
![Page 2: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/2.jpg)
By the end of this talk, I hope you’ll:
Have a good sense of what model evaluation is, why it’s important, and how it’s tied to your research question
Know some of the characteristics that are desirable in models
2
![Page 3: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/3.jpg)
Specific question
Identify relevant factors and information
Model formulation
Mathematics
Evaluation
3
![Page 4: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/4.jpg)
“E” topics: Edison, Thomas Epicycle Epidemiology Elasticity Eratosthenes Euler Existence Evolution Extrapolation Eradication
“M” topics: Malthus, Thomas Mars Maternity Maximum likelihood Maxwell, James C. Misery Monte Carlo Moon Mortality Mumps
4
![Page 5: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/5.jpg)
Let’s look at a few different types of models
5
Source: Wikipedia
![Page 6: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/6.jpg)
6Source: Wikipedia
![Page 7: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/7.jpg)
7Source: Wikipedia
![Page 8: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/8.jpg)
8
Source: XKCD
![Page 9: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/9.jpg)
9
Thomas Jefferson
John Adams
John Adams
Thomas Jefferson
Thomas Jefferson
James Madison
James Madison
DeWitt Clinton
James Monroe
John Quincy Adams
Andrew Jackson
Andrew Jackson
John Quincy Adams
Henry Clay
Andrew Jackson
Martin Van Buren
William Henry Harrison
William Henry Harrison
Martin Van Buren
James K. PolkHenry Clay
Lewis Cass
Zachary Taylor
Winfield Scott
Franklin Pierce
John C. Fremont
James Buchanan
Stephen A. Douglas
Abraham Lincoln
Abraham Lincoln
George B. McClellan
Uylsses S. Grant
Ulysses S. Grant
Horace Greeley
Rutherford B. Hayes
Samuel J. Tilden
James A. GarfieldWinfield HancockGrover ClevelandJames G. BlaineGrover ClevelandBenjamin HarrisonGrover Cleveland
Benjamin Harrison
William McKinley
William J. Bryan
William McKinley
William J. Bryan
Theodore Roosevelt
Alton B. Parker
William Howard Taft
William J. BryanWoodrow Wilson
Theodore Roosevelt
Woodrow Wilson
Charles E. Hughes
James Cocks
Warren G. Harding
Calvin Coolidge
John W. Davis
Herbert Hoover
Al Smith
Franklin D. Roosevelt
Herbert Hoover
Alf Landon
Franklin D. Roosevelt
Franklin D. Roosevelt
Wendell Willkie
Franklin D. Roosevelt
Thomas E. Dewey
Harry S. Truman
Thomas E. Dewey
Dwight D. Eisenhower
Adlai Stevenson
Dwight D. Eisenhower
Adlai Stevenson
John F. KennedyRichard Nixon
Lyndon Johnson
Barry Goldwater
Richard NixonHubert Humphrey
Richard Nixon
George McGovern
Gerald FordJimmy Carter
Jimmy Carter
Ronald Reagan
Ronald Reagan
Walter Mondale
George H. W. Bush
Michael Dukakis
Bill Clinton
George H. W. Bush
Bill Clinton
Bob Dole
George W. BushAl GoreGeorge W. Bush
John Kerry
Barack Obama
John McCain
30
40
50
60
70
160 170 180 190 200heightcm
Influence of Candidate Height on Presidential Elections
Estimated Percent Popular Vote = 1.07 + 0.26 * Height
![Page 10: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/10.jpg)
10
Source: Witlock & Schluter, Analysis of Biological Data
![Page 11: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/11.jpg)
11
The Krebs Cycle
The Eye
![Page 12: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/12.jpg)
= birth rate
N = S + I
= infection rate
I = Weibull mortality
S I I
N SI /N IS
0.0
0.2
0.4
0.6
0.8
1.0
0 10 20 30Time (years)
P(s
urv
ivin
g)
Normal (Weibull 2)
Exponential(Weibull 1)
Slide credit: J. Hargrove/B. Williams
![Page 13: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/13.jpg)
Consider the previous examples
Talk with someone next to you
Come up with a list of characteristics that a “good” model should have
For example, you might say “simplicity”
13
![Page 14: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/14.jpg)
Accurate (i.e. low bias)
14
![Page 15: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/15.jpg)
A model is accurate if estimates based on the model match the truth E.g. models that are used to predict the
weather are reasonably accurate when predicting tomorrow’s weather. They are much less accurate at predicting the weather at times further into the future.
Does the model fit the data
15
![Page 16: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/16.jpg)
16Source: Wikipedia
![Page 17: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/17.jpg)
17
![Page 18: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/18.jpg)
Farr initially believed in the miasma theory of disease transmission – disease was propagated by “bad air”
The higher the elevation, the better the air
Mortality at terrace level X = Mortality at terrace 1 / terrace level
X18
![Page 19: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/19.jpg)
19
![Page 20: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/20.jpg)
20r = 0.9972 Est. Temp = 155.3 + 1.90 * Pressure
195
200
205
210
215
Tem
p -
F
20 22 24 26 28 30Pressure - Inches of Hg
Boiling Point of Water vs Pressure
Source: Wikipedia
![Page 21: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/21.jpg)
Accurate (i.e. low bias)
Descriptively realistic
21
![Page 22: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/22.jpg)
A model is descriptively realistic if it’s derived from a correct description of the mechanism involved in whatever is being modeled Corollary: underlying assumptions are correct Statistical models are not descriptively realistic
▪ For example, a linear equation only models a pattern in the data – there’s nothing telling us what’s going on behind the scene
An SIR model is more descriptively realistic▪ A mechanism for transmission is specified
22
![Page 23: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/23.jpg)
23
33
.13
.23
.33
.4
.0047 .0048 .0049 .005 .0051 .0052InverseT
Source: Wikipedia
Ln P
![Page 24: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/24.jpg)
Accurate (i.e. low bias)
Descriptively realistic
Precise (i.e. low variability)
24
![Page 25: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/25.jpg)
A model is precise if the estimates that the model produces have low variability E.g. a model that estimates that it will
start to rain in the next 3 – 6 hours is more precise than a model that estimates it will start to rain in the next 3 – 6 days
25
![Page 26: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/26.jpg)
26Weekly U.S. Influenza Surveillance Report, http://www.cdc.gov/flu/weekly/
![Page 27: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/27.jpg)
27
I(a,t) is the incidence at age a at time t
P(a,t) is the age-specific prevalence at age a at time t
*
* Simplified model
![Page 28: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/28.jpg)
28
![Page 29: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/29.jpg)
Accurate (i.e. low bias)
Descriptively realistic
Precise (i.e. low variability)
General
29
![Page 30: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/30.jpg)
A model is general if it applies to a wide variety of situations e.g. the law of supply and demand
30Source: Wikipedia
![Page 31: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/31.jpg)
31
![Page 32: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/32.jpg)
N
SI
dt
dS I
N
SI
dt
dI I
dt
dR
S I R
![Page 33: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/33.jpg)
33
A
BX
X
ZX
A
Z
Each pie represents a sufficient cause for disease (i.e. disease is inevitable)
Each letter represents a component cause for a disease
The component cause X is a necessary cause (i.e. disease cannot occur without it)
![Page 34: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/34.jpg)
Accurate (i.e. low bias)
Descriptively realistic
Precise (i.e. low variability)
General
Robust
34
![Page 35: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/35.jpg)
A model is robust if it is relatively immune to errors in the data and/or immune to small violations of model assumptions Is the model very sensitive to relatively small
changes in estimated input parameters? ▪ Model is NOT robust
Do model predictions remain accurate even when some key assumptions do not strictly hold?▪ Model IS robust
35
![Page 36: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/36.jpg)
36
![Page 37: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/37.jpg)
Poor Hygiene
Poo
r S
anit
atio
n
% d
isea
se a
ttri
buta
ble
to w
ater
No shedding of pathogens (contamination) into the water ( = 0)
![Page 38: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/38.jpg)
Poor Hygiene
Poo
r S
anit
atio
n
% d
isea
se a
ttri
buta
ble
to w
ater
Moderate contamination ( = 1.0)
![Page 39: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/39.jpg)
Poor Hygiene
Poo
r S
anit
atio
n
% d
isea
se a
ttri
buta
ble
to w
ater
Very high contamination ( = 2.0)
![Page 40: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/40.jpg)
Accurate (i.e. low bias)
Descriptively realistic
Precise (i.e. low variability)
General
Robust
Simple / Parsimonious
40
![Page 41: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/41.jpg)
A model is parsimonious if it can “accomplish a lot without much” E.g. a model that selects a relatively small
number of the most useful parameters
Simple isn’t always better
The research question should drive the complexity of the model
41
![Page 42: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/42.jpg)
42r = 0.9972 Est. Temp = 155.3 + 1.90 * Pressure
195
200
205
210
215
Tem
p -
F
20 22 24 26 28 30Pressure - Inches of Hg
Boiling Point of Water vs Pressure
Source: Wikipedia
![Page 43: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/43.jpg)
43
-1.5
-1-.
50
.5R
esid
ual
s
195 200 205 210 215Fitted values
Hmmm…
What does this mean?
Outlier!
Strong evidenceof non-linearity
![Page 44: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/44.jpg)
44
. regress bpt pressure pressure2
Source | SS df MS Number of obs = 16-------------+------------------------------ F( 2, 13) =18942.88 Model | 527.718935 2 263.859468 Prob > F = 0.0000 Residual | .181079777 13 .013929214 R-squared = 0.9997-------------+------------------------------ Adj R-squared = 0.9996 Total | 527.900015 15 35.1933344 Root MSE = .11802
------------------------------------------------------------------------------ bpt | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- pressure | 3.754805 .2026619 18.53 0.000 3.31698 4.192629 pressure2 | -.0358576 .0039463 -9.09 0.000 -.044383 -.0273322 _cons | 131.7824 2.570509 51.27 0.000 126.2292 137.3357------------------------------------------------------------------------------
195
200
205
210
215
20 22 24 26 28 30Pressure
Est. Temp = 131.8 + 3.75 Pressure – 0.036 Pressure2
![Page 45: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/45.jpg)
45
IN
SI
dt
dI vs.
![Page 46: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/46.jpg)
46
Used as a model selection tool Penalizes models with excessive parameter
spaces AIC = 2k – 2ln(L) AICc = AIC + 2k(k+1) / (n – k – 1)
AICc is often used to avoid over-fitting when the sample size is small or the parameter space is large
Lower AICc more parsimonious model
![Page 47: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/47.jpg)
47
“Best” model (#1)
“Mass action”Model (#5)
IN
SI
dt
dI
![Page 48: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/48.jpg)
Accurate (i.e. low bias)
Descriptively realistic
Precise (i.e. low variability)
General
Robust
Simple / Parsimonious
Useful
48
![Page 49: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/49.jpg)
A model is useful if: its conclusions are useful it points the way to other good models
E.g. Modeling HIV exercise▪ The early models weren’t necessarily
accurate but they were useful
49
![Page 50: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/50.jpg)
= birth rate
N = S + I
= infection rate
I = Weibull mortality
S I I
N SI /N IS
0.0
0.2
0.4
0.6
0.8
1.0
0 10 20 30Time (years)
P(s
urv
ivin
g)
Normal (Weibull 2)
Exponential(Weibull 1)
Slide credit: J. Hargrove/B. Williams
![Page 51: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/51.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
1980 1990 2000 2010 2020Year
0.00
0.05
0.10
0.15
0.20
Pre
vale
nce
Inci
denc
e/m
orta
lity
Slide credit: J. Hargrove/B. Williams
![Page 52: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/52.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
1985 1990 1995 2000Year
Re
lativ
e t
ran
smis
sio
n
.
~
S I
I N SI /N I
S ~
= birth rate
N = population = C(t)
I = mortality
~
~
C(t)
Including controlSlide credit: J. Hargrove/B. Williams
![Page 53: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/53.jpg)
0.0
0.1
0.2
0.3
0.4
1980 1990 2000 2010 2020Year
0.00
0.02
0.04
0.06
Pre
vale
nce
Inci
denc
e/m
orta
lity
Slide credit: J. Hargrove/B. Williams
![Page 54: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/54.jpg)
Accurate (i.e. low bias)
Descriptively realistic
Precise (i.e. low variability)
General
Robust
Simple / Parsimonious
Useful
Inexpensive
Others???54
![Page 55: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/55.jpg)
Accurate (i.e. low bias)
Descriptively realistic
Precise (i.e. low variability)
General
Robust
Simple / Parsimonious
Useful
Inexpensive
Others???55
![Page 56: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/56.jpg)
It really depends on what your original research question was Was the goal to accurately predict
something? Was the goal to determine a relationship
between two or more parameters? Was the goal to understand a system in
general terms? Was the goal to test a hypothesis? Or to
generate one?56
![Page 57: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/57.jpg)
Consider each of the models presented today What are the good
things about each model?
What are the shortcomings of each model?
Final word:
57Source: XKCD
![Page 58: Model Evaluation and Comparison](https://reader031.vdocument.in/reader031/viewer/2022012913/56813c63550346895da5ecdf/html5/thumbnails/58.jpg)
Concepts of Mathematical Modeling, Walter Meyer, McGraw-Hill, 1984
Probability and Statistics, Charles Stone, Duxbury, 1996 Modeling Infectious Diseases in Humans and Animals, Keeling and
Rohoni, Princeton, 2008 Mathematical Models for Communicable Diseases, Brauer and
Castillo-Chavez, SIAM, 2013 The Analysis of Biological Data, Whitlock and Schluter, Roberts
and Company, 2008 Wikipedia XKCD
58