examining validity and precision of prognostic models. dan mcgee department of statistics florida...
TRANSCRIPT
![Page 1: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/1.jpg)
Examining validity and precision of prognostic models.
Dan McGee
Department of Statistics
Florida State University
![Page 2: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/2.jpg)
Acknowledgements
• The National Heart, Lung, and Blood Institute. Funding: HL67640
• The Diverse Populations Collaboration
![Page 3: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/3.jpg)
• Validity
• Classification Efficacy
• Predictive Accuracy
![Page 4: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/4.jpg)
DPC Collaborating Centres
USA15 cohorts
>230,000 participants
ISRAEL4 cohorts>35,000 participants
DENMARK1 cohort
>10,000 participants
CHINA1 cohort>7,000 participants
NORWAY1 cohort>48,000 participants
PUERTO RICO1 cohort
>9,000 participants
SCOTLAND2 cohorts>22,000 participants
YUGOSLAVIA1 cohort
>6,000 participants
ICELAND1 cohort
>18,000 participants
Hawaii1 cohort
>8,000 participants
![Page 5: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/5.jpg)
• 21 Studies • 49 strata (gender, race, etc.)• 50+ CVD deaths (within 10 years)in each
strata
• 219,973 Observations– 78,980 Female– 9,938 CVD deaths (within 10 years)
![Page 6: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/6.jpg)
Some Published Framingham Risk Models.
Reference Sample Cases/Total Model
1971(Section 27)
2 year risk, people free of CHD, pool of exams 1-8.
Men:370/31,704Women:206/41,834
Logistic
1973 (Section 28)
8 year risk, people free of CVD, pool of exams 2 and 6
Men 350/3813 Women 212/4960
Logistic
1987 (Section 37)
8 year risk, people free of CVD, pool of exams 2, 6, and 10
Men 523/4970Women 359/6570
Logistic
1991 AHA(Circulation)
Pool of Exam 11 of cohort and Exam 1 of offspring free of CHD (12 year follow-up)
Men 385/2590Women 241/2983
Accelerated Failure Time
1998 (Circulation)
Pool of Exam 11 of cohort and Exam 1 of offspring free of CHD.
Men 383/2489Women 227/2856
Proportional Hazards, categorical data.
![Page 7: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/7.jpg)
The Logistic Model
it(
= ( a vector of characteristics, with
Pr( | )
exp
log ) log
, , )' ,,
Y
x
x
x x x x
i i
ij jj
p
ii
iij j
j
p
i i i pi i
FHG
IKJ
FHG
IKJ
11
1
1
1
0
0
0 1 0
x
x
![Page 8: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/8.jpg)
Age, age2, Log(age), Log(age/74)Cholesterol, Log(chol/hdl)SBP, hypotensives, Diabetes, SmokerHypot.*SBP, Chol*age, LVH-ECG, Atrial Fibrillation
![Page 9: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/9.jpg)
Predict CVD death (10 years) based on:
AgeSystolic blood pressureSerum cholesterolDiabetic statusSmoking status (yes/no)
![Page 10: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/10.jpg)
Altman D and Royston P: What do we mean byvalidating a prognostic model? Statist Med 2000;
19:453-473.
• Inform patients and their families.• Create clinical risk groups for stratification.• Inform treatment or other decisions for individual patients.
• Usefulness is determined by how well a model works in practice.
![Page 11: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/11.jpg)
High CVD risk regions, risk based on total cholesterol
Age180 6 6 7 8 10 12 13 15 17 20
160 4 4 5 6 7 8 9 10 12 14
140 2 3 3 4 5 5 6 7 8 10 65120 2 2 2 3 3 4 4 5 6 7
180 3 4 4 5 6 7 8 9 11 12
160 2 2 3 3 4 5 5 6 7 8
140 1 2 2 2 3 3 4 4 5 6 60120 1 1 1 2 2 2 2 3 3 4
180 2 2 2 3 3 4 4 5 6 7
160 1 1 2 2 2 3 3 3 4 5
140 1 1 1 1 1 2 2 2 3 3 55120 0 1 1 1 1 1 1 2 2 2
180 1 1 1 1 2 2 2 3 3 4
160 1 1 1 1 1 1 1 2 2 2
140 0 0 1 1 1 1 1 1 1 2 50120 0 0 0 0 0 1 1 1 1 1
180 0 0 0 0 0 0 0 0 1 1
160 0 0 0 0 0 0 0 0 0 0
140 0 0 0 0 0 0 0 0 0 0 40120 0 0 0 0 0 0 0 0 0 0
4 5 6 7 8 4 5 6 7 8
12 14 17 20 23 24 27 31 36 42
8 10 12 14 16 17 19 23 26 31
6 7 8 10 12 11 13 16 19 23
4 5 6 7 8 8 9 11 13 16
8 10 12 14 16 17 19 22 26 31
6 7 8 9 11 11 13 16 19 22
4 5 5 7 8 8 9 11 13 16
3 3 4 5 6 5 6 8 9 11
5 6 8 9 11 11 13 15 18 21
4 4 5 6 8 7 9 10 13 15
2 3 4 4 5 5 6 7 9 11
2 2 2 3 4 3 4 5 6 7
3 4 5 6 7 7 8 9 11 14
2 3 3 4 5 5 5 7 8 10
2 2 2 3 3 3 4 4 5 7
1 1 1 2 2 2 3 3 4 5
1 1 1 2 2 2 2 3 3 4
1 1 1 1 1 1 2 2 2 3
0 1 1 1 1 1 1 1 2 2
0 0 0 1 1 1 1 1 1 1
4 5 6 7 8 4 5 6 7 8
15% and over10%Ğ14%6Ğ9%4Ğ5%3%2%
1%
< 1%
150 200 250 300mg/dl
Cholesterol mmol
10-year risk of fatal CVD
in areas of high CVD risk
Women Men
Non-smoker Smoker Non-smoker Smoker
![Page 12: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/12.jpg)
Low CVD risk regions, risk based on total cholesterol
Age180 4 4 5 6 6 8 9 10 11 12
160 3 3 3 4 4 5 6 7 7 8
140 2 2 2 2 3 3 4 4 5 6 65120 1 1 1 2 2 2 3 3 3 4
180 2 3 3 3 4 5 5 6 6 7
160 1 2 2 2 2 3 3 4 4 5
140 1 1 1 1 2 2 2 2 3 3 60120 1 1 1 1 1 1 1 2 2 2
180 1 1 1 2 2 2 3 3 3 4
160 1 1 1 1 1 2 2 2 2 3
140 1 1 1 1 1 1 1 1 1 2 55120 0 0 0 0 1 1 1 1 1 1
180 1 1 1 1 1 1 1 1 2 2
160 0 0 0 1 1 1 1 1 1 1
140 0 0 0 0 0 1 1 1 1 1 50120 0 0 0 0 0 0 0 0 0 1
180 0 0 0 0 0 0 0 0 0 0
160 0 0 0 0 0 0 0 0 0 0
140 0 0 0 0 0 0 0 0 0 0 40120 0 0 0 0 0 0 0 0 0 0
4 5 6 7 8 4 5 6 7 8
7 8 9 11 12 14 16 18 21 24
5 5 6 7 9 9 11 12 14 17
3 4 4 5 6 6 7 9 10 12
2 2 3 3 4 4 5 6 7 8
5 5 6 7 8 9 11 12 14 17
3 4 4 5 6 6 7 8 10 12
2 2 3 3 4 4 5 6 7 8
1 2 2 2 3 3 3 4 5 6
3 3 4 5 5 6 7 8 9 11
2 2 3 3 4 4 5 5 6 8
1 2 2 2 3 3 3 4 4 5
1 1 1 1 2 2 2 2 3 4
2 2 2 3 3 3 4 5 6 7
1 1 2 2 2 2 3 3 4 5
1 1 1 1 2 2 2 2 3 3
1 1 1 1 1 1 1 2 2 2
0 1 1 1 1 1 1 1 2 2
0 0 0 1 1 1 1 1 1 1
0 0 0 0 0 0 1 1 1 1
0 0 0 0 0 0 0 0 1 1
4 5 6 7 8 4 5 6 7 8
15% and over10%Ğ14%6Ğ9%4Ğ5%3%2%
1%
< 1%
150 200 250 300mg/dl
Cholesterol mmol
10-year risk of fatal CVD
in areas of low CVD risk
Women Men
Non-smoker Smoker Non-smoker Smoker
![Page 13: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/13.jpg)
Reliable classification of patients into different groups with different prognosis.
Area under the Receiver Operator Characteristic Curve
c-statistic, statistic of concordance.
![Page 14: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/14.jpg)
0102030405060708090
100
0 10 20 30 40 50 60 70 80 90 100
False Positives (%)
True
Pos
itive
s (%
)
Receiver Operating Characteristic (ROC) analysis
![Page 15: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/15.jpg)
.6.7
.8.9
Fra
min
gham
Mod
el
.65 .7 .75 .8 .85 .9Study Model
Area Under the ROC Curve
![Page 16: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/16.jpg)
ROC using study cohort model.6 .7 .8 .9 1
Combined
Random effects summary: .79 (.77,.81)
![Page 17: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/17.jpg)
Ordering:
* * * * * 0 1 2 3 4 5 age sbp chol smoking diabetesi i i i
If everyone were the same age, the ordering would be determined by:
sbp chol smoking diabetesi i i i* * * * 2 3 4 5
![Page 18: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/18.jpg)
.6.7
.8.9
Age
-Adj
uste
d, S
tudy
Mod
el
.65 .7 .75 .8 .85 .9Study Model
Area Under the ROC Curve
![Page 19: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/19.jpg)
ROC, study model, age-adjusted.5 .6 .7 .8 .9
Combined
Random effects summary: .71 (.70, .73)
![Page 20: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/20.jpg)
0.0
5.1
.15
.2R
OC
, With
ag
e -
age
-ad
just
ed
5 10 15(sd) age
![Page 21: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/21.jpg)
.6.7
.8.9
Age
-Onl
y, S
tudy
Mod
el
.65 .7 .75 .8 .85 .9Study Model
Area Under the ROC Curve
![Page 22: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/22.jpg)
Classification Model (Gordon 1979)
Each person belongs to either one group or another.
Estimated probabilities tend to be a unimodal right-skewed distribution.
![Page 23: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/23.jpg)
Framingham Males
Predicted Probability
Den
sity
0.0 0.2 0.4 0.6 0.8
02
46
8
![Page 24: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/24.jpg)
Predictive AccuracyGoodness of FitExplained VariationStrength of association
R2
How close are the estimated probabilities to the observed values.
![Page 25: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/25.jpg)
Ordinary Least Squares (OLS)
R2
Coefficient of determinationExplained varianceSquared correlation, observed, predicted
![Page 26: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/26.jpg)
R02
1
2
1
2
1
( )
( )
y p
y p
i ii
n
i ii
n
0.1
.2.3
R-s
qua
re b
ase
d o
n sq
uar
ed
erro
r
Average: .095
![Page 27: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/27.jpg)
Gordon (1979)
p
p
Rp
p
i from a Beta Distribution with:
O2
,
/
1
1
2
2 3
1
![Page 28: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/28.jpg)
RO2
1
2
1
2
1
( )
( )
y y
y y
i ii
n
ii
n
minimizing is not the criteria for developing estimates( )y yi ii
n
2
1
R can decrease with additional information (or even be negative)O2
![Page 29: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/29.jpg)
The error sum of squares is the only reasonable criteria forjudging residual variation in OLS. (Efron 1978)
Several exist for dichotomous dependent variables.
![Page 30: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/30.jpg)
l
(Negative log likelihood of p variable model)
l
Negative log likelihood of intercept only model)
Rl
l
p
0
L2 p
0
y p y p
y y y y
i i i ii
n
i ii
n
log( ) ( ) log( )
log( ) ( ) log( )
(
1 1
1 1
1
1
1
(Menard 2000)
![Page 31: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/31.jpg)
0.1
.2.3
.4
Likelihood based psuedo Rsq
Average: .16
![Page 32: Examining validity and precision of prognostic models. Dan McGee Department of Statistics Florida State University dan@stat.fsu.edu](https://reader030.vdocument.in/reader030/viewer/2022032604/56649e675503460f94b62995/html5/thumbnails/32.jpg)
ROC usingstudycohortmodel
R-squared,squared
error
R-squared,likelihood
based
.6
.7
.8
.9
.6 .7 .8 .9
0
.1
.2
.3
0 .1 .2 .3
0
.2
.4
0 .2 .4