moons individual progn studies design and analysis -...
TRANSCRIPT
K.G.M. Moons
Julius Center for Health Sciences and Primary Care, UMC Utrecht, www.juliuscenter.nl
Design and analysis of individualprognostic studies
Outline talk
• What is prognosis?
• Prognostic research: design and analysis
� Summary of our series of 4 papers on prognostic resea rch appearing in BMJ end 2008
• Special focus on dealing with missing values
� Summary of series of 3 papers in J Clin Epidemiol in 20 06 (Donders et al, Moons et al, van der Heijden et al)
What is prognosis?
• Prognosis = foreseeing / foretelling / predicting
� weather forecasts, banks going bankrupt
• Medical textbooks: (average) course of an illness
� Prognosis of MI, Alzheimer, breast cancer
Prognosis in practice
• Too general / not conform practice
1. Rather predict course of illness in particular individual
� Patient not only has illness but also particular age, gend er, symptoms, signs, test results, biomarkers, etc.
2. Prognosis in medicine not only in patients or ill indiv iduals
– risk of pre-eclampsia in pregnant women
– prediction of heart disease or BRCA mutation in generalpopulation (Framingham risk score)
– prognosis of newborns (Apgar score)
Prognosis in practice
• Probability individual develops particular state of health (outcome) over specific time period, based on clini cal + non-clinical profile (predictors)
� Time: hours, days, months, years
� Outcomes: death, complication, disease progression, QoL, therapy response
� Predictors: history taking, physical examination, tests (imaging, ECG, biomarkers, genetic ‘markers’), disea sestate, etc.
Prognosis vs. Prognostic research
• Prognosis = predicting ���� prognostic research = prediction research
� Prognostic studies = baseline prognosis
� Prediction studies = therapy respons
� Same concepts/requirements for design, analysis,
reporting
• Similarly, does not matter whether predictor understudy is biomarker, imaging, ECG, genetic test result, answer to question
Prognostic research: characteristics of design and analysis
Focus on design
1. Inherently multivariable
• Prognosis rarely estimated by single predictor (McShane LM 2005; Riley RD 2003)
� Prognostic research requires multivariable approachin design and analysis ���� objectives = providingevidence on:
� 1. Outcome occurrence over time
� 2. Which are the true prognostic predictors
� 3. Whether new predictor (e.g. biomarker) truly addspredictive information to easy to obtain predictors
� 4. Outcome probabilities for (different) predictor combinations or tools to estimate these prob’s
Knowing the addedpredictive
value is desired
1. Inherently multivariable
• Tools to estimate individual probabilities
� Prognostic or prediction models / risk scores / prediction ru les
• Convert predictor values to absolute probabilities
• Presented by:
� Mathematical formula requiring cacluator / computer
� Simple scoring rule
� Nomogram
- APACHE score- SAPS score- Nottingham prognostic score
2. Prognostic research != aetiologic research
• … despite clear similarities in design and analysis(Brotman, 2005)
1. Different aims
• Aetiologic: explain whether outcome occurrence canattributed to particular risk factor ���� pathofysiology
� adjusted for other risk factors, using multivarable appr oach
• Prognostic:(simply) to predict as accurate as possib le
� Prognostic analysis provides insight in causality: aim nor requirement
2. Different requirements in predictors under study
• Aetiologic: factors theoretically in causal chain
• Prognostic: all variables potentially related to outcome can be considered
How long doI have doc?
Do you have a
red car?
• Every causal factor is predictor
� Though often weak: e.g. genetic factors
� Not vice versa: e.g. skin color and biomarkers
3. Difference in analysis / presentation
• Both multivariable models … but different output reported
• Prognostic studies: absolute probabilities
� Relative risk estimates (OR/RR/HR) no direct meaning/r elevance ����
only to obtain absolute risks for individual
2. Prognostic research != aetiologic research
• Aetiologic studies: focus on relative risks of etiologi c ortherapeutic factor relative to its absence
4. Calibration and discrimination of a multivariable m odel highly important in prognostic but meaningless in aetiologic studies
2. Prognostic research != aetiologic research
• Best = cohort study
� Prospective preferred
� Optimal measurement predictors and outcomes
� Retrospective (existing cohort): longer f-up times butoften poorer data
– Dominate the literature (McShane 2005; Riley 2003)
3. Subject selection / sampling
• Not infrequently case control data
� Patients selected on presence/absence of outcome
• CC-design ideal for causal studies…
� Aimed at estimating relative risks
• … not for prognostic (or diagnostic) purposes
3. Subject selection / sampling
• Besides an often biased patient selection ���� sampling fraction of controls (and cases) unknown
� Relative risks (OR/HR/RR) correct
� Absolute risks (posterior probabilities) not
� Applies to single marker studies
� Multivariable prognostic model studies
3. Subject selection / sampling
• Exception: nested case control study (withincohort)
� Biesheuvel et al, BMC Res Methodol 2008; Rutjes et al Clin Chem 2005.
� Sampling fraction known (weight controls with inverse sampling fraction)
� Ideal design if:
� Predcitor meaurement expensive (tumor marker, genetic marker)
� Retrospective analysis of stored study data/human m aterial
– E.g. Framingham study
3. Subject selection / sampling
• Randomised trial data
� When Ry is ineffective: combine both groups
� If Ry effective; only control group (limited power) or maycombine ���� include treatment(s) as seperate predictor
� Ry studied on independent predictive effect
� Study interaction treatment*other predictors (next talk)
� Disdavantages trial data: less generalisability
� Strict elegibility criteria
� Control group also ‘treated’ group
� Selective refusals /consent
3. Subject selection / sampling
• History, physical, biomarkers, imaging,disease sever ity, received treatments
• When studying treatment as predictors (prognosis given treatment):
� OK when using RCT data, careful with observational data
– Ideally all required treatments given and all treatments required
– Treatment administration far from standard
– Confounding by indication
� Most treatments small predictive value compared to e.g. age, gender, disease stage
4. Candidate predictors
• Prognostic research = pragmatic/applied ���� to serve practice
• Predictors clearly defined, reproducible to enhance generalisability
• Care with predictors requiring too muchinterpretation
� Imaging test results
� Model observers rather than test results
4. Candidate predictors
• Preferably patient-relevant outcome
� Occurrence event, remission disease, death, complicat ions, death, Ry response, tumor growth
� Intermediates (LOS, physiology measures) unhelpful
� Except clear association with patient outcome --> CD4 cou nt in HIV
• Measure without knowledge of predictors(and v.v.)
� Not for all-cause death
5. Outcome
• Katz, Ann intern Med 2003; Harrell, Stat Med 1996 + 200 1 (book); Royston + Sauerbrei 2008 (book); Steyerberg 2008 (book); Royston
et al BMJ 2008; Royston et al, BMJ 2008.
• A variety of approaches found in literature
• Focus on dealing with missing values (J Clin Epi 2006)
6. Statistical AnalysisPrognostic model development
Introduction
• Missing data always occur (all types of studies)
• Usual CC-analysis ���� negatively affects
� Precision (logic)
� Commonly validity as well
• Bias depends on type of missing
3 types of missing values
• MCAR = Missing on a variable independent of any oth er data (observed and unobserved)
• MAR = Missing dependent on other (observed) variabl es but independent of unobserved data
� So we can in fact predict missingness
• MNAR = Missing depends on unobserved (not-available ) study data
Example
• Prognostic study using routine care cohort data
� Aim: prognostic value history + physical items + ad ded value labtests to predict 1 year outcome in patients with ba cterial meningitis
� Very sick patients (commonly those with outcome) in stantly referred to additional tests ���� missing history + physical
� Less sick ones (more often without event) ���� missing lab tests
� CC-analysis :
� Almost zero analysable cases
� Selection bias ���� predictive values incorrect
MCAR
• MCAR = no validity problem (only efficiency)
� Except indicator method + overall meanimputation (later)
� MCAR can be tested easily
Table. Distribution of co-variates among subjects without and with missing values (100%: n=398).Simple chi-square tests and t-tests (Wilcoxon tests).
<0.01(6)18(7)22Respiratory rate (breaths/min)*
0.173643Positive Chest x-ray
0.19(18)54(17)57Age (years)*
0.15711Signs of deep venous thrombosis
0.06510Collapse with or without loss of consciousness
0.02125Previous pulmonary embolism
0.091118Wheezing
0.17106Prior deep venous thrombosis
0.041624Surgery in previous 3 months
<0.011628Malignancy
<0.016680Dyspnoea
0.023647Pulmonary embolism (outcome variable)
p-value≥ 1 missing n=152 (38%)
No missingsn=246 (62%)Variables
* Mean (sd)
MAR vs MNAR
• Previous table = MAR = typical for medicalresearch
� Greenland Am J Epidemiol 1995
• Unfortunately: could still (partly) be MNAR
� Never to check (CATCH 22)
� MNAR = problems (Little JASA 1993)� requires ancillary info on mechanism of missing
Two main types of dealing with missing values
• Missing Indicator method
• Imputation methods
Indicator methodDonders, J Clin Epi 2006; Greenland, Am J Epidemiol 1995
• Goes wrong (in prediction and etiologic research) even when MCAR � more biased results than CC analysis
• Missing indicator often associated with outcome ����
usually retained as predictor in prognostic model
� Overestimation prognostic model ���� optimisticcalibration and discrimination
� ridiculous in prediction research
Imputation methods
Imputation is replacement:
• Overall mean / median
• Subgroup mean
• Hot decking
• Single imputation (SI)
• Multiple imputation (MI)
Overall mean/median imputationDonders, J Clin Epi 2006
• For each missing on X overall mean from observedvalues imputed� Diseased + non-diseased together
• All imputations have same value for X (no co-variates)� Distributions of X for D+ and D- will merge/less overlap� Association X on outcome dilutes = bias� Also: distribution X too narrow (SD too low)
� SE’s of X underestimated
� Also if MCAR!
Subgroup mean imputation
• A priori relevant subgroups are defined� E.g. per outcome category, sex, age groups, etc.
• Estimate mean for subgroup� For each missing on X subgroup mean is imputed
• More variations in imputed values� Less bias� SE’s still underestimated� Limited number of co-variates can a-priori be defined� Requires categorisation for continuous variables (loss of
information)
Single Imputation
• Regression: without or with addition random error
� For each variable Z with missings ���� regressionprediction model is fitted
� Z = a + b1.x1 + b2.x2 + b3.x3 …+ … e – e = error term (residuals from the regression model)
Single Imputation
• Include all relevant variables including outcome (!!!! )
� Prediction model for Z is fitted� Fixed beta’s (MLE’s) ���� same for each SI when repeated
• Prediction model used to estimate for each subject with missing Z best guess given covariate (X) values
• Analyses of determinant (Z) against outcome as usual
Single Imputation
• If no addition of random error term per patient
� 1. Each patient with same co-variates sameimputed value
� 2. Too optimistic imputation model
� 1 + 2 lead to more biased association of Z vsoutcome ���� THEREFORE ADD ERROR TERM
Single imputation with error term
• Conclusions:
� Usually correct regression coefficients
� But SE still underestimated ���� too easy significance
� As if all data were observed
– Beta’s of imputation model also estimated
Multiple imputation
• If you repeat SI with error term 5 times ���� seems MI
� But only variation across imputations = differences in randomly drawn and added error term
� Still too limited variation
Multiple imputation
• Model is same:� Z = a + b1.x1 + b2.x2 + b3.x3 …+ … e
– e = error term (residuals from the regression model)
� But now the distribution of b’s (!) and e’s are estimat ed and ‘saved’ ���� not fixed b’s
� Then: Per MI a random draw of b’s and e’s is taken
� Z estimated per patient based on co-variate pattern
Multiple imputation
� Study association/model fitted on each ‘of 10 completed’ data sets
� 10 Beta’s are averaged ���� 1 beta per determinant
� 10 SE’s averaged (within sample variation) plus accounting for between imputation variation ���� 1 SE per determinant
Multiple imputation
• 1 overall dataset can be created (for Table 1)
� means over de 10 datasets or choose 1
• MI Leads to better estimates of SE’s (p-values)
� As variation/insecurity in estimated/imputed values is introduced
Imputation with or without outcome?Moons, J Clin Epi 2006
• Missingness on determinant commonly related to other determinants and directly/indirectly outcome
• Advice (SI + MI) = use all observed patient data, i.e. all other determinants plus outcome
• Using outcome to impute missing determinants and subsequently estimate association between determinants + outcome - ���� self-fulfilling prophecy
•� Associations of determinants overestimated (away from null)
Bias
Complete Case
areg no y
areg y
Mice no y
Mice y
no missings ("truth")
-0.1 0.0 0.1
intercept (β = -2.948)
0.0002 0.0006 0.0010
age (β = 0.017)
-0.04 -0.02 0.00 0.02
recent surgery (β = 0.505)
-0.2 -0.1 0.0 0.1
Complete Case
areg no y
areg y
Mice no y
Mice y
no missings ("truth")
collapse (β = 1.352)
Complete Case
areg no y
areg y
Mice no y
Mice y
no missings ("truth")
-0.005 0.000 0.005
respiratory rate (β = 0.057)
-0.10 -0.05 0.00
abnormal x-ray (β = 0.812)
MCARSlightly MARVery MAR
Imputation with outcome …
• No circular analysis ���� no self-fulfilling prophecy
• “Ignoring in the imputation of missing determinants t he association between outcome and determinants will caus e rather than prevent bias ���� simply because (prediction) model misses an important variable, i.e. the outcome”
» Little and Rubin
Concluding remarks (1)
• Prognosis = foretelling = predicting � Prognostic = prediction studies� Therapy response is just a type of outcome� Prognosis is about individuals not diseases
• Prognostic studies� != etiologic studies ���� prediction != causation� Sampling subjects ���� cohort ideal
– Be careful with RCT and case-control data (except when nested)
� Predictors ���� all types� Outcome ���� patient relevant, blinded for predictors
Concluding remarks (2)• Dealing with missing predictor values
� CC analysis often biased� Missing indicator always biased� Overall mean imputation as well� Multiple imputation best to reduce invalidity most
� Also for selectively missing outcomes – De Groot Stat Med 2008
• Inherently multivariable� Added independent predictive value
– Certainly for biomarkers (too many)
� Multivariable design and analysis required– Prognostic / prediction models
Prognostic models limited application in practice
• Doctors do not trust model’s probabilities • Don’t know how to use them • To difficult to use ���� certainly if no computer
• Latter ‘will soon diminish’
Electronic patient record
‘Must be nationwide used by 2010’(James, NEJM 2001)
(Supposed) EPR advantagesDexter NEJM 2001;Hunt Jama 1998; James NEJM 2001;Kawamoto BMJ
2005; Zikmund-Fisher MDM 2007
•
� Ideal for prognostic research
� No need to simplify to simple risk scores
� Continuous predictors remain continuous
� No cumbersome paper versions of risk scores or nomograms
� EPR brings paperless practice / office
Utopia?
Paperless toilet
It is possible!!!!!!!! But use it with care