choosing appropriate statistical test rss6 2104
Post on 21-Jan-2015
231 Views
Preview:
DESCRIPTION
TRANSCRIPT
Choosing Appropriate Statistical Test
Amr Albanna, MD, MSc
Factors Influencing the Selection of Statistical Tests
Study Design
Type of Data
Study Design
4
Descriptive Studies
• Prevalence
– Cross-sectional study
• Incidence
– Cohort study
Prevalence Versus Incidence
• Prevalence can be viewed as describing a pool of disease in a population.
• Incidence describes the input flow of new cases into the pool.
• Deaths and cures reflects the output flow from the pool.
Prevalence Versus Incidence
Prevalence at time t1 = 2/10 = 20%
Source: Silva 1999
Prevalence at time t2 = 3/8 = 38%
Incidence between t1 and t2: 4/8 = 50%
Descriptive Studies
• Determine the size of health problem in the “study base” population.
• Promote public health policies.
Analytic Studies
• Randomized-controlled trials.
• Cohort studies
• Case-control studies
• Diagnostic studies
Analytic Studies
• To effectively practice medicine, we need evidence/knowledge on 3 fundamental types of professional knowing “gnosis”:
Dia-gnosis Etio-gnosis Pro-gnosis
• Most fundamental application of clinical research: to identify causal associations between exposure(s) and outcome(s)
Exposure Outcome
?
Analytic Studies
Causal Vs. Non-causal Association
A B
Accidental
No Association
Causal Vs. Non-causal Association
A B
A cause B
Causal Vs. Non-causal Association
A B
B cause A
Direction of causality: does overeating cause obesity?
Taubes G
New Scientist
2008
Causal Vs. Non-causal Association
A B
A is not causally associated with B
C e.g. Smoking
e.g. Lung cancer e.g. Coffee
A Research Scenario
• Study question: Does eating affect student intellectual ability.
• 100 students underwent an exam after eating lunch.
• 50% failed the exam.
• You conclude that eating worsen students intellectual ability.
Compared to what?
• In an old movie, comedian Groucho Marx is asked: “Groucho, how’s your wife?”
• Groucho quips: “Compared to what?”
http://en.wikipedia.org
Outcome
Outcome Counterfactual, unexposed cohort
Exposed cohort
Ideal counterfactual comparison to determine
causal effects
Maldonado & Greenland, Int J Epi 2002;31:422-29
“Initial conditions” are identical in
the exposed and unexposed groups
– because they are the same
population!
Outcome
Outcome
Counterfactual, unexposed cohort
Exposed cohort
Substitute, unexposed cohort
Outcome
What happens in reality?
counterfactual state
is not observed
(latent)
A substitute will usually be a population other than the target population
during the etiologic time period - INITIAL CONDITIONS MAY BE
DIFFERENT
Risk
Rate
Risk Difference
Risk Ratio
Rate Ratio
Odds Ratio
Measures of disease
freq
Measures of effect
Measures of potential
impact
Attributable Risk
Population Attributable Risk
How PAR is dependent on prevalence of exposure
Szklo & Nieto. Epidemiology: Beyond the basics. 2nd Edition, 2007
Randomization helps to make the groups “comparable” (i.e. similar
initial conditions)
Eligible patients
Treatment
Randomization
Placebo
Outcomes
Outcomes
Randomized-controlled trials
Incidence
Incidence
Difference: “RR” or “RD”
Observational Studies
E
E
E
E
E
E
E
E
E E
E
E
N
E
E
N
N
N
N
N
N
N
N
N
N
N
N
N
N N
N
N
N
N
N
N
N
N
E
E
E
E
E
E
E
E
E
N
N N
N
N
N
N
N
Cohort
E E
E
E
E
E E
E
E
E E
E
N
E
E
N N
N
N N N N
N
N
N
N
N N
N
N
N
N
N
N
N N
N
N
E
E
E
E
E E
E
E E
N
N
N
N N
N
N
N
N
Un-Exposed
Exposed
Study population
Exposed Unexposed
Disease No
Disease Disease
No Disease
Incidence of disease in exposed
Incidence of disease in unexposed
Cohort
“Risk Ratio”
“Risk Difference”
Case-Control
E E E
E
E E E
E E
E E
E
N
E
E
N
N
N
N N N N N
N
N N N N N N
N
N
N N N N
N N E E E
E E
E E E
E
N
N
N
N
N N
N N N
Cases
Controls
Study population
Disease No disease
Exposed Un-
exposed Exposed
Un-exposed
Odds of being exposed
Odds of being exposed
Case-control
“Odds Ratio” approximate “Risk Ratio”
Observational Studies: Problem
Association between birth order and Down syndrome
Source: Rothman 2002 Data from Stark and Mantel (1966)
Source: Rothman 2002
Association between maternal age and Down syndrome
Data from Stark and Mantel (1966)
Source: Rothman 2002
Association between maternal age and Down syndrome, stratified by
birth order
Data from Stark and Mantel (1966)
Criteria to define confounder
• A factor is a confounder if 3 criteria are met:
– a) a confounder must be causally or noncausally associated with the exposure in the source population;
– b) a confounder must be a causal risk factor (or a surrogate measure of a cause) for the disease;
– c) a confounder must not be an intermediate cause (in other words, a confounder must not be an intermediate step in the causal pathway between the exposure and the disease)
Exposure Disease (outcome)
Confounder
Confounding Schematic
E D
C
Szklo M, Nieto JF. Epidemiology: Beyond the basics. Aspen Publishers, Inc., 2000.
Gordis L. Epidemiology. Philadelphia: WB Saunders, 4th Edition.
Exposure Confounder
Intermediate cause
E D C
Disease
Birth Order Down Syndrome
Confounding factor:
Maternal Age
Confounding Schematic
E D
C
HRT use Heart disease
Confounding factor:
SES
Are confounding criteria met?
Association between HRT and heart disease
Control of confounding: Outline
• Control at the design stage
– Randomization
– Restriction
– Matching
• Control at the analysis stage
– Conventional approaches
• Stratified analyses
• Multivariate analyses
– Newer approaches
• Propensity scores
Observational Study on Vit E and Coronary Heart Disease
Fitzmaurice, 2004
Crude OR = (50)(384)/(501)(65) = 0.59
Are there potential confounders that can explain this crude OR?
Vitamin E CHD
Confounding factor:
Smoking
Stratify on the
confounding
variable
Could reduced smoking among Vit E users partly
explain the observed protective effect?
Stratified Analyses (by smoking status)
Fitzmaurice, 2004
OR (smokers) = (11)(200)/(40)(49) = 1.12
OR (non-smokers) = (39)(184)/(461)(16) = 0.97
Stratum 1
Stratum 2
Multivariate Analysis
•Diagnostic 2 X 2 table*:
Disease + Disease -
Test + True
Positive
False
Positive
Test - False
Negative
True
Negative
*When test results are not dichotomous, then can use ROC curves [see later]
Diagnostic Studies
Disease
present
Disease
absent
Test
positive
True
positives
False
positives
Test
negative
False
negative
True
negatives
Sensitivity
[true positive rate]
The proportion of patients with disease who test
positive = P(T+|D+) = TP / (TP+FN)
Disease
present
Disease
absent
Test
positive
True
positives
False
positives
Test
negative
False
negative
True
negatives
Specificity
[true negative rate]
The proportion of patients without disease who test
negative: P(T-|D-) = TN / (TN + FP).
Disease
present
Disease
absent
Test
positive
True
positives
False
positives
Test
negative
False
negative
True
negatives
Predictive value of a positive test
Proportion of patients with positive tests who have
disease = P(D+|T+) = TP / (TP+FP)
Disease
present
Disease
absent
Test
positive
True
positives
False
positives
Test
negative
False
negative
True
negatives
Predictive value of a negative test
Proportion of patients with negative tests who do not have
disease = P(D-|T-) = TN / (TN+FN)
Disease
present
Disease
absent
Test
positive
True
positives
False
positives
Test
negative
False
negative
True
negatives
Likelihood Ratio of a Positive
Test
LR+ = TPR / FPR )|Pr(
)|Pr(
DT
DTLR
How more often a positive test result occurs in persons with compared to those without the target condition
Disease
present
Disease
absent
Test
positive
True
positives
False
positives
Test
negative
False
negative
True
negatives
Likelihood Ratio of a Negative
Test
LR- = FNR / TNR )|Pr(
)|Pr(
DT
DTLR
How less likely a negative test result is in persons with the target condition compared to those without the target condition
Continuous results: Receiver operating characteristic (ROC)curve
Blood sugar level
(2-hour after
food) in
mg/100 ml
Sensitivity
(%)
Specificity
(100%)
70
80
90
100
110
120
130
140
150
160
170
180
190
200
98.6
97.1
94.3
88.6
85.7
71.4
64.3
57.1
50.0
47.1
42.9
38.6
34.3
27.1
8.8
25.5
47.6
69.8
84.1
92.5
96.9
99.4
99.6
99.8
100
100
100
100
Area under the curve (AUC) can range from 0.5 (random chance, or no predictive ability; refers to the 45 degree line in the ROC plot) to 1 (perfect discrimination/accuracy).
The closer the curve follows the left-hand border and then the top-border of the ROC space, the more accurate the test. The closer the curve comes to the 45-degree diagonal of the ROC space, the less accurate the test.
Systematic Review
Bates et al. Arch Intern Med 2007
Meta-analysis
Ried K. Aus Fam Phys 2006
Type of Data
Continuous Variables
• Mean and 95% CI • Median and IQR
Descriptive analysis
Continuous Variables
• Two Variable
– Student t test
– Paired t test (matched pairs)
– Univariate Linear Regression
• More than two variables
– ANOVA
– Multivariate Linear Regression
Comparative analysis
Categorical Variables
• Descriptive analysis
– Proportion and 95% CI
• Comparative analysis
– Chi Square test
– Fisher's exact test
– Logistic Regression
Incidence Risk Vs. Incidence Rate Hypothetical cohort of 12 initially disease-free subjects followed
over a 5-year period from 1990 to 1995.
Incidence risk = 5/12 = 42/100 persons Incidence rate = 5/25 = 20/100 person-year
Kleinbaum et al. ActivEpi
Incidence Rate
Example Hypothetical cohort of 12 initially disease-free subjects followed
over a 5-year period from 1990 to 1995.
Kleinbaum et al. ActivEpi
Incidence risk = 5/12 = 0.42 (42 per 100 persons)
Incidence rate = 5/25 = 0.2 per person year
Statistical Significance: P-Value “or” 95% Confidence Interval
Hypothesis Testing (P-value)
• Null hypothesis No difference.
• P-value < 0.05 Reject the null hypothesis (there is difference).
Problems with P-values
• Does not measure the magnitude of the difference.
• Depends on the sample size.
– Very small difference can become significant by increasing the sample size.
• Multiple testing will increase the chance of having positive (significant difference) result due to random error.
Biggest problem!
• We know that the null hypothesis (difference = zero) is not true.
• We just need enough power (sample size) to reject the null hypothesis (and make our study “POSITIVE”).
• Example: 5-years mortality
Group 1 Group 2
0.0021633098649999 0.0021633098649999
Confidence Interval
No difference (equivalent)
Inconclusive
Better
No difference
May be better, not worse
Better Worse
top related