choosing appropriate statistical test rss6 2104

Choosing Appropriate Statistical Test

Amr Albanna, MD, MSc

Factors Influencing the Selection of Statistical Tests

Study Design

Type of Data

Study Design

Descriptive Studies

• Prevalence

– Cross-sectional study

• Incidence

– Cohort study

Prevalence Versus Incidence

• Prevalence can be viewed as describing a pool of disease in a population.

• Incidence describes the input flow of new cases into the pool.

• Deaths and cures reflects the output flow from the pool.

Prevalence Versus Incidence

Prevalence at time t1 = 2/10 = 20%

Source: Silva 1999

Prevalence at time t2 = 3/8 = 38%

Incidence between t1 and t2: 4/8 = 50%

Descriptive Studies

• Determine the size of health problem in the “study base” population.

• Promote public health policies.

Analytic Studies

• Randomized-controlled trials.

• Cohort studies

• Case-control studies

• Diagnostic studies

Analytic Studies

• To effectively practice medicine, we need evidence/knowledge on 3 fundamental types of professional knowing “gnosis”:

Dia-gnosis Etio-gnosis Pro-gnosis

• Most fundamental application of clinical research: to identify causal associations between exposure(s) and outcome(s)

Exposure Outcome

?

Analytic Studies

Causal Vs. Non-causal Association

A B

Accidental

No Association


A B

A cause B


A B

B cause A

Direction of causality: does overeating cause obesity?

Taubes G

New Scientist

2008


A B

A is not causally associated with B

C e.g. Smoking

e.g. Lung cancer e.g. Coffee

A Research Scenario

• Study question: Does eating affect student intellectual ability.

• 100 students underwent an exam after eating lunch.

• 50% failed the exam.

• You conclude that eating worsen students intellectual ability.

Compared to what?

• In an old movie, comedian Groucho Marx is asked: “Groucho, how’s your wife?”

• Groucho quips: “Compared to what?”

http://en.wikipedia.org

Outcome

Outcome Counterfactual, unexposed cohort

Exposed cohort

Ideal counterfactual comparison to determine

causal effects

Maldonado & Greenland, Int J Epi 2002;31:422-29

“Initial conditions” are identical in

the exposed and unexposed groups

– because they are the same

population!

Outcome

Outcome

Counterfactual, unexposed cohort

Exposed cohort

Substitute, unexposed cohort

Outcome

What happens in reality?

counterfactual state

is not observed

(latent)

A substitute will usually be a population other than the target population

during the etiologic time period - INITIAL CONDITIONS MAY BE

DIFFERENT

Risk

Rate

Risk Difference

Risk Ratio

Rate Ratio

Odds Ratio

Measures of disease

freq

Measures of effect

Measures of potential

impact

Attributable Risk

Population Attributable Risk

How PAR is dependent on prevalence of exposure

Szklo & Nieto. Epidemiology: Beyond the basics. 2nd Edition, 2007

Randomization helps to make the groups “comparable” (i.e. similar

initial conditions)

Eligible patients

Treatment

Randomization

Placebo

Outcomes

Outcomes

Randomized-controlled trials

Incidence

Incidence

Difference: “RR” or “RD”

Observational Studies

E

E

E

E

E

E

E

E

E E

E

E

N

E

E

N

N

N

N

N

N

N

N

N

N

N

N

N

N N

N

N

N

N

N

N

N

N

E

E

E

E

E

E

E

E

E

N

N N

N

N

N

N

N

Cohort

E E

E

E

E

E E

E

E

E E

E

N

E

E

N N

N

N N N N

N

N

N

N

N N

N

N

N

N

N

N

N N

N

N

E

E

E

E

E E

E

E E

N

N

N

N N

N

N

N

N

Un-Exposed

Exposed

Study population

Exposed Unexposed

Disease No

Disease Disease

No Disease

Incidence of disease in exposed

Incidence of disease in unexposed

Cohort

“Risk Ratio”

“Risk Difference”

Case-Control

E E E

E

E E E

E E

E E

E

N

E

E

N

N

N

N N N N N

N

N N N N N N

N

N

N N N N

N N E E E

E E

E E E

E

N

N

N

N

N N

N N N

Cases

Controls

Study population

Disease No disease

Exposed Un-

exposed Exposed

Un-exposed

Odds of being exposed

Odds of being exposed

Case-control

“Odds Ratio” approximate “Risk Ratio”

Observational Studies: Problem

Association between birth order and Down syndrome

Source: Rothman 2002 Data from Stark and Mantel (1966)

Source: Rothman 2002

Association between maternal age and Down syndrome

Data from Stark and Mantel (1966)

Source: Rothman 2002

Association between maternal age and Down syndrome, stratified by

birth order

Data from Stark and Mantel (1966)

Criteria to define confounder

• A factor is a confounder if 3 criteria are met:

– a) a confounder must be causally or noncausally associated with the exposure in the source population;

– b) a confounder must be a causal risk factor (or a surrogate measure of a cause) for the disease;

– c) a confounder must not be an intermediate cause (in other words, a confounder must not be an intermediate step in the causal pathway between the exposure and the disease)

Exposure Disease (outcome)

Confounder

Confounding Schematic

E D

C

Szklo M, Nieto JF. Epidemiology: Beyond the basics. Aspen Publishers, Inc., 2000.

Gordis L. Epidemiology. Philadelphia: WB Saunders, 4th Edition.

Exposure Confounder

Intermediate cause

E D C

Disease

Birth Order Down Syndrome

Confounding factor:

Maternal Age

Confounding Schematic

E D

C

HRT use Heart disease

Confounding factor:

SES

Are confounding criteria met?

Association between HRT and heart disease

Control of confounding: Outline

• Control at the design stage

– Randomization

– Restriction

– Matching

• Control at the analysis stage

– Conventional approaches

• Stratified analyses

• Multivariate analyses

– Newer approaches

• Propensity scores

Observational Study on Vit E and Coronary Heart Disease

Fitzmaurice, 2004

Crude OR = (50)(384)/(501)(65) = 0.59

Are there potential confounders that can explain this crude OR?

Vitamin E CHD

Confounding factor:

Smoking

Stratify on the

confounding

variable

Could reduced smoking among Vit E users partly

explain the observed protective effect?

Stratified Analyses (by smoking status)

Fitzmaurice, 2004

OR (smokers) = (11)(200)/(40)(49) = 1.12

OR (non-smokers) = (39)(184)/(461)(16) = 0.97

Stratum 1

Stratum 2

Multivariate Analysis

•Diagnostic 2 X 2 table*:

Disease + Disease -

Test + True

Positive

False

Positive

Test - False

Negative

True

Negative

*When test results are not dichotomous, then can use ROC curves [see later]

Diagnostic Studies

Disease

present

Disease

absent

Test

positive

True

positives

False

positives

Test

negative

False

negative

True

negatives

Sensitivity

[true positive rate]

The proportion of patients with disease who test

positive = P(T+|D+) = TP / (TP+FN)

Disease

present

Disease

absent

Test

positive

True

positives

False

positives

Test

negative

False

negative

True

negatives

Specificity

[true negative rate]

The proportion of patients without disease who test

negative: P(T-|D-) = TN / (TN + FP).

Disease

present

Disease

absent

Test

positive

True

positives

False

positives

Test

negative

False

negative

True

negatives

Predictive value of a positive test

Proportion of patients with positive tests who have

disease = P(D+|T+) = TP / (TP+FP)

Disease

present

Disease

absent

Test

positive

True

positives

False

positives

Test

negative

False

negative

True

negatives

Predictive value of a negative test

Proportion of patients with negative tests who do not have

disease = P(D-|T-) = TN / (TN+FN)

Disease

present

Disease

absent

Test

positive

True

positives

False

positives

Test

negative

False

negative

True

negatives

Likelihood Ratio of a Positive

Test

LR+ = TPR / FPR )|Pr(

)|Pr(

DT

DTLR

How more often a positive test result occurs in persons with compared to those without the target condition

Disease

present

Disease

absent

Test

positive

True

positives

False

positives

Test

negative

False

negative

True

negatives

Likelihood Ratio of a Negative

Test

LR- = FNR / TNR )|Pr(

)|Pr(

DT

DTLR

How less likely a negative test result is in persons with the target condition compared to those without the target condition

Continuous results: Receiver operating characteristic (ROC)curve

Blood sugar level

(2-hour after

food) in

mg/100 ml

Sensitivity

(%)

Specificity

(100%)

70

80

90

100

110

120

130

140

150

160

170

180

190

200

98.6

97.1

94.3

88.6

85.7

71.4

64.3

57.1

50.0

47.1

42.9

38.6

34.3

27.1

8.8

25.5

47.6

69.8

84.1

92.5

96.9

99.4

99.6

99.8

100

100

100

100

Area under the curve (AUC) can range from 0.5 (random chance, or no predictive ability; refers to the 45 degree line in the ROC plot) to 1 (perfect discrimination/accuracy).

The closer the curve follows the left-hand border and then the top-border of the ROC space, the more accurate the test. The closer the curve comes to the 45-degree diagonal of the ROC space, the less accurate the test.

Systematic Review

Bates et al. Arch Intern Med 2007

Meta-analysis

Ried K. Aus Fam Phys 2006

Type of Data

Continuous Variables

• Mean and 95% CI • Median and IQR

Descriptive analysis

Continuous Variables

• Two Variable

– Student t test

– Paired t test (matched pairs)

– Univariate Linear Regression

• More than two variables

– ANOVA

– Multivariate Linear Regression

Comparative analysis

Categorical Variables

• Descriptive analysis

– Proportion and 95% CI

• Comparative analysis

– Chi Square test

– Fisher's exact test

– Logistic Regression

Incidence Risk Vs. Incidence Rate Hypothetical cohort of 12 initially disease-free subjects followed

over a 5-year period from 1990 to 1995.

Incidence risk = 5/12 = 42/100 persons Incidence rate = 5/25 = 20/100 person-year

Kleinbaum et al. ActivEpi

Incidence Rate

Example Hypothetical cohort of 12 initially disease-free subjects followed

over a 5-year period from 1990 to 1995.

Kleinbaum et al. ActivEpi

Incidence risk = 5/12 = 0.42 (42 per 100 persons)

Incidence rate = 5/25 = 0.2 per person year

Statistical Significance: P-Value “or” 95% Confidence Interval

Hypothesis Testing (P-value)

• Null hypothesis No difference.

• P-value < 0.05 Reject the null hypothesis (there is difference).

Problems with P-values

• Does not measure the magnitude of the difference.

• Depends on the sample size.

– Very small difference can become significant by increasing the sample size.

• Multiple testing will increase the chance of having positive (significant difference) result due to random error.

Biggest problem!

• We know that the null hypothesis (difference = zero) is not true.

• We just need enough power (sample size) to reject the null hypothesis (and make our study “POSITIVE”).

• Example: 5-years mortality

Group 1 Group 2

0.0021633098649999 0.0021633098649999

Confidence Interval

No difference (equivalent)

Inconclusive

Better

No difference

May be better, not worse

Better Worse

choosing appropriate statistical test rss6 2104

Health & Medicine

schematic e d c szklo

disease disease

disease incidence of

exposed incidence of

pool of disease

causal risk factor

noncausal association

incidence prevalence