observational designs

Observational Designs

Oncology Journal Club

April 26, 2002

Presented Article

“Plasma Selenium Level Before Diagnosis and the Risk of Prostate Cancer Development”. J.D. Brooks, E.J. Metter, D.W. Chan, L.J. Sokoll, P. Landis, W.G. Nelson, D. Muller, R. Andres, H.B. Carter. The Journal of Urology, v. 166, pp. 2034-2038.

Study Design: nested case-control study.

52 cases

96 contols

Disease: prostate cancer

Exposure: selenium intake

Design Types

• Experimental:– Clinical Trials– Randomized, controlled

• Observational:– Prospective Cohort study– Retrospective Cohort study– Case-Control

Experimental Designs

• Exposure/treatments are controlled by design– dose levels fixed– time course fixed– systematic data collection– predefined sample size– usually randomized if comparative

Observational Studies

• “Sit back and watch”– no “control” over doses, treatments, exposures

– individuals self-select exposure

• Prospective Cohort Studies– E.g. Baltimore Longitudinal Study of Aging

– population followed forward in time

– assess exposures in the present tense

– watch for disease in the future

– usually a “representative”(random) sample, but sometimes sampling is based on exposure

goal is to compare exposed and unexposed individuals

• Case-Control Studies– E.g. plasma selenium level ~ prostate cancer

– population followed backward in time

– assess disease status in the present tense

– look for exposure in the past

– designed so that sampling is based on disease status goal is to compare diseased and non-diseased

individuals

Observational Studies

Designs

today future

D

D

Prospective Cohort:

X

X

X

past today

Case-Control:

D

DX

X

X

One more to consider

• Retrospective cohort study– Similar to prospective cohort because sample

tends to be “representative”– Sampling not based on case/disease status– uses historical data (“chart review”)– can be treated the same as prospective cohort

study because we are comparing exposed and non-exposed populations

Key difference

WHO IS BEING COMPARED?

COHORT: EXPOSED VS. UNEXPOSED

CASE-CONTROL: DISEASED VS. NON-DISEASED

Pros & Cons• Cohort studies are expensive• Cohort studies can (usually)

measure exposure precisely• In cohort studies, disease

prevalence can be measured

• Cohort studies are impractical for study of rare disease.

• Can assess temporal relationship

• Case control studies are cheap• Case control studies tend to rely on

recall for exposure measure• Case control studies don’t allow for

measurement of disease prevalence• Case control studies are efficient in

rare diseases• Can’t always assess temporal

relationship

In both, inferences can be biased due to confounders Confounding would be protected against if we could randomize! Both allow for inference when randomized clinical trial would be unethical

Measuring Risk

• Cohort Study: What is the probability of getting diseased if you

are exposed as compared to unexposed?

• Case-Control Study:What is the probability of having been exposed if

you have the disease compared to not having the disease?

Risk in Cohort Studies

• Relative Risk (RR):

R R

A A B

C C D

p ro b ab ility o f d isea se g iv en e x p o sed

p ro b ab ility o f d isea se g iv en u n ex p o sed

/ ( )

/ ( )

Disease Non-Diseased

Exposed A B A+B

Unexposed C D C+D

A+C B+D

Risk in Cohort Studies

• Odds Ratio (OR):

O R

A A B B A B

C C D D C D

A B

C DA D

B C

p ro b ab ility o f d isea se g iv en e x p o sed / (1 - p ro b ab ility o f d isea se g iv en e x p o sed )

p ro b ab ility o f d isea se g iv en u n ex p o sed / (1 - p ro b ab ility o f d isea se g iv en u n ex p o sed )

[ / ( )] / [ / ( )]

[ / ( )] / [ / ( )]

/

/


Exposed A B A+B

Unexposed C D C+D

A+C B+D

Risk in Case-Control Studies

• Odds Ratio (OR):

O R

A A C C A C

B B D D B D

A C

B DA D

B C

p ro b ab ility o f ex p o su re g iv en d isease / (1 - p ro b ab ility o f ex p o su re g iv en d isease )

p ro b ab ility o f ex p o su re g iv en n o n - d iseased / (1 - p ro b ab ility o f ex p o su re g iv en n o n - d iseased )

[ / ( )] / [ / ( )]

[ / ( )] / [ / ( )]

/

/


Exposed A B A+B

Unexposed C D C+D

A+C B+D

Take Home Point• Despite difference in design, the odds ratio is the

SAME measure of risk in both types of studies.

• In the simplest analytic approach, we can easily calculate AD/BC from the 2x2 table of an observational study.

• But, things do tend to get more complicated:– what if exposure is not binary, like selenium level?

– what if we need to adjust for known, measured confounders, such as BMI, smoking, alchohol, time between selenium and prostate diagnosis?

Logistic RegressionLogistic regression allows us to do 2x2 table analysis, and much more:

Let y = 1 if prostate cancer, 0 if not

Let x = 1 if high selenium, 0 if low (assume binary for now)

What is difference between an “exposed” and “unexposed” pair of individuals?

lo g ( )( | )( | )

P y xP y x x

11 1 0 1

( )

( )

( | )( | )

( | )( | )

P y xP y x

P y xP y x

e

e

1 11 1 1

1 01 1 0

0 1

0

if x = 1

if x = 0

O R e e

e

P y xP y x

P y xP y x

( ) / ( )( | )( | )

( | )( | )

1 11 1 1

1 01 1 0

0 1 0

1

/

Logistic Regression• That was simplest case

• Logistic regression allows us much more freedom:

• x’s can be anything (continuous, binary, etc.)

• Let’s assume that x1 = 4th quartile selenium, x2 = 3rd quartile selenium, x3 = 2th quartile selenium

• And we need to adjust for x4 = BMI, x5 = smoking, x6 = alcohol, x7 = years before diagnosis.

• What is the interpretation of 1?

lo g ( )( | )( | )

P y xP y x k kx x x x

11 1 0 1 1 2 2 3 3

Interpretation of coefficients

1 is the log odds ratio comparing risk of prostate cancer for those in 4th quartile of selenium to those in lowest quartile, adjusted for BMI, smoking, alcohol, and years since diagnosis.

e1 = 0.24 Individuals in the highest quartile for selenium are at 0.24

times the risk of prostate cancer compared to those in the lowest quartile, adjusing for BMI, smoking, alcohol, and years since diagnosis.

Why is logistic regression SO important in observational studies?

• We see it in clinical trials, but it is not as omnipresent as in observational

• Big difference: in clinical trials, we often rely on randomization to ensure comparability of groups.

• In observational studies, individuals self-select treatment/exposure and that choice may be related to other factors.

We MUST perform adjustment for confounding factors!

• Examples:

1. Exercise and selenium: what if selenium is strongly associated with prostate cancer? People who exercise tend to eat better diets, rich in selenium. If we consider the association between exercise and prostate cancer without adjusting for selenium, then we may falsely conclude that exercise and prostate cancer are associated.

2. Coffee and lung cancer: A case-control study found a strong association between coffee and lung cancer. However, after adjusting for smoking, the association “went away.” Why? People who self-select smoking also tend to self-select coffee consumption

observational designs

Documents