m2 medical epidemiology survival analysis; evidence-based medicine

48
M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Post on 20-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

M2 Medical Epidemiology

Survival Analysis;Evidence-Based Medicine

Page 2: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Rates and Censoring

– Problems with naïve analyses

– Ubiquity of censored data

– Methods of adjustment for censoring

– Survival curves and their interpretation

Page 3: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Problems with Naïve Survival Analyses

•The table below was obtained several decades ago by averaging ages at death from death certificates of physicians who had practiced in different specialties.

Category Mean Age at Death (years) Radiologists 60.5 Dermatologists 62.3 Pathologists 62.4 Specialists with some radiation exposure (gastroenterologists, urologists, etc.)

63.7 Other physicians with no exposure to radiation

65.7

•The investigators concluded that there was a dose-response effect associating specialties with higher exposure to radiation with shorter life expectancy.

•Why was their conclusion nonsense?

Page 4: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Problems with Naïve Survival Analyses

The following headlines, the second of which is from theChampaign-Urbana News-Gazette, feature a researcher who has repeatedly reached very surprising conclusions about smoking and mortality.

Page 5: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Problems with Naïve Survival Analyses These include:

– Women are at greater risk of smoking-related death than men.– Filter cigarettes are more dangerous than unfiltered cigarettes.

Several years ago other investigators, using a similarly designed study that received even more publicity, reached the conclusion that left-handers have life expectancy 9 years below that of right-handers. Numerous “plausible” explanations were advanced to explain this.

These conclusions were reached by the same method used by the researchers above. Other similar studies have erroneously concluded that effective medical innovations have no impact.

Can you explain all this silliness by a single flaw in the design of the research?

Page 6: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Problems with Naïve Survival Analyses

Death certificate studies such as those above can be grossly biased, because they don't start with a defined cohort.

By selecting into the study population only those people who have already died, they systematically exclude the people who live longest.

If the age distributions of two groups differ to start with, the resulting comparisons can lead to ridiculous scientific conclusions due to selection bias.

Page 7: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Problems with Naïve Survival Analyses Thus, only studies that begin with defined cohorts are

usually valid. But the cohorts – must usually be assembled over a period of time, since

people don’t all get sick at once, and

– members of the cohorts may drop out of the study, or become lost to follow-up, for numerous reasons,

– and some suffer the outcome of interest earlier than others.

Hence, some members of the cohort may be observed for much shorter periods than others. Methods that don’t take these differences in observation periods into account are subject to substantial measurement bias, because the process of monitoring for the outcome differs between individuals.

Page 8: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Problems with Naïve Survival Analyses One way to take observation times into account is to calculate incidence density rates using person-years. This method works well when incidence density is stable over the period of study.

– However, usually this assumption is false. In the study of long-term survival, we all know that the incidence density of death (the mortality rate) rises with age, which increases over time.

– Studies of surgical outcomes must deal with perioperative mortality, which is often much higher than later mortality.

– Mortality rates from cancer change substantially after treatment; 5-year survival for some cancers is regarded as cure.

– The rate of complications of certain diseases, such as diabetes, increases greatly with duration of the disease.

For these situations, calculating overall incidence density rates using person-years pools information from different times in an

inappropriate way. Other methods are necessary.

Page 9: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Ubiquity of Censored Data The term “survival data” is used in Medicine and Public Health to

describe “time to event” data, where the event is any occurrence of importance to health.

Time to death, renal failure, second myocardial infarction, first asthma attack after change in therapy, or time to recovery all are “survival data” in this technical sense.(CAUTION)

Survival data in Medicine and Public Health are also usually "censored data," because for some subjects we know only that they have survived at least a certain period of time, but we don’t know when death or other outcome will occur.

– We stop most clinical trials or vaccine field trials before most subjects die or experience an unfavorable outcome.

– Otherwise, it would take to long to get an answer and researchers couldn’t get tenure.

Page 10: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Methods of adjustment for censoring

To interpret censored survival data well, we must – avoid pooling information from different subjects

inappropriately

– avoid pooling information from different times inappropriately

– take censoring into account without introducing bias into the analysis.

We use one of two methods– actuarial (Cutler-Ederer)

– Product-limit (Kaplan-Meier)

Both stem from the same fundamental approach: the fundamental equation of survival analysis.

Page 11: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Probability of surviving 2 Periods (example 2 years) Probability of surviving 1st year (e.g. 80%) Probability of surviving 2nd year ( not 2 years.

Only the 2nd year)i.e. of those alive after 1 year what is the probability of surviving the 2nd year. (e.g. 70%)

Then probability of surviving 2 years is 80% X 70% = 56%

Can we just divide the number surviving 2 years by the starting number? NO

Page 12: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curve

0

20

40

60

80

100

120P

erc

en

t a

live

Page 13: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

So all we need is:

Percent surviving each time period. We get that by calculating the percent

dying during time period. Example We start with 90 patients. During

first year 20 withdraw and 16 die. Probability of dying during 1st is 16

dividing by 90 or 70 ? Half way.

Page 14: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Actuarial method

Number dying during period divided by number alive at beginning of period minus half of the withdrawn.

16/80= 20% so 80% survive

Page 15: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

2nd year

We are starting with 90-20-16=54 During 2nd year 8 are lost to follow up and

15 die. Probability of dying in the 2nd year is

15/(54-4)=30%. So 70% survive the 2nd year.

So probability of surviving 2 years is 80% X 70% = 56%

Page 16: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Methods of adjustment for censoring

Survival data, converted from chronological to biological time:

Page 17: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Methods of adjustment for censoring

Fundamental equation of survival analysis

Suppose we select a set of times, symbolized by t1, t2, ... , tk. These represent not calendar time, but durations from a clinically defined starting point such as diagnosis or treatment.

Suppose that patients are observed for different durations after this starting point, usually over different intervals of calendar time, as in the previous slide.

We are interested in the probabilities that a patient survives until each of the given times t1, t2, ... , tk after the starting point. Why?

–these are useful measures of prognosis in clinical practice, both for their own sakes, and as complements of CI’s of death at t1, t2, ... , tk –we may also use them for

• comparing cohorts with different exposures in observational epidemiological studies

• for comparing treatment effects in experimental clinical trials

Page 18: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Methods of adjustment for censoring

To estimate these probabilities, we use the

Fundamental Equation of Survival Analysis

Pr{surviving through time tj} =

1 - Pr{death by time tj} =

Pr{surviving through time t1}

Pr{surviving time t2|survival through t1}

Pr{surviving through t3|survival through t2}

Pr{surviving through t4|survival through t3}

...

...

...

X Pr{surviving through tj|survival through tj-1}

Page 19: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Methods of adjustment for censoring

Thus, the probability of surviving a given duration is expressed as the product of:

probability of surviving an initial interval

with

conditional probabilities of surviving successive subsequent intervals

having survived all previous intervals

Each of these terms may be separately estimated by pooling data from relevant persons with possibly non-concurrent experiences!

Page 20: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Methods of adjustment for censoring

Notation

Ox = # alive at beginning of interval x

Dx = # dying during interval x

Wx = # withdrawn from study or lost to follow-up during interval x

Page 21: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Methods of adjustment for censoring

Cutler-Ederer (Actuarial) Approach

Intervals specified in advance.

Pr{dying during interval x} =

Dx /(Ox -Wx/2)

Pr{surviving during interval x} =

1 - Pr{dying during interval x}

Page 22: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Kaplan-Meier

Keep track of withdrawals all the time. Don’t touch the curve until someone dies. Probability of dying is number dying at this

point divided by number still available at the time of death.

Page 23: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Example

You start with 15 patients. You are notified about withdrawals. On July 3rd you are notified about 2 deaths

(on the same day!) You look at the number withdrawn up to

that point and you find there have been 5. You divide 2 by 15 minus 5= 20%

Page 24: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Contd

On July 3rd you take your line straight down from 100% to 80%.

So probability of dying is number dying at any point divided by number alive at beginning of previous period minus all withdrawals during that period.

Page 25: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curve

020406080

100120P

erce

nt a

live

Page 26: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Next

Now we have only 8 patients. On December 23 1 patient dies. Between July 3rd and December 23rd 2 patients are

withdrawn. Divide 1 by 8 minus 2 = 1/6= 16.7% Probability of surviving the 2nd period is 83.3% Probability of surviving 2 time periods is 80% X

83.3% =66.6%. So on December 23rd you take the line straight

down from 80% to 66.6%

Page 27: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Where do you read at ?

End of line

Page 28: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Methods of adjustment for censoring

Product-Limit (Kaplan-Meier) Approach

Intervals are determined by times at death.–infinitesimally small intervals around each death time,

and, in between,–intervals during which no deaths occur.

Pr{surviving intervals between deaths) = 1

Pr{dying at the xth death time} =Dx/Ox

Page 29: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Methods of adjustment for censoring

Kaplan-Meier (product-limit) and Cutler-Ederer (actuarial) survival plots of the same data. Which is which?

Page 30: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

YEARS SINCE ENTRY INTO STUDY

ALIVE UNDER OBSERVATION AT START OF INTERVAL (OX)

DEATHS DURING

INTERVAL (DX )

WITHDRAWN ALIVE OR LOST

DURING INTERVAL (WX )

0-1 146 27 3 1-2 116 18 10 2-3 88 21 10 3-4 57 9 3 4-5 45 1 3 5-6 41 2 11 6-7 28 3 5 7-8 20 1 8 8-9 11 2 1 9-10 8 2 6

Actuarial methods of adjustment for censoring

Estimated chance that someone who starts the interval will die

within the interval = qx = Dx/(Ox-Wx/2)

Estimated chance that someone who starts the interval will survive through it = px = 1-qx

Chance of surviving from the beginning of the study to the end of the interval = Px = pxpx-1 px-2 ... p1 = px Px-1

Page 31: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Actuarial Method of adjustment for censoring

YEARS SINCE ENTRY INTO STUDY

ALIVE UNDER OBSERVATION AT START OF INTERVAL (OX)

DEATHS DURING

INTERVAL (DX )

WITHDRAWN ALIVE OR

LOST DURING INTERVAL (WX )

0-1 146 27 3 1-2 116 18 10 2-3 88 21 10

q1 = D1/(O1-W1/2) = 27/(146-(3/2)) = .1869

p1 = 1-q1 = 1-.1869 = .8131

Px = pxpx-1 px-2 ... p1 = px Px-1

P1 = p1 = .8131

Page 32: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Actuarial Method of adjustment for censoring YEARS SINCE ENTRY INTO STUDY

ALIVE UNDER OBSERVATION AT START OF

INTERVAL

DEATHS DURING

INTERVAL (DX )

WITHDRAWN ALIVE OR

LOST DURING INTERVAL (WX )

0-1 146 27 3 1-2 116 18 10 2-3 88 21 10

P1 = p1 = .8131

q2 = D2/(O2-W2/2) = 18/(116-(10/2)) = .1622

p2 = 1-q2 = 1-.1622 = .8378

Px = pxpx-1 px-2 ... p1 = px Px-1

P2 = p2 p1=.8378x.8131 = .6812

Page 33: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Cox Proportional Hazards

Car going at constant 20 MPH through varying traffic, curves etc.

Risk of accident varies instantaneously according to traffic, road condition etc.

Another car going through exact same roads and traffic but at 40 MPH.

Risk of accident is twice(?) as much at every instant.

Page 34: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

34

Proportional hazards

Hazard varies over time but the ratio of the hazard remains constant.

Sir David Cox in 1972 introduced a method to calculate proportional hazard without calculating the actual time dependent hazard.

This proportional hazard can be “adjusted” for covariates (Cox Regression).Output: HR Hazard Ratio (similar to OR)

Breslow introduced a way to estimate hazard at any particular time.

Page 35: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

Survival curves always start at 1.0=100% on the vertical axis, and must decline. The only issue is how fast they decline.

Further, if one follows patients long enough, all curves describing actual survival (in contrast to some other outcome that doesn't affect everyone eventually) end at zero. The issue is therefore not where they end, but how much higher one curve is relative to another, or the area between the curves.

This is no surprise, it’s just the cumulative incidence issue in another form, since survival “rates” are just complements, with respect to 1, of cumulative incidences.

Page 36: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

Trends in survival curves may be much less accurate towards the right end than at the beginning, because fewer people contribute to the computation at the rightend, most subjects having been observed for shorterintervals.

However, this problem of unreliability may be somewhat mitigated by the tendency of the true survival curve to flatten out in many real situations.

Note that it’s not as much the height at the end that’s less accurate as it is the slope at the end. This point is important in understanding prognostic estimates made near the ends of the curves, as described below.

Page 37: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

Later PrognosisSurvival curves can be used to estimate the outlook for a patient who has already survived a certain length of time, by dividing the height of the curve later by its present height.

Thus, if a patient who has survived a myocardial infarction for 2 years wants to know the chances of surviving another year,

divide the 3-year survival rate by the 2-year survival rate.

This gives the estimated fraction, of those who survived the first 2 years, who will make it through another year.

Page 38: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

From the blurry curve below, can you determine roughly the chance that someone who has already survived for three years will survive for two more?

Page 39: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

Reiterating a previous point, survival analysis is applied to the development of any irreversible outcome, not just mortality.

It is also frequently applied to the first occurrence of a reversible outcome as well.

Survival curves are sometimes plotted with a logarithmic vertical scale, especially when the mortality rate is roughly constant. In that case the survival curves look like straight lines. Watch the scale or you can be badly misled.

Page 40: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

In interpreting survival curves, the choice of starting point is critical as well as the shapes of the curves.

For instance, if you evaluate a screening program by starting at time of diagnosis, and compare survival from diagnosis of a screened and unscreened group, then screening will always look good.

Why?

Page 41: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

...screening will always look good. Why?

Because survival for the screened group is being measured from an earlier point in the disease process than for an unscreened group.

This is called "lead-time bias," a measurement bias. It may be that an apparent survival advantage in the screened group simply reflects the extent by which screening moved up the date of diagnosis of the disease, rather than any impact of early detection and treatment on true survival. Beware this trap!

Page 42: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

The figure above compares three survival curves, but gives no indication of how reliable these curves are. They might be from large samples or very small samples, and be statistically very stable or highly variable. We can't tell.

Page 43: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

The graph to the right is more informative. With each curve, at the end of each interval, is the number who survived the interval without a recurrence (Ox-Dx-Wx), shown as a fraction of the number (Ox) who reached the start of the interval without a recurrence. We see from the curves that they are based on only a few patients. Specifically, we see that even though things look encouraging after two years, there is very little information in these data about that period of time.

Page 44: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation This plot gives information about variability in a different form,

by using standard error bars for each survival rate. Just as means, proportions, or any other statistic, a survival rate

has a standard error that reflects how variable the statistic is from sample to sample under the same conditions.

Page 45: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

The standard error bars give us more direct information than the sample sizes as to how precisely the survival rate at each time is estimated by the given set of data.

The error bars below show the survival rates are quite imprecise.

Page 46: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

The figure below tries to combine the best features of the previous two, by including both

•the number of individuals observed to survive each interval, and•standard error bars for the survival rates plotted at the end of each interval.

This makes the figure "busy,” but more informative than the others we have seen.

Page 47: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

The plot below compares survival of lung cancer patients diagnosed during three successive decades. Visually, the increase in long-term survival looks quite noticeable.

What special feature of this plot makes the visual impression exaggerate the beneficial trend?

Page 48: M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Curves and their Interpretation

The literature is also replete with plots of cumulative probabilities of events over time, such as the plot below. These are obtained by the same method as survival plots. The only difference is that, rather than plot the survival probability, the researchers subtract it from one first.