statistics for clinicians

45
Statistics for Statistics for clinicians clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida, College of Nursing Professor, College of Public Health Department of Epidemiology and Biostatistics Associate Member, Byrd Alzheimer’s Institute Morsani College of Medicine Tampa, FL, USA 1

Upload: mechelle-cabrera

Post on 04-Jan-2016

8 views

Category:

Documents


0 download

DESCRIPTION

Statistics for clinicians. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistics for clinicians

Statistics for cliniciansStatistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA

Professor and Executive Director, Research CenterUniversity of South Florida, College of NursingProfessor, College of Public HealthDepartment of Epidemiology and BiostatisticsAssociate Member, Byrd Alzheimer’s InstituteMorsani College of MedicineTampa, FL, USA

1

Page 2: Statistics for clinicians

22222

SECTION 6.6SECTION 6.6

Introduction to Introduction to survival analysissurvival analysis

Page 3: Statistics for clinicians

Learning Outcome:

Recognize concepts and methods used in survival analysis

Page 4: Statistics for clinicians

Survival AnalysisSurvival Analysis

• A technique to estimate the probability of “survival” (and also risk of disease) that takes into account incomplete subject follow-up.

• Calculates risks over a time period with changing incidence rates.

• Wide application in a variety of disciplines, such as engineering.

Page 5: Statistics for clinicians

Survival AnalysisSurvival Analysis

• With the Kaplan-Meier method (“product-limit method”), survival probabilities are calculated at each time interval in which an event occurs.

• The cumulative survival over the entire follow-up period is derived from the product of all interval survival probabilities.

• Cumulative incidence (risk) is the complement of cumulative survival.

Page 6: Statistics for clinicians

K-M formula:K-M formula:

# of time

intervals (Nk – Ak)

S = -------------

k = 1 Nk

Where: k = sequence of time intervalNk = number of subjects at risk

Ak = number of outcome events

Page 7: Statistics for clinicians

Survival AnalysisSurvival Analysis

• With the Kaplan-Meier method, subjects with incomplete follow-up (FU) are “censored” at their last known time of (FU).

• An important assumption (often not upheld) is that censoring is “non-informative” (survival experience of subjects censored is the same as those with complete FU).

• Non-fatal outcomes can also be studied.

Page 8: Statistics for clinicians

Survival AnalysisSurvival Analysis

• The Life-Table method is conceptually similar to the Kaplan-Meier method.

• The primary difference is that survival probabilities are determined at pre-determined intervals (i.e. years), rather than when events occur.

Page 9: Statistics for clinicians

999999

SECTION 6.7SECTION 6.7

Calculation and Calculation and Interpretation of Interpretation of

Survival Analysis Survival Analysis EstimatesEstimates

Page 10: Statistics for clinicians

Learning Outcome:

Calculate and interpret survival analysis estimates of incidence

Page 11: Statistics for clinicians

Survival AnalysisSurvival Analysis

Example:

• Assume a study of 10 subjects conducted over a 2-year period.

• A total of 4 subjects die.

• Another 2 subjects have incomplete follow-up (study withdrawal or late study entry).

What is the probability of 2-year survival, and the corresponding risk of 2-year death?

Page 12: Statistics for clinicians

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 ? 1 1 ? ? ? ?

Page 13: Statistics for clinicians

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 ? ? ? ?

Page 14: Statistics for clinicians

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 ? ? ? ?

Page 15: Statistics for clinicians

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 0.20 0.80 0.54 0.46

Page 16: Statistics for clinicians

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 0.20 0.80 0.54 0.46

24 4 0 0 0.0 1.0 0.54 0.46

Page 17: Statistics for clinicians

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 0.20 0.80 0.54 0.46

24 4 0 0 0.0 1.0 0.54 0.46

Interpretation: What is the 2-year risk of death?

Page 18: Statistics for clinicians

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 0.20 0.80 0.54 0.46

24 4 0 0 0.0 1.0 0.54 0.46

Interpretation: What is the 1-year risk of death?

Page 19: Statistics for clinicians

19

Survival Analysis (Practice)Survival Analysis (Practice)

Example:

• Assume a study of 12 subjects conducted over a 3-year period.

• A total of 5 subjects die.

• Another 2 subjects have incomplete follow-up (study withdrawal or late study entry).

What is the probability of 3-year survival, and the corresponding risk of 3-year death?

Page 20: Statistics for clinicians

20

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

7 12 1 1 0.0833 0.9167 0.9167 0.0833

11 10 1 0 0.10 0.90 0.8250 0.1750

16 1 0

24 1 1

30 1 0

36 0 0

Complete the worksheet below

What is the probability of 3-year survival, and the corresponding risk of 3-year death? Survival _______ Death _________

Page 21: Statistics for clinicians

21

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

7 12 1 1 0.0833 0.9167 0.9167 0.0833

11 10 1 0 0.10 0.90 0.8250 0.1750

16 9 1 0 0.1111 0.8889 0.7333 0.2667

24 8 1 1 0.125 0.875 0.6416 0.3584

30 6 1 0 0.1667 0.8333 0.5346 0.4654

36 5 0 0 0.0 1.0 0.5346 0.4654

Complete the worksheet below

What is the probability of 3-year survival, and the corresponding risk of 3-year death? Survival _0.5346_ Death _0.4654_

Page 22: Statistics for clinicians

22222222222222

SECTION 6.8SECTION 6.8

Logistic Regression Logistic Regression ModelModel

Page 23: Statistics for clinicians

23

Learning Outcome:

Recognize components and interpret parameters from the logistic regression model

Page 24: Statistics for clinicians

Logistic Regression AnalysisLogistic Regression Analysis

Conceptually similar to linear regression with dichotomous outcome.

Outcome is usually coded as “0” or “1”, with “1” referring to presence of the outcome in interest (although SAS assumes 0).

p represents the probability that the outcome is present (e.g. value of 1), given particular covariate values of an individual

Page 25: Statistics for clinicians

Logistic Regression AnalysisLogistic Regression Analysis Multiple logistic regression model can be

written in different ways:

where:p = expected probability that outcome is presentx1 through xp = independent variablesb0 through bp = regression coefficients

Page 26: Statistics for clinicians

Logistic Regression AnalysisLogistic Regression Analysis

b1 = change in the expected log odds in the outcome relative to a 1-unit change in xi holding other predictors constant

Anti-log of regression coefficient, exp(bi), produces odds ratio

Page 27: Statistics for clinicians

Logistic Regression AnalysisLogistic Regression AnalysisExample: Estimate the risk of incident CVD among persons defined as obese.

Variable b χ2 p-value

Intercept -2.367 307.38 0.0001

Obesity (yes vs. no) 0.658 9.87 0.0017

ln{ p

1 – p} = b0 + b1x1 + b2x2 + … bpxp

ln{ p

1 – p}= -2.367 + 0.658(Obesity) = log odds

exp(0.658) = 1.93 (odds ratio)

Page 28: Statistics for clinicians

Example: Estimate the log odds of being on a statin drug in relation to the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

ln{ p

1 – p} = b0 + b1x1 + b2x2 + … bpxp

ln{ p

1 – p}=

Write out the logistic regression equation below. (Practice)

Page 29: Statistics for clinicians

Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

ln{ p

1 – p} = b0 + b1x1 + b2x2 + … bpxp

ln{ p

1 – p}= -3.065 + 0.036(age) – 0.53(female) + 0.029(BMI)

– 0.001 (physical activity) + 1.067(diabetes)

Write out the logistic regression equation below.

Page 30: Statistics for clinicians

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

ln{ p

1 – p} = b0 + b1x1 + b2x2 + … bpxp

= EXP[(-3.065 + 0.036(age) – 0.53(female) + 0.029(BMI) – 0.001 (physical activity) + 1.067(diabetes)]

So, the predicted odds of an individual being on a statin drug =

Predicted Probability = Predicted odds / (1 + predicted odds).

AND

Page 31: Statistics for clinicians

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

= EXP[(-3.065 + 0.036(55) – 0.53(0) + 0.029(31.4) – 0.001 (2) + 1.067(1)]

= exp(0.896) = 2.446

Estimate the predicted odds and probability of an individual being ona statin drug with the following characteristics:

Age=55; male; BMI=31.4; physical activity level=2; diabetic

Predicted Probability = odds / (1 + predicted odds)= 2.446 / (3.446) = 0.71

Page 32: Statistics for clinicians

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

=

Estimate the predicted odds and probability of an individual being ona statin drug with the following characteristics: PRACTICE

Age=52; female; BMI=29.5; physical activity level=3; non-diabetic

Predicted Probability = odds / (1 + predicted odds)=

Page 33: Statistics for clinicians

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

= EXP[(-3.065 + 0.036(52) – 0.53(1) + 0.029(29.5) – 0.001 (3) + 1.067(0)]

= exp(-0.8645) = 0.42

Estimate the predicted odds and probability of an individual being ona statin drug with the following characteristics:

Age=52; female; BMI=29.5; physical activity level=3; non-diabetic

Predicted Probability = odds / (1 + predicted odds)= 0.42 / (1.42) = 0.296

Page 34: Statistics for clinicians

Example: Estimate the log odds of being on a statin drug in relation to the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

Produce odds ratio estimates of statin use for the following (Practice):

Age (per year) =Age per 5 years) =Female gender =History of diabetes =

Page 35: Statistics for clinicians

Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

Produce odds ratio estimates of statin use for the following:

Age (per year) = exp(0.036) = 1.04Age per 10 years) = exp(10 x 0.036) = 1.43Female gender = exp(-0.530) = 0.59History of diabetes = exp(1.067) = 2.91

Page 36: Statistics for clinicians

Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

Interpret odds ratio estimates of statin use for the following:

Age per 10 years) = exp(10 x 0.036) = 1.43

History of diabetes = exp(1.067) = 2.91

Page 37: Statistics for clinicians

Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

Interpret odds ratio estimates of statin use for the following:

Age per 10 years) = exp(10 x 0.036) = 1.43For every 10 year increase in age, the adjusted odds ofbeing on a statin drug increases 1.43-fold

History of diabetes = exp(1.067) = 2.91Persons with diabetes have 2.91 times higher odds of

being on a statin drug compared to persons without diabetes

Page 38: Statistics for clinicians

3838383838383838

SECTION 6.9SECTION 6.9

SPSS for Logistic SPSS for Logistic Regression AnalysisRegression Analysis

Page 39: Statistics for clinicians

39

Learning Outcome:

Use SPSS to fit and interpret a logistic regression model

Page 40: Statistics for clinicians

SPSSAnalyze

RegressionBinary Logistic

Dependent VariableCovariates

Page 41: Statistics for clinicians
Page 42: Statistics for clinicians
Page 43: Statistics for clinicians
Page 44: Statistics for clinicians
Page 45: Statistics for clinicians

SPSSAnalyzeDescriptive StatisticsCrosstabs

Row=Hx diabetesCol = Statin use

Odds Ratio = odds exposure casesodd exposure controls

= (17 / 88) / (24 / 372)= 0.193 / 0.0645 = 2.99