exploring the shape of the dose-response function

Exploring the Shape of the Dose-Response Function

Traditional approach to dose-response analysis The “step function”

Alternative: “Flexible” regression line Spline regression

Examples: logistic/linear/Cox

Outline

Example: Sleep-Disordered Breathing and Stroke

Study: the Sleep Heart Health Study

Data set: cross-sectional

Exposure variable: apnea-hypopnea index (AHI)

Dependent variable: self-reported stroke

Potential confounders: known stroke risk factors

Data set

Observations: N=5,192

Self-reported stroke: N=204

Mean Percentile Distribution 5th 25th 50th 75th 95th

8.9 0.2 1.4 4.5 11.3 34.1

Apnea- Hypopnea Index (AHI)

Traditional Approach: Categorical Analysis

Categorization dummy coding

AHI Q2 Q3 Q4

0 - 1.4 0 0 0

1.5 - 4.5 1 0 0

4.6 - 11.3 0 1 0

>11.3 0 0 1

Traditional Approach: Step Function

Model:

Log odds (stroke) = 1 + 2Q2 + 3Q3 + 4Q4 + Z

Maximum Likelihood Estimates:

Log odds (stroke) =

(-9.924) + (0.301)Q2 + (0.344)Q3 + (0.454)Q4 + Z

Adjusted Odds Ratios of Prevalent STROKEby Quartile of the Apnea-Hypopnea Index

AHI Quartile

1.0 (ref.) 1.35(0.84 - 2.18)

1.41(0.88 - 2.26)

1.57(0.98 - 2.53)

I II III IV


1.0 1.35 1.41 1.57

0.1

1

10

Adj. OR

Q1 Q2 Q3 Q4

AHI Quartile

Traditional Approach: “Step Function”


AHI Fitted Model

0 - 1.4 Log (odds of stroke) = 1 + Z

1.5 - 4.5 Log (odds of stroke) = 1 + 2 + Z

4.6 - 11.3 Log (odds of stroke) = 1 + 3 + Z

> 11.3 Log (odds of stroke) = 1 + 4 + Z

Traditional Approach: “Step Function”


AHI Fitted Model

0 - 1.4 Log (odds of stroke) = -9.924 + Z

1.5 - 4.5 Log (odds of stroke) = -9.623 + Z

4.6 - 11.3 Log (odds of stroke) = -9.580 + Z

> 11.3 Log (odds of stroke) = -9.470 + Z


-9.470 + Z

-9.580 + Z

-9.924 + Z

Log odds (stroke)

-9.623 + Z

0 1.4 4.5 11.3 AHI

Unrealistic assumptions A “step function” We actually don’t believe it; our mind tries to draw an

imaginary smooth line through the step

Choice of categories could influence the shape

Test for trend Not a test for monotonic dose-response Statistical hypothesis testing

Step Function: Problems

Alternative: “Flexible” Regression Line

Spline Regression

Categorize (specify cutoff points)(as in categorical analysis)

Fit the regression line in segments (as in categorical analysis)

Enforce continuity at the junctions (knots) (new)

EXAMPLE: Linear Spline Regression

Log odds (stroke)

0 1.4 4.5 11.3 AHI

Linear Spline Regression

Log odds (stroke)

0 1.4 4.5 11.3


Fit two straight regression lines

Ensure continuity at the knot (AHI=1.4)

Method:

Define a new variable, SS=0, if AHI<1.4

S=AHI-1.4, if AHI>1.4

Log odds (stroke) = 0 + 1(AHI)+ 2(S)+ Z

To the left of the knot: S=0

Log odds (stroke) = 0 + 1(AHI) + Z

To the right of the knot: S=AHI-1.4

Log odds (stroke) = 0 + 1(AHI) + 2(AHI-1.4) + Z

= 0 -1.4 2 + (1+ 2)AHI + Z

Different slopes

Identical predicted value at the knot (AHI=1.4)


More Flexible Spline Regression

Quadratic spline

AHI + AHI2

Cubic spline

AHI + AHI2 + AHI3

Basic quadratic spline: Step #1

Determine cutpoints (C1, C2, C3) on the exposure scale (4 categories)

These are either percentiles or some other values. That is, decide on the values of C1, C2, C3 of your choice

C1=?;

C2=?;

C3=?;

Step #2S1 = EXP2;

S2 = 0; S3 = 0; S4 = 0;

IF EXP > C1 THEN S2 = (EXP-C1)2;

IF EXP > C2 then S3 = (EXP-C2)2;

IF EXP > C3 then S4 = (EXP-C3)2;

Step #3

Step #4

Regress the dependent variable on

EXP S1 S2 S3 S4 covariates

And find the four regression equations: one per exposure category(together they form a continuous dose-response function)

Compute and display the dose-response function

C1=14;

C2=29; Example: pack-years of smoking and CHD

C3=43; EXP = pack-years

S1 = EXP**2;

S2=0; S3=0; S4=0;

IF EXP > C1 THEN S2 = (EXP-C1)**2;

IF EXP > C2 then S3 = (EXP-C2)**2;

IF EXP > C3 then S4 = (EXP-C3)**2;

PROC LOGISTIC;

MODEL DIS = EXP S1 S2 S3 S4;

Maximum Likelihood Estimates

Parameter DF Estimate

Intercept 1 -1.7022 (α)

EXP 1 -0.0203 (β0)

S1 1 0.00252 (β1)

S2 1 -0.00265 (β2)

S3 1 -0.00047 (β3)

S4 1 0.000305 (β4)

Log odds (CHD) = α + 0(EXP)+ 1(S1) + 2(S2) + 3(S3) + 4(S4)

EXP Four regression equations

< 14 Log odds (CHD) = S1=EXP2, S2=0, S3=0, S4=0

15-29 Log odds (CHD) = S1=EXP2, S2=(EXP-14)2, S3=0, S4=0

30-43 Log odds (CHD) = S1=EXP2, S2=(EXP-14)2, S3=(EXP-29)2, S4=0

>43 Log odds (CHD) = S1=EXP2, S2=(EXP-14)2, S3=(EXP-29)2, S4=(EXP-43)2

(Unrestricted) Quadratic Spline:Pack-years and CHD

-2

-1.5

-1

-0.5

0

0.5

1

0 15 30 45 60 75 90 105 120 135 150

Pack-years

log o

dds

(cas

enes

s)

-2

-1.5

-1

-0.5

0

0.5

0 15 30 45 60 75 90 105 120 135 150 165

Pack-years

log o

dds

(cas

enes

s)

Cubic Spline RegressionLog odds (stroke) vs. AHI

3 Knots: 0.2, 4.5, 34.1

-4.50

-4.00

-3.50

-3.00

-2.50

0 10 20 30 40 50

AHI

0100200

300400500600

700800900

Cubic Spline RegressionLog odds (stroke) vs. AHI

4 knots: 0.2, 1.4, 11.3, 34.1

-5.00

-4.50

-4.00

-3.50

-3.00

-2.50

0 10 20 30 40 50

AHI

0

200

400

600

800

1000

Spline Regression: Applications

Regression Dependent SAS ProcedureModel Variable

Logistic log odds (Y=1) PROC LOGISTIC

Linear mean Y PROC REG

Cox log (hazard) PROC PHREG

All models are linear functions of the predictors

Spline Regression (within PROC REG)

Systolic BP vs. AHI3 knots: 0.1, 3.6, 29.1

124.0

125.0

126.0

127.0

128.0

129.0

130.0

0 10 20 30 40 50

AHI

0

100

200

300

400

500

600

700


Systolic BP vs. AHI4 knots: 0.1, 1.1, 9.5, 29.1

124.0

125.0

126.0

127.0

128.0

129.0

130.0

0 10 20 30 40 50

AHI

0

100

200

300

400

500

600

700


Systolic BP vs. AHI5 knots: 0.1, 1.1, 3.6, 9.5, 29.1

124.0

125.0

126.0

127.0

128.0

129.0

130.0

0 10 20 30 40 50

AHI

0

100

200

300

400

500

600

700

Spline RegressionKey Advantages

Less restrictive assumptions More regional flexibility Does not rely on statistical hypothesis testing Not as sensitive to the choice of cutoff points Visual inspection of the dose-response pattern Might be used to guide the choice of categories

for traditional categorical analysis

Spline RegressionKey Issues

Moderately sensitive to the number of knots (especially if only 3 are specified)

What do the “bumps and valleys” really mean? Visual (subjective) interpretation

Consider the scale of the Y-axis Consider the amount of data at the tail(s) Straight line at the outermost segments

exploring the shape of the dose-response function

Documents

log odds of stroke

z log odds stroke

step function log odds

q4 z ahi

cubic spline ahi ahi

454q4 z slide

step function slide

knot ahi