analyses of covariance comparing k means adjusting for 1 or more other variables (covariates) ho: u...

25
Analyses of Covariance Comparing k means adjusting for 1 or more other variables (covariates) Ho: u 1 = u 2 = u 3 (Adjusting for X) Combines ANOVA and regression u y = u k + X Assumptions are same as ANOVA + regression Y normally distributed, constant variance across groups – Slope the same for each group

Upload: agnes-flynn

Post on 31-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Analyses of Covariance

• Comparing k means adjusting for 1 or more other variables (covariates)

• Ho: u1 = u2 = u3 (Adjusting for X)

• Combines ANOVA and regression– uy = uk + X

• Assumptions are same as ANOVA + regression– Y normally distributed, constant variance across groups

– Slope the same for each group

SAS Code

PROC GLM; CLASS group; MODEL chol12 = group cholbl/SS3 SOLUTION; MEANS group; LSMEANS group; ESTIMATE ‘Adjusted Mean Dif' group 1 -1;RUN;

Adjusted Means ComputationObservations

YBAR(A)i = YBARi – XBARi – XBAR)

1) If then adjusted mean equals unadjusted mean

2) If mean of X is same for all group then adjusted mean equals unadjusted mean

Computing the Adjusted Means

12-mo Avg. Baseline Avg.Diuretic 231.7 230.7Placebo 219.7 224.9Total 227.0

=0.894 Regression slope of 12-month cholesterol onbaseline cholesterol

YBAR(A) (Diur) = 231.7 – 0.894 (230.7 – 227.0)= 231.7 – 0.894 (3.7) = 228.4

YBAR(A) (Plac) = 219.7 – 0.894 (224.9 – 227.0)= 219.7 – 0.894 (-3.7) = 221.6

6.8

Summary of Analyses of Continuous Variables

• Y is continuous variable

• Estimate a single mean • Compare 2 means

• Compare k means 1, 2, 3, … k

• Model means as function of 1 or more variables (LR)– Y = X + X

Summary of Analyses of Continuous Variables

• Hypothesis testing

• Ho: 0

• Ho:

• Ho: k

• Ho: j = 0

Summary of Analyses of Continuous Variables

• Confidence intervals

SEestimate 96.1

Analyses of Binary Outcomes

• Much of bio-medical data relates to analyses of binary outcomes:– Cancer (yes/no)– Survival (yes/no)– Had side-effect (yes/no)– Currently smoke cigarettes

• Social Sciences:– Divorced (yes/no)– Return to prison (yes/no)

• Political:– Favor a candidate (yes/no)– State has capital punishment (yes/no)

Analyses of Binary Variables

• Y has two outcomes (yes/no or 1/0)

• Estimate a single proportion • Compare 2 proportions

• Compare k proportions 1, 2, 3, … k

• Model probability as function of 1 or more variables – Y = X + X

Binary Outcomes

• Binary outcomes (Y=0 or 1) can be thought of in terms of probabilities:P (Y=1) = P (Y=0) = (1 –

• The ratio of the P(Y=1) to P(Y=0) is the odds

Odds (Y=1 versus Y = 0) = P(Y=1)/P(Y=0) = (1 –

Example

• Y = 1 indicates your horse winning the raceP (Y=1) = 0.20

P (Y=0) = (1 – 0.20) = 0.80

• What is the Odds of winning versus losing.

Odds = P(Winning)/P(Losing) = 0.20/0.80 = 0.25 or ¼

In gambling terms the odds are 4 to 1.

Relationship Between Probability and Odds

Odds (o=0.95 19.00

0.50 1.00

0.40 0.67

0.30 0.43

0.20 0.25

0.15 0.18

0.10 0.11

0.05 0.053

0.01 0.0101

For small values the probability and the odds are close in value

Comparing Two Groups With Binary Outcomes

1 = probability of Y=1 for group 1

2 = probability of Y=1 for group 2

Ways to summarize the probability differences:

1) 1- 2 difference in probabilities

2) 1/ 2 ratio of probabilities (Relative Risk)

3) (1/(1-1)/ ratio of odds (Relative Odds)

(2/(1-2)

Example

Group 1: SmokersGroup 2: Non-smokersY = 1 indicates cough upon awakening

0.30

0.20

= 0.10

= 0.30/0.20 = 1.50

= (.30/.70)/(.20/.80) = 0.429/0.250 = 1.71

Interpretation of Relative Risks

Group 1: SmokersGroup 2: Non-smokers

• RR = 1.50• There is a 50% increased risk of cough for smokers compared to non-smokers.

• Smokers are at a 50% increased risk of cough compared to non-smokers

Interpretation of Relative RisksChanging the Reference Group

Group 1: Non-SmokersGroup 2: Smokers

• RR = 0.67 (1/1.50 or .20/.30)• There is a 33% decreased risk of cough for non-

smokers compared to smokers.• Non-smokers are at a 33% lower risk of cough

compared to smokers.

Results: During follow-up, 477 major cardiovascular events were confirmed in the aspirin group, as compared with 522 in the placebo group, for a nonsignificant reduction in risk with aspirin of 9 percent (relative risk, 0.91; 95 percent confidence interval, 0.80 to 1.03; P=0.13). With regard to individual end points, there was a 17 percent reduction in the risk of stroke in the aspirin group, as compared with the placebo group (relative risk, 0.83; 95 percent confidence interval, 0.69 to 0.99; P=0.04), owing to a 24 percent reduction in the risk of ischemic stroke (relative risk, 0.76; 95 percent confidence interval, 0.63 to 0.93; P=0.009) and a nonsignificant increase in the risk of hemorrhagic stroke (relative risk, 1.24; 95 percent confidence interval, 0.82 to 1.87; P=0.31). As compared with placebo, aspirin had no significant effect on the risk of fatal or nonfatal myocardial infarction (relative risk, 1.02; 95 percent confidence interval, 0.84 to 1.25; P=0.83) or death from cardiovascular causes (relative risk, 0.95; 95 percent confidence interval, 0.74 to 1.22; P=0.68). Gastrointestinal bleeding requiring transfusion was more frequent in the aspirin group than in the placebo group (relative risk, 1.40; 95 percent confidence interval, 1.07 to 1.83; P=0.02).

NEJM March 2005: A Randomized Trial of Low-Dose Aspirin in the Primary Prevention of Cardiovascular Disease in Women

Relationship Between Relative Risk and Relative Odds

• RO = RR x (1-2) / (1-1)

• If 1 and 2 are small (<0.10) then

– RO ~ RR

– Because of this relative risk and relative odds are sometimes interpreted in the same way

Example RR = 2.0 and RR=0.5

• RR = 2.012 Odds Ratio

0.20 0.10 2.250.10 0.05 2.110.05 0.025 2.05

• RR = 0.512 Odds Ratio

0.10 0.20 0.440.05 0.10 0.470.025 0.05 0.49

Why Use Ratios

• In most cases the probability of an event is dependent on length of timetime)

• Using ratios removes time as a factortprob. of developing lung cancer for smokers

tprob. of developing lung cancer for non-smokers

– RR = tt

• Using differences does not remove time as a factorDIF = tt

Comparing Studies With DifferentFollow-up Time

• Study 1 follows patients for 5 years:5prob. of developing lung cancer for smokers

prob. of developing lung cancer for non-smokers

– RR =

• Study 2 follow patients for 30 years:30prob. of developing lung cancer for smokers

30prob. of developing lung cancer for non-smokers

– RR =

Hypothesis TestingConfidence Intervals

• Ho: 1 = 2

• Ha: 1 ≠ 2

• Estimate 1 with p1 = number with condition/total in group 1

• Estimate 2 with p2 = number with condition/total in group 2

• p1-p2 is point estimate of 1 - 2

Proportions for two groups

95% CI for difference in proportions:

2

22

1

11

21

)1()1(

96.1)(

n

pp

n

ppSE

SEpp

Proportions for two groups

Example• 50 men with 13 smokers• 50 women with 10 smokers

p1 = 13/50 = 0.26, p2 = 10/50 = 0.20

SE = sqrt(0.003848 + 0.0032) = 0.084

95% CI for difference = 0.06 ± 1.96*0.084

0.06 ± 0.165

(-0.105, 0.225)

We do not have evidence that the proportion of smokers is different for men and women

Estimation a Single Proportion

Example• N=625 sampled; X=# favor • X = 300

p = 300/625 = 0.48SE = SQRT( (0.48)(0.52)/625) = 0.020

95% CI: = 0.48 ± 1.96*0.020.48 ± 0.04(0.44, 0.52)