one-factor experiments & ancova

103
One-Factor Experiments & ANCOVA Group 3 Jesse Colton; Junyan Song; Kan He; Lijuan Kang; Minqin Chen; Xiaotong Li; Xin Li ; Yaqi Xue

Upload: jaxon

Post on 24-Feb-2016

130 views

Category:

Documents


0 download

DESCRIPTION

One-Factor Experiments & ANCOVA. Group 3 Jesse Colton; Junyan Song; Kan He; Lijuan Kang; Minqin Chen; Xiaotong Li ; Xin Li ; Yaqi Xue. Outline:. History and Introduction. Model and Overall F Test. Theoretical Background. Do ANCOVA by Hand. ANCOVA. ANOVA. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: One-Factor Experiments & ANCOVA

One-Factor Experiments & ANCOVA

Group 3

Jesse Colton; Junyan Song; Kan He; Lijuan Kang; Minqin Chen; Xiaotong Li;Xin Li ; Yaqi Xue

Page 2: One-Factor Experiments & ANCOVA

Outline:

ANOVA

ANCO

VA

Theoretical Background

Do ANCOVA by Hand

Check Assumptions

Do ANCOVA by SAS

History and Introduction

Model and Overall F Test

Pairwise Test for Group

Means

ANOVA Linear Model and

Tests

Page 3: One-Factor Experiments & ANCOVA

What Is ANCOVA?

Page 4: One-Factor Experiments & ANCOVA

Definition• ANOVA stands for Analysis Of Variance.

• ANCOVA stands for Analysis Of Covariance.

• ANCOVA uses aspects of ANOVA and Linear Regression to compare samples to each other, when there are outside variables involved

• “One-Factor Experiment” means we are testing an experiment using only one single treatment factor.

Page 5: One-Factor Experiments & ANCOVA

HistoryLike many of the important topics in statistical analysis, elements of ANOVA/ANCOVA come from works of R.A. Fisher, and some from Francis Galton

Page 6: One-Factor Experiments & ANCOVA

History

Ronald Aylmer Fisher1890-1962

• British Statistician, Eugenicist, Evolutionary Biologist & Geneticist• Fisher “pioneered the principles of the design of

experiments and elaborated his studies of analysis of variance.”(Wikipedia)• He also developed the method of maximum

likelihood, and is known for “Fisher’s exact test”

Page 7: One-Factor Experiments & ANCOVA

HistorySir Francis Galton

1822-1922

• Established the concept of correlation

• He “invented the use of the regression line and was the first to describe and explain the common phenomenon of regression toward the mean.”(Wikipedia)

Page 8: One-Factor Experiments & ANCOVA

Uses• ANOVA is used to compare the means of two or

more groups.• ANCOVA is used in situations where another

variable effects the experiment.• While we normally use the T-test for two group

means, there are many situations where it is not applicable or as useful.• More than 2 samples• Samples with additional variables• Other factors leading to skewed experimental results

Page 9: One-Factor Experiments & ANCOVA

Uses• When conducting an experiment, there is often an

initial difference between test groups.• ANCOVA “provides a way of measuring and

removing the effects of such initial systematic differences between the samples.”

(http://vassarstats.net/textbook/ch17pt2.html)• If you only compare the means, you are not

taking into account any previous advantages one group may have

Page 10: One-Factor Experiments & ANCOVA

UsesExample: Two methods of teaching a topic are tested on two different groups (A and B). However, in the preliminary data collected, group A is shown to have a higher IQ than group B. The fact that group A had a higher score after learning by one method does not prove the method is better. ANCOVA seeks to eliminate the difference between the groups before the experiment in order test which method is better.

Page 11: One-Factor Experiments & ANCOVA

Uses• By merging ANOVA with Linear Regression,

ANCOVA controls for the effects that the covariates we are not studying may have on the outcomes

ANOVA Linear Regression

ANCOVA

Page 12: One-Factor Experiments & ANCOVA

Aims of ‘ANOVA’ Models• Linear models with continuous response and one or

more categorical predictors

• Description:-relation between response variable (Y) and predictor (X) variable(s)

• Explanation:- How much of variation in Y explained by different sources of variation (factors or combination of factors)

Page 13: One-Factor Experiments & ANCOVA

Completely Randomized Designs• Experimental designs where there is

no restriction on random allocation of experimental/sampling units to groups or treatments

- single factor and factorial designs

Page 14: One-Factor Experiments & ANCOVA

Single factor model Completely randomized design

Page 15: One-Factor Experiments & ANCOVA

Terminology

• Factor (categorical predictor variable): - usually designed factor A • Number of observations within each

group: -ni

• Each observation: - y

Page 16: One-Factor Experiments & ANCOVA

Data layout

Page 17: One-Factor Experiments & ANCOVA

Estimating Model Parameters

Page 18: One-Factor Experiments & ANCOVA

Estimating Model Parameters

Page 19: One-Factor Experiments & ANCOVA

Estimating Model Parameters

Least Square (LS) Estimate

Page 20: One-Factor Experiments & ANCOVA

Estimating Model Parameters

Page 21: One-Factor Experiments & ANCOVA

Analysis of Variance

•Test the hypothesis

aH 210 :

equalareallNotH ia :

Page 22: One-Factor Experiments & ANCOVA

Analysis of Variance

•Test the hypothesis0: 210 aH

𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑠𝑜𝑚𝑒𝜏 𝑖≠0

Page 23: One-Factor Experiments & ANCOVA

Analysis of Variance

Page 24: One-Factor Experiments & ANCOVA

Analysis of Variance

Page 25: One-Factor Experiments & ANCOVA

Analysis of Variance

Test Statistical

Page 26: One-Factor Experiments & ANCOVA

Analysis of Variance

Page 27: One-Factor Experiments & ANCOVA

Analysis of Variance

Page 28: One-Factor Experiments & ANCOVA

Unequal sample sizes

• Sums of squares equations provided only work for equal sample sizes

- can be modified for unequal samples sizes but very clumsy -model comparison approach simpler (and used by statistical software)

Page 29: One-Factor Experiments & ANCOVA

Unequal sample sizes• F-ratio tests less reliable if sample sizes

are different, especially if variances also different

- bigger difference in sample sizes, less reliable tests become• Use equal or similar sample sizes if

possible• But don’t omit data to balance sample

sizes!

Page 30: One-Factor Experiments & ANCOVA

Anova— Multiple Comparisons of Means

Reject , where a is the # of groups Not all means are equal.

But which means are significantly different from each other?

We need a more detailed comparison! Making multiple test

0 1 2: aH

Page 31: One-Factor Experiments & ANCOVA

Making multiple test Test All Pairwise equality Hypotheses

Number of Pairs: Using two sided t-test at level α:Reject if

where

Anova— Multiple Comparisons of Means

=

0 :

:ij i j

aij i j

H

H

𝐻0 𝑖𝑗2, /21 1 i j

i jij n n

i j

y yT t

Sn n

𝑆2=MSE=¿

𝑛𝑖 𝑦 𝑖 is the number of group i, is the mean of the observed value of group i, .

2

1 1

( ) /ina

ij ii j

y y N a

1

a

ii

N n

Page 32: One-Factor Experiments & ANCOVA

Anova— Multiple Comparisons of Means

Least Significant Difference (LSD):The critical value,

that the difference must exceed in order to be significant at level .

2, /2 2, /21 1

1 1 i j i j

i jij n n i j n n

i j

i j

y yT t y y t S

n nSn n

2, /21 1

i jn ni j

t Sn n

¿ 𝑦 𝑖−𝑦 𝑗∨¿

𝛼

Page 33: One-Factor Experiments & ANCOVA

Anova— Multiple Comparisons of Means

Familywise Error Rate (FWE):Type I error probability of declaring at least one pairwise difference to be falsely significant.

FWE=P{Reject at least one true null hypothesis}

If each test is done at level , then FWE will exceed .Why?

𝛼𝛼

Page 34: One-Factor Experiments & ANCOVA

Anova— Multiple Comparisons of Means

Let denote rejecting the true null hypothesis in test, where total number of test is k=.P( ) = = type I error.FWE=P( ) =P( )If is independent to each other,FWE=k*P( )= k

Our goal is to control FWE .

𝛼

𝐴𝑖

𝑖 h𝑡

𝐴𝑖

𝐴1 𝐴𝑘 𝐴𝑖

𝛼

≤𝛼

𝐴𝑖𝐴𝑖

≥𝛼

Page 35: One-Factor Experiments & ANCOVA

Anova— Multiple Comparisons of Means

Two Methods:• Bonferroni Method.• Tukey Method.

Page 36: One-Factor Experiments & ANCOVA

Anova— Multiple Comparisons of Means

Bonferroni Method• Idea: To perform k tests simultaneously, divide the FWE α among the k tests. If the error rate is allocated equally among the k tests, then each test is done at level α/k.

For example: α=0.05 and k=10 each test: 0.05/10=0.005

Page 37: One-Factor Experiments & ANCOVA

Anova— Multiple Comparisons of Means

Bonferroni Method• Test:

At FWE= , we reject if

0 :

:ij i j

aij i j

H

H

𝛼 0ijH

2, /21 1 i j

i jij n n k

i j

y yT t

sn n

where 2

1 1

( ) /ina

ij ii j

y y N a

Page 38: One-Factor Experiments & ANCOVA

Anova— Multiple Comparisons of Means

Tukey Method

At FWE= , we reject if

where,

0 :

:ij i j

aij i j

H

H

𝛼 0ijH

, ,| |1 1 2i j a N a

ij

i j

y y qt

sn n

, ,a N aq

Page 39: One-Factor Experiments & ANCOVA
Page 40: One-Factor Experiments & ANCOVA

Dummy Variable:A Dummy Variable is an artificial variable created to represent an attribute with two or more distinct categories/levels.

How to create a Dummy Variable:The number of dummy variables necessary to represent a single attribute variable is equal to the number of levels(categories)(k) in that variable minus one. (k-1)

Page 41: One-Factor Experiments & ANCOVA

Gender: Male & Female

Categories D1

Male 1

Female 0

Rank: Assistant & Associate & Full

Categories D1 D2

Assistant 1 0

Associate 0 1

Full 0 0

Page 42: One-Factor Experiments & ANCOVA

ANOVA Models(A Multiple Regression with all categorical predictors):General Linear Model:

Dummy Variables

Page 43: One-Factor Experiments & ANCOVA

?Relationship between these Models:

constraint

Page 44: One-Factor Experiments & ANCOVA

Note: is the Grand Mean, but in the last case it is the mean of Group 3.

, is different from those in the last case.

Page 45: One-Factor Experiments & ANCOVA

The Interpretation differs depending on which constraint we apply.

: Group one mean-Group three mean

: Group two mean-Group three mean

:Group one mean –Group two mean

Page 46: One-Factor Experiments & ANCOVA

? How do we test ANOVA in terms of General Linear Model

1. Overall F-TestH0:

H0:

Recall Test for Multiple Regression Coeffcient:Reduced Model:Full Model:

P: numbers of parameters in H0. * p=a-1

Page 47: One-Factor Experiments & ANCOVA

Recall Test for ANOVA in terms of Model

General Linear Model

We reject H0 when So the Overall Test of ANOVA for both models are consistent.

Page 48: One-Factor Experiments & ANCOVA

2. Test for individual regression coefficient(Pairwise Test for Group Means)H0 differs depending on different coding of the Dummy Variables.For Example:

H0: T test

F test:

Full Model:

Reduced Model:

Page 49: One-Factor Experiments & ANCOVA

ANCOVA Models(A Multiple Regression with continuous predictors and dummy coded factors)

Continuous Dummy Variables Variables

Page 50: One-Factor Experiments & ANCOVA

Overall Test for ANCOVA in terms of Linear Model:H0:

H0:

Page 51: One-Factor Experiments & ANCOVA

What is analysis of Covariance?• An analysis procedure for looking at group

effects on a continuous outcome when some other continuous explanatory variable also has an effect on the outcome.• Generally, ANCOVA has at least one or more

categorical independent variables, and one or more covariates. It can be seen as multiply regression with 1+ covariates and 1+ dummy variable coded factors.

Page 52: One-Factor Experiments & ANCOVA

Why include Covariates in ANOVA

• To reduce within-group error variance: explain part of unexplained variance in terms of covariates so we can reduce the error variance and increase the statistical power.• Elimination of Confounds: if any variables which

will have an influence on the dependent variable can be measured, ANCOVA would be a good choice to use to partial out such effect.

Page 53: One-Factor Experiments & ANCOVA

Assumptions of ANCOVA

• Normality of Residuals• Homogeneity of Variances• Independence of Error terms• Linearity of Regression• Homogeneity of Regression Slopes• Independence of Covariates and treatment effect

Page 54: One-Factor Experiments & ANCOVA

Homogeneity of Regression

Page 55: One-Factor Experiments & ANCOVA

Test the Homogeneity of Regression

• Run ANCOVA model including independent variables and interaction term

• If interaction term is significant, the assumption is invalid.

• If interaction term is not significant, then try one more without intersection term.

Page 56: One-Factor Experiments & ANCOVA

General Linear Model of ANCOVA

• Yij = GMY + αi + [βi(Ci – Mij) + …… ] + εij

A continuous dependent

variable

Grand Mean of

dependent variable

Treatment effect

Regression coefficient for ith covariate

Known Covariance

Error N(0, σ2)

Page 57: One-Factor Experiments & ANCOVA

General Linear Model of ANCOVA

• Yij - [βi(Ci – Mij) + …… ] = GMY + αi + εij

Adjusted Yij = GMY + αi + εij

• Adjusted dependent variable means the relationship between dependent variable and covariates has been partialed out of dependent variable.

Adjusted continuous dependent

variable Same as ANOVA model

Page 58: One-Factor Experiments & ANCOVA

How to calculate Regression coefficient?

• The numerator is the covariance of X and Y within the group

• The denominator is the sum square of deviates within the group

• Then we should take the summation of βi hat, which is the regression coefficient

Page 59: One-Factor Experiments & ANCOVA

F test in ANCOVA

• F test in ANCOVA is same as that in ANOVA, the only difference is that now we are using the adjusted values of SSbg(Y) and SSwg(Y), along with adjusted value of df.

• If it is significant, the group means statistically differ after controlling for the effect of 1+ covariates

Page 60: One-Factor Experiments & ANCOVA

Abbreviation

• SS: sum square of deviates• SC: sum of co-deviates• SST: total sum square of deviates• SSWG: sum square of deviates within groups• SSBG: sum square of deviates between groups• SCT: total sum of co-deviates• SCWG: sum of co-deviates with group• SCBG: sum of co-deviates between group

Page 61: One-Factor Experiments & ANCOVA

ANCOVA

http://vassarstats.net/textbook/ch17pt2.html

Example:

Comparing twomethods of HypnoticInduction

Page 62: One-Factor Experiments & ANCOVA

Items to calculateFor the Dependent Variable Y

= - = - ) + ( - )

= -

Page 63: One-Factor Experiments & ANCOVA

Items to calculateFor the Covariate X

= -

= - ) + ( - )

Page 64: One-Factor Experiments & ANCOVA

Calculations

Page 65: One-Factor Experiments & ANCOVA

Items to calculateFor the Covariance of X and Y

(Sum of the co-deviates)

= (General form)

+

Page 66: One-Factor Experiments & ANCOVA

Calculations

Page 67: One-Factor Experiments & ANCOVA

4. The Final Set of CalculationsA summary of the values we obtained so far

X Y CovarianceSST(X) = 908.9SSwg(X) = 788.9

SST(Y) = 668.5SSwg(Y) = 662.5SSbg(Y) = 6.0

SCT = 625.9SCwg = 652.8

Page 68: One-Factor Experiments & ANCOVA

4a. Adjustment of SST(Y)The overall correlation between X and Y:

Page 69: One-Factor Experiments & ANCOVA

The proportion of the total variability of Y attributable to its covariance with X is accordingly(rT)2 = (+.803)2 = .645

Page 70: One-Factor Experiments & ANCOVA

we adjust SST(Y) by removing from it this proportion of covariance. Since SST(Y)=668.5

Page 71: One-Factor Experiments & ANCOVA

4b. Adjustment of SSwg(Y)The overall correlation between X and Y within the two groups:

Page 72: One-Factor Experiments & ANCOVA

The proportion of the within-groups variability of Y attributable to covariance with X is therefore(rwg)2 = (+.903)2 = .815

Page 73: One-Factor Experiments & ANCOVA

we adjust SSwg(Y) by removing from it this proportion of covariance. Since SSwg(Y)=662.5

Page 74: One-Factor Experiments & ANCOVA

4c. Adjustment of SSbg(Y)The adjusted value of SSbg(Y) can then be obtained through simple subtraction as

Page 75: One-Factor Experiments & ANCOVA

4d Adjustment of Means of Y for Groups A and B

Purpose: Adjust the group means of Y to the same starting point, using the aggregate correlation between X and Y within the two groups.

Page 76: One-Factor Experiments & ANCOVA

Recall for Linear Regression:

By Least Square Method:

We can get:

Page 77: One-Factor Experiments & ANCOVA

An increase by 1 unit of X is associated withan average increase of .83 units of Y.

Page 78: One-Factor Experiments & ANCOVA

bwg:=.83Original: Adjusted :

Mx My13.1 29.2+2.45(.83)=31.23 +2.45

15.55

-2.4518.0 28.1 -2.45(.83)=26.07

Page 79: One-Factor Experiments & ANCOVA

( ..)ij i ij ijY X X

?Linear Model for ANCOVA:

Page 80: One-Factor Experiments & ANCOVA

[ adjusted Yij]

Linear Model For ANOVA

Thus, as with the corresponding one way ANOVA,The final step in a one-way analysis of covariance Involves the calculation of an F-ratio of the general form.

Page 81: One-Factor Experiments & ANCOVA

4e. Analysis of Covariance Using Adjusted Values of SS

We have to use the adjusted values of SSbg(Y) and SSwg(Y), along with one adjusted value of df.

Page 82: One-Factor Experiments & ANCOVA

Total Numbers of YNumbers of Group

Numbers of independent variables

Page 83: One-Factor Experiments & ANCOVA

Interpretation:

Page 84: One-Factor Experiments & ANCOVA

Summary:

ANCOVA Begins

Four sets of

calculation

Get rid of covariate

from SS(Y)&Mean(Y)

ANAOVAF Test,

Interpretation

Page 85: One-Factor Experiments & ANCOVA

ANCOVA Assumptions

ANCOVA GLM

Page 86: One-Factor Experiments & ANCOVA

ANCOVA AssumptionsFull model—the model involving all x’sReduced model – the model involving only those x’s from the full model whose β coefficients are not hypothesized as 0.

(full) (reduced)T.S.

Page 87: One-Factor Experiments & ANCOVA

ANCOVA Assumptions

• No interaction between the factor and the covariate. Interaction between 2 independent variables is present when the effect of one on the outcome depends the value of the other.

• The slope terms for within group regression doesn’t differ• The regression line of different groups are parallel. • Group 1: • Group 2:

Page 88: One-Factor Experiments & ANCOVA

ANCOVA Assumptions

With interaction

Group 1:

Group2:

Slopes not equal:

¿ 𝛽0+𝛽1+( 𝛽2+𝛽3 )𝑐𝑖

Page 89: One-Factor Experiments & ANCOVA

ANCOVA Assumptions

Testing The Interaction for SignifianceInteraction of Interest: the interaction between the covariate and the dummy variable.Interaction term

FULL MODEL

REDUCED MODEL

k= 3, g=2

Page 90: One-Factor Experiments & ANCOVA

ANCOVA Assumptions

Example:

Page 91: One-Factor Experiments & ANCOVA

ANCOVA Assumptions

: the response, i.e. anxiety score: the drug doseDrug A: Drug B:

Page 92: One-Factor Experiments & ANCOVA

ANCOVA Assumptions

Reject Anxiety level increases at different rate as the drug dose is increasedfor drug A and drug B.

Page 93: One-Factor Experiments & ANCOVA

SAS Implementation

Page 94: One-Factor Experiments & ANCOVA

SAS code 1. Initial data exploration

proc contents data=Instruction;run;

proc means data=Instruction N MEAN STD MAXDEC=2;class method;var prescore postscore;

run;

Proc freq data=instruction;tables method;

Run;

proc sgplot data = instruction;reg x = Prescore y = PostScore / group = method;

run;

Page 95: One-Factor Experiments & ANCOVA

SAS code 2. ANOVA Model

PROC GLM DATA=Instruction; CLASS method ; No class statement in PROC REG MODEL PostScore=method/solution ; MEANS method / deponly ; RUN;QUIT;

We can also use: PROC ANOVA data=instruction;

class method;model postscore=method;means method/Tukey;

run;quit;

Page 96: One-Factor Experiments & ANCOVA

SAS code Another way: creating dummy variables for Method

data instruction_dummy;set instruction;

/**create dummy variables**/if method="A" then do; dummy1=0; dummy2=0;end;if method="B" then do; dummy1=1; dummy2=0;end;if method="C" then do; dummy1=0; dummy2=1;end;

run;

TITLE“ Regression model for Instruction method dataset";PROC GLM DATA=Instruction_dummy; MODEL PostScore=dummy1 dummy2 /solution; RUN;

Now we don’t need Class statement in PROC GLM

Page 97: One-Factor Experiments & ANCOVA

ANOVA output Accept Null hypothesis:H0:

Missing line because there are only two dummy variables

Accept Null hypothesis:H0:H0:

Page 98: One-Factor Experiments & ANCOVA

SAS code3. ANCOVA Model

ods graphic on;

proc glm data=instruction plot=meanplot(cl); class method; model PostScore = method PreScore/solution; lsmeans method / pdiff; output out=out p=yhat r=resid stdr=eresid;run;quit;

ods graphic off;

Include covariate x: PreScore

Page 99: One-Factor Experiments & ANCOVA

ANCOVA output Accept Alternative Hypothesis:H1: at least one

𝛽2≠0𝜇2≠𝜇3

is almost 0, we may expect

Covariate X is significant

Page 100: One-Factor Experiments & ANCOVA

ANCOVA outputAdjusted Means:

; ;

Page 101: One-Factor Experiments & ANCOVA

SAS code3. Checking on the homogeneity of Slope

/**1) perform an analysis that shows the slopes of each of the lines***/PROC SORT DATA=instruction; BY method;RUN;

PROC GLM DATA=instruction; BY method; MODEL PostScore = PreScore / SOLUTION ;RUN;QUIT;

/** 2) method*prescore effect tests if the three slopes are equal**/PROC GLM DATA=instruction; CLASS method; MODEL PostScore = method PreScore method*PreScore;RUN;QUIT;

Include Interaction Term

Interaction term is not significant: Assumption met

Page 102: One-Factor Experiments & ANCOVA

Comparing before and after adjusted means

Page 103: One-Factor Experiments & ANCOVA

Acknowledge: • http://www.ats.ucla.edu/stat/sas/library/hetreg.htm• http://www.unt.edu/rss/class/mike/6810/ANCOVA.pdf• http://www.stat.cmu.edu/~hseltman/309/Book/chapter10.pdf• SAS/STAT(R) 9.22 User's Guide• Text book: Statistics and Data Analysis from Elementary to

Intermediate