l12_anova

32
Analysis of Variance David Chow Nov 2014 Chap 11-1 Chap 11-1

Upload: donald-yum

Post on 01-Feb-2016

8 views

Category:

Documents


0 download

DESCRIPTION

lecture notes

TRANSCRIPT

Page 1: L12_anova

Analysis of Variance

David Chow

Nov 2014

Chap 11-1Chap 11-1

Page 2: L12_anova

Learning ObjectivesLearning ObjectivesIn this chapter, you learn: The basic concepts of experimental design How to use one-way analysis of variance (ANOVA) How to use two-way analysis of variance and interpret the

interaction effect

Chap 11-2Chap 11-2

Page 3: L12_anova

General ANOVA SettingGeneral ANOVA Setting

R h d i i t ll t d t d Researcher designs an experiment, collects data, and draw conclusions

Researcher controls one or more factors of interest Observe effects on the dependent variable

Main Question: Are the groups (populations) the same? Main Question: Are the groups (populations) the same?

Each factor (independent variable) contains two or more treatments (levels)

Levels can be numerical or categoricalDifferent levels give different groups with each group

Chap 11-3Chap 11-3

Different levels give different groups, with each group representing a population

Page 4: L12_anova

Completely Randomized DesignCompletely Randomized Design

CRD i th i l t i t l d i CRD is the simplest experimental design Only one factor under consideration

T t bj t d l Test subjects (assumed to be homogeneous) randomly assigned to different treatment levels

Treatment (level)( )Placebo (P) Vaccine (V)

300 300

Eg: A medical experiment Subjects randomly assigned to get one treatment (either P or V)

Dependent variable = no of colds reported

Chap 11-4Chap 11-4

Dependent variable = no of colds reported Few no of colds in the vaccine group?

Page 5: L12_anova

One-Way ANOVA: AssumptionsOne Way ANOVA: Assumptions

Evaluate the difference among the means of three Evaluate the difference among the means of three or more groups

t d dEg1: Accident rates for 1st, 2nd, and 3rd shiftEg2: Expected mileage for five brands of tires

AssumptionsPopulations are normally distributed Populations are normally distributed

Populations have equal variancesSamples are randomly and independently drawn

Chap 11-5Chap 11-5

Samples are randomly and independently drawn

Page 6: L12_anova

Setting HypothesesSetting Hypotheses

μμμμ:H

All population means are equal (c = no of groups)i e no factor effect

c3210 μμμμ:H

i.e., no factor effect

thl tithfllN tH

At least one pair with different population means

samethearemeanspopulationtheofallNot :1H

i.e., there is a factor effect

Chap 11-6Chap 11-6

Page 7: L12_anova

Graphical PresentationGraphical Presentation

H0 is True

321 μμμ

H0 NOT true

or

Chap 11-7Chap 11-7321 μμμ 321 μμμ

Page 8: L12_anova

Idea: Partitioning the VariationIdea: Partitioning the Variation

Total variation can be split into two parts:

SST = SSA + SSW

SST = Total Sum of Squares

SST = SSA + SSW

SST = Total Sum of Squares(Total variation)

SSA = Sum of Squares Among Groups(A i ti d t f t )(Among-group variation – due to factor)

SSW = Sum of Squares Within Groups(Within-group variation – due to ____)

Chap 11-8Chap 11-8

( g p ____)

Page 9: L12_anova

Obtaining the Mean SquaresObtaining the Mean SquaresThe Mean Squares are obtained by dividing the various

SSA

sum of squares by their associated degrees of freedom

Mean Square Among

1cSSAMSA Mean Square Among

(d.f. = c-1)

cnSSWMSW

Mean Square Within(d.f. = n-c)cn

SSTMST

( )

Mean Square Total

Chap 11-9Chap 11-9

1n

MST Mean Square Total(d.f. = n-1)

Page 10: L12_anova

One-Way ANOVA TableOne Way ANOVA Table

S f S OfD f M SSource of Variation

Sum OfSquares

Degrees ofFreedom

Mean Square(Variance)

A

F

SSAAmong Groups c - 1 MSA =

Within

SSAMSA

SSAc - 1SSW

FSTAT =

Within Groups SSWn - c MSW =

T t l SST1

MSWSSWn - c

df1 = c – 1 dfTotal SSTn – 1

c = number of groups

df2 = n – c

Chap 11-10Chap 11-10

n = sum of the sample sizes from all groupsdf = degrees of freedom

Page 11: L12_anova

Interpreting F StatisticInterpreting F Statistic

The F statistic is the ratio of two variance The F statistic is the ratio of two variance estimates: among groups to within groups The ratio must always be positive df1 = c -1 will typically be small df2 = n - c will typically be large

One-Tail F-testDecision Rule:ec s o u e Reject H0 if FSTAT > Fα,

otherwise do not reject H00

Chap 11-11Chap 11-11

0 Reject H0Do not reject H0

Page 12: L12_anova

Eg: Are the Clubs Different?g

Wh th diff t lf l b

Club 1 Club 2 Club 3254 234 200

When three different golf clubs are used, they hit the ball different distances. Y d l l t fi

254 234 200263 218 222241 235 197237 227 206 You randomly select five

measurements for each club. At the 0.05 significance level, is there

diff i di t ?

237 227 206251 216 204

a difference in mean distance?

Computations by EXCEL

Chap 11-12Chap 11-12

Page 13: L12_anova

Excel Output

SUMMARY

Excel Output

SUMMARYGroups Count Sum Average Variance

Club 1 5 1246 249.2 108.2Club 2 5 1130 226 77.5Club 3 5 1029 205.8 94.2ANOVA

Source of Variation SS df MS F P-value F crit

Between Groups 4716.4 2 2358.2 25.275 4.99E-05 3.89Within Groups 1119.6 12 93.3

Total 5836.0 14

Chap 11-13Chap 11-13

Page 14: L12_anova

Statistical DecisionStatistical DecisionH : μ = μ = μ Test Statistic:H0: μ1 = μ2 = μ3

H1: μj not all equal = 0.05

Test Statistic:

25.2752358.2FSTAT MSA

df1= ___ df2 = ___

D i i

25.27593.3

FSTAT MSW

Critical Decision:

C l iReject H0 at = 0.05

Critical Value:

Fα = 3.89Conclusion:

There is evidence that at least one μ differs0

= .05

Chap 11-14Chap 11-14

FSTAT = 25.275

at least one μj differs from the rest

0Fα = 3.89

Reject H0Do not reject H0

Page 15: L12_anova

Scatter PlotScatter Plot

270Distance270

260

250 •••

X

Club 1 Club 2 Club 3254 234 200263 218 222 250

240

230•••

••

1X

X

263 218 222241 235 197237 227 206251 216 204

•230

220

210•••

2X X251 216 204

••••

200

190

3X227.0 X

205.8 X 226.0X 249.2X 321

Chap 11-15Chap 11-15Club1 2 3

Page 16: L12_anova

ANOVA AssumptionsANOVA Assumptions

Randomness and Independence Select random samples from the c groups (or randomly

assign the levels)assign the levels) Normality

The sample values for each group are from a normal The sample values for each group are from a normal population

Homogeneity of Varianceg y All populations sampled from have the same variance

Chap 11-16Chap 11-16

Page 17: L12_anova

Chapter SummaryChapter Summary

One-way ANOVAy Its logic & assumptions F-test for difference in c means

(Below are not covered) If H0 rejected: Tukey-Kramer procedure for multiple comparisons

A ti h k L t t f h it f i Assumption check: Levene test for homogeneity of variance

Another experimental design: randomized block design Two-way analysis of variance Two way analysis of variance

Examined effects of multiple factors Examined interaction between factors

Chap 11-17Chap 11-17

Page 18: L12_anova

Appendix: Math Details

Chap 11-18Chap 11-18

Page 19: L12_anova

Total Sum of SquaresTotal Sum of Squares

SST = SSA + SSW

c n j

XXSST 2)(

SST = SSA + SSW

j i

ij XXSST1 1

)(Where:

SST = Total sum of squares

c = number of groups or levels

nj = number of observations in group j

Xij = ith observation from group j

Chap 11-19Chap 11-19

X = grand mean (mean of all data values)

Page 20: L12_anova

Total VariationTotal Variation

2212

211 )()()( XXXXXXSST

ccn

Response, X

X

Chap 11-20Chap 11-20

Group 1 Group 2 Group 3

Page 21: L12_anova

Among-Group VariationAmong Group Variation

SST = SSA + SSW

2)( XXnSSA j

c

SST = SSA + SSW

Where:1

)( XXnSSA jj

j

SSA = Sum of squares among groups

c = number of groups

nj = sample size from group j

Xj = sample mean from group j

Chap 11-21Chap 11-21

j p g p j

X = grand mean (mean of all data values)

Page 22: L12_anova

Among-Group VariationAmong Group Variation

c2

1)( XXnSSA j

c

jj

Variation Due to Differences Among Groups 1

cSSAMSA1c

Mean Square Among =

j

SSA/degrees of freedom

Chap 11-22Chap 11-22

i j

Page 23: L12_anova

Among-Group Variation

2222

211 )()()( XXnXXnXXnSSA cc

Response, X

XX 2X

3X

1X 2X

Chap 11-23Chap 11-23

Group 1 Group 2 Group 3

Page 24: L12_anova

Within-Group VariationWithin Group Variation

SST = SSA + SSW

2)( j

nc

XXSSWj

SST = SSA + SSW

Where:

11)( jij

ijXXSSW

Where:

SSW = Sum of squares within groups

c = number of groupsc number of groups

nj = sample size from group j

X = sample mean from group j

Chap 11-24Chap 11-24

Xj = sample mean from group j

Xij = ith observation in group j

Page 25: L12_anova

Within-Group VariationWithin Group Variation

n2

11)( jij

n

i

c

jXXSSW

j

Summing the variation within each group and then

SSWMSW

j

adding over all groups cnMean Square Within =

SSW/degrees of freedom

Chap 11-25Chap 11-25

Page 26: L12_anova

Within-Group Variation

22212

2111 )()()( ccn XXXXXXSSW

c

Response, X

X3X

1X 2X

Chap 11-26Chap 11-26

Group 1 Group 2 Group 3

Page 27: L12_anova

EgEg: Car Wax Effectiveness: Car Wax Effectiveness

•• The number of times each car went through the The number of times each car went through the carwash before its wax deteriorated is shown on the carwash before its wax deteriorated is shown on the next slide next slide

•• The wax producer must decide which wax to marketThe wax producer must decide which wax to marketA th th ll ff ti ?A th th ll ff ti ?•• Are the three waxes equally effective?Are the three waxes equally effective?

Factor :Factor : Car waxCar waxFactor :Factor : Car waxCar waxTreatments (Levels):Treatments (Levels): Type 1, Type 2, Type 3Type 1, Type 2, Type 3Subjects:Subjects: CarsCarsjjResponse variable:Response variable: Number of washesNumber of washes

Page 28: L12_anova

EgEg: Car Wax Effectiveness: Car Wax Effectiveness

Obser ationObser ationWaxWax

Type 1Type 1WaxWax

Type 2Type 2WaxWax

Type 3Type 3

1122

27273030

33332828

29292828

ObservationObservation Type 1Type 1 Type 2Type 2 Type 3Type 3

223344

303029292828

282831313030

282830303232

55 3131 3030 3131

Sample MeanSample Mean 29.0 30.429.0 30.4 30.030.0ppSample VarianceSample Variance 2.52.5 3.33.3 2.52.5

Page 29: L12_anova

EgEg: Car Wax Effectiveness: Car Wax Effectiveness

�� HypothesesHypotheses

HH : : == ==

where: where:

HH00: : 11==22==33HH11: Not all the means are equal: Not all the means are equal

where: where: 1 1 = mean number of washes using Type 1 wax= mean number of washes using Type 1 wax2 2 = mean number of washes using Type 2 wax= mean number of washes using Type 2 wax3 3 = mean number of washes using Type 3 wax= mean number of washes using Type 3 wax

Page 30: L12_anova

EgEg: Car Wax Effectiveness: Car Wax Effectiveness

Source ofSource ofV i tiV i ti

Sum ofSum ofSS

Degrees ofDegrees ofF dF d

MeanMeanSS FF

�� ANOVA TableANOVA Table

V lV lVariationVariation SquaresSquares FreedomFreedom SquaresSquares FF

TreatmentsTreatments aa5.25.2 cc

pp--ValueValue

.42.42ee

ErrorError 33.233.2 bb dd

TotalTotal 141438.438.4

�� Rejection Rule (given Rejection Rule (given αα = 0.05)= 0.05)�� Rejection Rule (given Rejection Rule (given αα 0.05) 0.05)pp--Value Approach: Reject Value Approach: Reject HH00 if if pp--value value << .05.05Critical Value Approach: Reject Critical Value Approach: Reject HH00 if if FF >> FF.05.05 = h= h

Page 31: L12_anova

ANSWERANSWER

�� ANOVA TableANOVA Table

Source ofSource ofVariationVariation

Sum ofSum ofSquaresSquares

Degrees ofDegrees ofFreedomFreedom

MeanMeanSquaresSquares FF pp--ValueValue

TreatmentsTreatments

ErrorError

a=2a=25.25.2

33.233.2 b=12b=12

c=2.60c=2.60

d=2.77d=2.77

e=0.939e=0.939 .42.42

TotalTotal 141438.438.4

Critical Value: Critical Value: FF.05.05 = 3.89= 3.89

Page 32: L12_anova

ANSWERANSWER

�� ConclusionConclusion

pp--value approachvalue approachFrom FFrom F--table, table, pp--value is greater than 0.10, where value is greater than 0.10, where FF = 2.81.= 2.81.

(E l i t (E l i t l f 0 42)l f 0 42)(Excel gives an exact (Excel gives an exact pp--value of 0.42)value of 0.42)Do not reject Do not reject HH00

Critical value approachCritical value approach: F: FTESTTEST=0.939 < F=0.939 < F.05.05, do not reject , do not reject HH00

There is insufficient evidence to conclude that the mean There is insufficient evidence to conclude that the mean There is insufficient evidence to conclude that the mean There is insufficient evidence to conclude that the mean number of washes for the three wax types are not the samenumber of washes for the three wax types are not the same