anova demo part 2: analysis psy 320 cal state northridge andrew ainsworth phd

36
ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Upload: julius-carvell

Post on 01-Apr-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

ANOVA DemoPart 2: Analysis

Psy 320Cal State Northridge

Andrew Ainsworth PhD

Page 2: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Review: Sample Variance

22

1iX X

sN

22

1iX X SS

sN df

This can be re-written into:

Page 3: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Data SetScores G Means

71

101178

141316

Mean 9.667 9.667

G1

G2

G3

6.00

8.67

14.33

FYI: N = 9

Page 4: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Scores G Means71

101178

141316

Mean 9.667 9.667

G1

G2

G3

6.00

8.67

14.33

Total Sums of SquaresLet’s calculate the Sums of Squares (SS) for this data set as it is…

2T i GMSS X X As we can see the mean of 9.67 has already been calculated for us and we are going to treat that 9.67 as a Grand Mean (i.e. ungrouped mean)

Page 5: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Scores G Means71

101178

141316

Mean 9.667 9.667

G1

G2

G3

6.00

8.67

14.33

Total Sums of Squares

2 2 2

2 2 2

2 2 2

7 9.667 1 9.667 10 9.667

11 9.667 7 9.667 8 9.667

14 9.667 13 9.667 16 9.667

TSS

164TSS

7.111 75.111 0.111

1.778 7.111 2.778

18.778 11.111 40.111

TSS

Page 6: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Scores G Means71

101178

141316

Mean 9.667 9.667

G1

G2

G3

6.00

8.67

14.33

Between Groups Sums of SquaresSo, the Total Sums of Squares applies to this data if all of the 9 data points were collected as part of a single 9 member group.

However, what if the data were collected in groups of 3 instead

And let’s imagine that each group is receiving some different form of treatment (e.g. Independent variable) that we think will affect the subjects’ scores along with each individual group’s mean

Page 7: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Between Groups Sums of SquaresNote: that each group has it’s own mean that describes the central tendency of the participants in that group (e.g. 6 is the mean of 7, 1 and 10)

Also Note: That with equal subjects in each group the average of the 3 group means is the “Grand Mean” from earlier (i.e. 9.67 is the average of 6, 8.667 and 14.333)

Scores G Means71

101178

141316

Mean 9.667 9.667

G1

G2

G3

6.000

8.667

14.333

Page 8: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Between Groups Sums of SquaresSo, if we want to understand the effect that the different treatments are having on the group via the participants (i.e. how are the treatments moving the participants away from the grand mean) we can pretend for a second that every participant scored exactly at their own group mean

Scores G Means71

101178

141316

Mean 9.667 9.667

G1

G2

G3

6.000

8.667

14.333

Page 9: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Between Groups Sums of SquaresSo, if we want to understand the effect that the different treatments are having on the group via the participants (i.e. how are the treatments moving the participants away from the grand mean) we can pretend for a second that every participant scored exactly at their own group mean

Note: the group means and grand mean stays the same

Scores G Means666

8.6678.6678.667

14.33314.33314.333

Mean 9.667 9.667

G1 6.000

G2 8.667

G3 14.333

Page 10: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Scores G Means666

8.6678.6678.667

14.33314.33314.333

Mean 9.667 9.667

G1 6.000

G2 8.667

G3 14.333

Between Groups Sums of SquaresLet’s ignore the group means for a second and calculate the SS pretending that every participant scored at their group mean

2 2 2

2 2 2

2 2 2

6 9.667 6 9.667 6 9.667

8.667 9.667 8.667 9.667 8.667 9.667

14.333 9.667 14.333 9.667 14.333 9.667

BGSS

108.655BGSS

2BG gi GMSS X X

Page 11: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Between Groups Sums of SquaresLet’s take a look at that last formula and see that when calculated there is a lot of redundancy within each group

2 2 26 9.667 6 9.667 6 9.667BGSS

For instance for the first group we are subtracting and squaring the same number 3 times (i.e. one for each participant)

Couldn’t we come to the same answer by simply doing it once and multiplying by the number of participants in that group (i.e. 3)?

23* 6 9.667BGSS

Page 12: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Between Groups Sums of SquaresThis is why typically we don’t substitute the mean for every person’s score but just weight the difference by the number of scores in each group (i.e. ng)

2 2

2

[3* 6 9.667 ] [3* 8.667 9.667 ]

[3* 14.333 9.667 ]

BGSS

108.655BGSS

2BG g g GMSS n X X

Page 13: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Within Groups Sums of SquaresLooking back at the original data we can see that in fact the subject rarely, if ever, scored exactly at their group mean…

So, something else, beside our hypothesized treatment, is causing our subjects to differ within each of the groups

We haven’t hypothesized it, therefore we can’t explain why it’s there so we are going to assume that it is just random variation, but we still need to identify it…

Scores G Means71

101178

141316

Mean 9.667 9.667

G1

G2

G3

6.000

8.667

14.333

Page 14: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Within Groups Sums of Squares

Scores71

10Mean 6

G1

To identify the random variation, let’s look inside each group separately to see if we can find an average degree of variation

Remembering that variance is SS/df let’s identify the with group SS values for each group

For Group 1: 1 1

2

WG i gSS X X

1

2 2 27 6 1 6 10 6WGSS

142WGSS

Page 15: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Within Groups Sums of SquaresTo identify the random variation, let’s look inside each group separately to see if we can find an average degree of variation

Remembering that variance is SS/df let’s identify the with group SS values for each group

For Group 2: 2 2

2

WG i gSS X X

2

2 2

2

11 8.667 7 8.667

8 8.667

WGSS

28.667WGSS

Scores1178

Mean 8.667

G2

Note: this equals the mean coincidentally

Page 16: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Within Groups Sums of SquaresTo identify the random variation, let’s look inside each group separately to see if we can find an average degree of variation

Remembering that variance is SS/df let’s identify the with group SS values for each group

For Group 3: 3 3

2

WG i gSS X X

3

2 2

2

14 14.333 13 14.333

16 14.333

WGSS

34.667WGSS

Scores141316

Mean 14.333

G3

Page 17: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Within Groups VarianceThese Within Groups Sums of Squares can be used to tell us how people just “randomly” spread out within each of groups

Remembering that variance is SS/df let’s divide each groups SS by it’s degrees of freedom (df)

22

1j

j

i g

WGj

X X

n

Where nj is the number of participants in each group (e.g. for our example nj = 3 for all of the groups)

Page 18: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

The Variance Within Each Group

22

1j j

j

j

i g WG

WGj WG

X X SS

n df

For group 1:

1

1

1

2 4221

3 1WG

WGWG

SS

df

Page 19: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

The Variance Within Each Group

22

1j j

j

j

i g WG

WGj WG

X X SS

n df

For group 2:

2

2

2

2 8.6674.334

3 1WG

WGWG

SS

df

Page 20: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

The Variance Within Each Group

22

1j j

j

j

i g WG

WGj WG

X X SS

n df

For group 3:

3

3

3

2 4.6672.334

3 1WG

WGWG

SS

df

Page 21: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Average Within Groups VarianceNow that we have the variances within each of the groups we can calculate an average within groups variance that is an extension of the pooled variance from the independent samples t-testBecause the values for nj are equal this is just a simple average (Note: if the nj values are not equal you can perform a weighted average or just calculate the WG variance directly as in the next slide)

2 21 4.334 2.3349.223

3WG

Note: No subscript because this is not for any particular group but an average across the across the groups

Page 22: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Within Groups Sums of SquaresThe value for the overall WG Sums of Squares could have been calculated directly by simply combining the SSWG formula across groups

Note again: that there is no subscript on the SS value because it is done for all groups at the same time

Scores G Means71

101178

141316

Mean 9.667 9.667

G1

G2

G3

6.000

8.667

14.333

2jWG i gSS X X

Page 23: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Within Groups Sums of SquaresRemembering that the means for the groups are 6, 8.667 and 14.333 respectively we can simply take every individual score and subtract the mean of the group the score belongs to

All together now…

2 2 2

2 2 2

2 2 2

7 6 1 6 10 6

11 8.667 7 8.667 8 8.667

14 14.333 13 14.333 16 14.333

WGSS

55.333WGSS

Page 24: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Source SS df MS (2) FBetween Groups 108.655 2 54.3275 5.89Within Groups 55.333 6 9.223Total 164 8

ANOVA Summary Table

Let’s take what we know and see if we can’t put it together and summarize it using the table aboveWe know that the SSTotal = 164We know that the SSBG = 108.655We know that the SSWG = 55.333We also know that the WG variance (i.e. MSWG above) = 9.223 from the average of the 3 group variancesNote: The SS for Between and Within add up to the SS-total as it should

Page 25: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Source SS df MS (2) FBetween Groups 108.655 2 54.3275 5.89Within Groups 55.333 6 9.223Total 164 8

ANOVA Summary Table

We have the SS value for the BG source of variability but we need to convert it to a variance.Remembering that variance is SS/df, we just need to figure out what are the BG degrees of freedom.

# 1 3 1 2BGdf groups

Page 26: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Source SS df MS (2) FBetween Groups 108.655 2 54.328 5.89Within Groups 55.333 6 9.223Total 164 8

ANOVA Summary Table

We have the SS value for the BG source of variability but we need to convert it to a variance.Remembering that variance is SS/df, we now just need to divide the SS value by the df value for the Between Groups source

2 108.65554.328

2BG

BG BGBG

SSMS

df

Page 27: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Source SS df MS (2) FBetween Groups 108.655 2 54.328 5.89Within Groups 55.333 6 9.223Total 164 8

ANOVA Summary Table

We have the SS value and the MS value for the WG source of variability but these to values should be connected, somehow…

Remembering that variance is SS/df, we just need to figure out what are the degrees of freedom Within Groups to see if we divide in the same way as with the BG source if we get the same value (i.e. 9.223)

Page 28: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

ANOVA Summary TableWhen we calculated the Within Groups variance we did so by averaging over the three individual group variances, each of which had a n – 1 degree of freedom

So, that’s an n – 1 for each group or g * (n – 1)

Or you can think of it as you need to calculate a mean for each group so you simply take the total number of scores (i.e. N) and subtract 1 for every group (i.e. g), and that’s N – g

Note: that if all of the nj values are equal then g*(n-1) = N – g

Page 29: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Source SS df MS (2) FBetween Groups 108.655 2 54.328 5.89Within Groups 55.333 6 9.223Total 164 8

ANOVA Summary Table

We have 9 total subjects and 3 groups so that’s…

3*(3 1) 3*2 6

9 3 6

WG

WG

df

OR

df N g

Page 30: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Source SS df MS (2) FBetween Groups 108.655 2 54.328 5.89Within Groups 55.333 6 9.223Total 164 8

ANOVA Summary Table

If we divide the SS value by the df value for the Within Groups source of variance we in fact get the same value we calculated earlier using the pooling method

2 55.3339.222

6WG

WG WGWG

SSMS

df

Page 31: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Source SS df MS (2) FBetween Groups 108.655 2 54.328 5.89Within Groups 55.333 6 9.223Total 164 8

ANOVA Summary Table

In order to calculate the total SS we needed to estimate a single Grand Mean, because of this we lose one degree of freedom.

The total degrees of freedom is simply the total number of participants (i.e. N) minus 1

1 9 1 8Totaldf N Note: The degrees of freedom for BG and WG sum to the df-total as they should

Page 32: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

ANOVA Summary TableIn the ANOVA demo #1 we talked about how the Between Groups Variance is a measure of how far apart the groups are from the Grand Mean, which in turn tells us how far apart they are from each other on average.

In order for us to know if the groups are varying far away from each other (i.e. they are significantly different) we need a measure of random variability to see if our groups are differing more than just randomly

The Within Groups Variance tells us how much individuals vary from one another on average across the groups and this is our best estimate of random variability so we use it to see if the groups are different by creating the F-ratio

Page 33: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Source SS df MS (2) FBetween Groups 108.655 2 54.328 5.89Within Groups 55.333 6 9.223Total 164 8

ANOVA Summary Table

The F-ratio is simply the ratio of the Between Groups variance over the Within Groups Variance

2

2

54.3285.89

9.223BG BG

WG WG

MSF

MS

Page 34: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

ANOVA Summary TableThe Between Groups variance contains both Real and Random variability, while the Within Groups variance contains only random (at least that’s what we are assuming).

So in order for an F-ratio to be large the real group differences have to be large enough for us to see them through the random differences

2 2 2Real Random

2 2Random

BG

WG

F

If no real differences exist than you are left with 2 2

Random2 2

Random

1BG

WG

F

Page 35: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Source SS df MS (2) FBetween Groups 108.655 2 54.328 5.89Within Groups 55.333 6 9.223Total 164 8

ANOVA Summary Table

The values found in the F-table indicate how much “real” variability exists between the groups compared to the random variability, controlling for the number of groups (i.e. dfBG) and the number of people in each group (i.e. dfWG)

For our example

( , ) (2,6) 5.143crit BG WG critF df df F

Page 36: ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

ANOVA Summary Table

The value of 5.143 tells us that any value of 5.143 or larger is not likely to occur by accident (i.e. it has a .05 of lower probability) given the number of groups and the number of subjects per groupSince our F-value is 5.89 and that is larger than 5.143 we can conclude that some significant group difference occurs somewhere between 2 of our group means

( , ) (2,6) 5.143crit BG WG critF df df F

Source SS df MS (2) FBetween Groups 108.655 2 54.328 5.89Within Groups 55.333 6 9.223Total 164 8