previous lecture: phylogenetics. analysis of variance this lecture judy zhong ph.d
TRANSCRIPT
![Page 1: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/1.jpg)
Previous Lecture: Phylogenetics
![Page 2: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/2.jpg)
Analysis of Variance
This Lecture
Judy Zhong Ph.D.
![Page 3: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/3.jpg)
Learning Objectives
Until now, we have considered two groups of individuals and we've wanted to know if the two groups were sampled from distributions with equal population means or medians.
Suppose we would like to consider more than two groups of individuals and, in particular, test whether the groups were sampled from distributions with equal population means.
How to use one-way analysis of variance (ANOVA) to test for differences among the means of several populations ( “groups”)
![Page 4: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/4.jpg)
Hypotheses of One-Way ANOVA
All population means are equal No treatment effect (no variation in means among groups)
At least one population mean is different There is a treatment effect Does not mean that all population means are different
(some pairs may be the same)
H1:Not all of the population means are the same
![Page 5: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/5.jpg)
One-Factor ANOVA
All means are the same:The null hypothesis is true
(No treatment effect)
![Page 6: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/6.jpg)
One-Factor ANOVA
At least one mean is different:The null hypothesis is NOT true
(Treatment effect is present)
or
(continued)
![Page 7: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/7.jpg)
One-Way ANOVA:Model Assumptions
The K random samples are drawn from K independent populations
The variances of the populations are identical The underlying data are approximately normally
distributed
![Page 8: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/8.jpg)
Basic Idea partitioning the variation
Suppose there are K groups with observations.
yij j-th observation in i-th group, y overall mean,
yi mean of group i
y
ijy y
i y y
ij y
i Deviation of group mean from grand
mean
Deviation of observations from
group mean
y
ij
i
ij
Knnn ,...,, 21
![Page 9: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/9.jpg)
Partitioning the variation
y
ij y 2 y
ij y
i 2 yi y 2
Total variation(total SS)
Variation due to random sampling(within SS)
Variation due to factor(between SS)
Total variation is the sum of Within-group variability and Between-group variability
y
ij y y
ij y
i yi y
y
ij y
i Deviation of observations from group mean (within group variability)
Deviation of observations from overall mean (between group variability) y i y
![Page 10: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/10.jpg)
Partitioning the variation
y
ij y 2
j1
ni
i1
3
yij y
i 2j1
ni
i1
3
yi y 2
j1
ni
i1
3
y overall mean
yi mean of group i
n13,n
24,n
34
G rou p 1 G rou p 2 G rou p 3
Resp on se , X
y
y1 y2
y3
![Page 11: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/11.jpg)
Group 1 Group 2 Group 3
Response, X
Group 1 Group 2 Group 3
Response, X
If Between group variability is large and Within group variability is small => reject Ho
If Between group variability is small and Within group variability is large => accept Ho
Basic Idea of ANOVA
![Page 12: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/12.jpg)
Partition of Total Variation
Variation Due to Factor (Between SS)
Variation Due to Random Sampling (Within SS)
Total Variation (total SS)
Commonly referred to as: Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained Within-Group Variation
Commonly referred to as: Sum of Squares Between Sum of Squares Among Sum of Squares Explained Among Groups Variation
= +
d.f. = n – 1
d.f. = k – 1 d.f. = n – k
![Page 13: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/13.jpg)
Total Sum of Squares
Total SS (y
ij y )2
i1
n j
j1
k
Where:
Total SS = Total sum of squares
k = number of groups (levels or treatments)
nj = number of observations in group j
yij = ith observation from group j
= grand mean (mean of all data values) y
Total SS = Between SS + Within SS
![Page 14: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/14.jpg)
Total Variation
G rou p 1 G rou p 2 G rou p 3
Resp on se , X
Total SS (y
11 y )2 (y
12 y )2 ... (y
knk y )2
y
![Page 15: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/15.jpg)
Between-Group Variation
y1
Group 1 Group 2 Group 3
Response, X
Between SS (y
j y )2
i1
n j
j1
k
n1(y
1 y )2 n
2(y
2 y )2 ...n
k(y
k y )2
y2
y3
y
![Page 16: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/16.jpg)
Within-Group Variation
1Y
3Y
G rou p 1 G rou p 2 G rou p 3
Resp on se , X
Within SS (y
ij y
i)
j1
ni
i1
k
(ni 1) *S
i2
i1
k
(continued)
2Y
![Page 17: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/17.jpg)
Obtaining the Mean Squares
Within MS
Within SS
n k
Between MS
Between SS
k 1
Total MS
Total SS
n 1
![Page 18: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/18.jpg)
One-Way ANOVA Table
Source of Variation
dfSS MS(Variance)
Between Groups
B SS BMS =
Within Groups
n - kW SS WMS =
Total n - 1TSS =BSS+WSS
k - 1 BMS
WMS
F ratio
k = number of groupsn = sum of the sample sizes from all groupsdf = degrees of freedom
BSS
k - 1
WSS
n - k
F =
![Page 19: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/19.jpg)
One-Way ANOVAF Test Statistic
Test statistic
Degrees of freedom df1 = k – 1 (k = number of groups) df2 = n – k (n = sum of sample sizes from all populations)
F
Between MS
Within MS
![Page 20: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/20.jpg)
Interpreting One-Way ANOVA F Statistic
The F statistic is the ratio of the among estimate of variance and the within estimate of variance The ratio must always be positive df1 = k -1 will typically be small df2 = n - k will typically be large
Decision Rule: Reject H0 if F > FU
Otherwise do not reject H0
0
= .05
Reject H0Do not reject H0
FU
![Page 21: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/21.jpg)
Example
You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the 0.05 significance level, is there a difference in mean distance?
Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204
![Page 22: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/22.jpg)
••••
•
Example
270
260
250
240
230
220
210
200
190
••
•••
•••••
Distance
Y 1 249.2 Y 2 226.0 Y 3 205.8
Y 227.0
Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204
Club1 2 3
Y 1
Y 2
Y 3
Y
![Page 23: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/23.jpg)
Example
Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204
Y1 = 249.2
Y2 = 226.0
Y3 = 205.8
Y = 227.0
n1 = 5
n2 = 5
n3 = 5
n = 15
k = 3
B SS = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4
W SS = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6
BMS = 4716.4 / (3-1) = 2358.2
WMS = 1119.6 / (15-3) = 93.325.275
93.3
2358.2F
![Page 24: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/24.jpg)
Test Statistic:
Decision:
Conclusion:
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
![Page 25: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/25.jpg)
Test Statistic:
Decision:
Conclusion:
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
F
BMS
WMS
2358.2
93.325.275
![Page 26: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/26.jpg)
Test Statistic:
Decision:
Conclusion:
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
F = 25.275
F
BMS
WMS
2358.2
93.325.275
![Page 27: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/27.jpg)
Test Statistic:
Decision:
Reject H0 at = 0.05
Conclusion:
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
F = 25.275
F
BMS
WMS
2358.2
93.325.275
![Page 28: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/28.jpg)
Test Statistic:
Decision:
Reject H0 at = 0.05
Conclusion:There is evidence that at least one µj differs from the rest
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
F
BMS
WMS
2358.2
93.325.275
F = 25.275
![Page 29: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/29.jpg)
Source SS DF MS F P-value
Between 4716.4 2 2358.2 25.76 <0.001
Within 1119.6 12 93.3
Total 5836.0
ANOVA Table
![Page 30: Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2c5503460f94c471dd/html5/thumbnails/30.jpg)
Next Lecture: Categorical Data Methods