anova (statistics)
TRANSCRIPT
Types of samples and appropriate testing:
1 sample•Use 1-sample t-test
2 samples•Use 2 samples t-test
3 samples•Use ANOVA
ANOVA is also called 2-samples-and-more test (rather than 3-samples-and-more test) as 2 samples can also employ ANOVA
ANOVA can be:
• 1-way 1 independent variable
• 2-way 2 independent variable
• 3,4,etc-way 3,4,etc independent variable
Must be categorical (nominal/ordinal)
Must be categorical
Must be categorical
In ANOVA, whatever the type, there is always only 1 Dependent Variable
ANOVA is UNIVARIATE (1 Dependent Variable). If there are more than 1 Dependent Variables,
use MANOVA
Must be continuous (numerical/scale)
It can be further classified:
1-WAY ANOVA
INDEPENDENT ANOVA
REPEATED MEASURE
ANOVA
2-WAY ANOVA
INDEPENDENT ANOVA
REPEATED MEASURE
ANOVA
MIXED ANOVA
So, they are called 2-way independent Anova, 2-way mixed Anova, etc
We are testing the effect of blueberry on the eye sight.
Students taking blueberry
Specky
Non-specky
Students NOT taking blueberry
Specky
Non-specky
We can do t-test TWICE to test the samples. However, doing that will increase α (type 1 error ie. we tend to reject Ho when Ho should
not be rejected). Instead of doing t-test repeatedly, we must do ANOVA
1 categorical independent variable
1 continuous dependent variable
3 or more groups (samples)
One-way Independent
ANOVA1-WAY ANOVA
INDEPENDENT ANOVA
REPEATED MEASURE ANOVA
First part of this chapter deals with 1-way Independent Anova
Later we will look at 1-way Repeated Measure Anova
One-way Independent ANOVA
Assumptions that MUST be fulfilled:
1. Normality (any one of three)W-S or K-S (p ≥ 0.05)Skewness test (within S ± 2SE) Coefficient of variation:
2. Homogeneity of varianceLevene’s test (p ≥ 0.05)
%30100 x
s
If not normal, use non-parametric tests like Mann-Whitney or
Kruskal-Wallis (but the latter does not have Post
Hoc)
If homogenous read Tukey test
If not, read Dunnette test
1. Analyze2. Descriptive3. Explore4. Plot5. Normality
W-S
One-way Independent ANOVA
Hypotheses:
1. Hoμ1 = μ2 = μ3 ….. μi
2. HAAt least one pair of means is not equal
(it can be μ1≠μ2 = μ3 etc)
One-way Independent ANOVA
If p < 0.05 (significant, ie Ho rejected), then must do Post Hoc test (multiple pairwise
comparison test)
Post Hoc Tests
Tukey Test
Dunnette Test
Bonferroni Test
If homogenous, No control
If not homogenous,Has control
For repeated measure
On the other hand if not significant, test stops
One-way Independent ANOVA
Knowledge Score on Vision and Mission of MSU
Student Yr1 60 55 45 50 55 60 70 45 35 35
Student Yr2 65 60 70 75 70 78 79 80 81 82 85
Student Yr3 60 60 60 60 70 70 70 70 75 70
Hypotheses:Ho: μ1 = μ2 = μ3
HA: At least one pair of means is not equal (it can be μ1≠μ2 = μ3 etc)
Transfer the data into PASW.Remember, since this is an independent test, all samples are recorded in similar column.
A study is carried out to determine if there is difference in the knowledge of Vision and Mission of the university among students of first year, second year and third year of The Management and Science University (MSU)
One-way Independent ANOVA
Variable view
One-way Independent ANOVA
Transfer the data from the test conducted in “Data View”
Since this is an independent test, same column (in this example labeled “year”) used for all samples
In repeated measure test we use different column for every variable
ANALYSIS 1
Normality test
One-way Independent ANOVA
1. Analyze2. Descriptive Statistics3. Explore4. Dependent = score5. Factor = year6. Plots7. Normality plot
MENU
Case Processing SummaryMSU Year Cases
Valid Missing Total
N Percent N Percent N PercentKnowledge
dimension1
Year 1 10 100.0% 0 .0% 10 100.0%Year 2 11 100.0% 0 .0% 11 100.0%Year 3 10 100.0% 0 .0% 10 100.0%
Descriptives
MSU Year
Statistic Std. ErrorKnowledge Year 1 Mean 51.000 3.5590
95% Confidence Interval for Mean
Lower Bound
42.949
Upper Bound
59.051
5% Trimmed Mean 50.833
Median 52.500
Variance 126.667
Std. Deviation 11.2546
Minimum 35.0
Maximum 70.0
Range 35.0
Interquartile Range 17.5
Skewness -.018 .687
Kurtosis -.563 1.334
Year 2 Mean 75.000 2.3549
95% Confidence Interval for Mean
Lower Bound
69.753
Upper Bound
80.247
5% Trimmed Mean 75.278
Median 78.000
Variance 61.000
Std. Deviation 7.8102
Minimum 60.0
Maximum 85.0
Range 25.0
Interquartile Range 11.0
Skewness -.731 .661
Kurtosis -.396 1.279
Year 3 Mean 66.500 1.8333
95% Confidence Interval for Mean
Lower Bound
62.353
Upper Bound
70.647
5% Trimmed Mean 66.389
Median 70.000
Variance 33.611
Std. Deviation 5.7975
Minimum 60.0
Maximum 75.0
Range 15.0
Interquartile Range 10.0
Skewness -.192 .687
Kurtosis -1.806 1.334
Tests of NormalityMSU Year Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.Knowledge
dimension1
Year 1 .139 10 .200* .952 10 .695Year 2 .195 11 .200* .931 11 .424Year 3 .327 10 .003 .770 10 .006
a. Lilliefors Significance Correction*. This is a lower bound of the true significance.
S-W test showed that Year 1 and Year 2 were normal but Year 3 was not
So, check Year 3 skewness: •Skewness -.192 .687
It showed normal. So we can use ANOVA
ANALYSIS 2
The ANOVA test
One-way Independent ANOVA
1. Analyze2. Compare means3. One-way ANOVA4. Dependent = score5. Factor = year6. Post Hoc7. Tukey8. Dunnette’s T39. Option10. Descriptive11. Homogeneity
MENU
If we have control, under Post Hoc choose Dunnette only
If p > 0.05, use TukeyIf p < 0.05, use Dunnette’s T3
Remember, we look at Post Hoc only if we reject Ho (ie there is at least a pair of means not equal)
Test of Homogeneity of Variances
Knowledge
Levene Statistic df1 df2 Sig.
2.022 2 28 .151P > 0.05, so homogeneity is assumed
Knowledge
MSU Year
N
Subset for alpha = 0.05
1 2
Tukey HSDa,b
dimension1
Year 1 10 51.000
Year 3 10 66.500
Year 2 11 75.000
Sig. 1.000 .079
Means for groups in homogeneous subsets are displayed.
a. Uses Harmonic Mean Sample Size = 10.313.
b. The group sizes are unequal. The harmonic mean of the group sizes
is used. Type I error levels are not guaranteed.
Homogenous subsetSee there are group 1 (Year 1) and group 2 (Year 3 and Year 2)
So, Year 2 and Year 3 are not significant, but both are significant when compared to Year 1
μ1 ≠ μ2 = μ3 (ie at least one pair of means is not equal)
ANALYSIS 3
GLM (General Linear Model) test
The general linear model incorporates a number of different statistical models: ANOVA, ANCOVA, MANOVA, MANCOVA, ordinary linear regression, t-test and F-test. GLM is therefore a more general concept, compared to ANOVA.
One-way Independent ANOVA
1. Analyze2. General Linear Model3. Univariate
MENU
Click year to Horizontal Axis first, then click Add
1
2Plots
One-way Independent ANOVA
3 5
4If p is high (not significant, ie rejecting Ho), look at Observed Power (B). If
B is high (0.8 ie. 80% or more), then confirm to reject Ho. If B is low, probably means that the low sample size used in the test results in
rejection of Ho. Ho can still be accepted, instead of rejected – refer to type II error
Estimate of effect size will returns “Partial ETA Squared”. Value of 0.14 or more means high. Effect size is NOT influenced by sample number (as
opposed to p value, which can be influenced by sample size)
Cook’s distance shows the outliers. The value should be less than 1. Value of more than 1 means outlier (that can be removed). See Cook’s
distance at DATA VIEW under COO_1
Post Hoc
Save
Options
One-way Independent ANOVA
Estimated Marginal Means
MSU Year
Dependent Variable:Knowledge
MSU Year
Mean Std. Error
95% Confidence Interval
Lower
Bound
Upper
Bound
dimension1
Year 1 51.000 2.707 45.454 56.546
Year 2 75.000 2.581 69.712 80.288
Year 3 66.500 2.707 60.954 72.046
Tests of Between-Subjects Effects
Dependent Variable:Knowledge
Source Type III Sum of
Squares df Mean Square F Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powerb
Corrected Model 3075.242a 2 1537.621 20.976 .000 .600 41.952 1.000
Intercept 127380.859 1 127380.859 1737.717 .000 .984 1737.717 1.000
year 3075.242 2 1537.621 20.976 .000 .600 41.952 1.000
Error 2052.500 28 73.304
Total 134160.000 31
Corrected Total 5127.742 30
a. R Squared = .600 (Adjusted R Squared = .571)
b. Computed using alpha = .05
Estimates of Effect Size (Partial ETA Squared) and Observed Power
This refers to unweighted means. This is important when comparing the means of
unequal sample sizes (as in ANOVA), where you take into consideration each mean in
porportion to its sample size. Unequal sample size can occur eg. due to drop-out
of participants which can destroy the random assignment of subjects to conditions, a critical feature of the
experimental design
One-way Independent ANOVA
Cook’s Distance
Cook’s distance should be less than 1. If not, the data can be excluded.
SPSS 17 (left) and PASW 18 (right) show different results. See row 14.
The actual reading for COO_1 should be 0.00, not 2.53. It seems that there is a bug in PASW 18.
One-way Independent ANOVA
The profile plot can be included in the thesis result
Profile Plot
One-way Repeat Measure
ANOVA1-WAY ANOVA
INDEPENDENT ANOVA
REPEATED MEASURE ANOVA
We have looked at 1-way Independent Anova
Now, we look at 1-way Repeated Measure Anova
One-way Repeat Measure ANOVA
In Repeat Measure, we repeat the test on the SAME sample but at DIFFERENT time intervals.
The data for different time or day must be put in DIFFERENT COLUMNS of PASW Variable View.
In this test, we are not concerned about homogeneity. Rather we are concerned about sphericity (Maunchly’s Sphericity Test). The value, W>0.05 showed sphericity.
For pairwise comparison (Post Hoc), we do not use Tukey or Dunnette but Bonferroni Test.
[If W>0.05, read Sphericity row. If W<0.05, read Greenhouse row]
Knowledge Score on Vision and Mission of MSUSunday 60 55 45 50 55 60 70 45 35 35 65
Monday 60 55 45 50 55 82 85 60 60 60 60
Friday 85 60 60 60 60 70 70 70 70 75 70
Hypotheses:Ho: μ1 = μ2 = μ3
HA: At least one pair of means is not equal (it can be μ1≠μ2 = μ3 etc)
Transfer the data into PASW.Remember, since this is repeated measure test, all samples are recorded in different columns.
A study is carried out to determine if there is difference in the knowledge of Vision and Mission of the university on different days among students of first year of The Management and Science University (MSU)
One-way Repeat Measure ANOVA
One-way Repeat Measure ANOVA
Variable view
Transfer the data from the test conducted in “Data View”In repeated measure test we use different column for every variable
One-way Repeat Measure ANOVA
1. Analyze2. General Linear Model3. Repeated Measures4. Factor5. Number of levels = 36. Define7. (Move all knowledge to right)8. Option9. Compare main effects10. (See picture on right, Select Bonferroni)11. Descriptive12. Estimates13. Observed power14. Save15. Cook’s distance16. Plots17. Move factor1 to Horizontal Axis18. Add19. Continue
MENU
One-way Repeat Measure ANOVA
Step 7
Step 9 - 13
Mauchly's Test of Sphericityb
Measure:MEASURE_1
Within Subjects Effect
Mauchly's W
Approx. Chi-
Square df Sig.
Epsilona
Greenhouse-
Geisser Huynh-Feldt Lower-bound
dimension1
factor1 .961 .359 2 .835 .962 1.000 .500
Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to
an identity matrix.
a. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests of
Within-Subjects Effects table.
b. Design: Intercept
Within Subjects Design: factor1
One-way Repeat Measure ANOVA
[If W>0.05, read Sphericity row. If W<0.05, read Greenhouse row]
Tests of Within-Subjects Effects
Measure:MEASURE_1
Source Type III Sum
of Squares df Mean Square F Sig.
Partial Eta
Squared
Noncent.
Parameter
Observed
Powera
factor1 Sphericity Assumed 1397.515 2 698.758 9.347 .001 .483 18.694 .957
Greenhouse-Geisser 1397.515 1.925 726.116 9.347 .002 .483 17.990 .951
Huynh-Feldt 1397.515 2.000 698.758 9.347 .001 .483 18.694 .957
Lower-bound 1397.515 1.000 1397.515 9.347 .012 .483 9.347 .787
Error(factor1) Sphericity Assumed 1495.152 20 74.758
Greenhouse-Geisser 1495.152 19.246 77.685
Huynh-Feldt 1495.152 20.000 74.758
Lower-bound 1495.152 10.000 149.515
a. Computed using alpha = .05
W > 0.05W < 0.05
See that the Observed Power is high
Look at Mauchly’s W
In this example, W = 0.961
Since W > 0.05, we will read
Sphericity, not Greenhouse
One-way Repeat Measure ANOVA
Pairwise Comparisons here is Bonferroni test
1 and 2 are not significant (p=0.092)
1 and 3 are significant (p=0.08)
So we reject Ho because at least one pair of means is not equal
Pairwise Comparisons
Measure:MEASURE_1
(I) factor1 (J) factor1
Mean
Difference
(I-J)
Std.
Error Sig.a
95% Confidence Interval
for Differencea
Lower
Bound
Upper
Bound
dimension1
1dimension2
2 -8.818 3.508 .092 -18.886 1.250
3 -15.909* 4.035 .008 -27.490 -4.328
2dimension2
1 8.818 3.508 .092 -1.250 18.886
3 -7.091 3.491 .209 -17.112 2.930
3dimension2
1 15.909* 4.035 .008 4.328 27.490
2 7.091 3.491 .209 -2.930 17.112
Based on estimated marginal means
a. Adjustment for multiple comparisons: Bonferroni.
*. The mean difference is significant at the .05 level.