analysis of variance (anova)
DESCRIPTION
Analysis of variance (ANOVA). (from Chapter 7). Tests on multiple hypotheses. Consider the situation where the means for more than two groups are compared, e.g. mean alcohol expenditure for: (a) students; (b) unemployed; (c) employees - PowerPoint PPT PresentationTRANSCRIPT
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
1
Analysis of variance(ANOVA)
(from Chapter 7)
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
2
Tests on multiple hypotheses
• Consider the situation where the means for more than two groups are compared, e.g. mean alcohol expenditure for: (a) students; (b) unemployed; (c) employees
• One could run a set of two mean comparison tests (students vs. unemployed, students vs. employed, employed vs. unemployed)
• But.....too many results...
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
3
Analysis of Variance
• It is an alternative approach to mean comparison for multiple groups
• It is applicable to a sample of individuals that differ for one or more given factors
• It allows tests where variability in a variable is attributable to one (or more) factors
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
4
Example Economic position of Household Reference Person
Self-employed
Fulltime employee
Pt employee
Unempl. Ret unoc over min
ni age
Unoc - under min ni
age TOTAL
Mean 18.56 14.64 12.39 19.48 7.34 11.99 12.67
EFS: Total Alcoholic Beverages, Tobacco
St. Dev. 19.0 18.5 15.0 19.7 14.6 19.1 17.8
Are there significant difference across the means of these groups?
Or do the differences depend on the different levels of variability across the groups?
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
5
Analysis of Variance
• Here: the target variable is alcohol, bev.,
tobacco expenditure, the factor is the economic position of the HRP
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
6
One-way ANOVA
• Only one categorical variable (a single factor)• Several levels (categories) for that factor• The typical hypothesis tested through ANOVA
is that the factor is irrelevant to explain differences in the target variable (i.e. the means are equal, as in bivariate mean comparisons/t-tests)
• Apart from the tested factor(s), the groups should be safely considered homogeneous between each other
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
7
Null and alternative hypothesis for ANOVA
• Null hypothesis (H0): all the means are equal
• Alternative hypothesis (H1): at least two means are different
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
8
Measuring and decomposing the total variation
VARIATION BETWEEN THE GROUPS +VARIATION WITHIN EACH GROUP =________________________________
TOTAL VARIATION
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
9
The basic principle of the ANOVA:
If the variation explained by the different factor between the groups is significantly more relevant than the variation within the groups, then the factor is assumed to be statistically relevant in explaining the differences
The test statistic:
• The test statistic is computed as:
2
2
Variance between groups
Variance within groupsB
W
sF
s
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
10
Distribution of theF-statistic (one-tailed test)
if p<0,05 we refuse H0:
i.e. the means are not equal
Rejection area
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
11
ANOVA in SPSS
Target variable
Factor
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
12
SPSS outputANOVA
EFS: Total Alcoholic Beverages, Tobacco
6171.784 5 1234.357 4.024 .001
151535.3 494 306.752
157707.1 499
Between Groups
Within Groups
Total
Sum ofSquares df Mean Square F Sig.
Variation decomposition Degrees of freedom
Variance between
Variance withinp-value < 0.05
The null is rejected
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
13
Post-hoc tests
• They open the way to further explore the sources of variability when the null hypothesis of mean equality is rejected.
• It is usually relevant to understand which particular means are different from each other.
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
14
Some post-hoc tests
• LSD (least significant difference)• Duncan test• Tukey’s test• Scheffe test• Bonferroni post-hoc method • .......
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
15
ANOVA assumptions
Two key assumptions are needed for running analysis of variance without risks
1)that the sub-samples defined by the treatment are independent
2)that no big discrepancies exist in the variances of the different sub-samples
Statistics for Marketing & Consumer ResearchCopyright © 2008 - Mario Mazzocchi
16
Multi-way (factorial) analysis of variance
• This analysis measures the influence of two or more factors
• Beside the influence of each individual factor, it provides testing of interactions between treatments belonging to different factors
• ANOVA with more than two factors is rarely employed, as interpretation of results becomes quite complex