psy 307 – statistics for the behavioral sciences chapter 19 – chi-square test for qualitative...
Post on 15-Jan-2016
221 views
TRANSCRIPT
PSY 307 – Statistics for the Behavioral Sciences
Chapter 19 – Chi-Square Test for Qualitative DataChapter 21 – Deciding Which Test to Use
Chi-Square (2) Test
For qualitative data Tests whether observed frequencies
are closely similar to hypothesized expected frequencies.
Expected frequencies can be probabilities determined by chance or other values based on theory.
Two Tests
One-way (one variable) chi-square: Tests observed frequencies against a
null hypothesis of equal or specified proportions.
Two-way (two variable) chi-square: Tests observed frequencies against
specified proportions across all cells of two cross-classified variables.
Another way of saying this is that it tests for an interaction.
Frequencies
Observed frequencies – the obtained frequency for each category in a study.
Expected frequencies – the hypothesized frequency for each category given a true null hypothesis.
Calculating Chi-Square (2)
Determine the expected frequencies.
Are the differences between the expected and the observed frequencies large enough to qualify as a rare outcome?
Calculate the 2 ratio. Compare against the 2 table with
appropriate degrees of freedom.
Blood Type Example
Blood Type
Frequency O A B AB Total
Observed (fo) 38 38 20 4 100
Expected (fe) 44 41 10 5 100
H0: PO = .44, PA = .41, PB = .10, PAB = .05
H1: H0 is false
e
eo
f
ff 22 )(
Calculating 2
e
eo
f
ff 22 )(
24.11
20.00.1022.82.5
1
10
100
41
9
44
365
)1(
10
)10(
41
)3(
44
)6(
5
)54(
10
)1020(
41
)4138(
44
)4438(
2222
2222
df = categories (c) - 1
Chi-Square Distribution
Chi Square Table
Look up the critical value for our df (c-1) and significance level (e.g., p < .05).
Is 11.24 greater than 7.81?
If yes, reject the null hypothesis. Conclude blood types are not distributed as in the general population.
Reject H0
About 2
Because differences from expected values are squared, the value of 2 cannot be negative.
Because differences are squared, the 2 test is nondirectional.
A significant 2 is not necessarily due to big differences, small ones can add up.
Two-Way 2
When observations are cross-classified according to two variables, a two-way test is used.
The two-way test examines the relationship between two variables. It is a test of independence between
them. Null hypothesis: independence. Alternative hypothesis: H0 is false.
Returned Letter Example
Neighborhood
Returned Letters
Downtown Suburbia Campus Total
Yes 41 32 47 120
No 19 38 23 80
Total 60 70 70 200
H0: Type of neighborhood and return rate of lost letters are independent.H1: H0 is false.
Calculating Expected Frequencies
Neighborhood
Returned Letters
Downtown Suburbia Campus Total
Yes fo 41 32 47 120
fe 36 42 42
No fo 19 38 23 80
fe 24 28 28
Total 60 70 70 200
) )( (
totalgrand
totalrowtotalcolumnfe 36
200
7200
200
)120)(60(ef
42200
8400
200
)120)(70(ef
Calculating Two-Way 2
Expected frequencies are based on the proportions found in the column and row totals.
Degrees of freedom are limited by the column and row totals.
Once expected frequencies and df have been found, calculate 2 the same as in a one-way test.
Calculating 2
e
eo
f
ff 22 )(
17.9
89.057.304.1060.38.269.028
)2823(
28
)2838(
24
)2419(
42
)4247(
42
)4232(
36
)3641( 222222
df = (columns – 1)(rows – 1)df = (3-1)(2-1) = 2
From the Chi Square Table, critical value is 5.99.
Our value of 9.17 exceeds 5.99 so reject the null. There is a relationship between neighborhood and letter return rate.
Effect Size for 2
Cramer’s Phi Coefficient ( ) Roughly estimates the proportion of
explained variance (predictability) between two qualitative variables.
.01 = small effect .09 = medium effect .25 = large effect
2c
)1(
22
knc
where k is the smaller of the number of rows or columns
Precautions
Observations must be independent of each other. One observation per subject.
Avoid small expected frequencies – must be 5 or more.
Avoid small sample sizes – increases danger of Type II error (retaining a false null hypothesis).
Avoid very large sample sizes.
A Repertoire of Hypothesis Tests
z-test – for use with normal distributions when σ is known.
t-test – for use with one or two groups, when σ is unknown.
F-test (ANOVA) – for comparing means for multiple groups.
Chi-square test – for use with qualitative data.
Null and Alternative Hypotheses
How you write the null and alternative hypothesis varies with the design of the study – so does the type of statistic.
Which table you use to find the critical value depends on the test statistic (t, F, , U, T, H).
t and z tests can be directional.
Deciding Which Test to Use
Is data qualitative or quantitative? If qualitative use Chi-square.
How many groups are there? If two, use t-tests, if more use ANOVA
Is the design within or between subjects?
How many independent variables (IVs or factors) are there?
Summary of t-tests
Single group t-test for one sample compared to a population mean.
Independent sample t-test – for comparing two groups in a between-subject design.
Paired (matched) sample t-test – for comparing two groups in a within-subject design.
Summary of ANOVA Tests
One-way ANOVA – for one IV, independent samples
Repeated Measures ANOVA – for one or more IVs where samples are repeated, matched or paired.
Two-way (factorial) ANOVA – for two or more IVs, independent samples.
Mixed ANOVA – for two or more IVs, between and within subjects.
Summary of Nonparametric Tests
Two samples, independent groups – Mann-Whitney (U). Like an independent sample t-test.
Two samples, paired, matched or repeated measures – Wilcoxon (T). Like a paired sample t-test.
Three or more samples, independent groups – Kruskal-Wallis (H). Like a one-way ANOVA.
Summary of Qualitative Tests
Chi Square (2) – one variable. Tests whether frequencies are equally
distributed across the possible categories. Two-way Chi Square – two variables.
Tests whether there is an interaction (relationship) between the two variables.