psy 307 – statistics for the behavioral sciences chapter 19 – chi-square test for qualitative...

PSY 307 – Statistics for the Behavioral Sciences

Chapter 19 – Chi-Square Test for Qualitative DataChapter 21 – Deciding Which Test to Use

Chi-Square (2) Test

For qualitative data Tests whether observed frequencies

are closely similar to hypothesized expected frequencies.

Expected frequencies can be probabilities determined by chance or other values based on theory.

Two Tests

One-way (one variable) chi-square: Tests observed frequencies against a

null hypothesis of equal or specified proportions.

Two-way (two variable) chi-square: Tests observed frequencies against

specified proportions across all cells of two cross-classified variables.

Another way of saying this is that it tests for an interaction.

Frequencies

Observed frequencies – the obtained frequency for each category in a study.

Expected frequencies – the hypothesized frequency for each category given a true null hypothesis.

Calculating Chi-Square (2)

Determine the expected frequencies.

Are the differences between the expected and the observed frequencies large enough to qualify as a rare outcome?

Calculate the 2 ratio. Compare against the 2 table with

appropriate degrees of freedom.

Blood Type Example

Blood Type

Frequency O A B AB Total

Observed (fo) 38 38 20 4 100

Expected (fe) 44 41 10 5 100

H0: PO = .44, PA = .41, PB = .10, PAB = .05

H1: H0 is false

e

eo

f

ff 22 )(

Calculating 2

e

eo

f

ff 22 )(

24.11

20.00.1022.82.5

1

10

100

41

9

44

365

)1(

10

)10(

41

)3(

44

)6(

5

)54(

10

)1020(

41

)4138(

44

)4438(

2222

2222

df = categories (c) - 1

Chi-Square Distribution

Chi Square Table

Look up the critical value for our df (c-1) and significance level (e.g., p < .05).

Is 11.24 greater than 7.81?

If yes, reject the null hypothesis. Conclude blood types are not distributed as in the general population.

Reject H0

About 2

Because differences from expected values are squared, the value of 2 cannot be negative.

Because differences are squared, the 2 test is nondirectional.

A significant 2 is not necessarily due to big differences, small ones can add up.

Two-Way 2

When observations are cross-classified according to two variables, a two-way test is used.

The two-way test examines the relationship between two variables. It is a test of independence between

them. Null hypothesis: independence. Alternative hypothesis: H0 is false.

Returned Letter Example

Neighborhood

Returned Letters

Downtown Suburbia Campus Total

Yes 41 32 47 120

No 19 38 23 80

Total 60 70 70 200

H0: Type of neighborhood and return rate of lost letters are independent.H1: H0 is false.

Calculating Expected Frequencies

Neighborhood

Returned Letters

Downtown Suburbia Campus Total

Yes fo 41 32 47 120

fe 36 42 42

No fo 19 38 23 80

fe 24 28 28

Total 60 70 70 200

) )( (

totalgrand

totalrowtotalcolumnfe 36

200

7200

200

)120)(60(ef

42200

8400

200

)120)(70(ef

Calculating Two-Way 2

Expected frequencies are based on the proportions found in the column and row totals.

Degrees of freedom are limited by the column and row totals.

Once expected frequencies and df have been found, calculate 2 the same as in a one-way test.

Calculating 2

e

eo

f

ff 22 )(

17.9

89.057.304.1060.38.269.028

)2823(

28

)2838(

24

)2419(

42

)4247(

42

)4232(

36

)3641( 222222

df = (columns – 1)(rows – 1)df = (3-1)(2-1) = 2

From the Chi Square Table, critical value is 5.99.

Our value of 9.17 exceeds 5.99 so reject the null. There is a relationship between neighborhood and letter return rate.

Effect Size for 2

Cramer’s Phi Coefficient ( ) Roughly estimates the proportion of

explained variance (predictability) between two qualitative variables.

.01 = small effect .09 = medium effect .25 = large effect

2c

)1(

22

knc

where k is the smaller of the number of rows or columns

Precautions

Observations must be independent of each other. One observation per subject.

Avoid small expected frequencies – must be 5 or more.

Avoid small sample sizes – increases danger of Type II error (retaining a false null hypothesis).

Avoid very large sample sizes.

A Repertoire of Hypothesis Tests

z-test – for use with normal distributions when σ is known.

t-test – for use with one or two groups, when σ is unknown.

F-test (ANOVA) – for comparing means for multiple groups.

Chi-square test – for use with qualitative data.

Null and Alternative Hypotheses

How you write the null and alternative hypothesis varies with the design of the study – so does the type of statistic.

Which table you use to find the critical value depends on the test statistic (t, F, , U, T, H).

t and z tests can be directional.

Deciding Which Test to Use

Is data qualitative or quantitative? If qualitative use Chi-square.

How many groups are there? If two, use t-tests, if more use ANOVA

Is the design within or between subjects?

How many independent variables (IVs or factors) are there?

Summary of t-tests

Single group t-test for one sample compared to a population mean.

Independent sample t-test – for comparing two groups in a between-subject design.

Paired (matched) sample t-test – for comparing two groups in a within-subject design.

Summary of ANOVA Tests

One-way ANOVA – for one IV, independent samples

Repeated Measures ANOVA – for one or more IVs where samples are repeated, matched or paired.

Two-way (factorial) ANOVA – for two or more IVs, independent samples.

Mixed ANOVA – for two or more IVs, between and within subjects.

Summary of Nonparametric Tests

Two samples, independent groups – Mann-Whitney (U). Like an independent sample t-test.

Two samples, paired, matched or repeated measures – Wilcoxon (T). Like a paired sample t-test.

Three or more samples, independent groups – Kruskal-Wallis (H). Like a one-way ANOVA.

Summary of Qualitative Tests

Chi Square (2) – one variable. Tests whether frequencies are equally

distributed across the possible categories. Two-way Chi Square – two variables.

Tests whether there is an interaction (relationship) between the two variables.

psy 307 – statistics for the behavioral sciences chapter 19 – chi-square test for qualitative...

Documents

chisquare test

twoway test

value of c2

observed frequencies

c2 table

expected values

c2 ratio

significant c2