fundamental statistics in applied linguistics research spring 2010 weekend ma program on applied...

39
Fundamental Fundamental Statistics in Statistics in Applied Linguistics Applied Linguistics Research Research Spring 2010 Spring 2010 Weekend MA Program on Applied Weekend MA Program on Applied English English Dr. Da-Fu Huang Dr. Da-Fu Huang

Upload: gyles-morgan

Post on 13-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Fundamental Statistics in Fundamental Statistics in Applied Linguistics ResearchApplied Linguistics Research

Spring 2010Spring 2010Weekend MA Program on Applied English Weekend MA Program on Applied English

Dr. Da-Fu HuangDr. Da-Fu Huang

7. Finding group differences with Chi-Square when all variables are categorical

7.1 Two main uses of the chi-square testTest for goodness of fit of the dataTest for group independence

7. Finding group differences with Chi-Square

7.2 Test for goodness of fitOnly one categorical variable with 2 or more

levels of choicesWhether observed frequencies match expected

frequencies if every chance were equally likelyMeasure how good the fit is to the probabilities

that we expectDesired Foreign Language at one university (χ2 = 8.2, p = .09, df = 4)

Chinese Spanish French German Japanese

23 20 15 13 29

7. Finding group differences with Chi-Square

7.3 Test for group independence2 or more categorical variables with 2 or more

levels of choicesWhether there is any association between the

variablesDesired Foreign Language at two universities

Language

Chinese Spanish French German Japanese Total

HT U 23 20 15 13 29 100

BC U 14 25 10 26 25 100

Total 37 45 25 39 54 200

7. Finding group differences with Chi-Square 7.3 Test for group independence Observed and expected frequencies for the foreign language survey

Chinese Spanish French German Japanese Total

Observed frequencies

HT U 23 20 15 13 29 100

BC U 14 25 10 26 25 100

Total 37 45 25 39 54 200

Expected frequencies

HT U (100*37)/200 (100*45)/200 (100*25)/200 (100*39)/200 (100*54)/200

BC U (100*37)/200 (100*45)/200 (100*25)/200 (100*39)/200 (100*54)/200

HT U 18.5 22.5 12.5 18.5 27

BC U 18.5 22.5 12.5 18.5 27

χ2 = Σ [(O – E)2 / E] ( = 8.374, p = .07, df = 4 ) (df = # levels – 1)

7. Finding group differences with Chi-Square 7.4 Situations that look like Chi-square but are notScenario #1: Case study, only one participant

The binomial testScenario #2: Binary choice, only one variable

with exactly 2 levelsThe binomial test

Scenario #3: Matched pairs with categorical outcomeThe McNemar test

Scenario #4: Summary over a number of similar items by the same participants

Application activities (8.1.4): PP215-216

7. Finding group differences with Chi-Square

7.5 Data inspection: Tables and Crosstabs

7.5.1 Summary tables for goodness-of-fit data Analyze > Descriptive Statistics > Frequencies

Student English proficiency

Frequency Percent Valid Percent

Cumulative

Percent

Low 9 16.7 16.7 16.7

Mid 26 48.1 48.1 64.8

High 19 35.2 35.2 100.0

Valid

Total 54 100.0 100.0

7. Finding group differences with Chi-Square

7.5 Data inspection: Tables and Crosstabs

7.5.1 Summary tables for goodness-of-fit data Analyze > Descriptive Statistics > Frequencies

Student English proficiency

Frequency Percent Valid Percent

Cumulative

Percent

Low 9 16.7 16.7 16.7

Mid 26 48.1 48.1 64.8

High 19 35.2 35.2 100.0

Valid

Total 54 100.0 100.0

7. Finding group differences with Chi-Square 7.5 Data inspection: Tables and Crosstabs

7.5.2 Summary tables for group-independence data (crosstabs)

Analyze > Descriptive Statistics > Crosstabs Move variables into Row, Column, and Layer (when

more than 2 variables)

Student English proficiency * Major1 Crosstabulation

Count

Major1

non-English

majors English majors Total

High 9 4 13

Mid 25 4 29

Student English proficiency

Low 12 0 12

Total 46 8 54

7. Finding group differences with Chi-Square

7.5 Data inspection: Tables and Crosstabs

7.5.3 Bar plots with one and two categorical variables

Graphs > Legacy Dialogs > Bar With one variable, choose Simple, and

Summaries For Groups Of CasesWith 2 variables, choose Clustered, and

Summaries For Groups Of Cases. Put the variables in “Category Axis” and “Define clusters by” boxes

Bar plots with one categorical variable

Bar plots with two categorical variables

Bar plots with two categorical variables

7. Finding group differences with Chi-Square

7.6 Assumptions of Chi-Square (PP226-228)Independence of observations (no repeated

measures)Nominal data (no inherent rank or order)Data are normally distributed (there are at least

5 cases in every cell)Non-occurrences must be included as well as

occurrences

7. Finding group differences with Chi-Square 7.7 Chi-Square statistic test

7.7.1 One-way goodness-of-fit Chi-Square in SPSS Analyze > Nonparametric Tests > Chi-Square Put variable in “Test Variable List” box

Test Statistics

Student English

proficiency

Chi-Square 8.111a

df 2

Asymp. Sig. .017

a. 0 cells (.0%) have expected

frequencies less than 5. The

minimum expected cell

frequency is 18.0.

Student English proficiency

Observed N Expected N Residual

Low 9 18.0 -9.0

Mid 26 18.0 8.0

High 19 18.0 1.0

Total 54

7. Finding group differences with Chi-Square

7.7 Chi-Square statistic test

7.7.2 Two-way group-independence Chi-Square in SPSS Analyze > Descriptive Statistics > Crosstabs Tick “Display clustered bar charts” box for a bar plotOpen Statistics and tick “Chi-Square” and “Phi and

Cramer’s V” boxesOpen Cells and tick “Expected values” and all of the

boxes under “Percentages”

Chi-Square statistic test (Two-way group-independence )

Test Statistics

Student English

proficiency

students from

different colleges

Chi-Square 8.111a 16.000b

df 2 4

Asymp. Sig. .017 .003

a. 0 cells (.0%) have expected frequencies less

than 5. The minimum expected cell frequency is

18.0.

b. 0 cells (.0%) have expected frequencies less

than 5. The minimum expected cell frequency is

10.8.

Chi-Square statistic test (Two-way group-independence )

Chi-Square Tests

Value df

Asymp. Sig.

(2-sided)

Pearson Chi-Square 10.431a 8 .236

Likelihood Ratio 10.737 8 .217

Linear-by-Linear Association 2.182 1 .140

N of Valid Cases 54

a. 11 cells (73.3%) have expected count less than 5. The minimum

expected count is .33.

Assuming that the variables are ordinal. Report this if your variables have inherent rank

Alternative to Pearson Chi-Square. Should be equivalent to the Chi-Square when sample sizes are large.

Chi-Square statistic test (Two-way group-independence )

Measures of effect size for the chi-square Phi (2 x 2 contingency tables with 2 levels /var) Cramer’s V

(larger than 2 x 2 with more than 2 levels/var)

Symmetric Measures

Value Approx. Sig.

Phi .440 .236 Nominal by Nominal

Cramer's V .311 .236

N of Valid Cases 54

Measures of effect size for the chi-square

Phi (2 x 2 contingency tables with 2 levels /var)

Cramer’s V (larger than 2 x 2 with more than 2 levels/var)w = phi (2x2 tables); = V √r-1 ( >2 levels ) (V = Cramer’s V; r =

the # of rows or columns whichever is smaller)

Odds ratio (= N11*N22 / N12*N21)Table subscripts

N11 N12

N21 N22

Reporting Chi-square test results

Contingency table with a summary of data and statistical results

Chi-square valueDfP-valueEffect size (for test for group independence)Phi, Cramer’s V, w, or odds ratioExample reporting (P239)

Contingency table (2 X 3)

Student English proficiency * Major1 Crosstabulation

Count

Major1

non-English majors English majors Total

High 9 4 13

Mid 25 4 29

Student English proficiency

Low 12 0 12

Total 46 8 54

Application activities 8.5.3 (P234)

with Chi-Square in SPSS