chi square goodness of fit test the goodness of fit of a supposed freaquencies to sample data. 1©...

Chi square goodness of fit test

• The goodness of fit of a supposed freaquencies to sample data.

1© V.Čekanavičius, G.Murauskas

goodness of fit

• Data. One categorical (nominal) sample.

• All data is divided into k categories.

• At least 5 respondents in each category.

• We make a conjecture about ratios between categories.

© V. Čekanavičius, G. Murauskas

goodness of fit

• Statistical hypothesis

H0 : Conjecture is correct.

H1 : Conjecture is incorrect.


H0 is rejected (data contradicts conjecture), if

H0 is accepted (data does not contradict conjecture), if

Here is the level of significance.

Conclusion based on p - value

05.0 p

05.0 p

05.0


SPSS goodness of fit test Is ratio between national majority and

national minority 7:2 ?


SPSS

data


SPSS

Here


SPSS

Supposed ratio

variable

8

SPSS

Supposed ratio


SPSS

Frequencies

0 No 276 282,3 -6,3

1 Yes 87 80,7 6,3

363

1

2

Total

Category Observed N Expected N Residual

minority Minority Classification

ObservedExpected

difference


SPSS

Test Statistics

,639

1

,424

Chi-Squarea

df

Asymp. Sig.

minority Minority

Classification

0 cells (,0%) have expected frequencies less than5. The minimum expected cell frequency is 80,7.

a.

test statistic

p-value

Data does not contradict the ratio 7:2.


ConcIusion

• Application of the goodness of fit test showed that there is no statistically significant difference between the supposed ratio of national majority/minority and sample data.


SPSS Special caseA marketing analyst claims that 25% of the

customers will by certain type of sweets packed in large boxes, 25% in medium boxes, 30% in small boxes and 20% in very small boxes.

Data: 50 bought large boxes, 40 medium, 72 small and 19 very small.

Does data contradict analyst‘s claim statistically

significantly?


SPSS

datais numeric


SPSS

Here


SPSS

Weight by


SPSS

Supposedratio

Weight isleft alone


SPSS

RUSIS

50 45.3 4.8

40 45.3 -5.3

72 54.3 17.7

19 36.2 -17.2

181

1.00

2.00

3.00

4.00

Total

Observed N Expected N Residual


SPSS

Test Statistics

15.050

3

.002

Chi-Squarea

df

Asymp. Sig.

RUSIS

0 cells (.0%) have expected frequencies less than5. The minimum expected cell frequency is 36.2.

a.

Data statistically significantly

contradicts the supposed ratio.19© V.Čekanavičius, G.Murauskas

CHI SQUARE TEST FORINDEPENDENCE

Test of association for categorical data

test

• Two categorical (nominal) variables.

• We test if those categorical variables are dependent.


Examples

• Does smoking depend on respondents religion;

• Do men and women vote similarly;• Is percent of male students the same in

all courses.

2


Data

All data is organized in cells according to two categorical variables.


Statistical hypothesis

H0 : variables are independent.

H1 : variables are dependent.


H0 is rejected (variables are dependent), if

H0 is accepted (variables are independent), if

Conclusion based on p-value

,050 p

0,05 p


Example

• Is percent of female employees the same for clerks and managers?


SPSS

data

Numeric orstring


SPSS

Here!


SPSS

rowNext, here

column


SPSS

check


SPSS

Then go


SPSS

check

check32© V.Čekanavičius, G.Murauskas

SPSS

JOBCAT Employment Category * GENDER Gender Crosstabulation

206 157 363

56.7% 43.3% 100.0%

95.4% 68.0% 81.2%

10 74 84

11.9% 88.1% 100.0%

4.6% 32.0% 18.8%

216 231 447

48.3% 51.7% 100.0%

100.0% 100.0% 100.0%

Count

% within JOBCAT Employment Category

% within GENDER Gender

Count



Count



1 Clerical

3 Manager

JOBCAT EmploymentCategory

Total

f Female m Male

GENDER Gender

Total


SPSS

Chi-Square Tests

54.935b 1 .000

53.154 1 .000

61.256 1 .000

.000 .000

447

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is 40.59.b.

p-value

p < 0.05, therefore, corresponding percents differ statistically significantly.


Conclusion

• Applying chi-square test we got that among clerks there is statistically significantly greater percent of women (56,7%), than among managers (11,9 %), p<0,01.


SPSS Special case• One hundred children watched violence-prone

shows and 100 watched nonviolent programs. After two weeks of observation each child was classified as either agressive or nonagressive. 63 watched violent shows and were agressive, 37 watched violent shows and were nonagressive, 30 nonviolent and agressive and 70 nonviolent and nonagressive.

• Are TV and behavior related?

36

SPSS

Numeric orstring


SPSS

Weight by‘kiek’


SPSS

Leave alone!Po to čia!Next, here!

Statistics and Cells are delt in the same way as before 39

SPSS

ELGESYS * TV Crosstabulation

30 63 93

32.3% 67.7% 100.0%

30.0% 63.0% 46.5%

70 37 107

65.4% 34.6% 100.0%

70.0% 37.0% 53.5%

100 100 200

50.0% 50.0% 100.0%

100.0% 100.0% 100.0%

Count

% within ELGESYS

% within TV

Count

% within ELGESYS

% within TV

Count

% within ELGESYS

% within TV

agres

neagr

ELGESYS

Total

nesmurt smurt

TV

Total

violent TV watchers are more agressive


SPSS

Chi-Square Tests

21.887b 1 .000

20.581 1 .000

22.314 1 .000

.000 .000

200

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is 46.50.b.

stat. significantly


Conclusion

• Applying chi-square test we established that among watchers of violent TV are greater percent of agressive children (63%) then among non-watchers (30 %), p<0,01.


43

Mc Nemar test

© V.Čekanavičius, G.Murauskas

44

Mc Nemar test

Most freaquently is applied when we have dichotomuous data for the same respondents.

• Buyers and non-buyers before and after advertisment.

• Voters and non-voters before and after TV debates.


45

Data One two-valued categorical variable

observed in two related populations Or in one population twice.


46

Duomenys

dc

ba

After

Before


47


H0 : no impact of advertisment

H1 : significant impact


48

H0 is rejectest (impact stat. significant), if

H0 is accepted (impact not significant), if

Here 0.05 is the level of significancy.


05.0 p

05.0 p


49

SPSS

• Voters were twice asked about their support for candidate before and after TV debates:

• Before and after debates vote support candidate 200

• Before debates support, after - not 30• Before do not support, after support 60• Before and after do not support100• Does dabates influenced voters preferences?


50

SPSS


51

SPSS

Weight by


52

SPSS

Here!


53

SPSS

variables

here


54

SPSS

check


55

SPSS

pries * po Crosstabulation

Count

po

TotalNe užpries

Ne 100 60 160už 30 200 230

Total 130 260 390


SPSS

Number of supporters increased

statistically significantly.

Chi-Square Tests

Value Exact Sig. (2-sided)

McNemar Test .002a

N of Valid Cases 390

a. Binomial distribution used.

p-value


Nonparametric tests

• Are also called rank tests• Normality of variables is not required;• Fits small samples;• More difficult to interpret; test is nonparametric test but not

a rank test

Typical hypothesis

• H0 : distributions of X and Y are equal• H1 : distributions of X and Y differ

© V. Čekanavičius, G. Murauskas 60

Mann - Whitney test


Mann-Whitney test

1. Analogue of Student‘s test for independent samples;

2. Means are not compared;

3. Compares distributions;

4. The lager mean rank shows which variable is stochastically larger.


Data1. Two independent interval or rank

samples.

2. Sample sizes can be dfferent.

3. Rank variable has at least 5 different outcomes.



H0 : distributions are equal,

H1 : distributions differ.


H0 is rejected (distributions differ) if

H0 is accepted (distributions do not differ) if

Here is the level of significance


α p

α p

α


Example

• We investigate respondents, who are older than 40 years.

• Do classical music is equally appreciated by men and women?

• Values: 1-like it very much, 2-like it,….,5- hate it.

SPSS

• After suitable select cases (age >40)

• Analyze -> Nonparametric Tests -> Legacy Dialogs ->2 independent samples


SPSS



SPSS

• Males chose greater marks -> they like classical music less.

Ranks sex Respondent's

Sex N Mean Rank

Sum of Ranks

classicl Classical Music

1 Male 321 412,07 132273,50

2 Female 462 378,06 174662,50

Total 783


SPSS

• Statistically significantly, p =0,033<0,05Test Statisticsa

classicl Classical Music

Mann-Whitney U 67709,500

Wilcoxon W 174662,500

Z -2,134

Asymp. Sig. (2-tailed)

,033

a. Grouping Variable: sex Respondent's Sex


Wilcoxon test


Wilcoxon test

1. Analogue of Students paired samples test;

2. Means are not compared;

3. Compares distributions;

4. The lager mean difference rank shows which variable is stochastically larger.


Data1. Two dependent (paired) interval or

rank samples.


3. Usually the same respondent measured twice.



H0 : distributions are equal,

H1 : distributions differ.


H0 is rejected (distributions differ) if

H0 is accepted (distributions do not differ) if



α p

α p

α


Example

• If respondents, older than 50 years, like classical music more than jazz?

• Each respondent rated both music styles by using the following scale: 1- like it very much,......7 – hate it very much.

SPSS

• After suitable select cases (age >50)

• Analyze -> Nonparametric Tests -> Legacy Dialogs ->2 related samples


SPSS



SPSS

Ranks

138a 157.43 21725.00

198b 176.22 34891.00

161c

497

Negative Ranks

Positive Ranks

Ties

Total

JAZZ - CLASSICN Mean Rank Sum of Ranks

JAZZ Jazz Music < CLASSICL Classical Musica.

JAZZ Jazz Music > CLASSICL Classical Musicb.

CLASSICL Classical Music = JAZZ Jazz Musicc.

Ranks for differences


SPSS

Test Statisticsb

-3.782a

.000

Z

Asymp. Sig. (2-tailed)

JAZZ JazzMusic -

CLASSICL Classical

Music

Based on negative ranks.a.

Wilcoxon Signed Ranks Testb.

p-reikšmė

Difference is statistically significant.


Spearman correlation


Spearman correlation test

1. Analogue of Pearson’s correlation.

2. Has the same interpretation.

3. Calculates Pearson’s correlation between ranks;

4. Can be used for already ranked data.


Data1. Two dependent interval or ranked

variables.


3. In a special case data can be ranked.



H0 : variables do not correlate.

H1 : variables correlate.


H0 is rejected (variables correlate statistically significantly) if

H0 is accepted (variables do not correlate) if



α p

α p

α


Example• Respondents older than 50years.• Do the data support a statement that

the more respondent likes musicals, the more he/she likes classical music.

Analyze -> Correlate -> Bivariate



Un-check

Check


SPSS

Variables correlate statistically significantly.

Correlation is positive, but weak.

Correlations

classicl jazz

Spearman's rho classicl Correlation Coefficient

1,000 ,205**

Sig. (2-tailed) . ,000

N 504 497

jazz Correlation Coefficient

,205** 1,000

Sig. (2-tailed) ,000 .

N 497 514

**. Correlation is significant at the 0.01 level (2-tailed).


Spearman correlation test for ranked data

1. Two teachers ranked their students:

2. First teacher: A, B, C, D, E, F, G, H, I,J, K, L.

3. Second teacher: B, C, A, D, H,E, F, G, K, I,J, L.

4. Do their rankings correlate?



H0 : variables do not correlate.

H1 : variables correlate.


SPSS

•First: A,B,C,D,E,F, G,H,I,J,K,L

•Second: B, C, A,D, H,E, F,G,K,I,J,L.

This variable is auxiliary


SPSS

Correlations

1.000 .916**

. .000

12 12

.916** 1.000

.000 .

12 12

Correlation Coefficient

Sig. (2-tailed)

N

Correlation Coefficient

Sig. (2-tailed)

N

MOKYT1

MOKYT2

Spearman's rhoMOKYT1 MOKYT2

Correlation is significant at the .01 level (2-tailed).**.

Correlation is very strong, significant and positive


Kruskal - Wallis test


Kruskal-Wallis test

1. Mann-Whitney test extended to more than 2 samples.

2. Interpretation is the same as fo M-W test.

3. The larger mean rank corresponds to larger scores.

4. Gives no information on which variables differ.

5. Is also called ANOVA for rank data.


Data1. Two or more independent interval or

rank samples.

2. Each rank variable has at least 5 different outcomes.



H0 : all distributions are the same

H1 : some distributions differ.


H0 is rejected (some distributions differ st. significantly), if

H0 is accepted (all distributions are equal), if


Conclusion with p - value

α p

α p

α


Example• We investigate respondents with at

leasy 13years of formal education.• Do all races equally like rap music?• Rank variable rap: 1-like it very much,

….,5-hate it.

SPSS

• After: select cases ->if -> educ >13

• Analyze -> Nonparametric Tests -> Legacy Dialogs -> K independent Samples


SPSS


Here

SPSS



SPSS

Ranks

617 372.20

65 254.05

34 309.59

716

RACE Racewof Respondent1 white

2 black

3 other

Total

RAP Rap MusicN Mean Rank

Blacks like best (coding).


SPSS

Test Statisticsa,b

23.311

2

.000

Chi-Square

df

Asymp. Sig.

RAP RapMusic

Kruskal Wallis Testa.

Grouping Variable: RACE Racew of Respondentb.

p-reikšmė

The scores statistically significantly

depend on the respondents race.


Friedman test


Friedman test1. Generalization of Wilcoxon‘s test for

more samples than 2.

2. For 2 samples, Wilcoxon‘s test is more powerful.

3. Easy to interpret.


Interpretation of ranks

1. Lat us assume that respondent evaluated performances of three actors (larger score – better perfomance): 10 for actor A , 6 for actor B, 8 for actor C.

2. Scores are ranked. Ranks: 3 for A, 1 for B, 2 for C.


Data1. Two or more dependent interval or

rank samples.

2. Each rank variable has at least 5 different outcomes.



H0 : all distributions are the same

H1 : some distributions differ.


H0 is rejected (some distributions differ st. significantly), if

H0 is accepted (all distributions are equal), if


Conclusion with p - value

α p

α p

α


Example• We investigate respondents with formal

education longer than 15years.• Do musicals, classical music and rap

music are equally popular?• Rank variable rap: 1-like it very much,

….,5-hate it.

SPSS

• After: select cases ->if -> educ >15

• Analyze -> Nonparametric Tests -> Legacy Dialogs -> K related Samples


SPSS



SPSS

Ranks

1.87

2.05

2.08

CLASSICL Classical Music

MUSICALS BroadwayMusicals

BIGBAND Bigband Music

Mean Rank

Classical music got lowest scores


SPSS

Test Statisticsa

343

14.286

2

.001

N

Chi-Square

df

Asymp. Sig.

Friedman Testa.

p-reikšmė

Not all styles are equally popular.


Friedman‘s test special case

• Five experts ranked three sorts of bear: A,B and C.

• First: B, C, A (i.e. the best is B bear)• Second: B, C, A • Third: A or C, B• Fourth: A, B,C• Fifth: B, A,C• Do all sorts are equally popular?


SPSS

ranks!

sorts


SPSS

Ranks

2.10

1.60

2.30

A

B

C

Mean Rank

Most popular is sort B


SPSS

Test Statisticsa

5

1.368

2

.504

N

Chi-Square

df

Asymp. Sig.

Friedman Testa.

Differences are st. Insignificant.

chi square goodness of fit test the goodness of fit of a supposed freaquencies to sample data. 1©...

Documents

murauskas slide

categorical data slide

spss data numeric

spss weight

sample data

goodness of fit data

spss goodness of fit

spss row