nonparametric correlation techniques - blog staff · 2010-05-21 · nonparametric correlation...

60
Nonparametric Correlation Techniques Techniques for Correlating Nominal & Ordinal Variables

Upload: others

Post on 13-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Nonparametric Correlation Techniques Techniques for Correlating Nominal & Ordinal Variables

Page 2: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

2

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

KEY CONCEPTS *****

Nonparametric Correlation Techniques

Scales of measurement Nominal Scale Ordinal scale Interval scale Ratio scale Metric vs. nonmetric variables Spearman Rank-Order Correlation Coefficient: Rho () Rho assumptions Null hypothesis in rho One and two-tailed hypotheses Reducing metric variables to ordinal scales of measurement Resolving the problem of tied ranks Goodman’s & Kruskal’s Gamma () Gamma assumptions Null hypothesis in gamma The concepts of consistency & inconsistency in gamma Using Z to determine the significance of gamma The Phi Coefficient () Phi assumptions Null hypothesis in phi The relationship between phi and chi-square The Contingency Coefficient (C) C assumptions Null hypothesis in C The relationship between C and chi-square The relationship between C and phi Limitation in the values that C can take Cramér’s V V assumptions Null hypothesis in V The relationship between V and chi-square Guttman’s Lambda () Lambda assumptions Null hypothesis in lambda Lambda as an asymmetrical correlation coefficient The concept of the reduction of the error in prediction PRE: Proportionate reduction of error

Page 3: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

3

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Lecture Outline

What are nonparametric correlation techniques and what kind of research problems are they designed to solve. Spearman Rank-Order Correlation Coefficient: Rho ()

Goodman’s & Kruskal’s Gamma ()

The Phi Coefficient ()

Contingency Coefficient (C)

Cramér’s V

Guttman’s Coefficient of Predictability

Lambda ()

Page 4: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

4

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Nonparametric Correlation Techniques

If the variables X and Y are metric (i.e. interval or ratio measures) and they are to be correlated,

Then the appropriate technique is Pearson’s Product-Moment Correlation Coefficient.

r = xy

x2 y2

Q What if X and/or Y is nonmetric (i.e. nominal or ordinal measures), how can they be correlated? A By use of one of a variety of nonparametric correlational techniques. Nonparametric correlational techniques are designed two estimate the correlation or association between variables measured on nominal and/or ordinal scales, or metric variables that have been reduced to nominal and/or ordinal scales.

Page 5: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

5

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Spearman Rank-Order Correlation Coefficient: (rho)

= 1 - (6D2 )/ [N(N2 – 1)]

A technique for determining the correlation between two ordinal variables, or metric variables reduced to an ordinal scale. Assumptions The two variables are ordinal or metric

variables that have been reduced to an ordinal scale of measurement,

The correlation between the variables is

linear, and If a test of significance is applied, the

sample has been selected randomly from the population.

Page 6: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

6

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

An Example

A prosecutor received 10 felony cases filed by an interagency organized crime task force and ranked the cases by seriousness (serious=X) and relative prosecutability (prosecute=Y).*

Case

X Serious

Y Prosecute

D

D2

A 6 3 3 9 B 1 10 -9 81 C 4 7 -3 9 D 7 5 2 4 E 10 1 9 81 F 3 8 -5 25 G 8 2 6 36 H 9 4 5 25 I 5 6 -1 1 J 2 9 -7 49 Total 320

*(Rankings: 1= the highest and 10= the lowest)

D = the difference between the rank position of each case on X and Y. N = the number of paired observations, cases.

Page 7: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

7

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Calculation of Rho ()

= 1 - (6D2 )/ [N(N2 – 1)] = 1 - (6) (320)/ [10(102 – 1)] = 1 - (1920)/ [10(99)] = 1 - (1920)/ (990) = -0.939 Interpretation The correlation is negative and the

magnitude is high. As the seriousness of the crime increases,

its prosecutability decreases.

Page 8: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

8

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Sprearman’s Rho SPSS Results

Rho = -0.939

Two-tailed level of significance: p 0.001

Page 9: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

9

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Reducing a Metric Variable to an Ordinal Scale of Measurement

What is the correlation between … The rank-ordered seriousness of 8 offences

(ordinal variable) and The length of sentences received by their

perpetrators (ratio variable)?

Case

Serious-

ness

Sentence Length: In Years

Rank of

Sentence

D

D2

A 5 6 5 0 0 B 2 3 2 0 0 C 7 7 6 -1 1 D 1 2 1 0 0 E 6 8 7 -1 1 F 3 5 4 -1 1 G 8 10 8 0 0 H 4 4 3 +1 1

Total 4

Page 10: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

10

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Seriousness of offence is ranked-ordered from least serious (rank = 1) to most serious (rank = 8). The length of sentence is rank-ordered from lowest (rank = 1) to highest (rank = 8) Computation of rho

= 1 - (6D2 )/ [N(N2 – 1)]

= 1 - (6) (4) )/ [8(82 – 1)] = +0.952 = +0.952 Interpretation The relationship is positive and the

magnitude of the correlation is high. As the seriousness of the offence increases,

The length of sentence increases as well.

Page 11: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

11

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

The Problem of Tied Ranks

In converting a metric variable to an ordinal scale of measurement, some cases may have tied values. (Shaded cells are tied scores)

Case

Serious-

ness

Sentence Length In

Years

Sentence

Rank Position

Rank:

Sentence

D

D2

A 5 6 4 4.5 0.5 0.25 B 2 2 1 1.5 0.5 0.25 C 7 7 6 6 1.0 1.00 D 1 2 2 1.5 -0.5 0.25 E 6 8 7 7 -1.0 1.00 F 3 6 5 4.5 -1.5 2.25 G 8 10 8 8 0.0 0.00 H 4 4 3 3 +1.0 1.00

Total 6.00 Cases B & D have tied sentences (2 years) as do cases A & F (6 years) In a rank ordering, cases B & D occupy rank positions 1 & 2, while cases A & F occupy rank positions 4 & 5. To determine the appropriate rank of tied cases, add the rank positions and divided by the number of tied cases.

Page 12: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

12

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

For cases B & D: (1+2) / 2 = 1.5 1.5 is the rank assigned to cases B & D For cases A & F: (4+5) /2 = 4.5

4.5 is the rank assigned to cases A & F Computation of rho

= 1 - (6D2 )/ [N(N2 – 1)]

= 1 - (6) (6) )/ [8(82 – 1)] = +0.929 Interpretation The relationship is positive and the

magnitude of the correlation is high As the seriousness of the offence increases

The length of sentence increases

Page 13: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

13

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Spearman’s Rho With Tied Ranks SPSS Results

Rho with tied ranks = +0.928

Two-tailed level of significance p= 0.001

Page 14: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

14

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Significance of Rho

In testing the significance of rho, the null hypothesis H0 states …

That the value of rho in the population from which the sample was drawn is 0.0

Therefore, the statistical question becomes …

What is the probability that the obtained value of rho in the sample could have come from such a population?

Given a sample size of N cases, a statistical table can be used to answer this question.

Page 15: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

15

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Table for Determining the Significance of Rho

Page 16: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

16

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Critical Values of Rho in Testing Significance

Consider the three previous examples involving: The prosecutor ranking the seriousness &

prosecutability of criminal cases (N = 10) The correlation of offence seriousness and

sentence length (N = 8), and The correlation of offence seriousness and

sentence length involving tied cases (N = 8)

Example

N

Rho

Critical Value

0.05 0.01 Prosecutor

10

-0.939

0.648

0.794

Sentence 8 +0.952 0.738 0.881 Tied ranks 8 +0.929 0.738 0.881

All three sample values of rho exceed the critical value at the p=0.01 level of significance. Therefore, we are more than 99% confident in rejecting each of these H0’s.

Page 17: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

17

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Derivation of the Spearman Rank-Order Correlation Coefficient ()

Spearman’s rank-order correlation coefficient () can be derived from Pearson’s correlation coefficient (r).

r = r = xy = = 1 - (6d2 )/ [N(N2 - 1)] x2 y2 If X and Y are ordinal variables ranked 1, 2, …, N, then X = Y = N(N+1) / 2 And X2 = Y2 = N(N+1)(2N+1) / 6 Given that x2 = (X - X) 2 = X2 - (X)2 / N And y2 = (Y - Y) 2 = Y2 - (Y)2 / N

Page 18: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

18

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Then for ordinal variables X & Y x2 = N (N+1)(2N+1) - [N(N+1)/2] 2 / N 6 x2 = N(N+1)(2N+1) - 6

1/N [N(N+1)/ 2] [N(N+1)/ 2]

This can be reduced as follows x2 = N(2N2+N+2N+1) -

6 1/N [(N2+N)(N2+N) /4]

x2 = (2N3+N2+2N2+N) - 6

1/N [(N4+N3+N3+N2)/4]

x2 = (2N3+3N2+N) - 1/N [(N4+2N3+ N2)/4] 6

x2 = (2N3+3N2+N) - (N4+2N3+ N2) 6 4N x2 = (2N3+3N2+N) - (N3+2N2+ N) 6 4

Page 19: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

19

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Substituting the common denominator 12 x2 = 2(2N3+3N2+N) - 3(N3+2N2+ N)

12 12 x2 = (4N3+6N2+2N - 3N3 - 6N2 - 3N) / 12

x2 =(N3 - N) / 12

And by the same logic

y2 =(N3 - N) / 12 Now let d = (x - y) d2 = (x - y)2 = (x2 - 2xy +y2)

d2 = x2 + y2 - 2xy

Since r = and r = xy

x2 y2 And given that

d2 = x2 + y2 - 2xy Multiply the last term on the right side of the equation by 1

Page 20: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

20

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

1 = x2 y2

x2 y2

d2 = x2+y2-2 [xy] [ x2 y2 / x2 y2 ] Since

r = = xy x2 y2 Then by substitution

d2 = x2 + y2 - 2 () x2 y2

Solving for d2 - x2 - y2 = - 2 x2 y2

d2 + x2 + y2 = 2 x2 y2

Recall that when X & Y are ranks

x2 = y2 = (N3 - N) / 12 Then by substitution = (N3 - N) / 12 + (N3 - N) / 12 - d2

Page 21: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

21

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

2 [(N3 - N) / 12] [(N3 - N) / 12] = 2 [(N3 - N) / 12] - d2

2 [(N3 - N) / 12]

= 1 - d2 [2 (N3 - N) / 12] = 1 - 6d2

N3 – N

r =

xy = 1 - (6d2 )/ [N(N2 - 1)] x2 y2

Page 22: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

22

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Goodman’s & Kruskal’s Gamma ()

= (Na – Ni)/ (Na + Ni)

A technique for determining the correlation between two ordinal variables used to define a two-way cross classification table. Assumptions The two variables are ordinal or metric variables

that have been reduced to an ordinal scale of measurement,

The correlation between the variables is linear,

If a test of significance is applied, the sample has

been selected randomly from the population, The columns in the table are ranked in

decreasing order from left to right, and The rows in the table are ranked in decreasing

order from top to bottom.

Page 23: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

23

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

An Example

A survey was conducted to measure the perceptions of citizens concerning:

Faith in the fairness of the criminal justice system (X), and

Their attitude toward the death penalty (Y).

Survey Results (N = 105)

Faith in Fairness

Death Penalty

Very

Favorable

Favorable

Opposed

Very

Opposed

Totals

High

15

12

6

5

38

Medium

12

8

10

8

38

Low

4

6

9

10

29

Totals

31

26

25

23

105

Is the perceived fairness of the justice system correlated with attitudes about the death penalty?

Page 24: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

24

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Calculation of Gamma

Gamma measures the degree of agreement (Na) and disagreement (Ni) between the two variables. To calculate Na, begin with the frequency in the upper left-hand cell (i.e. 15) and multiply it by the sum of the frequencies in all cells below and to the right of it.

Faith in Fairness

Death Penalty

Very

Favorable

Favorable

Opposed

Very

Opposed

Totals

High

15

12

6

5

38

Medium

12

8

10

8

38

Low

4

6

9

10

29

Totals

31

26

25

23

105

Calculation of Na

15(8+10+8+6+9+10) = 15(51) = 765 Now do the same for all frequencies that have cells that fall below and to the right.

Page 25: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

25

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

15

12

6

5

12

8

10

8

4

6

9

10

12(10+8+9+10) = 12(37) = 444

15

12

6

5

12

8

10

8

4

6

9

10

6(8+10) = 6(18) = 108

15

12

6

5

12

8

10

8

4

6

9

10

12(6+9+10) = 12(25) = 300

15

12

6

5

12

8

10

8

4

6

9

10

Page 26: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

26

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

8(9+10) = 8(19) = 152

15

12

6

5

12

8

10

8

4

6

9

10

10(10) = 100 Na = sum of these computations Na = (765+444+108+300+152+100) = 1869

Page 27: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

27

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Calculation of Ni, Degree of Inconsistency

The process for determining Ni is similar to that of calculating Na. Begin with the frequency in the upper right-hand cell (i.e. 5) and multiply it by the sum of the frequencies in all cells below and to the left of it.

15

12

6

5

12

8

10

8

4

6

9

10

5(12+8+10+4+6+9) = 5(49) = 245 Now do the same for all frequencies that have cells that fall below and to the left.

15

12

6

5

12

8

10

8

4

6

9

10

Page 28: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

28

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

6(12+8+4+6) = 6(30) = 180

15

12

6 5

12

8

10

8

4

6

9

10

12(12+4) = 12(16) = 192

15

12

6

5

12

8

10

8

4

6

9

10

8(4+6+9) = 8(19) = 152

15

12

6

5

12

8

10

8

4

6

9

10

10(4+6) = 10(10) = 100

15

12

6

5

12

8

10

8

4

6

9

10

8(4) = 32

Page 29: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

29

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Ni = the sum of these calculation Ni = (245+180+192+152+100+32) = 901

= (Na – Ni)/ (Na + Ni)

= (1869 – 901)/ (1869 + 901) = (968 / 2710) = +0.35 Interpretation There is a positive correlation between the perception of fairness and attitudes towards the death penalty. As faith in the fairness of the justice system

increases, People become more favorably disposed

toward the death penalty.

Page 30: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

30

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Gamma Coefficient SPSS Results

Gamma = 0.349

Level of significance: p 0.001

Page 31: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

31

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Determining the Significance of Gamma

Gamma may be converted into a Z score and evaluated at 1.96 or 2.58. H0: The correlation in the population from which the sample was drawn is 0.0. Conversion Z = √ (Na + Ni) / [N (1 - 2)] Z = +0.35 √ (1869 + 901) / [105 (1 – 0.352)] Z = +0.35 √ (2770) / [105 (0.8775)] Z = +0.35 √ 30.06 Z = +0.35( 5.483) = 1.92 Since 1.92 is less than 1.96, we conclude that the correlation is not significant, i.e. the correlation in the population is 0.0.

Page 32: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

32

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

The Phi Coefficient ()

= 2 / N

Phi is a derivative of chi-square (2). It is a technique for correlating two nominal variables, or variables that have been reduced to a nominal scale of measurement. Assumptions The two variables are nominal,

The data consists of frequencies cast in a

2x2 cross-tabulation table, and The sample has been randomly selected

from the population if the phi coefficient is tested for significance.

Page 33: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

33

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

An Example

A study was conducted to determine if there is a relationship between race and the sentences received by 960 misdemeanant offenders.

The Results (2x2 table)

Sentence

White

Non- White

Total

Probation/ Deferred Adjudication

314

196

510

Jail

210

240

450

Totals

524

436

960

Page 34: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

34

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Step 1 Calculate the expected frequencies

Sentence

White

Non- White

Total

Probation/ Deferred Adjudication

278.4

231.6

510

Jail

245.6

204.4

450

Totals

524

436

960

Step 2 Calculate chi-square 2 = [ (fo – fe)2 / fe ] 2 = (314-278.4)2/278.4+(196-231.6)2/ 231.6+

(210-245.6)2/245.6+(240- 204.4)2/204.4

2 = 21.38

Page 35: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

35

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Step 3 Convert the chi-square to a phi coefficient

= 2 / N

= 21.38 / 960 = 0.149 The correlation between race and sentence type is low, i.e. 0.149 Q Is the obtained correlation statistically significant? A The significance of phi is tested the same way as the significance of chi-square.

Page 36: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

36

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Testing the Significance of the Phi Coefficient

If the chi-square statistic is significant at 1 df, so is the phi coefficient. The critical values of chi-square for 1 df are:

3.841 at the p = 0.05 level, and

6.635 at the p = 0.01 level Since 2 = 21.38 is greater than 6.635

It is significant at p 0.01 Therefore

= 0.149 is also significant at p 0.01 Interpretation

There is a significant association between race and sentence type in the population of the order of 0.149.

Page 37: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

37

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Chi-Square Table Critical values of 2 at 1 df: 3.841 at p = 0.05, and 6.635 at p = 0.01

Page 38: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

38

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Phi Coefficient SPSS Results

Phi = 0.149, significance: p 0.001

Page 39: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

39

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Contingency Coefficient (C)

C = 2 / (N+2 )

A technique for determining the correlation between two nominal variables cast in a frequency table larger than 2x2. Assumptions The two variables are nominal or variables

that have been reduced to a nominal scale of measurement

The data have been cast in a 2x2 frequency

table or larger table The sample has been drawn randomly from

the population if the significance of C is to be tested

Page 40: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

40

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

An Example

A study was conducted on 960 misdemeanor cases to determine if there is a correlation between race and type of sentence.

The Results (3x3 frequency table) Sentence

White

Black

Other

Totals

Deferred Adjudication

122

61

21

204

Probation 192 96 18 306 Jail 210 83 157 450 Totals 524 240 196 960 Step 1 Calculate the expected frequencies Sentence

White

Black

Other

Totals

Deferred Adjudication

111.35

51.00

41.05

204

Probation 167.03 76.5 62.48 306 Jail 245.63 112.50 91.88 450 Totals 524 240 196 960

Page 41: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

41

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Step 2 Calculate 2 2 = [ (fo – fe)2 / fe ] 2 = (122-111.35)2/111.35+(61-51) 2/51+

(21-41.65) 2+(192-167.03) 2/167.03+ (96-76.5) 2/76.5+(18-62.48) 2/62.48+ (210-245.63) 2/245.63+(83-112.5) 2/112. 5+ (157-91.88) 2/91.88

2 = 1.02+1.96+10.24+3.73+4.97+31.67 2 = 112.65 Step 3 Calculate the contingence coefficient C = 2 / (N+2 ) C = 112.65 / (960+112.65)

C = 0.105 = 0.32

Page 42: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

42

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Contingency Coefficient SPSS Results

Contingency Coefficient = 0.324

Level of significance: p 0.001

Page 43: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

43

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Testing the Significance of the Contingency Coefficient C

The significance of C is tested by testing the significance of 2 for (r – 1)(c – 1) df.

H0: the correlation between race and type of sentence in the population is 0.0.

2 = 21.38, df = (3 – 1)(3 – 1) = 4 Critical values of 2 at 4 df is 9.488 (p = 0.05) and 13.277 (p = 0.01)

Since 21.38 is greater than 13.277, C is significant at p 0.01

Interpretation

The correlation between race and type of sentence in the population is estimated to be 0.32.

The direction of the correlation is

meaningless since the variables are nominal.

Page 44: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

44

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Application of the Contingency Coefficient C to a 2x2 Table

Consider the previous example in which the phi coefficient was used to determine the correlation between race and type of sentence in a 2x2 table.

Sentence

White

Non- White

Total

Probation/ Deferred Adjudication

314

196

510

Jail

210

240

450

Totals

524

436

960

2 = 21.38 and = 0.149

The contingency coefficient for the same table would be

C = 21.38 / (960 + 21.38) = 0.148

C in a 2x2 table, within rounding error = 21.38 / 960 = 0.149

Page 45: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

45

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Contingency Coefficient: 2x2 Table SPSS Results

Phi = Contingency Coefficient = 0.149

Level of significance: p 0.001

Page 46: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

46

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

The Relationship Between the Phi Coefficient and the Contingency

Coefficient in a 2x2 Table

Contingency Coefficient

Phi Coefficient

C = 2 / (N+2 )

= 2 / N

C 2 = 2 / (N+2 )

2 = 2 / N

C 2(N+2 ) = 2

2 N = 2

C 2(N+2 ) - 2 = 0

2 N - 2 = 0

Therefore

C 2(N+2 ) - 2 = 2 N - 2 C 2(N+2 ) = 2 N

Page 47: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

47

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Limitation of the Contingency Coefficient C

A correlation coefficient is designed to range on a scale from 0.0 to 1.0 Where 0.0 indicates no linear correlation

and 1.0 indicates a perfect linear correlation.

While a contingency coefficient may not exceed 1.0 it can be limited to less than 1.0, if the frequency table is non-symmetric. Examples of symmetric frequency tables

2x2, 3x3, 5x5, etc. Examples of non-symmetric frequency tables

2x3, 4x7, 5x6, etc

Page 48: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

48

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Cramér’s V Correlation Coefficient

V = 2 / [N(k – 1)]

A technique for determining the correlation between two nominal variables An alternative to the Contingency Coefficient C if the data is cast in a non-symmetric frequency table Assumptions The two variables are nominal or variables

that have been reduced to a nominal scale of measurement

The data have been cast in a 2x2 frequency

table or larger table The sample has been drawn randomly from

the population if the significance of C is to be tested

Page 49: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

49

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

An Example

Consider the previous examples on the correlation between race and type of sentence, but this time the data has been cast in a 2x3 frequency table. Sentence

White

Black

Other

Totals

Probation/Deferred Adjudication

314

157

39

510

Jail 210 83 157 450 Totals 524 240 196 960 Step 1 Calculate the expected frequencies Sentence

White

Black

Other

Totals

Probation/Deferred Adjudication

278.38

127.5

104.125

510

Jail 245.62 112.5 91.875 450 Totals 524 240 196 960

Page 50: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

50

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Step 2 Calculate chi-square 2 = (314-278.38)2/278.38+(157-127.5) 2/127.5+

(39-104.125) 2/104.125+(210-245.62) 2/245.62+ (83-112.5) 2/112.5+(157-91.875) 2/91.875

2 = 111.19 Cramér’s V = 2 / [N(k – 1)]

k = 2, the lesser of the columns or rows, therefore (k – 1) = (2 – 1) = 1

V = 111.19 / [960(2 – 1)]

V = 0.340 The Contingency Coefficient C for the same data is as follows: C = 2 / (N+2 )

C = 111.19 / (960+111.19 ) = 0.322

Page 51: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

51

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Cramér’s V SPSS Results

Cramér’s V = 0.340

Level of significance: p 0.001

Page 52: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

52

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Lambda (): Guttman’s Coefficient of Predictability

= (Fiv – Mdv) / (N – Mdv)

A technique to determine the extent to which the error in the prediction of one nominal variable can be reduced by knowledge of another nominal variable. Assumptions The two variables cast in the frequency table

are assumed to be nominal variables, or variables reduced to a nominal scale of measurement.

If the significance of lambda is to be tested,

the sample must be selected on a random basis from the population.

Page 53: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

53

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

An Example

A survey of 195 people was conducted on attitudes toward the death penalty. The results were cast in a two-way frequency table by gender and attitude.

The Results

Gender

Attitude

Totals

Favorable Mixed Unfavorable Male 60 20 15 95 Female 50 10 40 100 Totals 110 30 55 195 Q To what extent can the error in the prediction of gender be reduced by knowledge of the person’s attitude toward the death penalty? Q To what extent can the error in the prediction of attitude toward the death penalty be reduced by knowledge of the person’s gender?

Page 54: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

54

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Predicting Gender from Attitude

Let attitude serve as the independent variable (IV) and gender serve as the dependent variable (DV)

Therefore, the columns in the frequency table are the categories of the IV and the rows the categories of the DV.

Calculating lambda

= (Fiv – Mdv) / (N – Mdv)

Fiv = Sum of the largest cell frequencies within each category of the IV, attitude

Mdv = The largest marginal total in the categories of the DV, gender

N = The total number of cases

Page 55: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

55

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Gender (DV)

Attitude (IV)

Totals

Favorable Mixed Unfavorable Male 60 20 15 95 Female 50 10 40 100 Totals 110 30 55 200 Calculation of Fiv, the sum of the largest cell frequencies in each category of the IV, attitude

Fiv = (60+20+40) = 120

Calculation of Mdv, the largest marginal total in the categories of the DV, gender Mdv = 100 Calculation of lambda

= (Fiv – Mdv) / (N – Mdv)

= [(60+20+40) – 100] / (195 – 100) = 0.21 Interpretation The error in predicting gender is reduced by 0.21, (21%), by knowledge of attitude toward the death penalty.

Page 56: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

56

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Lambda: SPSS Results

Lambdagender = 0.211, Lambdaattitude = 0.00

Significance: pgender 0.089 pattitude = NA

Page 57: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

57

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Predicting Attitude from Gender

Let gender serve as the independent variable (IV) and attitude serve as the dependent variable (DV)

Therefore, the columns in the frequency table are the categories of the DV and the rows the categories of the IV.

Gender (IV)

Attitude (DV)

Totals

Favorable Mixed Unfavorable Male 60 20 15 95 Female 50 10 40 100 Totals 110 30 55 200 Calculating lambda

= (Fiv – Mdv) / (N – Mdv)

Fiv = (60+50) = 110 Sum of the largest cell frequencies within each category of the IV, gender

Page 58: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

58

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

Mdv = 110 The largest marginal total in the categories of the DV, attitude = [(60+50) – 110] / (195 – 110) = 0.0

Interpretation

Since = 0.0, the reduction in the error in predicting attitude toward the death penalty by a knowledge of a person’s gender would be 0%, or none at all.

Lambda is asymmetrical Let one variable = X and the other = Y

The reduction in error in predicting X from Y

will not necessarily be the same as the reduction of error in predicting Y from X

Page 59: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

59

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

The Concept of the Reduction of Error in Prediction

Gender (IV)

Attitude (DV)

Totals

Favorable Mixed Unfavorable Male 60 20 15 95 Female 50 10 40 100 Totals 110 30 55 195

Consider the problem of predicting gender from attitude. If we knew nothing about attitude, the best guess about a person’s gender would be the modal category, female (nf = 100).

The number of errors, therefore would be 95

Taking into consideration attitude, we would predict

Male if favorable (error = 50 females) Male if mixed attitude (error = 10 females)

Female if unfavorable (error = 15 males)

Total errors = (50+10+15) = 75

Page 60: Nonparametric Correlation Techniques - Blog Staff · 2010-05-21 · Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

60

Nonparametric Correlation Techniques: Charles M. Friel PhD, Criminal Justice Center, Sam Houston State University

The proportionate reduction in error (PER) would be as follows PRE = (errors without IV – errors with IV) errors with out IV PRE = (95 – 75) / (95) = 0.21 NB The PRE is identical to the previously calculated value of lambda using SPSS.

= PRE = 0.21