©the mcgraw-hill companies, inc. 2008mcgraw-hill/irwin non-parametric: analysis of ranked data...

32
©The McGraw-Hill Companies, Inc. 2008 McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

Upload: collin-long

Post on 14-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin

Non-parametric: Analysis of Ranked Data

Chapter 18

Page 2: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

2

GOALS

Conduct the sign test for dependent samples using the binomial and standard normal distributions as the test statistics.

Conduct a test of hypothesis for dependent samples using the Wilcoxon signed-rank test.

Conduct and interpret the Kruskal-Wallis test for several independent samples.

Compute and interpret Spearman’s coefficient of rank correlation.

Conduct a test of hypothesis to determine whether the correlation among the ranks in the population is different from zero.

Page 3: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

3

The Sign Test

The Sign Test is based on the sign of a difference between two related observations.

No assumption is necessary regarding the shape of the population of differences.

The binomial distribution is the test statistic for small samples and the standard normal (z) for large samples.

The test requires dependent (related) samples.

Page 4: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

4

The Sign Test continued

Procedure to conduct the test: Determine the sign (+ or -) of the difference between

related pairs. Determine the number of usable pairs. Compare the number of positive (or negative)

differences to the critical value. n is the number of usable pairs (without ties), X is the

number of pluses or minuses, and the binomial probability π = .5

Page 5: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

5

The Sign Test - Example

The director of information systems at Samuelson Chemicals recommended that an in-plant training program be instituted for managers. The objective is to improve the knowledge of database usage in accounting, procurement, production, and so on. A sample of 15 managers was selected at random. A panel of database experts determined the general level of competence of each manager with respect to using the database. Their competence and understanding were rated as being either outstanding, excellent, good, fair, or poor. After the three-month training program, the same panel of information systems experts rated each manager again. The two ratings (before and after) are shown along with the sign of the difference. A “+” sign indicates improvement, and a “-” sign indicates that the manager’s competence using databases had declined after the training program.

Did the in-plant training program effectively increase the competence of the managers using the company’s database?

Page 6: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

6

Step 1: State the Null and Alternative HypothesesH0: π ≤.5 (There is no increase in competence as a result of the in-

plant training program.) H1: π >.5 (There is an increase in competence as a result of the in-

plant training program.)

Step 2: Select a level of significance. We chose the .10 level.

Step 3: Decide on the test statistic. It is the number of plus signs resulting from the

experiment.

Step 4: Formulate a decision rule..

Page 7: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

7

Page 8: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

8

Page 9: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

9

Step 5: Make a decision regarding the null hypothesis.

Eleven out of the 14 managers in the training course increased their database competency. The number 11 is in the rejection region, which starts at 10, so is rejected.

We conclude that the three-month training course was effective. It increased the database competency of the managers.

Page 10: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

10

Normal Approximation

If the number of observations in the sample is larger than 10, the normal distribution can be used to approximate the binomial.

Page 11: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

11

The market research department of Cola, Inc., has been given the assignment of testing a new soft drink. Two versions of the drink are considered—a rather sweet drink and a somewhat bitter one. A preference test is to be conducted consisting of a sample of 64 consumers. Each consumer will taste both the sweet cola (labeled A) and the bitter one (labeled B) and indicate a preference. Conduct a test of hypothesis to determine if there is a difference in the preference for the sweet and bitter tastes. Use the .05 significance level.

Normal Approximation - Example

Page 12: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

12

Normal Approximation - Example

Page 13: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

13

Normal Approximation - Example

Page 14: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

14

Wilcoxon Signed-Rank Test for Dependent Samples

If the assumption of normality is violated for the paired-t test, use the Wilcoxon Signed-rank test.

The test requires the ordinal scale of measurement. The observations must be related or dependent.

Page 15: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

15

Wilcoxon Signed-Rank Test

The steps for the test are: Compute the differences between related

observations. Rank the absolute differences from low to high. Return the signs to the ranks and sum positive and

negative ranks. Compare the smaller of the two rank sums with the T

value, obtained from Appendix B.7.

Page 16: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

16

Fricker’s is a family restaurant chain located primarily in the southeastern part of the United States. It offers a full dinner menu, but its specialty is chicken. Recently, Fricker, the owner and founder, developed a new spicy flavor for the batter in which the chicken is cooked. Before replacing the current flavor, he wants to conduct some tests to be sure that patrons will like the spicy flavor better. To begin, Bernie selects a random sample of 15 customers. Each sampled customer is given a small piece of the current chicken and asked to rate its overall taste on a scale of 1 to 20. A value near 20 indicates the participant liked the flavor, whereas a score near 0 indicates they did not like the flavor. Next, the same 15 participants are given a sample of the new chicken with the spicier flavor and again asked to rate its taste on a scale of 1 to 20.

The results are reported in the table on the right. Is it reasonable to conclude that the spicy flavor is preferred? Use the .05 significance level.

Wilcoxon Signed-Rank Test for Dependent Samples - Example

Page 17: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

17

Wilcoxon Signed-Rank Test for Dependent Samples - Example

Page 18: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

18

Each assigned rank in column 6 is then given the same sign as the original difference, and the results are reported in column 7. For example, the second participant has a difference score of 8 and a rank of 6. This value is located in the section of column 7.

The R+ and R- columns are totaled. The sum of the positive ranks is 75 and the sum of the negative ranks is 30.

The smaller of the two rank sums is used as the test statistic and referred to as T.

Page 19: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

19

The critical values for the Wilcoxon signed-rank test are located in Appendix B.7. A portion of that table is shown on the table below.

Page 20: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

20

The value at the intersection is 25, so the critical value is 25.

The decision rule is to reject the null hypothesis if the smaller of the rank sums is 25 or less. The value obtained from Appendix B.7 is the largest value in the rejection region. To put it another way, our decision rule is to reject if the smaller of the two rank sums is 25 or less.

In this case the smaller rank sum is 30, so the decision is not to reject the null hypothesis. We cannot conclude there is a difference in the flavor ratings between the current and the spicy.

Page 21: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

21

Kruskal-Wallis Test:Analysis of Variance by Ranks

This is used to compare three or more samples to determine if they came from equal populations. The ordinal scale of measurement is required. It is an alternative to the one-way ANOVA. The chi-square distribution is the test statistic. Each sample should have at least five observations. The sample data is ranked from low to high as if it were a single group.

Page 22: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

22

A management seminar consists of executives from manufacturing, finance, and engineering. Before scheduling the seminar sessions, the seminar leader is interested in whether the three groups are equally knowledgeable about management principles. Plans are to take samples of the executives in manufacturing, in finance, and in engineering and to administer a test to each executive. If there is no difference in the scores for the three distributions, the seminar leader will conduct just one session. However, if there is a difference in the scores, separate sessions will be given. We will use the Kruskal-Wallis test instead of ANOVA because the seminar leader is unwilling to assume that (1) the populations of management scores follow the normal distribution or (2) the population standard deviations are the same.

Kruskal-Wallis Test:Analysis of Variance by Ranks - Example

Page 23: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

23

Step 1:

H0: The population distributions of the management scores for the populations of executives in manufacturing, finance, and engineering are the same.

H1: The population distributions of the management scores for the populations of executives in manufacturing, finance, and engineering are NOT the same.

Step 2: H0 is rejected if χ2 is greater than 7.185. There are 3 degrees of freedom at the .05 significance level.

Kruskal-Wallis Test:Analysis of Variance by Ranks - Example

Page 24: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

24

The next step is to select random samples from the three populations. A sample of seven manufacturing, eight finance, and six engineering executives was selected. Their scores on the test are recorded below.

Kruskal-Wallis Test:Analysis of Variance by Ranks - Example

Page 25: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

25

Considering the scores as a single population, the engineering executive with a score of 35 is the lowest, so it is ranked 1. There are two scores of 38. To resolve this tie, each score is given a rank of 2.5, found by (2+3)/2. This process is continued for all scores. The highest score is 107, and that finance executive is given a rank of 21. The scores, the ranks, and the sum of the ranks for each of the three samples are given in the table below.

Kruskal-Wallis Test:Analysis of Variance by Ranks - Example

Page 26: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

26

Because the computed value of H (5.736) is less than the critical value of 5.991, the null hypothesis is not rejected. There is not enough evidence to conclude there is a difference among the executives from manufacturing, finance, and engineering with respect to their typical knowledge of management principles. From a practical standpoint, the seminar leader should consider offering only one session including executives from all areas.

Kruskal-Wallis Test:Analysis of Variance by Ranks - Example

Page 27: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

27

Rank-Order Correlation

Spearman’s coefficient of rank correlation reports the association between two sets of ranked observations. The features are:

It can range from –1.00 up to 1.00. It is similar to Pearson’s coefficient of correlation, but is based

on ranked data. It computed using the formula:

Page 28: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

28

Lorrenger Plastics, Inc., recruits management trainees at colleges and universities throughout the United States. Each trainee is given a rating by the recruiter during the on-campus interview. This rating is an expression of future potential and may range from 0 to 15, with the higher score indicating more potential. The recent college graduate then enters an in-plant training program and is given another composite rating based on tests, opinions of group leaders, training officers, and so on. The on-campus rating and the in-plant training ratings are given in the table on the right.

Rank-Order Correlation - Example

Page 29: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

29

Rank-Order Correlation - Example

Page 30: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

30

Rank-Order Correlation - Example

Page 31: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

31

Testing the Significance of rs

State the null hypothesis: Rank correlation in population is 0. State the alternate hypothesis: Rank correlation in population is

not 0. For a sample of 10 or more, the significance of is determined

by computing t using the following formula. The sampling distribution of follows the t distribution with n - 2 degrees of freedom.

Page 32: ©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18

32

Testing the Significance of rs - Example