lesson 15 - 7

17
Lesson 15 - 7 Test to See if Samples Come From Same Population

Upload: gittel

Post on 05-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Lesson 15 - 7. Test to See if Samples Come From Same Population. Objectives. Test a claim using the Kruskal–Wallis test. Vocabulary. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lesson 15 - 7

Lesson 15 - 7

Test to See if Samples Come From Same Population

Page 2: Lesson 15 - 7

Objectives• Test a claim using the Kruskal–Wallis test

Page 3: Lesson 15 - 7

Vocabulary• Kruskal–Wallis Test -- nonparametric procedure

used to test the claim that k (3 or more) independent samples come from populations with the same distribution.

Page 4: Lesson 15 - 7

Test of Means of 3 or more groups

● Parametric test of the means of three or more groups: Compared the corresponding observations by

subtracting one mean from the other Performed a test of whether the mean is 0

● Nonparametric case for three or more groups: Combine all of the samples and rank this combined

set of data Compare the rankings for the different groups

Page 5: Lesson 15 - 7

Kruskal-Wallis Test

● Assumptions: Samples are simple random samples from three or

more populations Data can be ranked

● We would expect that the values of the samples, when combined into one large dataset, would be interspersed with each other

● Thus we expect that the average relative ratings of each sample to be about the same

Page 6: Lesson 15 - 7

Test Statistic for Kruskal–Wallis Test

A computational formula for the test statistic is

where Ri is the sum of the ranks of the ith sample R²1 is the sum of the ranks squared for the first sample R²2 is the sum of the ranks squared for the second sample, and so on n1 is the number of observations in the first sample n2 is the number of observations in the second sample, and so on N is the total number of observations (N = n1 + n2 + … + nk) k is the number of populations being compared.

12 1 ni(N + 1) ²H = -------------- --- Ri - ------------ N(N + 1) ni 2

Σ

12 R²1 R²2 R²kH = ------------- ----- + ----- + … + ------- - 3(N + 1) N(N + 1) n1 n2 nk

Page 7: Lesson 15 - 7

Test Statistic (cont)

● Large values of the test statistic H indicate that the Ri’s are different than expected

● If H is too large, then we reject the null hypothesis that the distributions are the same

● This always is a right-tailed test

Page 8: Lesson 15 - 7

Critical Value for Kruskal–Wallis Test

Small-Sample CaseWhen three populations are being compared and when the sample size from each population is 5 or less, the critical value is obtained from Table XIV in Appendix A.

Large-Sample CaseWhen four or more populations are being compared or the sample size from one population is more than 5, the critical value is χ²α with k – 1 degrees of freedom, where k is the number of populations and α is the level of significance.

Page 9: Lesson 15 - 7

Hypothesis Tests Using Kruskal–Wallis TestStep 0 Requirements: 1. The samples are independent random samples. 2. The data can be ranked.

Step 1 Box Plots: Draw side-by-side boxplots to compare the sample data from the populations. Doing so helps to visualize the differences, if any, between the medians.

Step 2 Hypotheses: (claim is made regarding distribution of three or more populations) H0: the distributions of the populations are the same H1: the distributions of the populations are not the same

Step 3 Ranks: Rank all sample observations from smallest to largest. Handle ties by finding the mean of the ranks for tied values. Find the sum of the ranks for each sample.

Step 4 Level of Significance: (level of significance determines the critical value) The critical value is found from Table XIV for small samples. The critical value is χ²α with k – 1 degrees of freedom (found in Table VI) for large samples.

Step 5 Compute Test Statistic:

Step 6 Critical Value Comparison: We reject the null hypothesis if the test statistic is greater than the critical value.

12 R²1 R²2 R²kH = ------------- ----- + ----- + … + ------- - 3(N + 1) N(N + 1) n1 n2 nk

Page 10: Lesson 15 - 7

Kruskal–Wallis Test Hypothesis

• In this test, the hypotheses are

H0: The distributions of all of the populations are the same

H1: The distributions of all of the populations are not the same

• This is a stronger hypothesis than in ANOVA, where only the means (and not the entire distributions) are compared

Page 11: Lesson 15 - 7

Example 1 from 15.7

S 20-29 40-49 60-69

1 54 (29) 61 (31.5) 44 (18)

2 43 (16) 41 (14) 65 (34.5)

3 38 (11.5) 44 (18) 62 (33)

4 30 (2) 47 (21) 53 (27.5)

5 61 (31.5) 33 (3) 51 (26)

6 53 (27.5) 29 (1) 49 (22.5)

7 35 (7.5) 59 (30) 49 (22.5)

8 34 (4.5) 35 (7.5) 42 (15)

9 39 (13) 34 (4.5) 35 (7.5)

10 46 (20) 74 (36) 44 (18)

11 50 (24.5) 50 (24.5) 37 (10)

12 35 (7.5) 65 (34.5) 38 (11.5)

Medians (Sums)

41(194.5)

45.5(225.5)

46.5(246)

Page 12: Lesson 15 - 7

Example 1 (cont)

12 R²1 R²2 R²kH = ------------- ----- + ----- + … + ------- - 3(N + 1) N(N + 1) n1 n2 nk

12 194.5² 225.5² 246²H = ------------- ---------- + --------- + -------- - 3(36 + 1) = 1.009 36(36 + 1) 12 12 12

Critical Value: (Large-Sample Case)χ²α with 2 (3 – 1) degrees of freedom, where 3 is the number of populations and 0.05 is the level of significance

CV= 5.991

Conclusion: Since H < CV, therefore we FTR H0 (distributions are the same)

Page 13: Lesson 15 - 7

Summary and Homework

• Summary– The Kruskal-Wallis test is a nonparametric test for

comparing the distributions of three or more populations

– This test is a comparison of the rank sums of the populations

– Critical values for small samples are given in tables– The critical values for large samples can be

approximated by a calculation with the chi-square distribution

• Homework– problems 3, 5, 7, 10 from the CD

Page 14: Lesson 15 - 7

Homework Problem 3

Sorts and Ranks

Problem 3 9 1.5 19 1.5 2

Values Ranks 11 3 3Subject

Nr X Y Z X Y Z 12 4.5 41 13 16 12 6.5 10 4.5 12 4.5 52 9 18 14 1.5 12 8 13 6.5 63 17 11 9 11 3 1.5 13 6.5 74 12 13 15 4.5 6.5 9 14 8 8

Ri = Sum of the Ranks 23.5 31.5 23 15 9 9R²i = 552.25 992.25 529 16 10 10ni = 4 4 4 N = 12 17 11 11

i = 1 i = 2 i = 3 18 12 12H = 0.875

Hcr = 5.6923 FTR

Page 15: Lesson 15 - 7

Homework Problem5

Problem 5

RanksSubject Nr Mon Tues Wed Thurs Fri

Ri = 48 226 144 194.5 207.5R²i = 2304 51076 20736 37830.25 43056.25ni = 8 8 8 8 8 N = 40

i = 1 i = 2 i = 3 i = 4 i = 5H = 18.77058

Hcr = 9.488 Reject

Page 16: Lesson 15 - 7

Homework Problem 7

Sorts and Ranks

Problem 3 9 1.5 19 1.5 2

Values Ranks 11 3 3Subject

Nr X Y Z X Y Z 12 4.5 41 13 16 12 6.5 10 4.5 12 4.5 52 9 18 14 1.5 12 8 13 6.5 63 17 11 9 11 3 1.5 13 6.5 74 12 13 15 4.5 6.5 9 14 8 8

Ri = Sum of the Ranks 23.5 31.5 23 15 9 9R²i = 552.25 992.25 529 16 10 10ni = 4 4 4 N = 12 17 11 11

i = 1 i = 2 i = 3 18 12 12H = 0.875

Hcr = 5.6923 FTR

Page 17: Lesson 15 - 7

Homework Problem 10Sort & Rank

Problem 10 456 1 1458 2 2

Values Ranks 480 3 3Subject Nr CA DN US CN DN US 485 4 4

1 578 568 506 24 21 8 491 5 52 548 530 518 17 13.5 11 492 6 63 521 571 485 12 23 4 502 7 74 555 569 480 18 22 3 506 8 85 548 563 458 16.5 20 2 513 9.5 96 530 535 456 13.5 15 1 513 9.5 107 502 561 513 7 19 9.5 518 11 118 492 513 491 6 9.5 5 521 12 12

Ri = Sum of the Ranks 114 143 43.5 530 13.5 13R²i = 12996 20449 1892.25 530 13.5 14ni = 8 8 8 N = 24 535 15 15

i = 1 i = 2 i = 3 548 16.5 16H = 13.34313 548 16.5 17

Hcr = 9.21 Reject 555 18 18561 19 19563 20 20568 21 21569 22 22571 23 23578 24 24