ch11 [non-parametric tests]

Upload: cheena-alvarez-vitug

Post on 04-Jun-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    1/27

    Chapter

    11

    Elementary Statistics 

    Larson Farber

    Nonparametric Tests

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    2/27

    The Sign TestThe Sign TestThe Sign TestThe Sign Test

    Section 11.1

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    3/27

    Left-tailed test: H 0: median k and H a :

    median < k 

    Right-tailed test: H 0: median ≤ k and H a : median > k 

    Two-tailed test: H 0: median = k and H a : median k 

    Nonparametric TestsA nonparametric test is a hypothesis test that does not requireany specific conditions about the shape of the populations or the

    value of any population parameters.Tests are often called “distribution free” tests.

    The Sign Test is a nonparametric test that can be used to

    test a population median against a hypothesized value, k.

    Hypotheses

    or 

    or 

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    4/27

    Sign TestTo use the sign test, first compare each entry in thesample to the hypothesized median, k .

    • If the entry is below the median, assign it a – sign.

    • If the entry is above the median, assign it a + sign.

    • If the entry is equal to the median, assign it a 0.

    Compare the number of + and – signs. (Ignore 0’s.) If the

    number of + signs and the number of – signs are

    approximately equal, the null hypothesis is not likely tobe rejected. If they are not approximately equal,

    however, it is likely that the null hypothesis will be

    rejected.

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    5/27

    Sign TestTest Statistic: When n  ≤ 25, the test statistic is thesmaller number of + or – signs.

    When n > 25, the test statistic is:

    For n > 25, you are testing the binomial probability that = 0.50.

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    6/27

    ApplicationA meteorologist claims that the daily median temperature forthe month of January in San Diego is 57º Fahrenheit. Thetemperatures (in degrees Fahrenheit) for 18 randomly selectedJanuary days are listed below. At = 0.01, can you support themeteorologist’s claim?

    58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63

    551. Write the null and alternative hypothesis.

    H 0: median = 57º and H a : median ≠ 57º

    2. State the level of significance.= 0.01

    3. Determine the sampling distribution.

    Binomial with p = 0.5

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    7/27

    Since Ha contains the ≠ symbol, this is a two-tail test.

    There are 8 + signs and 9 – signs. So, n = 8 + 9 = 17.

    5855+

     – 

    6260+

    +

    5556 – 

     – 

    5557 – 

    0

    5361 – 

    +

    5258 – 

    +

    5263 – 

    +

    5963+

    +

    5555 – 

     – 

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    8/27

    6. Find the test statistic.

    5. Find the rejection region.

    4. Find the critical value. With n = 17, use Table 8

    Critical value is 2.

    Reject H 0 if the test

    statistic is less than orequal to 2.

    The test statistic is the smaller number of + or – signs,

    so the test statistic is 8.

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    9/27

    7. Make your decision.

    8. Interpret your decision.

    The test statistic, 8, does not fall in the critical region. Failto reject the null hypothesis.

    There is not enough evidence to reject themeteorologist’s claim that the median dailytemperature for January in San Diego is 57.

    The sign test can also be used withpaired data (such as before and after).Find the difference betweencorresponding values and record the

    sign. Use the same procedure.

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    10/27

    The Wilcoxon TestThe Wilcoxon TestThe Wilcoxon TestThe Wilcoxon Test

    Section 11.2

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    11/27

    Wilcoxon Signed-Rank Test

    The Wilcoxon signed-rank test is a nonparametric test thatcan be used to determine whether two dependent samples

    were selected from populations with the same distribution.

    •Find the difference for each pair:Sample 1 value – Sample 2 value

    •Find the absolute value of the difference.

    •Rank order these differences.

    •Affix a + or – sign to each of the rankings.

    •Find the sum of the positive ranks.

    •Find the sum of the negative ranks.

    •Select the smaller of the absolute values of the sums.

    To find the test statistic, w s 

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    12/27

    Application

    The table shows the daily headache hours suffered by 12patients before and after receiving a new drug for seven weeks.

    At = 0.01, is there enough evidence to conclude that thenew drug helped to reduce daily headache hours?

    1. Write the null and alternative hypothesis.

    2. State the level of significance.

    = 0.01

    H 0: The headache hours after using the new drug areat least as long as before using the drug.

    H a: The new drug reduces headache hours. (Claim)

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    13/27

    12

    345

    678

    2.13.9

    3.82.52.4

    3.63.42.4

    Before

    2.22.8

    2.52.61.9

    1.82.01.6

    After

     –0.11.1

    1.3 –0.10.5

    1.81.40.8

    Diff.

    0.11.1

    1.30.10.5

    1.81.40.8

    Abs

    1.55.0

    6.01.53.0

    8.07.04.0

    Rank

     –1.55.0

    6.0 –1.53.0

    8.07.04.0

    Sign Rank

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    14/27

    The sum of the positive ranks is 5 + 6 + 3 + 8 + 7 + 4 = 33.

    The sum of the negative ranks is –1.5 + (–1.5) = –3.

    The test statistic is the smaller of the absolute value ofthese sums, w s = 3.

    There are 8 + and – signs, so n = 8. The criticalvalue is 2. Because w s = 3 is greater than the

    critical value, fail to reject the null hypothesis.There is not enough evidence to conclude thenew drug reduces headache hours.

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    15/27

    Wilcoxon Rank-Sum TestThe Wilcoxon rank-sum test is a nonparametric test that

    can be used to determine whether two independentsamples were selected from populations having the samedistribution.

    Both samples must be at least 10. Then n 1represents the size of the smaller sample and n 2the size of the larger sample.

    When the samples are the same size, it does not matter which is n 1.

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    16/27

    Wilcoxon Rank-Sum TestTest statistic:

    Combine the data from both samples and rank it.R = the sum of the ranks for the smaller sample.Find the z -score for the value of R .

    where

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    17/27

    The KruskalThe KruskalThe KruskalThe Kruskal----WallisWallisWallisWallisTestTestTestTest

    Section 11.3

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    18/27

    The Kruskal-Wallis TestThe Kruskal-Wallis test is a nonparametric test that can beused to determine whether three or more independent

    samples were selected from populations having the samedistribution.

    H 0: There is no difference in the population distributions.H a: There is a difference in the population distributions.

    Combine the data and rank the values. Then

    separate the data according to sample and find

    the sum of the ranks for each sample.

    Ri = the sum of the ranks for sample i .

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    19/27

    The sampling distribution is a chi-square distribution with k  – 1degrees of freedom (where k = the number of samples).

    Given three or more independent samples, the teststatistic H for the Kruskal-Wallis test is:

    where k represents the number of samples, n i  is the

    size of the i th sample, N is the sum of the samplesizes, and R i  is the sum of the ranks of the i 

    th

    sample.

    Reject the null hypothesis when H is greater than the critical

    number. (Always use a right-tail test.)

    The Kruskal-Wallis Test

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    20/27

    ApplicationYou want to compare the hourly pay rates of accountantswho work in Michigan, New York and Virginia. To do so, you

    randomly select 10 accountants in each state and recordtheir hourly pay rate as shown below. At the .01 level, canyou conclude that the distributions of accountants’ hourly payrates in these three states are different?

    MI(1) NY(2) VA(3)14.24 21.18 17.02014.06 20.94 20.63014.85 16.26 17.470

    17.47 21.03 15.54014.83 19.95 15.38019.01 17.54 14.90013.08 14.89 20.48015.94 18.88 18.50013.48 20.06 12.800

    16.94 21.81 15.570

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    21/27

    = 0.01

    H 0 : There is no difference in the hourly pay rate in the 3 states.

    H a : There is a difference in the hourly pay in the 3 states.

    1. Write the null and alternative hypothesis.

    2. State the level of significance.

    The sampling distribution is chi-square with d.f. = 3 – 1 = 2.

    From Table 6, the critical value is 9.210.

    5. Find the rejection region.

    4. Find the critical value.

    3. Determine the sampling distribution.

    X2

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    22/27

    Test StatisticData State Rank

    12.800 VA 113.080 MI 2

    13.480 MI 314.060 MI 414.240 MI 5

    14.830 MI 614.850 MI 714.890 NY 814.900 VA 915.380 VA 10

    15.540 VA 1115.570 VA 1215.940 MI 1316.260 NY 14

    16.940 MI 1517.020 VA 1617.470 MI 17.517.470 VA 17.5

    17.540 NY 1918.500 VA 2018.880 NY 2119.010 MI 22

    19.950 NY 2320.060 NY 2420.480 VA 2520.630 VA 26

    20.940 NY 2721.030 NY 28

    21.180 NY 2921.810 NY 30

    Michigan salaries are in ranks:

    2, 3, 4, 5, 6, 7, 13, 15, 17.5, 22The sum is 94.5.

    New York salaries are in ranks:8, 14, 19, 21, 23, 24, 27, 28, 29, 30The sum is 223.

    Virginia salaries are in ranks:1, 9, 10, 11, 12, 16, 17.5, 20, 25, 26The sum is 147.5.

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    23/27

    R1 = 94.5, R2 = 223, R3 = 147.5

    n 1 = 10, n 2 = 10 and n 3 = 10, so N = 30

    The test statistic 10.76 falls in the rejection region, soreject the null hypothesis.

    There is a difference in the salaries of the 3 states.

    Find the test statistic.

    Make Your Decision

    Interpret your Decision

    9.210 10.76

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    24/27

    Rank CorrelationRank CorrelationRank CorrelationRank Correlation

    Section 11.4

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    25/27

    (There is a significant correlation between thevariables.)

    Rank Correlation

    The Spearman rank correlation coefficient, r s , is a measure ofthe strength of the relationship between two variables. TheSpearman rank correlation coefficient is calculated using theranks of paired sample data entries. The formula for theSpearman rank correlation coefficient is

    where n is the number of paired data entries and d is thedifference between the ranks of a paired data entry.

    The hypotheses:

    (There is no correlation between the variables.)

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    26/27

    Rank CorrelationSeven candidates applied for anursing position. The seven

    candidates were placed in rankorder first by x and then by y .The results of the rankings arelisted below. Using a .05 level

    of significance, test the claimthat there is a significantcorrelation between thevariables.

    (There is no correlation between the variables.)(There is a significant correlation between thevariables.)

    x y 

    1 2 12 4 43 1 34 5 2

    5 7 66 3 17 6 7

  • 8/13/2019 Ch11 [Non-Parametric Tests]

    27/27

    Application

    Critical Value = 0 .715

    Since the statistic 0.643 does not fall in the rejection region, fail to reject H 0. There

    is not enough evidence to support the claim that there is a significant correlation.

    x y d = x – y d 2

    1 2 1 1 12 4 4 0 03 1 3 –2 44 5 2 3 9

    5 7 6 1 16 3 1 2 47 6 7 –1 1

    20