chapter 15 nonparametric statistics. learning objectives determine situations where nonparametric...

36
CHAPTER 15 NONPARAMETRIC STATISTICS

Upload: caitlin-logan

Post on 28-Dec-2015

258 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

CHAPTER 15

NONPARAMETRIC STATISTICS

Page 2: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Learning Objectives

• Determine situations where nonparametric procedures are better alternatives to the parametric tests

• Understand the assumptions of nonparametric tests • Use one- and two-sample nonparametric tests • Use nonparametric alternatives to the single-factor

ANOVA

Page 3: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Nonparametric vs. Parametric • Used an assumption that we are working with

random samples from normal populations• Called parametric methods• Based on a particular parametric family of

distributions• Describe procedures called nonparametric methods• Make no assumptions about the population

distribution other than that it is continuous

Page 4: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Why Nonparametric Procedures

• Distributions are not close to normal• Data need not be quantitative but can be

categorical (such as yes or no, defective or non defective) or rank data

• Are usually very quick and easy to perform• Provides considerable improvement over the

normal-theory parametric methods• Not utilize all the information provided by the

sample• Requirement of a larger sample size

Page 5: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Which One?

• Which one to choose?

• If both methods are applicable to a particular problem

• Use the more efficient parametric procedure

• Otherwise, use the non parametric procedure

Page 6: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

SIGN TEST• Used to test hypotheses about the median of a

continuous distribution• Mean of a normal distribution equals the median• Sign test can be used to test hypotheses about

the mean of a normal distribution• Used the t-test in Chapter 9• Sign test is appropriate for samples from any

continuous distribution• Counterpart of the t-test

Page 7: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Description of the Test

• Use the following differences

• Xi is ith the sample observation and is the specified median value

• Number of plus signs is a value of a binomial random variable that has the parameter p=1/2

• Reject the if the proportion of plus signs is significantly different from 1/2

1,2,....ni ~, oiX

oH

~o

Page 8: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Using P-value

• Use the P-value• If r+ < n/2 the P-value

• If r+ > n/2 the P-value

• If the P-value is less than the significance level , we will reject H0 and conclude that H1 is true

)2

1p (2 whenrRPP

)2

1p when(2 rRPP

Page 9: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

The Normal Approximation

• Binomial distribution has well approximately a normal distribution when n >10 and p=0.5

• Mean=np and the variance=np(1-p)• Test statistics

• Critical region can be chosen from the table of the standard normal distribution

n

nRZo

5.0

5.0

Page 10: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Sign Test for Paired Samples

• Applied to paired observations drawn from two continuous populations

• Define the paired difference as

• Test the hypothesis that the two populations have a common median

• Equivalent to • Done by applying the sign test to the n observed

differences

njXXD jjj ,.....2,1 21

0~ D

Page 11: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Example

• Ten samples were taken from a plating bath used in an electronics manufacturing process, and the bath pH was determined.

• The sample pH values are 7.91, 7.85, 6.82, 8.01, 7.46, 6.95, 7.05, 7.35, 7.25, 7.42

• Manufacturing engineering believes that pH has a median value of 7.0. Do the sample data indicate that this statement is correct? Use the sign test with =0.05 to investigate this hypothesis. Find the P-value for this test

Page 12: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Calculate the differences• Use the general procedure covered in

Chapter 8

1. Parameter of interest is the median of the distribution of pH

2. The

3. The

4. =0.05

0.7~:1

0.7~:0

H

H

Page 13: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Solution - Cont

i xi xi-7 Sign

1 7.91 + 0.91 +

2 7.85 + 0.85 +

3 6.82 - 0.18 -

4 8.01 + 1.01 +

5 7.46 + 0.46 +

6 6.95 - 0.05 -

7 7.05 + 0.05 +

8 7.35 + 0.35 +

9 7.25 + 0.25 +

10 7.42 + 0.42 +

•Data and the observed plus signs

5. Test statistic is the observed number of plus differences r+=86. Reject H0 if the P-value corresponding to r=8 is less than or

equal to = 0.05

Page 14: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Solution-Cont.

7. Since r >n/2=5, we calculate the P-value by using the binomial formula with n=10 and p=0.5

• Hence, the P-value = 2P(R+8|p=0.5)

• Since P=0.109 is not less than = 0.05, we cannot reject the null hypothesis

8. Observed number of plus signs r = 8 was not large or enough to indicate that median pH is different from 7.0

109.0)5.0()5.0(10

210

8

rnr

r r

Page 15: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Using Table

• Table of critical values for the sign test• Appendix Table VII is for two-sided and one-

sided alternative hypothesis• Let R=min (R+, R-)

• Reject H0

– If r-≤ critical value; if (>) used for H1

– If r+≤ critical value; if (<) used for H1

– If r≤ critical value; if (≠) used for H1

Page 16: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Wilcoxon Signed-rank Test

• Sign test uses only the plus and minus signs of the differences

• Does not take into consideration the size or magnitude of these differences

• Uses both direction (sign) and magnitude• In case of symmetric and continuous distributions

• Test H0 as µ=µ0

Page 17: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Description of the Test• Compute the following quantities

Xi- 0

• Xi is ith the sample observation i and 0 is the specified median or mean value

• Rank the absolute differences in ascending order• Give the ranks the signs• W+ be the sum of the positive ranks and W- be the sum of

the negative ranks, and let W min(W+,W- )• Table VIII contains critical values of W• Reject H0

– If w-≤ critical value; if (>) used for H1

– If w+≤ critical value; if (<) used for H1

– If w≤ critical value; if (≠) used for H1

Page 18: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Large-Sample Approximation

• Large sample size (n>20)• has approximately a normal distribution• Mean and variance

• Test statistics

• Appropriate critical region can be chosen from a table of the standard normal distribution

-Wor W

24/)12)(1(

4/)1(0

nnn

nnWZ

4

)1(

nnW

24

)12)(1(2

nnnW

Page 19: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Paired Observations

• Applied to paired observations drawn from two continuous and symmetric populations

• Define the paired difference as

• Test the hypothesis that the two populations have a common mean

• Equivalent to testing that the mean of the differences0D

jjj XXD 21

Page 20: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Description of the Test

• Differences are first ranked in ascending order of their absolute values

• Ranks are given the signs of the differences• Ties are assigned average ranks• W+ be the sum of the positive ranks and W- be the

sum of the negative ranks, and let W min(W+,W- )• Table VIII contains critical values of W• Reject H0

– If w-≤ critical value; if (>) used for H1

– If w+≤ critical value; if (<) used for H1

– If w≤ critical value; if (≠) used for H1

Page 21: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Example

• Consider the data in the previous example and assume that the distribution of pH is symmetric and continuous.

• Use the Wilcoxon signed-rank test with =0.05 to test the following hypothesis H0: µ=7 vs. H1: µ≠7

Page 22: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Solution1. Parameter of interest is the mean of the pH

2. H0: µ=7

3. H1: µ≠7

4. α=0.05

5. Test statistic

w=min (w+, w-)

6. Reject H0 if w<w*0.05=8 from Table VIII

Page 23: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Solution – Cont.

i xi xi-7 Signed Rank

1 7.05 + 0.05 + 1.5

2 6.95 -0.05 - 1.5

3 6.82 - 0.18 - 3

4 7.25 + 0.25 + 4

5 7.35 + 0.35 + 5

6 7.42 + 0.42 + 6

7 7.46 + 0.46 + 7

8 7.85 + 0.85 + 8

9 7.91 + 0.91 + 9

10 8.01 +1.01 + 10

7. Signed rank

•Determine the minimum value of the following•w+ = ( 1.5 + 4 + 5 + 6 + 7 + 8 + 9 + 10)= 50.5•w – = ( 1.5 + 3) = 4.5

•Test statistic is w = min (50.5,4.5)

Page 24: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Solution-Cont.

8. Since w=4.5 is less than the critical value

w0.05 =8

• Reject the null hypothesis

Page 25: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

WILCOXON RANK-SUM TEST

• Statistical inference for two samples• Wilcox on rank-sum test is a non parametric

alternative

• Two independent continuous populations X1 and X2 with means 1 and 2

• Wish to test the following hypotheses

• n1 and n2 are sample size

211

21

:

:

H

Ho

Page 26: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Description of the Test

• Arrange all n1+n2 observations in ascending order of magnitude and assign ranks to them

• Ties are assigned average rank• W1 be the sum of the ranks in the smaller sample (1), and define W2

to be the sum of the ranks in the other sample• Also can be found

• Table IX contains the critical value of the rank sums for two significance levels

• Reject H0 – If w2 ≤ critical value; if (>) used for H1

– If w1 ≤ critical value; if (<) used for H1

– If either w1 or w2 ≤ critical value; if (≠) used for H1

12121

2 2

)1)((W

nnnnW

Page 27: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Large-Sample Approximation

• When both n1 and n2 are moderately large• Distribution of w1 can be well approximated by

the normal distribution with the following mean and variance

• Test statistic

• Appropriate critical region can be chosen from the table

1

11

w

wWZo

2

)1( 2111

nnnW 12

)1( 21212

1

nnnnW

Page 28: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Kruskal-Wallis Test

• Recall the single-factor analysis of variance model

• Error terms ij were with mean zero and variance

• Kruskal-Wallis test is a nonparametric alternative

• Error terms ij are assumed to be from the same continuous distribution

2

Page 29: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Description of the Test

• Compute the total number of observations

• Rank all N observations from smallest to largest• Assign the smallest observation rank 1, the next

smallest rank 2, . . . , and the largest observation rank N

• Rij be the rank of observation Yij

• Ri. denote the total and the. average of the ni ranks

iR

a

iinN

1

Page 30: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Test Statistic

• Calculate

• H has approximately a chi-square distribution with a-1 degrees of freedom

• Reject H0 if the observed value h is greater than the critical value, or

• Critical region can be chosen from the Chi-square distribution table depending on whether the test is a two-tailed, upper-tail, or lower-tail test

2.

1

)2

1(

)1(

12

NRn

NNH i

a

ii

21, aXh

Page 31: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Ties in the Kruskal-Wallis Test

• Observations are tied, assign an average rank• use the following test statistic

• ni is the number of observations in the ith treatment

• N is the total number of observations• S2 is just the variance of the ranks

Page 32: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Example 15-7• Montgomery (2001) presented data from an

experiment in which five different levels of cotton content in a synthetic fiber were tested to determine whether cotton content has any effect on fiber tensile strength. The sample data and ranks from this experiment are shown in following Table

• Does cotton percentage affect breaking strength? Use α=0.01

Page 33: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Solution

Cotton % 7 7 7 9 10

Rank 1 2 3 4 5

Cotton % 11 11 11 12 12

Rank 6 7 8 9 10

Cotton % 14 15 15 17 18

Rank 11 12 13 14 15

Cotton % 18 18 18 19 19

Rank 16 17 18 19 20

Cotton % 19 19 22 23 25

Rank 21 22 23 24 25

• Rank all observations from smallest to largest

• Assign average rank (1 + 2 +3)/3 = 2 •Perform the same calculations for the other tied observations

Page 34: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Solution-Cont.• Data and Ranks for the Tensile Testing

Experiment

• There is a fairly large number of ties• Use the equation that was defined for the tied

observations

Page 35: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Solution-Cont.• Thus

• Test statistic

• Since h> 13.28, we would reject the null hypothesis • Conclude that treatments differ• Same conclusion is given by the usual analysis of

variance

Page 36: CHAPTER 15 NONPARAMETRIC STATISTICS. Learning Objectives Determine situations where nonparametric procedures are better alternatives to the parametric

Next Agenda

• Introduces statistical quality control

• Fundamentals of statistical process control