testing hypothesis that data fit a given probability distribution problem: we have a sample of size...

4
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution. Null Hypothesis, H0: The data fits the distribution. Fact: Divide the range into k intervals. If the data fits the distribution, then following random variable follows the chi-square distribution with k-1 degrees of freedom. k j j j j k j np np n 1 2 1 2 ) ( interval kth in points of number expected ) interva kth in values of number expected interval kth in values of number observed (

Upload: imogen-hodge

Post on 04-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution

Testing Hypothesis That Data Fit a Given Probability Distribution

• Problem: We have a sample of size n. Determine if the data fits a probability distribution.

• Null Hypothesis, H0: The data fits the distribution.• Fact: Divide the range into k intervals. If the data fits the

distribution, then following random variable follows the chi-square distribution with k-1 degrees of freedom.

k

j j

jj

k

j

np

npn

1

2

1

2

)(

intervalkth in points ofnumber expected

)intervalkth in valuesofnumber expectedintervalkth in valuesofnumber observed(

Page 2: Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution

Testing Hypothesis That Data Fit a Given Probability Distribution

• The value of the above variable computed in a hypothesis test is called chi-square statistic.

• If chi-square statistic is too large (far in the right tail of the chi-square distribution) this is a surprising result, and it means that the evidence from the test contradicts the hypothesis that the data fit the probability distribution.

Page 3: Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution

Algorithm

1. Perform visual test first. If there is no reason to reject hypothesis proceed as follows.

2. Divide range of values in a sample into k adjacent intervals.

3. Tally the number of observations in each interval.

4. Calculate the chi-square statistic.

5. Calculate the p-value of the test.

6. Decide if the hypothesis should be rejected.

Page 4: Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution

Decision Rule

• Reject hypothesis if p-value less or equal to some low significance level (e.g. 0.05). Otherwise do not reject hypothesis.

0 10 200

0.05

0.1

0.15

dchisq x 7( )

x

Critical value (probability of exceedence 0.05)

qchisq 0.95 7( ) 14.067

Reject H0Do not reject H0