hypothesis testing lecture 4. examples of various hypotheses the sodium content in furresøen is x...

16
Hypothesis Testing Lecture 4

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Hypothesis TestingHypothesis Testing

Lecture 4

Examples of various hypotheses

• The sodium content in Furresøen is x

• Sodium content in Furresøen is equal to the content in Madamsø

• The proportion of Turks in Aalborg is x %

• Proportion of Turks in Århus is the same as in Aalborg

• Average height of men in Sweden is the same as in Denmark

Basics

• Null hypothesis

• Alternative hypothesis

• Type I errors: Rejecting falsely

• Type II errors: Accepting falsely

Level of significance

So we want to construct a way to decide to

• ACCEPT or

• REJECT

the hypothesis based on data in a way such that

Critical Region

Assume

• We want to test if the sodium content here is approx 3.8 units

• We have data y1, …, yn

• We have calculated average and SE.Support that content is 3.8

Support that content is 3.8

Support that content is < 3.8

Support that content is < 3.8

Support that content is > 3.8

Support that content is > 3.8

What do we know?If the content is 3.8 then the average is normally distributed with mean 3.8

With probability of 95% is the average less than 2*SE from 3.8

If the true content is 3.8 then the average

is in the red area with prob 5%

Test:• The hypothesis is that the true

content is 3.8• Estimate mean and SE.• The critical region is

• If the average is in the critical area then reject the hypothesis else accept

Significance level

Prob(Type I error) = 5 %

Alternative approach

Can we give a number telling us to what extend the observations support the hypothesis?

Yes, of course!

Why do you think I asked?

Hmmm

Supports hypothesis

Here we should definitely reject

If the true content is 3.8 then

and

Assume that we observe an average of 3.4 and SE = 0.1

Then what?

What is the probability of observing this???

What is the probability of observing this???

95% of data sets will have an average in this area (mean +/- 2 SE)

95% of data sets will have an average in this area (mean +/- 2 SE)

Assume we obtain an average of 3.4 and standard error SE = 0.1 and the true concentration is 3.8

P-value

Summing Up

A Statistical test can be

1.On a 5% significance level

2.By calculating the p-value

Hypothesis about the Mean

1. Is the concentration 3.8?

2. Is the proprotion of Turks in Århus 7.5%?

Normal Distribution

Binomial Distribution

Sodium

1. Are data normal?

2. Estimate average and standard error

3. Calculate

4. Is t bigger than 2 (numerically)? OR

5. Calculate p-value

Turks

1. Are data binomial?

2. Calculate proportion p and standard error

3. Calculate

4. Is t bigger than 2 (numerically)?

Last slide …• Are 3.8 in the 95% CI ?

• Accept the hypothesis (mean = 3.8) on a 5% significance level

That’s the same!!