hypothesis testing lecture 3. examples of various hypotheses average salary in copenhagen is larger...
TRANSCRIPT
![Page 1: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/1.jpg)
Hypothesis TestingHypothesis Testing
Lecture 3
![Page 2: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/2.jpg)
Examples of various hypotheses
• Average salary in Copenhagen is larger than in Bælum
• Sodium content in Furresøen is equal to the content in Madamsø
• Proportion of Turks in Århus is the same as in Aalborg
• Average height of men in Sweden is the same as in Denmark
• The average temperature is increasing over time
![Page 3: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/3.jpg)
Formulation of hypothesis
Assume we are interested in a parameter Θ (e.g. the mean of the data). Let Θ0 be a number.
There are three different kinds of hypotheses:
H0: Θ = Θ0 H0: Θ ≥ Θ0 H0: Θ ≤ ΘHA: Θ ≠ Θ0 HA: Θ < Θ0 HA: Θ > Θ0
H0 is called the null hypothesis.HA is called the alternative hypothesis.
![Page 4: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/4.jpg)
Examples of various hypotheses
• Average salary in Copenhagen is larger than in Bælum
H0: μC ≥ μB. HA: μC < μB.
• Sodium content in Furresøen is equal to the content in Madamsø
H0: μF = μM. HA: μF ≠ μM.
• Proportion of Turks in Århus is the same as in Aalborg
H0: PÅ = PA. HA: PÅ ≠ PA.
• Average height of men in Sweden is the same as in Denmark
H0: μS = μD. HA: μS ≠ μD.
• The average temperature is increasing over time
H0: μtime 1 ≥ μtime 2. HA: μtime 1 < μtime 2 if time 1 ≥ time 2.
![Page 5: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/5.jpg)
COMPARE
SMALL DIFFERENCE
BIG DIFFERENCEE NOT EQUAL MEANS
EQUAL MEANS
NORMAL DISTRIBUTION(average height in Sweden and Denmark)
![Page 6: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/6.jpg)
BINOMIAL DISTRIBUTION(Proportion of Turks in Århus and Aalborg)
BIG OR NOT?
![Page 7: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/7.jpg)
The Test Procedure
Formulate a HYPOTHESIS!
![Page 8: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/8.jpg)
Numerically bigger than
Does the data support the hypothesis or not?
![Page 9: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/9.jpg)
Types of errors•Type I error: Rejecting falsely.•Type II error: Accepting falsely.
Decision H0 is true H0 is false
Reject H0 Type I error No error
Accept H0 No error Type II error
Ideally we would like a test where it is difficult to make errors.
![Page 10: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/10.jpg)
Unfortunately
If you make a test where
• it is difficult to make a Type I error
• it is easy to make a Type II error
• and the other way around
![Page 11: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/11.jpg)
Level of significance
So we want to construct a way to decide to
• ACCEPT or
• REJECT
the hypothesis based on data in a way such that
![Page 12: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/12.jpg)
This sounds really technical!!!
Hmm
I don’t like this at all!
![Page 13: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/13.jpg)
Critical Region
Assume
• We want to test if the sodium contest here is approx 3.8 units
• We have data y1, …, yn
• We have calculated average and SE.Support that content is 3.8
Support that content is 3.8
Support that content is < 3.8
Support that content is < 3.8
Support that content is > 3.8
Support that content is > 3.8
![Page 14: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/14.jpg)
What do we know?If the content is 3.8 then the average is normally distributed with mean 3.8
With probability of 95% is the average less than 2*SE from 3.8
If the true content is 3.8 then the average
is in the red area with prob 5%
![Page 15: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/15.jpg)
Test:• The hypothesis is that the true
content is 3.8• Estimate mean and SE.• The critical region is
• If the average is in the critical area then reject the hypothesis else accept
Significance level
Prob(Type I error) = 5 %
![Page 16: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/16.jpg)
Alternative approach
Can we give a number telling us to what extend the observations support the hypothesis?
Yes, of course!
Why do you think I asked?
Hmmm
Supports hypothesis
Here we should definitely reject
![Page 17: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/17.jpg)
If the true content is 3.8 then
and
Assume that we observe an average of 3.8 and SE = 0.1
Then what?
![Page 18: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/18.jpg)
What is the probability of observing this???
What is the probability of observing this???
95% of data sets will have an average in this area (mean +/- 2 SE)
95% of data sets will have an average in this area (mean +/- 2 SE)
Assume we obtain an average of 3.8 and standard error SE = 0.1 and the true concentration is 3.8
![Page 19: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/19.jpg)
P-value
![Page 20: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/20.jpg)
Summing Up
A Statistical test can be
1.On a 5% significance level
2.By calculating the p-value
![Page 21: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/21.jpg)
Hypothesis about the Mean
1. Is the concentration 3.8?
2. Is the proprotion of Turks in Århus 7.5%
Normal Distribution
Binomial Distribution
![Page 22: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/22.jpg)
Sodium
1. Are data normal?
2. Estimate average and standard error
3. Calculate
4. Is t bigger than 2 (numerically)? OR5. Calculate p-value
![Page 23: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/23.jpg)
Turks
1. Are data binomial?
2. Calculate proportion p and standard error
3. Calculate
4. Is t bigger than 2 (numerically)?
![Page 24: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/24.jpg)
Last slide before the end• Are 3.8 in the 95% CI ?
• Accept the hypothesis (mean = 3.8) on a 5% significance level
That’s the same!!
![Page 25: Hypothesis Testing Lecture 3. Examples of various hypotheses Average salary in Copenhagen is larger than in Bælum Sodium content in Furresøen is equal](https://reader035.vdocument.in/reader035/viewer/2022070412/5697bf7a1a28abf838c82a5e/html5/thumbnails/25.jpg)
The End