Hypothesis Testing
9.1 Null and Alternative Hypotheses and Errors in Testing
9.2 z Tests about a Population with known 9.3 t Tests about a Population with unknown
2
Hypothesis testing-1Researchers usually collect data from a sample and then use the sample data to help answer questions about the population. Hypothesis testing is an inferential statistical process that uses limited information from the sample data as to reach a general conclusion about the population.
3
• A hypothesis test is a formalized procedure that follows a standard series of operations.
• In this way, researchers have a standardized method for evaluating the results of their research studies.
4
Hypothesis testing-2
5
The basic experimental situation for using hypothesis testing is presented here. It is assumed that the parameter is known for the population before treatment. The purpose of the experimentis to determine whether or not the treatment has an effect. Is the population mean after treatment the same as or different from the mean before treatment? A sample is selected from the treated population to help answer this question.
Procedures of hypothesis-testing
6
1. First, we state a hypothesis about a population. Usually the hypothesis concerns the value of a population parameter. For example, we might hypothesize that the mean IQ for UIC students is = 110.
2. Next, we obtain a random sample from the population. For example, we might select a random sample of n = 100 UIC students.
3. Finally, we compare the sample data with the hypothesis. If the data are consistent with the hypothesis, we will conclude that the hypothesis is reasonable. But if there is a big discrepancy between the data and the hypothesis, we will decide that the hypothesis is wrong.
Null and Alternative Hypotheses• The null hypothesis, denoted H0, is a statement of the
basic proposition being tested. It generally represents the status quo (a statement of “no effect” or “no difference”, or a statement of equality) and is not rejected unless there is convincing sample evidence that it is false.
• The (scientific or) alternative hypothesis, denoted Ha (or H1) , is an alternative (to the null hypothesis) statement that will be accepted only if there is convincing sample evidence that it is true.
• These two hypotheses are mutually exclusive and exhaustive.
7
9
Alpha level of .05 -- the probability of rejecting the null hypothesis when it is true is no more than 5%.
Z
11
Example:Alcohol appears to be involved in a variety of birth defects, including low birth weight and retarded growth. A researcher would like to investigate the effect of prenatal alcohol on birth weight. A random sample of n = 16 pregnant rats is obtained. The mother rats are given daily doses of alcohol. At birth, one pup is selected from each litter to produce a sample of n = 16 newborn rats. The average weight for the sample is 15 grams. The researcher would like to compare the sample with the general population of rats. It is known that regular newborn rats (not exposed to alcohol) have an average weight of m = 18 grams. The distribution of weights is normal with sd = 4.
13
1. State the hypothesesThe null hypothesis states that exposure to alcohol has no effect on birth weight.The alternative hypothesis states that alcohol exposure does affect birth weight.
2. Select the Level of Significance (alpha) levelWe will use an alpha level of .05. That is, we are taking a 5% risk of committing a Type I error, or, the probability of rejecting the null hypothesis when it is true is no more than 5%.
3. Set the decision criteria by locating the critical region
14
Alpha level of .05 -- the probability of rejecting the null hypothesis when it is true is no more than 5%.
Z
15
4. COLLECT DATA and COMPUTE SAMPLE STATISTICS
The sample mean is then converted to a z-score, which is our test statistic.
316/4
1815
n/0
X
z
5. Arrive at a decisionReject the null hypothesis
Hypothesis Testing
D o n o t re jec t n u ll R e jec t n u ll an d accep t a lte rn a te
S tep 5 : Take a sam p le , a rrive a t a d ec is ion
S tep 4 : F orm u la te a d ec is ion ru le
S tep 3 : Id en tify th e tes t s ta tis t ic
S tep 2 : S e lec t a leve l o f s ig n ifican ce
S tep 1 : S ta te n u ll an d a lte rn a te h yp o th eses
Alternative Hypothesis H1:
A statement that is accepted if H0 is false
Without “=” signSay, “ 2” or “ < 2”
Null Hypothesis H0:
A statement about the value of a population parameter
( and ).
With “=” signSay, “ = 2” or “ 2”
17
Step 1: State the null and alternate Step 1: State the null and alternate hypotheseshypotheses
Three possibilities
regarding means
H0: = 0
H1: = 0
H0: < 0
H1: > 0
H0: > 0
H1: < 0
The null hypothesis
always contains equality.
3 hypotheses about means 18
a constant
Step 1: State the null and alternate Step 1: State the null and alternate hypotheseshypotheses
Step Two: Select a Level of Step Two: Select a Level of Significance, Significance,
Measures the max probability of rejecting a true null hypothesis
H0 is actually true but you reject it (false positive).
H0 is false but you accept it (false negative).
Level of Significance, Level of Significance,
Type I ErrorType I Error Type II ErrorType II Error
19
too high
Level of Significance: the maximum allowable probability of making a type I error
Researcher
Null Accepts Rejects
Hypothesis Ho Ho
Ho is true
Ho is false
Correct
decision
Type I error
Type II
Error
Correct
Decision
Risk table
20
Step Two: Select a Level of Step Two: Select a Level of Significance, Significance,
Step 3: Select the test statisticStep 3: Select the test statisticA test statistic is used to determine whether the result of the research study (the difference between the sample mean and the population mean) is more than would be expected by chance alone.
We will only consider statistics Z or t, for the time being.
Since our hypothesis is about the population mean.
)1,0(~n/
0 NX
z
21
Test Statistic• The term test statistic simply indicates that the
sample mean is converted into a single, specific statistic that is used to test the hypotheses.
• The z-score statistic that is used in the hypothesis test is the first specific example of what is called a test statistic.
• We will introduce several other test statistics that are used in a variety of different research situations later.
22
Reject the H0 if
Computed z > Critical z
Computed z < - Critical z
Decision Rule
H0: 0
Computed z > Critical zOr
Computed z < - Critical z
H0: 0
H0: = 0
23
Determined by level of significance
Step 4: Formulate the decision rule.Step 4: Formulate the decision rule.
Critical value: The dividing point between the region where H0 is rejected and the region where H0 is accepted, determined by level of significance.
0 1.65
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .05]
C ritica l va lue
From the table, with statistic z, one tailed test and significance level 0.05, we found the critical value 1.65.
24
H0: 0
Reject if z > Critical z
One-Tailed Test of SignificanceOne-Tailed Test of Significance
.
0 1.65
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .05]
C ritica l va lue
If H0: 0 is true, it is
very unlikely that the
computed z value is so
large.
25
26
H0: 0 Computed z < - Critical z
Reject the H0 if
If H0: 0 is true, it is very
unlikely that the computed z value (from the
sample mean) is so small.
If H0: = 0 is true, it is
very unlikely that the
computed z value is
extremely large or small. 0 1.96
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .025]
C ritica l va lue-1.96
R egion of
re jection
[P robability= .025]
C ritica l va lue
Two-Tailed Tests of Two-Tailed Tests of SignificanceSignificance
27
• An insurance company is reviewing its current policy rates. When originally setting the rates they believed that the average claim amount was $1,800. They are concerned that the true mean is actually higher than this, because they could potentially lose a lot of money. They randomly select 40 claims, and calculate a sample mean of $1,950. Assuming that the population standard deviation of claims is $500, and set level of significance = 0.05, test to see if the insurance company should be concerned.
29
Step 1: Set the null and alternative hypotheses
Example One Tailed (Upper Tailed)
30
Step 2: Calculate the test statistic
Example One Tailed (Upper Tailed)
Step 3: Set Rejection RegionLooking at the picture below, we need to put all of alpha in the right tail. Thus,R : Z > 1.96
31
Step 4: ConcludeWe can see that z=1.897 < 1.96, thus our test statistic is not in the rejection region. Therefore we fail to reject the null hypothesis. We cannot conclude anything statistically significant from this test, and cannot tell the insurance company whether or not they should be concerned about their current policies.
Example One Tailed (Upper Tailed)
32
Trying to encourage people to stop driving to campus, the university claims that on average it takes people 30 minutes to find a parking space on campus. John does not think it takes so long to find a spot.
He calculated the mean time to find a parking space on campus for the last five times and found it to be 20 minutes. Assuming that the time it takes to find a parking spot is normally distributed, and that the population standard deviation = 6 minutes, perform a hypothesis test with level of significance alpha = 0.10 to see if his claim is correct.
Example: One Tailed (Lower Tailed)
33
Step 1: Set the null and alternative hypotheses
Example: One Tailed (Lower Tailed)
Step 2: Calculate the test statistic
Step 3: Set Rejection RegionLooking at the picture below, we need to put all of alpha in the left tail. Thus,R : Z < -1.28
34
Example: One Tailed (Lower Tailed)
Step 4: ConcludeWe can see that z=-3.727 < -1.28, thus our test statistic is in the rejection region. Therefore we reject the null hypothesis in favor of the alternative.
We conclude that the mean is significantly less than 30, thus John has proven that the mean time to find a parking space is less than 30.
35
Example: Two Tailed
A sample of 40 sales receipts from a grocery store has mean = $137 and population standard deviation = $30.2. Use these values to test whether or not the mean in sales at the grocery store are different from $150 with level of significance alpha = 0.01.Step 1: Set the null and alternative hypotheses
Step 2: Calculate the test statistic
36
Example: Two Tailed
Step 3: Set Rejection RegionLooking at the picture below, we need to put half of alpha in the left tail, and the other half of alpha in the right tail. Thus, R : Z < -2.58 or Z > 2.58
Step 4: ConcludeWe see that Z= -2.722 < -2.58, thus our test statistic is in the rejection region. Therefore we reject the null hypothesis in favor of the alternative. We can conclude that the mean is significantly different from $150, thus I have proven that the mean sales at the grocery store is not $150.
Example: credit manager
Lisa, the credit manager, wants to check if the mean monthly unpaid balance is more than $400. The level of significance she set is .05. A random check of 172 unpaid balances revealed the sample mean to be $407. The population standard deviation is known to be $38.
Should Lisa conclude that the population mean is greater than $400, or is it reasonable to assume that the difference of $7 ($407-$400) is due to chance? (at confidence level 0.05)
37
Step 1H0: µ < $400 H1: µ > $400
Step 2The significance
level is .05.
Step 3 Since is known, we can find the
test statistic z.
Step 4H0 is rejected if
z > 1.65 (since = 0.05)
Step 5Make a decision and interpret the results.
(Next page)
Example: Lisa, the credit manager38
42.217238$
400$407$
n
Xz
The p-value is .0078 for a one-tailed test.
(ref to informal ans.)
oComputed z of 2.42
> Critical z of 1.65,
op of .0078 < of .05.
Reject H0.
Step 5Make a decision and interpret the results.
We can conclude that the mean unpaid balance is greater than $400.
39
Limitation of z-scores in hypothesis testing
• The limitation of z-scores in hypothesis testing is that the population standard deviation (or variance) must be known.
• What if you don’t know the µ and of the population?
• Answer: use the sample variability instead
40
41
Sample variance s2 = sum of squares of deviation/ (n-1)
= sum of square of deviations/df= SS/df
Since you must know the sample mean before you can compute sample variance, this places a restriction on sample variability such that only n-1 scores in a sample are free to vary. The value n-1 is called the degrees of freedom (or df ) for the sample variance.
42
If you select all the possible samples of a particular size (n), the set of all possible t statistics will form a t distribution.
n
Xz
Z statisticZ statistic
ns
Xt
t statistict statistic
Unknown
Good for: Good for: (i) large sample n>30, with the (i) large sample n>30, with the underlying distribution may or may not be Normalunderlying distribution may or may not be Normal
(ii) (ii) small samplesmall sample n<30 with the n<30 with the underlying distribution is Normalunderlying distribution is Normal
43
Distributions of the t statistic for different values of degrees of freedom are compared to a normal distribution.
47
The t distribution with df = 3. Note that 5% of the distribution is located in the tails t>2.353 and t<2.353.
The label on Fries’ Catsup indicates that the bottle contains 16 ounces of catsup. A sample of 36 bottles from last hour’s production revealed a mean weight of 16.12 ounces per bottle and a sample standard deviation of 0.5 ounces. At the 0.05 significance level, test if the process out of control? That is, can we conclude that the mean amount per bottle is different from 16 ounces?
48
Step 1 State the null and the
alternative hypotheses H0: = 16
H1: 16
Step 3Since the sample size is large enough
and the population s.d. is unknown, we can use the test statistic is t.
Step 2 Select the significance level. The significance level is .05.
Step 4 State the decision rule.
Reject H0 if z > 1.96 or z < -1.96 (since = 0.05)
Step 5Make a decision and interpret the results.
(Next page)
49
44.1365.0
00.1612.16
ns
Xt
oComputed z of 1.44
< Critical z of 1.96,
op of .1499 > of .05,
Do not reject the null hypothesis.
The p-value is .1499 for a two-tailed test.
Step 5: Make a decision and interpret the results.
We cannot conclude the
mean is different from 16 ounces.
50
ns
Xt
/
The test statistic is the t distribution.
Testing for a Population Mean: Unknown Testing for a Population Mean: Unknown (Population) standard deviation , (Population) standard deviation , Small sampleSmall sample. . But the underlying distribution is NormalBut the underlying distribution is Normal
The critical value of t is determined by its degrees of
freedom which is equal to n-1.
51
The current rate for producing 5 amp fuses at a Electric Co. is 250 per hour. A new machine has been purchased and installed. According to the supplier, the production rate are normally
distributed. A sample of 10 randomly selected hours from last month revealed that the mean hourly production was 256 units, with a sample s.d. of 6 per hour.
At the 0.05 significance level, test if the new machine is faster than the old one?
52
Step 1 State the null and
alternate hypotheses.
H0: µ < 250
H1: µ > 250
Step 2 Select the level of
significance. It is .05.
Step 3 Since the underlying
distribution is normal, is unknown, use the t
distribution.
Step 4 State the decision rule.
degrees of freedom = 10 – 1 = 9
. Reject H0 if t > 1.833
53
162.3106
250256
ns
Xt
oComputed t of 3.162 >Critical t of 1.833
op of .0058 < alpha of .05Reject Ho
The p-value is 0.0058. (obtained from t, need a
software to find it.)
Step 5 Make a decision and interpret the results.
The mean number of fuses produced is more
than 250 per hour.
54
If the p-value is less than alpha , then reject the null hypothesis.
• Amount of time UIC students spend in library from survey– Mean 41.72 minutes– Standard deviation 40.179 minutes– Number of cases 294
• National survey finds university library users spend mean of 38 minutes
• Is population mean for UIC Library users different from national mean?
Example: One-sample hypothesis test for mean
• Null hypothesis
H0: μ = μ0 μ = 38
• Alternative or research hypothesis
Ha: μ ≠ μ0 μ ≠ 38
Step 1. Hypotheses
Step 2. Level of significance
0 1.96
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .025]
C ritica l va lue-1.96
R egion of
re jection
[P robability= .025]
C ritica l va lue
• Probability of error in making decision to reject null hypothesis
• For this test choose α = 0.05
Step 3. Test statistic
588.1294/179.40
3872.41
/0
ns
yt
0 1.96
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .025]
C ritica l va lue-1.96
R egion of
re jection
[P robability= .025]
C ritica l va lue
n = 294 so use critical t values from table for infinity.
• Cannot reject the null hypothesis• Cannot conclude that population mean is
different from 38 minutes
4. Decision
95% confidence Interval in this example:
E=1.96* =4.59
[41.72-4.59, 41.72+4.59] or [37.13, 46.31]
• Confidence interval for time spent in library is 37.13 < μ < 46.31
• Hypothesized value of 38 minutes falls within confidence interval
• Therefore we cannot say that population mean is not equal to 38 minutes, cannot reject the null hypothesis
Confidence interval and hypothesis test for library example
• For parameters for a single sample…– One-sample hypothesis test involves comparison
with pre-specified value…– Which is often artificial…– So confidence interval most appropriate for
reporting results• For parameters for two samples…
– Difference in parameters is of interest– Hypothesis test examines directly– Confidence interval less intuitive
Using confidence intervals or hypothesis tests
Confidence interval or Hypothesis test?
• Hypothesis tests are better when the chief issue is to make a yes/no decision about whether a pattern exists in a population.
• Confidence intervals are better when the chief issue is to make a best guess of a population parameter.
63
When reading a scientific journal, you typically will not be told explicitly that the researcher evaluated the data using a z-score as a test statistic with an alpha level of .05. Nor will you be told that “the null hypothesis is rejected.” Instead, you will see a statement such as:
The treatment with medication had a significant effect on people’s depression scores, z = 3.85, p < .05.
Let us examine this statement piece by piece.First, what is meant by the term significant?In statistical tests, this word indicates that the result is different from whatwould be expected due to chance. A significant result means that the null hypothesis has been rejected. That is, the data are significant because the sample mean falls in the critical region and is not what we would have expected to obtain if H0 were true.Next, what is the meaning of z = 3.85? The z indicates that a z-score was used as the test statistic to evaluate the sample data and that its value is 3.85.
64
Finally, what is meant by p< .05? This part of the statement is a conventional way of specifying the alpha level that was used for the hypothesis test. More specifically, we are being told that an outcome as extreme as the result of the experiment would occur by chance with a probability (p) that is less than .05 (alpha) if H0 were true.
65
In circumstances where the statistical decision is to fail to reject H0, the report might state thatThere was no evidence that the medication had an effect on depression scores, z=1.30, p> .05.
In this case, we are saying that the obtained result, z= 1.30, is not unusual (not in the critical region) and is relatively likely to occur by chance (the probability is greater than .05). Thus, H0 was not rejected.
Using the p-Value in Hypothesis TestingUsing the p-Value in Hypothesis Testing
If the p-Value a, H0 cannot be rejected.
If the p-Value < a, H0 is rejected.
p-value does not only tell us whether we should reject H0, but also tell us how confident we are to reject it.
66
Sample means that fall in the critical region (shaded areas) have a probability less than alpha. H0 should be rejected.
67
More Example: To test the effectiveness of eye-spot patterns in deterring predation, a sample of n=16 insectivorous birds is selected. The animals are tested in a box that has two separate chambers (see figure). The birds are free to roam from one chamber to another through a doorway in a partition. On the wall of one chamber, two large eye-spot patterns have been painted. The other chamber has plain walls. The birds are tested one at a time by placing them in the doorway in the center of the apparatus.
Each animal is left in the box for 60 minutes, and the amount of time spent in the plain chamber is recorded. Suppose that the sample of n=16 birds spent an average m of 39 minutes in the plain side, with SS=540. Can we conclude that eye-spot patterns have an effect on behavior? Note that we have no information about the population variance.
68
Step 1: State the hypotheses : H0: µplain side = 30 minutes
Step 2: Locate the critical region. The test statistic is a t statistic because the population variance is not known.
df=16-1=15For a two-tailed test at the .05 level of significance and with 8 degrees of freedom, the critical region consists of t values greater than +2.131 or less than -2.131
Step 3: Calculate the test statistic s2 = SS/df = 540/15 = 36sm = sqrt(s2 /16) = 1.5 the t statistic t=(39-30)/1.5=6
Step 4: Make a decision – reject H0
Example: Survey data on attitudes toward income inequality
• Imagine that we would like to find out if US adults had some net opinion on the following issue.
• “Do you think it should or should not be the government’s responsibility to reduce income differences between the rich and the poor?”
• Score Response Number• 1 should be 591• 0 should not be 636• Total n = 1227
Survey data on attitudes toward income inequality
• 0: Assumptions: we will be doing a large-sample test for population proportions. To perform this test, we must assume that…– Sample size is large enough that np(1-p) > 10 – The sample is a random sample of some sort– The variable is a discrete interval-scale variable, which is
automatically true for population proportions.
Survey data on attitudes toward income inequality
• 1: Hypothesis: let denote the population proportion who favor government intervention to alleviate income inequality.
• Our null hypothesis is that the population, on average, neither supports nor opposes government intervention.– Ho: = 0.5
• The alternate hypothesis is then– HA: 0.5
Survey data on attitudes toward income inequality
• 2: Test Statistic: For an n of 1227 respondents, we calculate the following statistics:– P = n(yes)/n(total) = 591/1227 = .4817– σ0 = SQRT(o(1- o)) = .5– SE = σ0 / SQRT(n) = .01427– z = (P - o ) / s.e.
. = (.4817 - .500) / .01427 . = -1.282
• The z-statistic is the test statistic of interest in a large-sample test of a population proportion.
Survey data on attitudes toward income inequality
0 1.96
D o not
re ject
[P robability = .95]
R egion of
re jection
[P robability= .025]
C ritica l va lue-1.96
R egion of
re jection
[P robability= .025]
C ritica l va lue
3. Pick α = 0.05 & determine critical z
-1.282