chapter 6.1 estimating with confidence 1. point estimation sample mean is the natural estimator of...

43
Chapter 6.1 Estimating with Confidence 1

Upload: amy-small

Post on 13-Dec-2015

232 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Chapter 6.1

Estimating with Confidence

1

Page 2: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Point estimation Sample mean is the natural estimator of the unknown

population mean. Is the point estimation a good method?1. It may never hit the true value (population mean).2. We have no idea about the variability of the estimation.

Therefore, we have no confidence about how close our estimator is to the true value.

Taser GunNet Gun (flyswatter)

Idea: It is better to use an INTERVAL than a POINT estimator.

Page 3: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Review: Chapter 1.3: All Normal curves N) share 68-95-99.7 Rule

Reminder: µ (mu) is the mean of the idealized curve, while x¯ is the mean of a sample.

s (sigma) is the standard deviation of the idealized curve, while s is the s.d. of a sample.

• About 68% of all observations

are within 1 SD (of mean ().

•Called: C=68%, z*≈1

• About 95% of all observations

are within 2 of the mean .

•Called: C=95%, z* ≈ 2

• Almost all (99.7%)

observations are within 3 of the

mean.

•Called: C=99.7%, z* ≈ 3

Standard Normal Distribution N(0, 1)

Page 4: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Confidence intervals contain the population mean in C% of samples.

Different areas under the curve give different confidence levels C.

Example: For an 80% confidence level C, 80% of the normal curve’s

area is contained in the interval.

C

z*−z*

Confidence levels

z*: z* is related to the chosen

confidence level C.

C is the area under the standard

normal curve between −z* and z*.

nx z*)(

The confidence interval is thus:

Page 5: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Point estimation versus interval When population mean (µ) is unknown, it is better to use an interval than a point to

estimate it. The theory behind interval estimation looks at the sampling distribution of the statistic. Confidence level C- CI for the population mean µ is :

( ( *) , ( *) )x z x zn n

For a particular confidence level, C, the appropriate z* value is given in

the last row of Table D.

Example: For a 98% confidence level, z*=2.326

Page 6: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Specific Confidence Intervals for population mean

99% CI for the population mean µ is :

i.e.: C=99%, z*=2.576

95% CI for the population mean µ is :

i.e.: C=95%, z*=1.960

90% CI for the population mean µ is :

i.e.: C=90%, z*=1.645

2.58 , 2.58X Xn n

1.65 , 1.65X Xn n

1.96 , 1.96X Xn n

Page 7: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 1 The average lifetime of 36 randomly selected

certain brand TVs is 20 years. Suppose the SD of all TVs is 2 years.

Construct a 95% CI for the average lifetime of all TVs from this brand.

A 95% CI for the average lifetime of all TVs from this brand is: (19.35, 20.65)

Page 8: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 21. The average height of 100 randomly selected UNCW students is 5.9 feet. Suppose the SD of the heights of all students is 1.2 feet. Construct 99%, 95% and 90% CIs for the average height of all students.

2. Select another set of 100 UNCW students randomly. The average height of second set of 100 students is 5.5 feet. Suppose the SD of the heights of all students is 1.2 feet. Construct 95% CIs for the average height of all students.

A 99% CI for the average height of all students is: (5.5904, 6.2096)A 95% CI for the average height of all students is: (5.6648, 6.1352)A 90% CI for the average height of all students is: (5.702, 6.098)

A 95% CI for the average height of all students is: (5.2648, 5.7352)

Note: Confidence level C gets smaller, CI gets smaller

Page 9: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Outlines for Z* Z* depends on the level of confidence C. What does “confidence” mean? See applet. This idea is only true for simple random samples

and completely randomized experiments. Margin of error: Z*/√(n)

http://bcs.whfreeman.com/ips7e/#616906__657132__

In Statcrunch, use Applet option and select desired options. Then run the applet.

http://www.statcrunch.com/app/index.php?

Page 10: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Understanding of Confidence Intervals

With 95% confidence, we can say

that µ should be within roughly 2

standard deviations (that is, 2*/√n)

from our sample mean .

About 95% of all possible samples of

this size n, µ will indeed fall in our

confidence interval.

About only 5% of samples would be

farther from µ.

applet.

n

x

Page 11: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Link between confidence level and margin of errorThe margin of error depends on z.

MOE z n

C

z*−z*

m m

Higher confidence C implies a larger

margin of error m (thus less precision

in our estimates).

A lower confidence level C produces a

smaller margin of error m (thus better

precision in our estimates).

Page 12: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 3

a) A 90% CI for the average life that this medicine can prolong for all cancer patients is: (3.8453, 4.1547);

b) Z*=1.645; MOE=(1.645)*(0.75)/sqrt(n)=0.1; so n=(1.645*0.75/0.1)^2=152.2139.We will take n=153.

Page 13: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Summary to Confidence Interval If Confidence level C gets larger and n stays the same,

what will happen to z*, MOE, CI, and prediction precision?

If Z* and stay the same, when n goes bigger, what will happen to MOE and CI?

Page 14: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Chapter 6.2

Hypothesis Testing

Page 15: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

6.2 Tests of hypothesis

5 Steps to Hypothesis Testing

1. State the hypothesis

2. State the level of significance

3. Calculate the test statistic

4. Find the p-value

5. Conclusion (both statistical and non-statistical)

Page 16: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Hypothesis Testing The idea of hypothesis testing is to use the data

to make a decision. In hypothesis testing, there are only two “decisions”, also called hypotheses, in which the data could support. The two hypotheses are called the null hypothesis and the alternative hypothesis.

Forms of Null and Alternative Hypotheses

Page 17: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Null HypothesisExpectation --- what somebody believes or claims before

the sample available. Null hypothesis: the hypothesis you assume to be true,

the one you are comparing against your data. denoted by H0.

Many times the null hypothesis is a statement of “no effect” or of “no difference”…

“Being fair”

E.g.: Last year, your company’s service technicians took an average of 2.6 hours to response to trouble calls from business customers who had purchased service contracts. Do this year’s data show a lower average response time?

Page 18: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Alternative HypothesisExpectation is not correct --- the difference between the

expectation and sample statistic is real. Alternative hypothesis: express the hopes or

suspicions we bring to data. The test is designed to assess the strength of evidence

against the null hypothesis, denoted by Ha .

It’s a statement that “supports” the information from the data.

E.g.: Last year, your company’s service technicians took an average of 2.6 hours to response to trouble calls from business customers who had purchased service contracts. Do this year’s data show a lower average response time?

Page 19: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 1 Exercise 6.55 (p. 391):

Translate each of the following research questions into

appropriate H0 and Ha.

a) Census Bureau data show that the mean household income in the area served by a shopping mall is $62,500 per year. A market research firm questions shoppers at the mall to find out whether the mean household income of mall shoppers is higher than that of the general population.

a) H0 : µ = $62,500 verse Ha : µ > $62,500.

Page 20: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 1 cont.

Exercise 6.55 (p. 391):

Translate each of the following research questions into

appropriate H0 and Ha.

b) Last year, your company’s service technicians took an average of 2.6 hours to response to trouble calls from business customers who had purchased service contracts. Do this year’s data show a different average response time?

b) H0 : µ = 2.6 hours verse Ha : µ ≠ 2.6 hours.

Page 21: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

6.2 Tests of hypothesis

5 Steps to Hypothesis Testing

1. State the hypothesis

2. State the level of significance (α=0.05 unless otherwise stated)

3. Calculate the test statistic

4. Find the p-value

5. Conclusion (both statistical and non-statistical)

Page 22: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

6.2 Tests of hypothesis--ReviewE.g.: Last year, your company’s service technicians took an average of

2.6 hours to response to trouble calls from business customers who had purchased service contracts. Do this year’s data show a lower average response time?

E.g.: Census Bureau data show that the mean household income in the area served by a shopping mall is $62,500 per year. A market research firm questions shoppers at the mall to find out whether the mean household income of mall shoppers is higher than that of the general population.

E.g.: Last year, your company’s service technicians took an average of 2.6 hours to response to trouble calls from business customers who had purchased service contracts. Do this year’s data show a different average response time?

b) H0 : µ = $62,500 verse Ha : µ > $62,500.

c) H0 : µ = 2.6 hours verse Ha : µ ≠ 2.6 hours.

a) H0 : µ = 2.6 hours verse Ha : µ < 2.6 hours.

Page 23: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

One-sided and two-sided tests for P-value A two-tail or two-sided test of the population mean has these null

and alternative hypotheses:

H0 :  µ = [a specific number] Ha : µ [a specific number]

A one-tail or one-sided test of a population mean has these null and

alternative hypotheses:

H0 :   µ = [a specific number] Ha :   µ < [a specific number] OR

H0 :   µ = [a specific number] Ha :   µ > [a specific number]

Page 24: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Find P-valueThe P-value is the area under the sampling distribution for values at least as extreme, in the direction of Ha, as that of

our random sample.

Use Table A, or NORMALCDF in calculator.

e.g. H0 : µ = 2.6 hours verse Ha : µ < 2.6 hours gives test

statistic Z=-1.6. Q: Find the p-value.

x µdefined by H0

Sampling distribution

σ/√n

Page 25: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

P-value in one-sided and two-sided tests

To calculate the P-value for a two-sided test, use the symmetry of the

normal curve. Find the P-value for a one-sided test, and double it.

One-sided

(one-tailed) test

Two-sided

(two-tailed) test

Page 26: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 3: One-sample Z-testA test of the null hypothesis H0 : µ = µ0 gives test statistic Z=-1.6

a) What is the P-value if the alternative is Ha : µ > µ0 ?

b) What is the P-value if the alternative is Ha : µ < µ0 ?

c) What is the P-value if the alternative is Ha : µ ≠ µ0 ?

Page 27: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 3(cont.): One-sample Z-testA test of the null hypothesis H0 : µ = µ0 gives test statistic Z=2.1

a) What is the P-value if the alternative is Ha : µ > µ0 ?

b) What is the P-value if the alternative is Ha : µ < µ0 ?

c) What is the P-value if the alternative is Ha : µ ≠ µ0 ?

Page 28: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

How to do 5 steps

1. State H0 and Ha

2. State the level of significance (Usually α is 5% ).

3. Calculate the test statistic (ASSUMING THE NULL HYPOTHESIS IS TRUE)

4. Find the P-value, that is the probability in the direction of Ha.

5. Draw Conclusion:

If P-value ≤ α, then we reject H0 (Enough evidence).

If P-value > α, then we do not reject H0 (No Enough evidence).

Note: The two possible conclusions are rejecting or not rejecting H0.

Page 29: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 2:

The P-value for a significance test is 0.032

a) Do you reject the null hypothesis at level α = 0.05?

b) Do you reject the null hypothesis at level α = 0.01?

c) Explain your answers.

Note that:

If P-value ≤ α, then we reject H0 (Enough evidence).

If P-value > α, then we do not reject H0 (No Enough evidence).

Page 30: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Exact N( , ) Exact N( , )

Not Exact Normal, but with Approximately N( , )

Mean and SD

Distribution of X, (n=1): Sampling distribution of , (n>1) :

,

X

n

n

(By Central Limit Theorem)

Standardize: Z-score of Reverse: ; *X Xn

XZ

n

30

Chap 5: Sampling distribution of a sample mean=distribution of XPopulation

Page 31: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

One-sample Z-test for population mean: Test statistics is a Z-score to the sampling

distribution of the sample mean (see chapter 5.1)

n

xz

Page 32: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 4: One-sample Z-Test (one sided)

The National Center for Health Statistics reports that the mean systolic blood pressure for males 35 to 44 years of age is 128 with a population SD=15.

The medical director of a company looks at the medical records of 72 company executives in this age group and finds that the mean systolic blood pressure in this sample is 126.07. Is this evidence that executives blood pressures are lower than the national average?

Page 33: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Answer to Example 4:

(1) Hypothesis: H0 : µ = 128 v.s. Ha : µ <128. (2) α = 5%

(3) One-sample Z-Test statistics

126.07 128 1.09

15 72

xz

n

126.07 15 72x n

(4) From table A, the area under the standard normal curve to the left of z is

0.1379.Thus, P-value = normalcdf(-999, -1.09, 0, 1) = 0.1379 = 13.79%.

(5) (Statistical Conclusion) Since P-value > α, we do not reject H0.

(Non- Statistical Conclusion) That is, there is NO evidence that executives

blood pressures are lower than the national average.

Page 34: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 5: One-sample Z-Test(two sided)A new medicine treating cancer was introduced to the

market decades ago and the company claimed that on

average it will prolong a patient’s life for 5.2 years.

Suppose the SD of all cancer patients is 2.52.

In a 10 years study with 64 patients, the average prolonged

lifetime is 4.6 years. With normality assumption, do the

10-year study’s data show a different average prolonged

lifetime?

Page 35: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Answer to Example 5:

(1) Hypothesis: H0 : µ = 5.2 year versus Ha : µ ≠ 5.2 year. (2) α = 5%

(3) One-sample Z-Test statistics4.6 5.2

1.902.52 64

xz

n

4.6 2.52 64x n

(4) From table A, the area under the standard normal curve to the left of z=-

1.90 is normalcdf(-999, -1.90, 0, 1) = 0.0287.Thus, P-value = 2*0.0287 =

0.0574=5.74%.

(Statistical Conclusion) Since P-value > α, we do not reject H0.

(Non-Statistical Conclusion) There is NOT enough evidence to conclude that

10-year study’s data show a different average prolonged lifetime.

Page 36: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 5.1 (Based on Example 5)Find a 95% confidence interval for the average

prolonged lifetime for all patients.

A 95% CI for the average prolonged lifetime for all patients

is given by:

[3.9826, 5.2174]

Note: Since H0 : µ =5.2, we have µ 0 =5.2 which falls inside

the 95% CI. We are therefore 95% confident that µ is equal

to 5.2. Therefore, we did not reject H0 at the level of 5%.

Page 37: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Confidence intervals to test hypothesesFor a level two-sided significance test:

Rejects H0: = 0 exactly when the hypothesized value 0

falls outside a level (1-confidence interval for .

α /2 α /2

In a two-sided test,

C = 1 – α.

C confidence level

α significance level

Page 38: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Ex: Your sample gives a 99% confidence interval of .

With 99% confidence, could samples be from populations with µ = 0.86? µ = 0.85?

x m 0.84 0.0101

99% C.I.

Logic of confidence interval test

x

Cannot rejectH0: = 0.85

Reject H0 : = 0.86

Page 39: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 6:

The P-value for a two-sided test of the null hypothesis

H0 : µ = 30 is 0.04.

a) Does the 95% confidence interval include the value 30? Why?

b) Does the 90% confidence interval include the value 30? Why?

c) Does the 99% confidence interval include the value 30? Why?

Note that In a two-sided test, C = 1 – α. C confidence level , α significance level

Page 40: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Example 7:

A 90% confidence interval for a population mean is (12, 15).

a) Can you reject the null hypothesis that H0 : µ = 13 at the

10% significance level? Why?

b) Can you reject the null hypothesis that H0 : µ = 10 at the

10% significance level? Why?

Page 41: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Multiple Choice Questions

10. The P-value for a two-sided test of the null hypothesis

is 0.09,

a) the 99% confidence interval includes the value 30.

b) the 95% confidence interval includes the value 30.

c) the 90% confidence interval does not include the value 30.

d) All of the above are correct.

Page 42: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Exercises on Hypothesis Testing (One-sample Z-test)1. Because of variation in the manufacturing process, tennis balls produced by a

particular machine do not have identical diameters, which is supposed to be 3in. The population SD is 0.15 in. If the average diameters of the first 36 balls made from a machine is 3.2in, shall we stop and calibrate the machine?

2. A new medicine treating cancer was introduced to the market decades ago and the company claimed that on average it will prolong a patient’s life for 5 years. The population SD is 0.4 year. In a 10 years study with 81 patients, the average prolonged lifetime is 4.5 years. With normality assumption, shall we reject the original claim?

3. The registrar office claims that the average SAT score of UNCW students is 1050. The population SD is 80. Suppose you randomly select 100 UNCW students the SAT score average of your sample is 1020. Do you agree with the claim?

4. National data shows that on the average, college freshmen spend 7.5 hours a week going to parties. The population SD is 2 hours. One administrator takes a random sample of 81 freshmen from her college and finds out that her students’ average hours spent on parties is 6.6. Shall the administrator believe that the national data applies to her students?

Page 43: Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation

Solutions1. H0 : µ = 3, Ha : µ ≠ 3; α = 5%;Z=(3.2-3)/(.15/(36)^.5)=8; P-value = 2*0=

0<5%;we reject H0 and we shall stop and calibrate the machine.

2. H0 : µ = 5, Ha : µ ≠ 5; α = 5%;Z=(4.5-5)/(.4/(81)^.5)=-11.25; P-value =

2*0= 0<5%;we reject H0 and we shall reject the claim that the average is 5

years.

3. H0 : µ = 1050,Ha :µ ≠ 1050; α = 5%;Z=(1020-1050)/(80/(100)^.5)=-3.75;

P-value = 2*0= 0<5%;we reject H0 and we shall stop and calibrate the

machine.

4. H0 : µ = 7.5, Ha : µ ≠ 7.5; α = 5%;Z=(6.6-7.5)/(2 /(81)^.5)=-4.05; P-value

= 2*0= 0<5%;we reject H0 and the national data does not apply.