copyright © 2013, 2010 and 2007 pearson education, inc. chapter estimating the value of a...
TRANSCRIPT
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Chapter
Estimating the Value of a Population Parameter
9
Chap 2 2
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Section
Estimating a Population Proportion
9.1
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
A point estimate is the value of a sample statistic that estimates the value of a
population parameter.
The point estimate for the population proportion is
where “x” is the number of “successes” (you determine what is a success) and “n”
is the sample size.
9-4
p̂ x
n
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
In 2008, a University Poll asked 1783 registered voters nationwide whether they favored or opposed the death penalty for
persons convicted of murder. 1123 were in favor (“success”).
Based on this sample, obtain a point estimate for the proportion of ALL registered voters (pop) who are in favor of the death penalty for persons convicted of murder.
9-5
Calculating a Point Estimate for the Population Proportion
ˆ p 1123
17830.63
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
A confidence interval for a pop parameter consists of an interval about a point
estimate.The level of confidence represents the
probability that this interval will actually contain the population parameter we are
trying to estimate. The “level of confidence” we specify is
denoted “c” or “1 - α”
9-6
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Another way to look at Confidence Intervals
95% level of confidence
(“c” = 0.95 or “α” = 0.05)
means that if we construct 100 different confidence intervals, each based on a different sample from the same population, then 95 of those intervals will actually contain the pop parameter we are trying to estimate,
and 5 will not.
9-7
8
The relation between The relation between “c” and ““c” and “αα””
c + c + αα = 1 = 1 c c = 1 – = 1 – αα αα = 1 – c = 1 – c
So, if “c” = 0.90 for 90% Conf Interval, So, if “c” = 0.90 for 90% Conf Interval, then then
αα = 1 – c = 0.10 = 1 – c = 0.10
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Confidence interval estimates for the population proportion
Interval = point estimate
± margin of error “E”
The margin of error of the interval is a measure of how much confidence we have that the interval
actually contains the pop parameter we are trying to estimate, based on our sample (point estimate).
9-9
ˆ p
Larson & Farber, Elementary Statistics: Picturing the World, 3e 10
Level of Confidence
The level of confidence “c” is the probability that the interval estimate actually contains the true population proportion “p”“p” .
zz = 0zc zc
Critical values
(1 – c)/2 (1 – c)/2
c is the area beneath the normal curve between the critical values.
The remaining area in both tails is (1 – c)
c
Use Table 5 or TI-84 to find the
corresponding z-scores.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 11
Common Levels of Confidence
If our level of confidence is c = 0.90, then we are 90% confident that the interval estimate will contain the true population proportion “p”.
zz = 0zc zc
The corresponding z-crit scores are ± 1.645
See bottom of Table 5 for some commonly used c-values.See bottom of Table 5 for some commonly used c-values.
c = 0.90
0.050.05
zc = 1.645 zc = 1.645
Larson & Farber, Elementary Statistics: Picturing the World, 3e 12
Point Estimate for Population p
In a sample survey of 1250 adults, 450 of them said that their favorite sport to watch is baseball.
Find a point estimate for the population proportion of adults who say their favorite sport to watch is baseball.
The point estimate for the proportion of US adults who say baseball is their favorite sport to watch is
0.36, or 36%. (q-hat is 1 - 0.36 = 0.64 or 64%)
450 0.36ˆ 1250xpn
n = 1250 x = 450
Larson & Farber, Elementary Statistics: Picturing the World, 3e 13
Confidence Intervals for p
A c-confidence interval for the population proportion “p” is
“margin-of-error”
The probability that this confidence interval actually contains the population “p” is “c”.
Construct a 90% confidence interval for the proportion of all adults who say baseball is their favorite sport to watch.
ˆ ˆp E p p E ˆ .̂cpqE zn
0.36ˆ
0.64ˆ
p
q
n = 1250 x = 450
Larson & Farber, Elementary Statistics: Picturing the World, 3e 14
Confidence Intervals for p
0.36 0.022p̂ E
Based on our sample, we can say with 90% confidence that the proportion of all adults (pop) who say baseball is their favorite
sport to watch is between 33.8% and 38.2%.
Left end = 0.338 Right end = 0.382
0.36p̂ n = 1250
x = 450
• •• 0.36p̂
ˆ ˆc
pqE zn
(0.36)(0.64)(1.645)1250
0.64q̂
0.36 0.022p̂ E
0.022
0.338 0.382
Chap 6Chap 6 1515
TI-84 Conf Interval (Proportions)TI-84 Conf Interval (Proportions)
450/1250 successes c = 0.90 450/1250 successes c = 0.90
p-hat = 0.36 q-hat = 0.64p-hat = 0.36 q-hat = 0.64
npq = (1250)(0.36)(0.64) = 288 > 10npq = (1250)(0.36)(0.64) = 288 > 10
(approx normal, so can use z-scores!)(approx normal, so can use z-scores!)
1-PropZInt Test1-PropZInt Test: x=450 n=1250 : x=450 n=1250 c=0.90c=0.90
Ans: (0.338 , 0.382) Ans: (0.338 , 0.382)
Exam Q Exam Q E=??? E=???
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
For a random sample of size “n”, the
sampling distribution of is
approximately normal with mean
standard deviation
IFF, distribution is normal: npq ≥ 10
(recall: q = 1-p)
9-16
Sampling Distribution of
ˆ p
p̂
pq
n
p̂
p̂ p
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
The margin of error, E, in a confidence interval for a population proportion is given by:
9-17
Margin of Error
ˆ ˆcrit
pqE z
n
Ex: if c = .90, then Zcrit = 1.645
Note: n must be ≤ 0.05N to construct a valid interval.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Interpretation of a Confidence Interval
A 90% confidence interval indicates that:
1) 90% of all random proportion samples of size “n” taken from the population will lie
within that interval OR
2) that the probability that our interval actually contains the pop parameter is 0.90
9-18
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
In 2008, a University Poll asked 1783 registered voters whether they favored/opposed the death penalty for persons convicted of murder.
1123 were in favor (“success”). 1123/1783 = 0.63
Obtain a 90% confidence interval for the proportion of ALL registered voters (pop) who are in favor of the death penalty for persons
convicted of murder.9-19
Constructing a Confidence Interval for a Population Proportion
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
So, distribution is normal and n < 0.05N (assumed) C = 0.90 or α = 0.10 so zcrit = 1.645
Low Interval bound:
Upper interval bound:
9-20
ˆ 0.6298p ˆ ˆ 1783(0.63)(0.37) 416 10npq
Solution
(0.63)(0.37)0.6298 1.645 0.6110
1783
(0.63)(0.37)0.6298 1.645 0.6486
1783
Chap 6Chap 6 2121
TI-84 Conf Interval (Proportions)TI-84 Conf Interval (Proportions)
1123/1783 successes c = 0.90 1123/1783 successes c = 0.90 Stat:Tests:A:1-PropZInt Test:
x=1123 n=1783 c=0.90x=1123 n=1783 c=0.90
Ans: (0.6110 , 0.6486) Ans: (0.6110 , 0.6486)
Margin of Error “E”=???Margin of Error “E”=??? E = (0.6486-0.6110)/2 = 0.0188 or E = (0.6486-0.6110)/2 = 0.0188 or
1.88%1.88%
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
90% Conf. Interval: (0.6110, 0.6486)
Based on our sample, we are 90% confident that the proportion of ALL
registered voters (pop) who are in favor of the death penalty for those convicted of murder is between 0.6110 and 0.6486
( between 61.10% and 64.86%)
9-22
Conclusion
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
The sample size required to obtain that confidence for our p interval, with a margin of error E is:
where E is normally in % because we are dealing in proportions.
9-23
2
ˆ ˆz
n pqE
Sample Size Needed for Estimating the Population Proportion p
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
If you have not yet done the survey, but want an estimate for a minimum sample size, then use :
9-24
2
(0.5)(0.5)z
nE
Sample Size Needed for Estimating the Population Proportion p
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
A sociologist wanted to determine the current percentage of US residents that speak only English at
home.
What minimum sample size should she use if she wants her max estimate error to be 3% (E), with 90%
confidence, assuming she uses the 2000 Census Survey result of 82.4% as a preliminary estimate?
9-25
Determining Sample Size
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
E = 0.03
Round this up to 437 randomly selected American residents which is the min sample size for a 90%
confidence that the max interval error is 3%
9-26
90 1.645z
ˆ p 0.8242
1.645(0.824)(0.176) 436.04
0.03n
Solution
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Section
Estimating a Population Mean
9.2
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
The Margin of Error “E” depends on three factors:
Level of confidence: As the level of confidence increases, the margin of error widens.
Sample size: As the size of the random sample increases, the margin of error shrinks.
Population standard deviation “σ”: The more spread there is in the population, the wider our
margin of error will be for a given level of confidence.
9-28
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Pennies minted after 1982 are made from 97.5% zinc and 2.5% copper.
The following data represent the weights (in grams) of 17 randomly selected pennies minted after 1982.
2.46 2.47 2.49 2.48 2.50 2.44 2.46 2.45 2.49
2.47 2.45 2.46 2.45 2.46 2.47 2.44 2.45
Based on this sample, create a point estimate for the population mean weight of all pennies minted after
1982.
9-29
Computing a Point Estimate for a Mean
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
The sample mean is:
Our sample mean forms a “point estimate” of the pop mean weight μ of
all pennies which is 2.464 grams.
9-30
2.46 2.47 2.452.464
17x g
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
If the population from which a sample size “n<30 ” is drawn follows a normal distribution, the distribution of
follows Student’s t-distribution with (n – 1) degrees of freedom where is the sample mean and “s” is the sample standard deviation.
9-31
“Student’s t-Distribution”
t x
s
n
x
9-32
Histogram for z
9-33
Histogram for t
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
1. The t-distribution is different for different degrees of freedom.
2. As the sample size n increases, the distribution curve of “t” gets approximates the standard normal curve.
3. The area under the curve is 1. The area under the curve is symmetric to the right and left of 0.
4. As t increases or decreases without bound, the graph approaches, but never equals, zero.
9-34
Larson & Farber, Elementary Statistics: Picturing the World, 3e 35
The Student t-Distribution
3. As the # of degrees of freedom (d.f.) increase, the t-distribution approaches the normal distribution.
4. Above 30 d.f., the t-distribution is close to the standard normal z-distribution.
t0
Standard normal curve
The tails in the t-distribution are “thicker” (further from the horizontal axis) than those in the standard normal distribution, so the area under the curve at large std devs is much greater.
d.f. = 5
d.f. = 2
9-36
SEE Table VI !!
Larson & Farber, Elementary Statistics: Picturing the World, 3e 37
Critical Values of tFind the critical value tc for a 95% confidence interval when
the sample size is 5.
95% of the area under the t-distribution curve with 4 degrees of freedom lies between t = ±2.776.
ttc = 2.776 tc =
2.776
c = 0.95
alpha = 0.025
9-38
The figure to the left shows the graph of the t-distribution with 10 degrees of freedom.
The area under the curve to the right of t is shaded.
See Table VI: the value of t0.20 with 10 degrees of freedom is 0.879
This is the t-crit for a Confidence Interval of 60% and a sample size: n = 11.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
A point estimate is the value of a sample statistic that estimates the value of a
population parameter.
For example, the sample mean is the point estimate for the population mean μ.
9-39
x
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Confidence Interval for μ
Data come from a simple random sample or randomized experiment.
Sample size is small relative to the population size or: n ≤ 0.05N
The data come from a population that is normally distributed, or the sample size is large
Lower /Upper bound:
Using and (n – 1) d.f.
9-40
x E
crit
sE t
n
critt
Larson & Farber, Elementary Statistics: Picturing the World, 3e 41
Constructing a Confidence Interval
In a random sample of 20 customers at a local junk food restaurant, the mean waiting time to order is 95 seconds, and the standard
deviation is 21 seconds. Assume the wait times are normally distributed.
Construct a 90% confidence interval for the mean wait time of all (pop) customers.
So, based on this sample, we are 90% confident that the mean wait time for all customers is between 86.9 and 103.1 seconds.
= 95 s = 21
tc = +/- 1.729
n = 20
csE tn
d.f. = 19 21(1.729)( )20
± E = 95 ± 8.1 86.9 sec < μ < 103.1 sec
x
x
= 8.1 sec
9-42
Using this sample of 17 pennies, construct a 99% confidence interval about the population mean weight (grams) of pennies minted after 1982.
Use = 2.464g and s = 0.02g
2.46 2.47 2.49 2.48 2.50 2.44 2.46 2.45 2.49
2.47 2.45 2.46 2.45 2.46 2.47 2.44 2.45
Constructing a Confidence Interval
x
9-43
Weight (in grams) of Pennies
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Lower bound:
= 2.464 – 0.0142 = 2.4498
Upper bound:
= 2.464 + 0.0142 = 2.4782
Based on this sample, we are 99% confident that
the mean weight of pennies minted after 1982 is
between 2.45 and 2.48 grams.
9-44
0.005 2.921t
2 2.921(0.02 / 17) 0.0142s
E tn
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
The sample size required to estimate the population mean, µ, with a level of confidence “c” with a specified margin of error, E, is given by
where “n” is rounded up to the nearest whole number.
9-45
n z
2
s
E
2
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
How large a sample of pennies would be required to estimate the mean weight of a
penny manufactured after 1982 with a max error of 0.005 grams with 99% confidence?
Assume: s = 0.02g
9-46
Determining the Sample Size
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
c=99% α=1% so s = 0.02
E = 0.005
Rounding up, we find min sample size is: n = 107.
9-47
2 0.005 2.576z z
2
2
2 2.576(0.02)106.17
0.005
z s
nE
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Section
Estimating a Population Standard Deviation
9.3
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
If a simple random sample of size n is obtained from a normally distributed population with mean μ and standard deviation σ, then
has a chi-square distribution with (n-1) degrees of freedom.
9-49
2 (n 1)s2
2
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
1. It is not symmetric (skewed right).
2. The shape depends on the degrees of freedom (d.f.), just like the Student’s t-distribution.
3. As the number of d.f. increases, the chi-square distribution becomes more nearly symmetric(normal).
4. The values of χ2 are always nonnegative (≥ 0).
9-50
Characteristics of the Chi-Square Distribution
9-51
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Find the chi-square values that separate the middle 95% of the distribution from the
2.5% in each tail. Assume 18 degrees of freedom (d.f.).
9-52
Finding Critical Values for the Chi-Square Distribution: TABLE VII
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Find the chi-square values that separate the middle 95% of the distribution from the 2.5% in each tail. Assume 18 degrees of freedom.
χ20.975 = 8.231
χ20.025 = 31.526
9-53
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
If a simple random sample of size n is taken from a normal population with mean μ and standard deviation σ, then a c% confidence interval for χ2 is: Lower bound:
Upper bound:
Note: To find a (1-)·100% confidence interval about “σ”, take the square root of the lower bound and upper bound.
9-54
Confidence Interval for χ2
(n 1)s2
22
(n 1)s2
1 22
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
One way to measure stock risk (volatility) is through the standard deviation of several stock prices. The
following data represent the weekly gain/loss (%) of Microsoft stock for 15 randomly selected weeks.
Compute the 90% confidence interval for the std dev (volatility/risk) of Microsoft stock.
5.34 9.63 –2.38 3.54 –8.76 2.12 –1.95 0.27 0.15 5.84 –3.90 –3.80 2.85 –1.61 –3.31
9-55
Constructing a Confidence Interval for a Population Variance and Standard Deviation
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
(Note: the data is approximately normal with no outliers) = 0.2687% s = 4.6974% s2 = 22.0659
Table VII: χ20.95 = 6.571 χ2
0.05 = 23.685 for 14 d.f.
Lower variance bound:
Upper variance bound:
We are 90% confident that the population stock variance interval is (13.04, 47.01), so
the stock standard deviation (volatility) interval is (3.61%,6.86%) 9-56
14(22.0659)
23.68513.04
14(22.0659)
6.57147.01
x
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Section
Which Procedure Do I Use?
9.4
58Chap 6
This next slide is very important.
You should print it and study it….
Larson & Farber, Elementary Statistics: Picturing the World, 3e 59
To Use Normal or t-Distribution?
Is n 30?
Is the population normally, or approximately normally,
distributed?
You cannot use the normal distribution or the t-distribution.
No
Yes
Is known?
No
Use the normal distribution with
If is unknown, use s instead.
.cσE zn
Yes
No
Use the normal distribution with
.cσE zn
Yes
Use the t-distribution with
and n – 1 degrees of freedom.
csE tn
Larson & Farber, Elementary Statistics: Picturing the World, 3e 60
To Use Normal or t-Distribution?
Determine whether to use the normal distribution, the t-distribution, or neither.
a.) n = 50, the distribution is skewed, s = 2.5The normal distribution (z-interval) would be used because the
sample size is > 30.
b.) n = 25, the distribution is skewed, s = 52.9Neither distribution would be used because n < 30 and the
distribution is skewed (not normal) so cannot use t or z-interval.
c.) n = 25, the distribution is normal, = 4.12The normal distribution (z-interval) would be used because although
n < 30, the population is normal and pop standard deviation is known.
9-61
Chap 2 62