chapter 19 confidence intervals for proportions math2200
TRANSCRIPT
Chapter 19 Confidence intervals for
proportions
math2200
Sample proportion
• A good estimate of the population proportion• Natural sampling variability• How does the sample proportion vary from
sample to sample?– Approximately normal
– About 95% of all samples have within 2 sd of p– Population proportion, p, is unknown– Standard deviation is unknown
,pq
N pn
We can estimate!• Standard error estimates standard deviation
• If estimates p accurately, by the 68-95-99.7% Rule, we know– about 68% of all samples will have ’s within 1 SE of
p– about 95% of all samples will have ’s within 2 SEs
of p– about 99.7% of all samples will have ’s within 3 SEs
of p
What can we say about p?
• The true value of p is . (Wrong!)• The true value of p is close to . (OK!)• We are 95% confident that the true value of p is
between
and
• This is a 95% confidence interval of p• In this specific context, we can also call it one-
proportion z-interval
Margin of error
• The extent of the confidence interval on either side of is called the margin of error– The margin of error for our 95% confidence interval is
2se.– For the 99.7% confidence interval, the margin of error
is 3se.– The more confident we want to be, the larger the
margin of error must be.– Certainty versus precision
• Confidence interval: estimate ± margin of error
What Does “95% Confidence” Really Mean?
• Each confidence interval uses a sample statistic to estimate a population parameter.
• But, since samples vary, the statistics we use, and thus the confidence intervals we construct, vary as well.
What Does “95% Confidence” Really Mean? (cont.)
• The figure to the right shows that some of our confidence intervals capture the true proportion (the green horizontal line), while others do not.
What Does “95% Confidence” Really Mean? (cont.)
• Our confidence is in the process of constructing the interval, not in any one interval itself.
• Thus, we expect 95% of all “95%-confidence intervals” to contain the true parameter that they are estimating.
Margin of Error: Certainty vs. Precision
• We can claim, with 95% confidence, that the interval contains the true population proportion. – The extent of the interval on either side of
is called the margin of error (ME).
• In general, confidence intervals have the form estimate ± ME.
• The more confident we want to be, the larger our ME needs to be.
p̂
)ˆ(2ˆ pSEp
Margin of Error: Certainty vs. Precision (cont.)
Margin of Error: Certainty vs. Precision (cont.)
• To be more confident, we wind up being less precise. – We need more values in our confidence interval to be
more certain.
• Because of this, every confidence interval is a balance between certainty and precision.
• The tension between certainty and precision is always there.– Fortunately, in most cases we can be both sufficiently
certain and sufficiently precise to make useful statements.
Margin of Error: Certainty vs. Precision
• The choice of confidence level is somewhat arbitrary, but keep in mind this tension between certainty and precision when selecting your confidence level.
• The most commonly chosen confidence levels are 90%, 95%, and 99% (but any percentage can be used).
Critical value
• The number of SE’s affects the margin of error
• This number is called the critical value, denoted by z*
• For a 95% confidence interval, the precise critical value is 1.96
• For a 90% confidence interval, the precise critical value is 1.645
Critical Values (cont.)
• Example: For a 90% confidence interval, the critical value is 1.645:
Margin of error
• A confidence interval too wide is not very useful• How large a margin of error can we tolerate?
– Reduce level of confidence makes margin of error smaller (not a good idea, though)
– You should think about this ahead of time, when you design your study. Choose a larger sample!
– Generally, a margin of error of 5% or less is acceptable
Sample size
• Suppose a candidate is planning a poll and wants to estimate voter support within 3% with 95% confidence. How large a sample does she need?
•
• But we do not know before we conduct the poll!
• However,
Sample size
• To be conservative, we set
• Solve for n, we have n = 1067.1• So, we will need at least 1068 respondents to keep the
margin of error as small as 3% with a confidence level of 95%– In practice, you may need more since many people do not respond– If the response rate is too low, then the study might be a voluntary
response study, which can be biased
• To cut the se in half, we must quadruple the sample size n
Assumptions
• Independence– Plausible independence condition– Randomization condition
• Sampled at random?• From a properly randomized experiment?
– 10% condition• Sampling without replacement can be viewed
roughly the same as sampling with replacement
Assumptions
• Sample size assumption– Normal approximation comes from the CLT– Sample size must be large enough– Check the success/failure condition
•
Example
• In May 2002, the Gallup Poll asked 537 randomly sampled adults “generally speaking, do you believe the death penalty is applied fairly or unfairly in this country today?”
• 53% answered “fairly”
• 7% said “don’t know”
• What can we conclude from this survey?
Example
• Plausible independence? (Yes.)
• Randomization condition? (Yes.)
• 10% condition? (Yes.)
• Success/failure condition (Yes.)
• Let’s find a 95% confidence interval– Standard error: – Margin of error: – 95% confidence interval
A deeper understanding of confidence interval
• How should we interpret the 95% confidence?– Randomness is from sample to sample– This does not say p is random
• We are the one who is uncertain, not the parameter.
– The interval itself is random (varies from sample to sample)
– This interval is about p, not about
• The sample proportion varies– Report confidence interval or margin or error
• Always pay attention to the required assumptions– Independence– Sample size
• In practice, watch out for biased samples
What have we learned?
• Finally we have learned to use a sample to say something about the world at large.
• This process (statistical inference) is based on our understanding of sampling models, and will be our focus for the rest of the book.
• In this chapter we learned how to construct a confidence interval for a population proportion.
• And, we learned that interpretation of our confidence interval is key—we can’t be certain, but we can be confident.