1 chapter 19 confidence intervals for proportions

1

Chapter 19

Confidence Intervals for Proportions

2

Sampling Distribution Models

Population

Sample

Population Parameter?

p

Sample Statistic

Inference

p̂

3

Sampling Distribution of

If our two conditions hold then we know: Shape: Approximately Normal Center: The mean is p. Spread: The standard deviation is

p̂

n

pp 1

4


Recall the two conditions: 10% Condition: The size of the sample

should be less than 10% of the size of the population.

Success/Failure Condition: np and n(1 – p) should both be greater than 10.

p̂

5


So If we know the population proportion p and the sample size is big enough then we can intelligently think about possible by using the normal model. We can find the probability of obtaining a

particular We can determine if observing a particular

is unlikely or not.

p̂ 's

p̂p̂

p̂

6

For example by using the 68-95-99.7 Rule we can say something like this: 95% of the time the sample proportion,

will be between

p̂

n

ppp

n

ppp

)1(2 and

)1(2

Sampling Distribution of p̂

7

Inference

Unfortunately the population parameter, p, is usually unknown. We would like to use a sample to tell us something about p.

Use the sample proportion, , (as our best guess) to make inferences about the population proportion p.

p̂

8

War on Terrorism

According to a March 2nd 2006 ABC News/Washington Post poll of 1,000 adult Americans, 46% of those surveyed disapprove of the way that Bush is handling the US campaign against terrorism.

This poll was conducted by The Washington Post, so lets assume (hope) that they randomized correctly and obtained a representative sample. What is the population here? What is the population parameter?

9

War on Terrorism So what is the sampling distribution of the

proportion of US adults who disapprove of the way that Bush is handling the US campaign against terrorism.?

We know? n = 1,000 = .46

if conditions hold.(do they?)p̂ ~ N p,p 1 p

1000

p̂

10

War on Terrorism What don’t we know?

p - we don’t know the actual proportion of US adults who disapprove of the way that Bush is handling the US campaign against terrorism.

Since we don’t know p then we also don’t know

What can we do? We can use to find an estimate of p̂ p̂

p p 1 p n

11

Estimation

We expect that p and should be similar so we can use to estimate

When we use to estimate the standard

deviation, this is called the standard error of p̂

p̂p̂

p̂

p̂

p̂ pq

n

p̂q̂

nSE( p̂)

12

What does this tell us?

n

qpp

n

qpp

n

qppp

n

qpp

n

qpp

n

qpp

ˆˆ3

ˆˆ2

ˆˆˆˆˆˆ2

ˆˆ3

13

What does this tell us? Once again using the 68-95-99.7 rule we know

that: About 68% of the time (i.e. for about 68% of

random samples), will be no more than 1 away from p.

About 95% of the time (i.e. for about 95% of random samples), will be no more than 2 away from p.

About 99.7% of the time (i.e. for about 99.7% of random samples), will be no more than 3 away from p.

p̂

p̂

p̂ )ˆ( pSE

)ˆ( pSE

)ˆ( pSE

14

What does this tell us?

Let’s think about the second interval (95%). Start with (because we know this value) and

go out about 2 in either direction. We can be 95% sure (confident) that the

interval will contain p.

p̂)ˆ( pSE

15

War on Terrorism About 95% of the time (i.e. for about 95% of random

samples), will be no more than 0.032 away from p Start at = 0.46 and go out about 0.032 in each

direction We can be 95% confident that this interval will contain

p. We are 95% confident that the true proportion of adults who

are dissatisfied with the way the war on terrorism is going is between 32% and 38%.

p̂p̂

16

Interpretation

Plausible values for the population parameter p.

95% confidence in the process that produced this interval.

17

Statistical Confidence

Two things can happen when we create the interval as above:

1. p can either be in the interval (which will happen in about 95% of the intervals).

2. p can be outside the interval (which will happen in about 5% of the intervals).

One thing that can’t happen:1. The parameter value can’t change!!

18

Statistical Confidence

We don’t know which is true. So, we rely on our statistical confidence. The best we can say is, “We are 95%

confident that the true population proportion lies within the interval we construct.”

19

Statistical Confidence WE ARE NOT SAYING that p is in our interval

95% of the time. The above phrase implies that p is “moving

around” which we already said this cannot happen (remember p is some unknown fixed value).

If we were to calculate lots of intervals, the population parameter will be in about 95% of them.

20

Confidence Intervals

Confidence intervals come from the fact that we could take multiple samples and calculate multiple 95% confidence intervals and, if we were using the same method to find all the intervals, we would expect that about 95% of the intervals we constructed would contain the true parameter (population proportion).

21

95% Confidence

If one were to repeatedly sample at random 1000 registered voters and compute a 95% confidence interval for each sample, 95% of the intervals produced would contain the population proportion p.

22

Confidence Intervals So, what does the interval look like? Confidence intervals for the population

proportion have the form

For 95% confidence intervals,

)ˆ(ˆ pMEp

)ˆ(2)ˆ( pSEpME

23

Confidence Intervals ME is the margin of error The extent of the interval on either side of our

estimate In general,

where is called a critical value.

)ˆ()ˆ( * pSEzpME *z

24

Construction of CI

, the point estimate, is the center of the interval. It merely shifts the interval along the axis.

, the critical value, is the number of multiples of the standard error needed to form the desired CI. This will depend on the level of confidence you want.

p̂

*z

25

How to find

Need z-tables Based on Normal model

Between what two z-values do 95% of the observations lie on N(0,1)?

*z

n

pqpNp ,~ˆ

26

How to find

The z-values for a 95% confidence interval are not exactly 2 and –2.

We use these numbers as an approximation.

1.96 and –1.96 are more exact. So, a 95% CI for the population proportion

looks like

*z

)ˆ()96.1(ˆ pSEp

27

How to find

What does a 99% CI for p look like?

What does a 90% CI for p look like?

What does an 80% CI for p look like?

*z

28

Construction of CI

, the standard error, is the estimate for

, the margin of error, is ½ the width

of the CI. This merely determines the width of the interval. What happens to ME if n increases?

n

qp ˆˆ

n

qpz

ˆˆ*

p̂

29

Construction of CI

So, the CI for p looks like

n

qpzp

n

qpzp

ˆˆˆ,

ˆˆˆ **

30


Now we know what the interval looks like, but how do we know we can do all this? It is based on the Normal model Did we check any assumptions beforehand? What were the assumptions needed for

sampling distribution for sample proportion?

31


We don’t know p or q. So, check the following assumptions: Random sample

• Were data sampled randomly or are they from a randomized experiment?

Independence• Do data values affect one another?

n < 10% of population size Success/Failure

• 10ˆ10ˆ qnandpn

32

Step to Forming a CI

1) Describe the population parameter of concern. Ex: p = proportion of adults dissatisfied with the

way the war on terrorism is going

2) Specify the confidence interval criteria

a) check assumptions• random sample• independence• n < 10%• success/failure

33

Steps to a Confidence Interval

b) state the level of confidencec) determine the critical value, z*

3) Collect and present sample informationa) collect the data from the populationb) find the point estimate,

4) Determine the confidence intervala) find the standard error,

p̂

n

qp ˆˆ

34

Steps to a Confidence Interval

b) find the margin of error,

c) find the interval,

d) describe your results; interpret the interval• I am ___% confident that the true population proportion falls

within the interval I constructed.

n

qpz

ˆˆ*

n

qpzp

n

qpzp

ˆˆ*ˆ,

ˆˆ*ˆ

35

Example Ch. 19 #7

True or False?a) For a given sample size, higher confidence

means a smaller margin of error.b) For a specified confidence level, larger

samples provide smaller margins of error.c) For a fixed margin of error, larger samples

provide greater confidence.d) For a given confidence level, halving the

margin of error requires a sample twice as large.

36

Examplea) For a given sample size, higher confidence

means a smaller margin of error.

SolutionME = z*(SE( )) = z*-fixed n implies fixed SE-higher confidence implies higher z*

(see Table T)-so, with fixed SE and increasing z*, ME increases, the statement is FALSE

n

pp )ˆ1(ˆ p̂

37

Exampleb) For a specified confidence level, larger samples

provide smaller margins of error.

SolutionME = z*(SE( )) = z*-certain confidence interval = fixed z*-bigger sample = increasing n implies smaller -so, with fixed z* and decreasing , ME decreases, the statement is TRUE

n

pp )ˆ1(ˆ

)ˆ( pSE

p̂

)ˆ( pSE

38

Example

c) For a fixed margin of error, larger samples provide greater confidence.

Solution ME = z*( ) = z*-fixed ME-larger samples imply smaller -so for ME to remain the same, z* must increase-increasing z* implies larger confidence, so the statement is TRUE

n

pp )ˆ1(ˆ )ˆ( pSE

)ˆ( pSE

39

Exampled) For a given confidence level, halving the margin of

error requires a sample twice as large.SolutionME = z*( ) = z*-given confidence level implies fixed z*-halving the margin of error means dividing ME by 2-if you divide one side by 2, must divide the other by 2:

-so, if you divide ME by 2, you need to multiply the sample size by 4, not 2, the statement is FALSE

n

pp )ˆ1(ˆ

n

ppz

n

ppzn

pp

zpSE

zpSEzME

4

)ˆ1(ˆ*

)ˆ1(ˆ

4

1*

2

)ˆ1(ˆ

*2

)ˆ(*

2

))ˆ((*

2

)ˆ( pSE

40

Example Ch. 19, #20

A city ballot includes a local initiative that would legalize gambling. The issue is hotly contested and two groups decide to conduct polls to predict the outcome. The local newspaper finds that 53% of 1200 randomly selected voters plan to vote “yes”, while a college statistics class finds 54% of 450 randomly selected voters are in support. Both groups will create 95% confidence intervals.

41

Example

a) Without finding the confidence intervals, explain which one will have the larger margin of error.

Because the classes sample size is smaller, its interval will be larger.

42

Example

b) Find both confidence intervals.Newspaper: (50.2%, 55.8%)We are 95% confident that the true proportion of people who will vote to legalize gambling is between 50.2% and 55.8%.Class: (49.4%, 58.6%)We are 95% confident that the true proportion of people who will vote to legalize gambling is between 49.4% and 58.6%.

43

Example

c) Which group concludes that the outcome is too close to call? Why?

The students should conclude that their interval is too close to call because 50% is in the interval, meaning that it is quite likely that p could be 50%.

44

Cautions about Confidence Intervals

Do NOT suggest that the parameter p varies!

Do NOT imply you are certain about the parameter p!

Be sure to remember that the confidence interval is about the parameter, NOT the sample proportion(s)!

45

Sample Size and the ME

How precise should our margin of error be? We know that we cannot be exact, but we

don’t want our margin of error to be too large.• If it is too large, it may not be useful.

46

Sample Size and the ME There are two ways to adjust our ME.

You can reduce your confidence level.• As you reduce confidence, the value of z* decreases.• However, confidence levels less than 80% are rarely used

in real studies. 95% and 99% are more common.

You can change your sample size.• If we look at the equation for ME, we see that changing

the sample size will change ME.• In many cases, we may want to know how large of a

sample we should take to guarantee a certain ME.

47


Determining sample size.

We know that ME = z*

We can manipulate this equation with algebra.

n

pp )ˆ1(ˆ

2

2 )ˆ1(ˆ*)(

ME

ppzn

48

Sample Size and the ME This will allow us to calculate the minimum

sample size needed to have a certain margin of error.

The worst case scenario, the one that needs the largest sample size, is when p = 0.5. So, if we use this value for , we will be safe, meaning that we won’t choose a sample size too small to meet our required margin or error.

If you get a decimal, always round up!

p̂

49


Example: Suppose that we want to estimate the

proportion of ISU students who like Stat 101 within 3% with 95% confidence. How large of a sample size is needed?

• ME = 0.03, z* = 1.960, and = 0.5

peoplen 10681.1067)03.0(

)5.0)(5.0()960.1(2

2

p̂

1 chapter 19 confidence intervals for proportions

Documents

sampling distribution

estimate of slide

population proportion

sample proportion

standard error of slide

us campaign

sample size

actual proportion of