Download - Lecture 16 Dustin Lueker. n≥30 n
Lecture 16Dustin Lueker
n≥30
n<30
STA 291 Summer 2008 Lecture 16 2
n
sZx 2/
n
stx 2/
Start with the confidence interval formula assuming that the population standard deviation is known
Mathematically we need to solve the above equation for n
3
Exn
Zx
2/
2
2/2
E
Zn
STA 291 Summer 2008 Lecture 16
The sample proportion is an unbiased and efficient estimator of the population proportion◦ The proportion is a special case of the mean
4
n
ppZp
)ˆ1(ˆˆ 2/
STA 291 Summer 2008 Lecture 16
As with a confidence interval for the sample mean a desired sample size for a given margin of error (E) and confidence level can be computed for a confidence interval about the sample proportion
◦ This formula requires guessing before taking the sample, or taking the safe but conservative approach of letting = .5 Why is this the worst case scenario? Or the
conservative approach?
5
E
Zppn 2/)ˆ1(ˆ
p̂
p̂
STA 291 Summer 2008 Lecture 16
Two independent samples◦ Different subjects in the different samples◦ Two subpopulations
Ex: Male/Female◦ The two samples constitute independent samples from
two subpopulations Two dependent samples
◦ Natural matching between an observation in one sample and an observation in the other sample Ex: Two measurements of the same subject
Left/right hand Performance before/after training
◦ Important: Data sets with dependent samples require different statistical methods than data sets with independent samples
6STA 291 Summer 2008 Lecture
16
Take independent samples from both groups
Sample sizes are denoted by n1 and n2
◦ To use the large sample approach both samples should be greater than 30
Subscript notation is same for sample means
7
2
22
1
21
2/21 )(n
s
n
sZxx
STA 291 Summer 2008 Lecture 16
In the 1982 General Social Survey, 350 subjects reported the time spend every day watching television. The sample yielded a mean of 4.1 and a standard deviation of 3.3.
In the 1994 survey, 1965 subjects yielded a sample mean of 2.8 hours with a standard deviation of 2.◦ Construct a 95% confidence interval for the
difference between the means in 1982 and 1994. Is it plausible that the mean was the same in both
years?
8STA 291 Summer 2008 Lecture
16
For large samples◦ For this we will consider a large sample to be
those with at least five observations for each choice (success, failure) All we will deal with in this class
Large sample confidence interval for p1-p2
9
2
22
1
112/21
)ˆ1(ˆ)ˆ1(ˆˆˆ
n
pp
n
ppZpp
STA 291 Summer 2008 Lecture 16
Is the proportion who favor national health insurance different for Democrats and Republicans?◦ Democrats and Republicans would be your two
samples◦ Yes and No would be your responses, how you’d find
your proportions Is the proportion of people who experience pain
different for the two treatment groups?◦ Those taking the drug and placebo would be your two
samples Could also have them take different drugs
◦ No pain or pain would be your responses, how you’d find your proportions
10STA 291 Summer 2008 Lecture
16
Two year Italian study on the effect of condoms on the spread of HIV◦ Heterosexual couples where one partner was infected
with HIV virus 171 couples who always used condoms (UK fans), 3
partners became infected with HIV 55 couples who did not always use a condom (U of L fans),
8 partners became infected with HIV◦ Estimate the infection rates for the two groups◦ Construct a 95% confidence interval to compare them
What can you conclude about the effect of condom use on being infected with HIV from the confidence interval? Was your Sex Ed teacher lying to you?
11STA 291 Summer 2008 Lecture
16