Download - Lecture 16 Dustin Lueker. n≥30 n

Lecture 16Dustin Lueker

n≥30

n<30

STA 291 Summer 2008 Lecture 16 2

n

sZx 2/

n

stx 2/

Start with the confidence interval formula assuming that the population standard deviation is known

Mathematically we need to solve the above equation for n

3

Exn

Zx

2/

2

2/2

E

Zn

STA 291 Summer 2008 Lecture 16

The sample proportion is an unbiased and efficient estimator of the population proportion◦ The proportion is a special case of the mean

4

n

ppZp

)ˆ1(ˆˆ 2/


As with a confidence interval for the sample mean a desired sample size for a given margin of error (E) and confidence level can be computed for a confidence interval about the sample proportion

◦ This formula requires guessing before taking the sample, or taking the safe but conservative approach of letting = .5 Why is this the worst case scenario? Or the

conservative approach?

5

E

Zppn 2/)ˆ1(ˆ

p̂

p̂


Two independent samples◦ Different subjects in the different samples◦ Two subpopulations

Ex: Male/Female◦ The two samples constitute independent samples from

two subpopulations Two dependent samples

◦ Natural matching between an observation in one sample and an observation in the other sample Ex: Two measurements of the same subject

Left/right hand Performance before/after training

◦ Important: Data sets with dependent samples require different statistical methods than data sets with independent samples

6STA 291 Summer 2008 Lecture

16

Take independent samples from both groups

Sample sizes are denoted by n1 and n2

◦ To use the large sample approach both samples should be greater than 30

Subscript notation is same for sample means

7

2

22

1

21

2/21 )(n

s

n

sZxx


In the 1982 General Social Survey, 350 subjects reported the time spend every day watching television. The sample yielded a mean of 4.1 and a standard deviation of 3.3.

In the 1994 survey, 1965 subjects yielded a sample mean of 2.8 hours with a standard deviation of 2.◦ Construct a 95% confidence interval for the

difference between the means in 1982 and 1994. Is it plausible that the mean was the same in both

years?


16

For large samples◦ For this we will consider a large sample to be

those with at least five observations for each choice (success, failure) All we will deal with in this class

Large sample confidence interval for p1-p2

9

2

22

1

112/21

)ˆ1(ˆ)ˆ1(ˆˆˆ

n

pp

n

ppZpp


Is the proportion who favor national health insurance different for Democrats and Republicans?◦ Democrats and Republicans would be your two

samples◦ Yes and No would be your responses, how you’d find

your proportions Is the proportion of people who experience pain

different for the two treatment groups?◦ Those taking the drug and placebo would be your two

samples Could also have them take different drugs

◦ No pain or pain would be your responses, how you’d find your proportions


16

Two year Italian study on the effect of condoms on the spread of HIV◦ Heterosexual couples where one partner was infected

with HIV virus 171 couples who always used condoms (UK fans), 3

partners became infected with HIV 55 couples who did not always use a condom (U of L fans),

8 partners became infected with HIV◦ Estimate the infection rates for the two groups◦ Construct a 95% confidence interval to compare them

What can you conclude about the effect of condom use on being infected with HIV from the confidence interval? Was your Sex Ed teacher lying to you?


16

Download - Lecture 16 Dustin Lueker. n≥30 n

Top Related