stat 31, section 1, last time

Stat 31, Section 1, Last Time• 2 Sample Inference

• Paired Differences

– Apply 1 sample methods to differences

• Unmatched Samples

– Requires deeper methods

– Work through TTEST

Reading In Textbook

Approximate Reading for Today’s Material:

Pages 485-504, 536-549

Approximate Reading for Next Class:

Pages 555-566, 582-611

Midterm IIComing on Tuesday, April 10

Think about:

• Sheet of Formulas– Again single 8 ½ x 11 sheet– New, since now more formulas

• Redoing HW…

• Asking about those not understood

• Will schedule Extra Office Hours

• Midterm II not cumulative

2 Sample Hypo Testing

Comparison of Paired vs. Unmatched Cases

Notes:

• Can always use unmatched procedure

– Just ignore matching…

• Advantage to pairing???



• Advantage to Pairing???

• Recall previous example:

Old Textbook 7.32

– Matched Paired P-value = 1.87 x 10-5

– Unmatched P-value = 3.95 x 10-6

• Unmatched better!?! (can happen)



• Advantage to Pairing???

Happens when “variation of diff’s”, ,

is smaller than “full sample variation”

I.e.

(whether this happens depends on data)

D

Y

Y

X

XD nn

22

Paired vs. Unmatched SamplingClass Example 29:

A new drug is being tested that should boost white

blood cell count following chemo-therapy. For

a set of 4 patients, it was not administered (as

a control) for the 1st round of chemotherapy,

and then the new drug was tried after the 2nd

round of chemotherapy. White blood cell

counts were measured one week after each

round of chemotherapy.

Paired vs. Unmatched Sampling

Class Example 29:

The resulting white blood cell counts were:

Patient 1 33 35

Patient 2 26 27

Patient 3 36 39

Patient 4 28 30


Class Example 29:

Does the new drug seem to reduce white

blood cell counts well enough to be

studied further?

• Seems to be some improvement

• But is it statistically significant?

• Only 4 patients…


Let: = Average Blood c’nts w/out drug

= Average Blood c’nts with drug

Set up:

(want strong evidence of improvement)

YX

YXA

YX

H

H

:

:0


Class Example 29:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg29.xls

Results:

• Matched Pair P-val = 0.00813

– Very strong evidence of improvement

• Unmatched P-val = 0.295

– Not statistically significant

http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg29.xls


Class Example 29:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg29.xls

Conclusions:

• Paired Sampling can give better results

• When diff’ing reduces variation

• Often happens for careful matching


Paired Sampling Visualization

Review Gray Level Testing

• Seems some uncertainty about this…

• Go over previous examples, with links

• Recall main ideas

– 0.1 < P-value : no strong evidence

– 0.01 < P-value < 0.1: somewhat strong

evidence (use words to indicate strength)

– P-Value < 0.01: very strong evidence


Some examples of words, in the gray level

region:

– P-value ~ 0.09: “mild evidence, but

perhaps something is there”

– P-value ~ 0.07: “not strong evidence, but

some indication”


Some examples of words:

– P-Value ~ 0.05: “close to the boundary of

strong evidence”

– P-value ~ 0.03: “fairly strong evidence”

– P-value ~ 0.02: “close to being very

strong evidence”


Some earlier class examples:

March 20, page 18:

P-value of 0.094

“Quite weak evidence, i.e. only a mild suggestion”


March 20, page 44:

P-value of 0.031

“Pretty strong evidence”

P-value of 0.062

“Not very strong, but some indication of

something there”


March 22, page 32, HW 6.82:

P-value of 0.382

“No evidence”

P-value of 0.171

“No evidence”

P-value of 0.0013

“Very strong Evidence”



P-value of 0.0505,

“Close to boundary of strong evidence, but

not quite there”

P-value of 0.0495

“Just over boundary of strong evidence, but

very close”



P-value of 0.04,

“moderately strong evidence”


P-value of 0.188

“No evidence”



P-value of 0.0401,

“moderately strong evidence”


P-value of 0.739

“No evidence”



P-value of 0.0001

“Very strong evidence”


P-value of 0.00052

“Very strong evidence”

And now for somethingcompletely different….

Another fun movie

Thanks to Trent Williamson

http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stat31Movie-IceScraping.wmv

Inference for proportionsSec. 8.1: A deeper look

(already know some basics, but there

are some fine point worth a deeper look)

Recall:

Counts:

Sample Proportions:

pnpnppnBiX XX 1,,,~

npp

pnX

p pp

1,,ˆ ˆˆ

Inference for proportions

Calculate prob’s with BINOMDIST,

but note no BINOMINV,

so instead use Normal Approximation

Revisit Class Example 20http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg20.xls



Recall Normal Approximation to Binomial:

For

is approximately

is approximately

So use NORMINV (and often NORMDIST)

npp

pN1

,

X pnpnpN 1,

p̂

101&10 pnnp


Main problem: don’t know

Solution: Depends on context:

CIs or hypothesis tests

Different from Normal, since mean and sd

are linked, with both depending on ,

instead of separate .

p

p

&


Case 1: Margin of Error and CIs:

95% 0.975

So:

npp

Npp1

,0~ˆ

nppNORMINVm /1,0,975.0

m m m


Case 1: Margin of Error and CIs:

Continuing problem: Unknown

Solution 1: “Best Guess”

Replace by

nppNORMINVm /1,0,975.0

p

p

p̂

Inference for proportionsSolution 2: “Conservative”

Idea: make sd (and thus m) as large as possible

(makes no sense for Normal)

zeros at 0 & 1

max at 2/1p

pppppf 21


Solution 1: “Conservative”

Can check by calculus

so

Thus nNORMINVm /4/1,0,975.0

41

21

121

1max]1,0[

pp

p

nsqrtNORMINV *2/1,0,975.0


Example: Old Text Problem 8.8

Power companies spend time and money trimming trees to keep branches from falling on lines. Chemical treatment can stunt tree growth, but too much may kill the tree. In an experiment on 216 trees, 41 died. Give a 99% CI for the proportion expected to die from this treatment.

Inference for proportionsExample: Old Text Problem 8.8

Solution: Class example 30, part 1http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg30.xls

Note: Conservative much bigger

(left end even < 0)

Since

Big gap

So may pay substantial

price for being “safe”

5.019.0ˆ p


Inference for proportionsHW:

8.11, 8.15

Do both best-guess and conservative CIs

8.21

Inference for proportionsCase 2: Choice of Sample Size:

Idea: Given the margin of error ,

find sample size to make:

i.e. Dist’n i.e. Dist’n

0.95 0.975

m

mppP ˆ95.0

m

m

n

m

pp ˆ

npp

N1

,0

Sample Size for Proportions

i.e. find so that

i.e.

Problem: in both cases, can’t “get at”

Solution: Standardize,

i.e. put on N(0,1) scale

n )

1,0,(975.0

npp

mNORMDIST

npp

NORMINVm1

,0,975.0

n

Inference for proportionsI.e. Find so that

N(0,1) dist’n

0.975

npp

m1

npp

m

npppp

PmppP11

ˆˆ95.0

n

npp

mZP

1


i.e. find so that:

Now solve to get:

Problem: don’t know

n )1,0,975.0(1

NORMINV

npp

m

m

ppNORMINVn

11,0,975.0

p

ppm

NORMINVn

1

1,0,975.02


Solution 1: Best Guess

Use from:

– Earlier Study

– Previous Experience

– Prior Idea

p̂


Solution 2: Conservative

Recall

So “safe” to use:

4

11max1,0

ppp

411,0,975.0

2

mNORMINV

n

Sample Size for ProportionsE.g. Old textbook problem 8.14

An opinion poll found that 44% of adults agree that parents should be given vouchers for education at a school of their choice. The result was based on a small sample. How large an SRS is required to obtain a margin of error of +- 0.03, in a 95% CI?


E.g. Old textbook problem 8.14

See Class Example 30, Part 2:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg30.xls



Note: conservative version not much

bigger, since 0.44 ~ 0.5 so

gap is small

0.44 0.5


HW: 8.25, give a “conservative” answer

8.26 (81), give both “best guess” and

“conservative” answers

stat 31, section 1, last time

Documents

unmatched pvalue

unmatched casesadvantage

unmatched samplinglet

unmatched casesnotes

unmatched procedurejust

matched paired pvalue

sample methods

sample variationi