stat 31, section 1, last time

63
Stat 31, Section 1, Last Time 2 Sample Inference Paired Differences Apply 1 sample methods to differences Unmatched Samples Requires deeper methods Work through TTEST

Upload: boris-walter

Post on 01-Jan-2016

39 views

Category:

Documents


2 download

DESCRIPTION

Stat 31, Section 1, Last Time. 2 Sample Inference Paired Differences Apply 1 sample methods to differences Unmatched Samples Requires deeper methods Work through TTEST. Reading In Textbook. Approximate Reading for Today’s Material: Pages 485-504, 536-549 - PowerPoint PPT Presentation

TRANSCRIPT

Stat 31, Section 1, Last Time• 2 Sample Inference

• Paired Differences

– Apply 1 sample methods to differences

• Unmatched Samples

– Requires deeper methods

– Work through TTEST

Reading In Textbook

Approximate Reading for Today’s Material:

Pages 485-504, 536-549

Approximate Reading for Next Class:

Pages 555-566, 582-611

Midterm IIComing on Tuesday, April 10

Think about:

• Sheet of Formulas– Again single 8 ½ x 11 sheet– New, since now more formulas

• Redoing HW…

• Asking about those not understood

• Will schedule Extra Office Hours

• Midterm II not cumulative

2 Sample Hypo Testing

Comparison of Paired vs. Unmatched Cases

Notes:

• Can always use unmatched procedure

– Just ignore matching…

• Advantage to pairing???

2 Sample Hypo Testing

Comparison of Paired vs. Unmatched Cases

• Advantage to Pairing???

• Recall previous example:

Old Textbook 7.32

– Matched Paired P-value = 1.87 x 10-5

– Unmatched P-value = 3.95 x 10-6

• Unmatched better!?! (can happen)

2 Sample Hypo Testing

Comparison of Paired vs. Unmatched Cases

• Advantage to Pairing???

Happens when “variation of diff’s”, ,

is smaller than “full sample variation”

I.e.

(whether this happens depends on data)

D

Y

Y

X

XD nn

22

Paired vs. Unmatched SamplingClass Example 29:

A new drug is being tested that should boost white

blood cell count following chemo-therapy. For

a set of 4 patients, it was not administered (as

a control) for the 1st round of chemotherapy,

and then the new drug was tried after the 2nd

round of chemotherapy. White blood cell

counts were measured one week after each

round of chemotherapy.

Paired vs. Unmatched Sampling

Class Example 29:

The resulting white blood cell counts were:

Patient 1 33 35

Patient 2 26 27

Patient 3 36 39

Patient 4 28 30

Paired vs. Unmatched Sampling

Class Example 29:

Does the new drug seem to reduce white

blood cell counts well enough to be

studied further?

• Seems to be some improvement

• But is it statistically significant?

• Only 4 patients…

Paired vs. Unmatched Sampling

Let: = Average Blood c’nts w/out drug

= Average Blood c’nts with drug

Set up:

(want strong evidence of improvement)

YX

YXA

YX

H

H

:

:0

Paired vs. Unmatched Sampling

Class Example 29:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg29.xls

Results:

• Matched Pair P-val = 0.00813

– Very strong evidence of improvement

• Unmatched P-val = 0.295

– Not statistically significant

Paired vs. Unmatched Sampling

Class Example 29:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg29.xls

Conclusions:

• Paired Sampling can give better results

• When diff’ing reduces variation

• Often happens for careful matching

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Paired Sampling Visualization

Review Gray Level Testing

• Seems some uncertainty about this…

• Go over previous examples, with links

• Recall main ideas

– 0.1 < P-value : no strong evidence

– 0.01 < P-value < 0.1: somewhat strong

evidence (use words to indicate strength)

– P-Value < 0.01: very strong evidence

Review Gray Level Testing

Some examples of words, in the gray level

region:

– P-value ~ 0.09: “mild evidence, but

perhaps something is there”

– P-value ~ 0.07: “not strong evidence, but

some indication”

Review Gray Level Testing

Some examples of words:

– P-Value ~ 0.05: “close to the boundary of

strong evidence”

– P-value ~ 0.03: “fairly strong evidence”

– P-value ~ 0.02: “close to being very

strong evidence”

Review Gray Level Testing

Some earlier class examples:

March 20, page 18:

P-value of 0.094

“Quite weak evidence, i.e. only a mild suggestion”

Review Gray Level Testing

March 20, page 44:

P-value of 0.031

“Pretty strong evidence”

P-value of 0.062

“Not very strong, but some indication of

something there”

Review Gray Level Testing

March 22, page 32, HW 6.82:

P-value of 0.382

“No evidence”

P-value of 0.171

“No evidence”

P-value of 0.0013

“Very strong Evidence”

Review Gray Level Testing

March 22, page 32, HW 6.84:

P-value of 0.0505,

“Close to boundary of strong evidence, but

not quite there”

P-value of 0.0495

“Just over boundary of strong evidence, but

very close”

Review Gray Level Testing

March 27, page 44, HW 7.16:

P-value of 0.04,

“moderately strong evidence”

March 27, page 44, HW 7.21:

P-value of 0.188

“No evidence”

Review Gray Level Testing

March 29, page 16, HW 6.61:

P-value of 0.0401,

“moderately strong evidence”

March 29, page 33, HW 7.27:

P-value of 0.739

“No evidence”

Review Gray Level Testing

March 29, page 33, HW 7.31:

P-value of 0.0001

“Very strong evidence”

March 29, page 33, HW 7.41:

P-value of 0.00052

“Very strong evidence”

And now for somethingcompletely different….

Another fun movie

Thanks to Trent Williamson

Inference for proportionsSec. 8.1: A deeper look

(already know some basics, but there

are some fine point worth a deeper look)

Recall:

Counts:

Sample Proportions:

pnpnppnBiX XX 1,,,~

npp

pnX

p pp

1,,ˆ ˆˆ

Inference for proportions

Calculate prob’s with BINOMDIST,

but note no BINOMINV,

so instead use Normal Approximation

Revisit Class Example 20http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg20.xls

Inference for proportions

Recall Normal Approximation to Binomial:

For

is approximately

is approximately

So use NORMINV (and often NORMDIST)

npp

pN1

,

X pnpnpN 1,

101&10 pnnp

Inference for proportions

Main problem: don’t know

Solution: Depends on context:

CIs or hypothesis tests

Different from Normal, since mean and sd

are linked, with both depending on ,

instead of separate .

p

p

&

Inference for proportions

Case 1: Margin of Error and CIs:

95% 0.975

So:

npp

Npp1

,0~ˆ

nppNORMINVm /1,0,975.0

m m m

Inference for proportions

Case 1: Margin of Error and CIs:

Continuing problem: Unknown

Solution 1: “Best Guess”

Replace by

nppNORMINVm /1,0,975.0

p

p

Inference for proportionsSolution 2: “Conservative”

Idea: make sd (and thus m) as large as possible

(makes no sense for Normal)

zeros at 0 & 1

max at 2/1p

pppppf 21

Inference for proportions

Solution 1: “Conservative”

Can check by calculus

so

Thus nNORMINVm /4/1,0,975.0

41

21

121

1max]1,0[

pp

p

nsqrtNORMINV *2/1,0,975.0

Inference for proportions

Example: Old Text Problem 8.8

Power companies spend time and money trimming trees to keep branches from falling on lines. Chemical treatment can stunt tree growth, but too much may kill the tree. In an experiment on 216 trees, 41 died. Give a 99% CI for the proportion expected to die from this treatment.

Inference for proportionsExample: Old Text Problem 8.8

Solution: Class example 30, part 1http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg30.xls

Note: Conservative much bigger

(left end even < 0)

Since

Big gap

So may pay substantial

price for being “safe”

5.019.0ˆ p

Inference for proportionsHW:

8.11, 8.15

Do both best-guess and conservative CIs

8.21

Inference for proportionsCase 2: Choice of Sample Size:

Idea: Given the margin of error ,

find sample size to make:

i.e. Dist’n i.e. Dist’n

0.95 0.975

m

mppP ˆ95.0

m

m

n

m

pp ˆ

npp

N1

,0

Sample Size for Proportions

i.e. find so that

i.e.

Problem: in both cases, can’t “get at”

Solution: Standardize,

i.e. put on N(0,1) scale

n )

1,0,(975.0

npp

mNORMDIST

npp

NORMINVm1

,0,975.0

n

Inference for proportionsI.e. Find so that

N(0,1) dist’n

0.975

npp

m1

npp

m

npppp

PmppP11

ˆˆ95.0

n

npp

mZP

1

Sample Size for Proportions

i.e. find so that:

Now solve to get:

Problem: don’t know

n )1,0,975.0(1

NORMINV

npp

m

m

ppNORMINVn

11,0,975.0

p

ppm

NORMINVn

1

1,0,975.02

Sample Size for Proportions

Solution 1: Best Guess

Use from:

– Earlier Study

– Previous Experience

– Prior Idea

Sample Size for Proportions

Solution 2: Conservative

Recall

So “safe” to use:

4

11max1,0

ppp

411,0,975.0

2

mNORMINV

n

Sample Size for ProportionsE.g. Old textbook problem 8.14

An opinion poll found that 44% of adults agree that parents should be given vouchers for education at a school of their choice. The result was based on a small sample. How large an SRS is required to obtain a margin of error of +- 0.03, in a 95% CI?

Sample Size for Proportions

E.g. Old textbook problem 8.14

See Class Example 30, Part 2:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg30.xls

Sample Size for Proportions

Note: conservative version not much

bigger, since 0.44 ~ 0.5 so

gap is small

0.44 0.5

Sample Size for Proportions

HW: 8.25, give a “conservative” answer

8.26 (81), give both “best guess” and

“conservative” answers