the paired t-test, non-parametric tests, and anova july 13, 2004

45
The paired t-test, The paired t-test, non-parametric tests, and non-parametric tests, and ANOVA ANOVA July 13, 2004 July 13, 2004

Upload: roland-chapman

Post on 29-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

The paired t-test, The paired t-test, non-parametric tests, and ANOVAnon-parametric tests, and ANOVA

July 13, 2004July 13, 2004

Page 2: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Review: the Experiment Review: the Experiment (note: exact numbers have been altered)(note: exact numbers have been altered)

Grade 3 at Oak School were given an IQ test at the beginning of the academic year (n=90).

Classroom teachers were given a list of names of students in their classes who had supposedly scored in the top 20 percent; these students were identified as “academic bloomers” (n=18).

BUT: the children on the teachers lists had actually been randomly assigned to the list.

At the end of the year, the same I.Q. test was re-administered.

Page 3: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

The resultsThe results

Children who had been randomly assigned to the “top-20 percent” list had mean I.Q. increase of 12.2 points (sd=2.0) vs. children in the control group only had an increase of 8.2 points (sd=2.5)

Page 4: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Confidence interval (more Confidence interval (more information!!)information!!)

95% CI for the difference: 4.0±1.99(.64) = (2.7 – 5.3)

t-curve with 88 df’s has slightly wider cut-off’s for 95% area (t=1.99) than a normal curve (Z=1.96)

Page 5: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

The Paired T-testThe Paired T-test

Page 6: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

The Paired T-testThe Paired T-test

Paired data means you’ve measured the same person at different time points or measured pairs of people who are related (husbands and wives, siblings, controls pair-matched to cases, etc.

For example, to evaluate whether an observed change in mean (before vs. after) represents a true improvement (or decrease):

Null hypothesis: difference (after-before)=0

Page 7: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

The differences are treated The differences are treated like a single random variablelike a single random variable

Xi Yi Xi - Yi

X1 Y1 D1

X2 Y2 D2

X3 Y3 D3

X4 Y4 D4

… … …

Xn Yn Dn

n

D

D

n

i

n

1

nSD

D

n 0T=

21

2

1

)(

D

n

ini

Sn

DD

2

D

Page 8: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Example DataExample Data

baseline Test2 improvement

10 9 -1

10 12 +2

9 13 +4

8 8 0

12 11 -1

11 12 +1

11 13 +2

7 11 +4

6 8 +2

9 9 0

9 8 -1

10 9 -1

9 9 0

Is there a significant increase in scores in this group? Average of differences = +1 Sample Variance = 3.3; sample SD = 1.82  T 12 = 1/(1.82/3.6) = 1.98

 data _null_;pval= 1-probt(1.98, 12);put pval;run;0.0355517436

Significant for a one-sided test; borderline for two-sided test

Page 9: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Example 2: Did the control Example 2: Did the control group in the Oak School group in the Oak School

experiment improveexperiment improveat allat all during the year? during the year?

2829.

2.8

725.2

2.8271 t

p-value <.0001

Page 10: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Confidence interval for annual Confidence interval for annual change in IQ test scorechange in IQ test score

95% CI for the increase: 8.2±2.0(.29) = (7.6 – 8.8)

t-curve with 71 df’s has slightly wider cut-off’s for 95% area (t=2.0) than a normal curve (Z=1.96)

Page 11: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Summary: parametric testsSummary: parametric tests

Equal variances

are pooled

Unequal variances (unpooled)

  One sample (or paired sample)

Two samples

 

True standard deviation is known

 

One-sample Z-test

 

Two-sample Z-test 

 

Standard deviation is estimated by the sample

 

One-sample t-test

 Two-sample t-test

Page 12: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Non-parametric testsNon-parametric tests

Page 13: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Non-parametric testsNon-parametric tests

t-tests require your outcome variable to be normally distributed (or close enough).

Non-parametric tests are based on RANKS instead of means and standard deviations (=“population parameters”).

Page 14: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Example: non-parametric testsExample: non-parametric tests

10 dieters following Atkin’s diet vs. 10 dieters following Jenny Craig

Hypothetical RESULTS:Atkin’s group loses an average of 34.5 lbs.

J. Craig group loses an average of 18.5 lbs.

Conclusion: Atkin’s is better?

Page 15: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Example: non-parametric testsExample: non-parametric tests

BUT, take a closer look at the individual data…

Atkin’s, change in weight (lbs):+4, +3, 0, -3, -4, -5, -11, -14, -15, -300

J. Craig, change in weight (lbs)-8, -10, -12, -16, -18, -20, -21, -24, -26, -30

Page 16: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Enter data in SAS…Enter data in SAS…data nonparametric;input loss diet $;datalines ;+4 atkins +3 atkins0 atkins-3 atkins-4 atkins-5 atkins -11 atkins-14 atkins-15 atkins-300 atkins-8 jenny-10 jenny-12 jenny-16 jenny-18 jenny -20 jenny-21 jenny-24 jenny-26 jenny-30 jenny

;run;

Page 17: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Jenny CraigJenny Craig

-30 -25 -20 -15 -10 -5 0 5 10 15 20

0

5

10

15

20

25

30

Percent

Weight Change

Page 18: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Atkin’sAtkin’s

-300 -280 -260 -240 -220 -200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20

0

5

10

15

20

25

30

Percent

Weight Change

Page 19: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

t-test doesn’t work…t-test doesn’t work…

Comparing the mean weight loss of the two groups is not appropriate here.

The distributions do not appear to be normally distributed.

Moreover, there is an extreme outlier (this outlier influences the mean a great deal).

Page 20: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Statistical tests to compare Statistical tests to compare ranks:ranks:

Wilcoxon rank-sum test (equivalent to Mann-Whitney U test) is analogue of two-sample t-test.

Wilcoxon signed-rank test is analogue of one-sample t-test, usually used for paired data

Page 21: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Wilcoxon rank-sum testWilcoxon rank-sum test

RANK the values, 1 being the least weight loss and 20 being the most weight loss.

Atkin’s +4, +3, 0, -3, -4, -5, -11, -14, -15, -300  1, 2, 3, 4, 5, 6, 9, 11, 12, 20 J. Craig -8, -10, -12, -16, -18, -20, -21, -24, -26, -30 7, 8, 10, 13, 14, 15, 16, 17, 18, 19

Page 22: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Wilcoxon “rank-sum” testWilcoxon “rank-sum” test

Sum of Atkin’s ranks:  1+ 2 + 3 + 4 + 5 + 6 + 9 + 11+ 12 + 20=73 Sum of Jenny Craig’s ranks:

7 + 8 +10+ 13+ 14+ 15+16+ 17+ 18+19=137

Jenny Craig clearly ranked higher! P-value *(from computer) = .017

– from ttest, p-value=.60

Page 23: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

*Tests in SAS…*Tests in SAS…

/*to get wilcoxon rank-sum test*/proc npar1way wilcoxon data=nonparametric;class diet;var loss;run;

/*To get ttest*/proc ttest data=nonparametric;class diet;var loss;run;

Page 24: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Wilcoxon “signed-rank” testWilcoxon “signed-rank” testH0: median weight loss in Atkin’s group = 0

Ha:median weight loss in Atkin’s not 0Atkin’s +4, +3, 0, -3, -4, -5, -11, -14, -15, -300

Rank absolute values of differences (ignore zeroes):Ordered values: 300, 15, 14, 11, 5, 4, 4, 3, 3, 0Ranks: 1 2 3 4 5 6-7 8-9 - Sum of negative ranks: 1+2+3+4+5+6.5+8.5=30Sum of positive ranks: 6.5+8.5=15P-value*(from computer)=.043; from paired t-test=.27

Page 25: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

*Tests in SAS…*Tests in SAS…

/*to get one-sample tests (both student’s t and signed-rank*/

proc univariate data=nonparametric;

var loss;

where diet="atkins";

run;

Page 26: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

What if data were paired?What if data were paired?

e.g., one-to-one matching; find pairs of study participants who have same age, gender, socioeconomic status, degree of overweight, etc.

Atkin’s+4, +3, 0, -3, -4, -5, -11, -14, -15, -300J. Craig-8, -10, -12, -16, -18, -20, -21, -24, -26, -30

Page 27: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Enter data differently in SAS…Enter data differently in SAS…10 pairs, rather than 20 10 pairs, rather than 20 individual observationsindividual observations

data piared;input lossa lossj;diff=lossa-lossj;datalines ;+4 -8 +3 -10 0 -12 -3 -16 -4 -18 -5 -20 -11 -21 -14 -24 -15 -26 -300 -30

;run;

Page 28: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

*Tests in SAS…*Tests in SAS…/*to get all paired tests*/proc univariate data=paired;var diff;run;/*To get just paired ttest*/proc ttest data=paired;var diff;run;/*To get paired ttest, alternatively*/proc ttest data=paired;paired lossa*lossj;run;

Page 29: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

ANOVAANOVAfor comparing means between for comparing means between

more than 2 groupsmore than 2 groups

Page 30: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

ANOVA ANOVA ((ANANalysis alysis OOf f VAVAriance)riance)

Idea: For two or more groups, test difference between means, for quantitative normally distributed variables.

Just an extension of the t-test (an ANOVA with only two groups is mathematically equivalent to a t-test).

Like the t-test, ANOVA is “parametric” test—assumes that the outcome variable is roughly normally distributed

Page 31: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

The “F-test”The “F-test”

groupswithinyVariabilit

groupsbetweenyVariabilitF

Is the difference in the means of the groups more than background noise (=variability within groups)?

Page 32: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

amenorrheic oligomenorrheic eumenorrheic

0.7

0.8

0.9

1.0

1.1

1.2

SPINE

Between group variation

Spine bone density vs. Spine bone density vs. menstrual regularity menstrual regularity

Within group variability

Within group variability

Within group variability

Page 33: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Group means and standard Group means and standard deviationsdeviations

Amenorrheic group (n=11):– Mean spine BMD = .92 g/cm2

– standard deviation = .10 g/cm2

Oligomenorrheic group (n=11)– Mean spine BMD = .94 g/cm2

– standard deviation = .08 g/cm2

Eumenrroheic group (n=11)– Mean spine BMD =1.06 g/cm2

– standard deviation = .11 g/cm2

Page 34: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

The F-TestThe F-Test

063.)13

)97.06.1()97.94(.)97.92(.(*11

22222

xbetween nss

0095.)11.08.10(.31 22222 savgswithin

6.60095.

063.2

2

30,2 within

between

s

sF

The size of the groups. The difference of

each group’s mean from the overall mean.

Between-group variation.

The average amount of variation within groups.

Each group’s variance.Large F value indicates that the between group variation exceeds the within group variation (=the background noise).

Page 35: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

The F-distributionThe F-distribution

The F-distribution is a continuous probability distribution that depends on two parameters n and m (numerator and denominator degrees of freedom, respectively):

Page 36: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

The F-distributionThe F-distribution A ratio of sample variances follows an F-

distribution:

22

220

:

:

withinbetweena

withinbetween

H

H

The F-test tests the hypothesis that two sample variances are equal. F will be close to 1 if sample variances are equal.

mnwithin

between F ,2

2

~

Page 37: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

ANOVA TableANOVA Table

Between (k groups)

k-1 SSB(sum of squared deviations of group means from

grand mean)

SSB/k-1 Go to

Fk-1,nk-k

chart

Total variation

nk-1 TSS(sum of squared deviations of observations from grand mean)  

 

Source of variation

 

d.f.

 

Sum of squares

Mean Sum of Squares F-statistic p-value

Within(n individuals

per group)

nk-k SSW (sum of squared deviations of observations from their group mean)

s2=SSW/nk-k

knkSSW

kSSB

1

TSS=SSB + SSW

Page 38: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

ANOVA=t-testANOVA=t-test

222

22

2

)()()(

npp

ts

YX

s

YXBetween

(2 groups)1 SSB

(squared differenc

e in means)

Squared difference in means

Go to

F1, 2n-2

Chart notice values are just (t 2n-2)

2

Total variation

2n-1 TSS 

 

Source of variation

 

d.f.

 

Sum of squares

Mean Sum of Squares F-statistic p-value

Within 2n-2 SSW

equivalent to numerator of pooled variance

Pooled variance

Page 39: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

ANOVA summaryANOVA summary

A statistically significant ANOVA (F-test) only tells you that at least two of the groups differ, but not which ones differ.

Determining which groups differ (when it’s unclear) requires more sophisticated analyses to correct for the problem of multiple comparisons…

Page 40: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Question: Question: Why not just do Why not just do 3 pairwise ttests?3 pairwise ttests?

Answer: because, at an error rate of 5% each test, this means you have an overall chance of up to 1-(.95)3= 14% of making a type-I error (if all 3 comparisons were independent)

 If you wanted to compare 6 groups, you’d have to do

6C2 = 15 pairwise ttests; which would give you a high chance of finding something significant just by chance (if all tests were independent with a type-I error rate of 5% each); probability of at least one type-I error = 1-(.95)15=54%.

Page 41: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Multiple comparisonsMultiple comparisons

With 18 independent comparisons, we have 60% chance of at least 1 false positive.

Page 42: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Multiple comparisonsMultiple comparisons

With 18 independent comparisons, we expect about 1 false positive.

Page 43: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Correction for multiple Correction for multiple comparisonscomparisons

How to correct for multiple comparisons post-hoc…

Bonferroni’s correction (adjusts p by most conservative amount, assuming all tests independent)

   Holm/Hochberg (gives p-cutoff beyond which not significant)

Tukey’s (adjusts p) Scheffe’s (adjusts p)

Page 44: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Non-parametric ANOVANon-parametric ANOVA

Kruskal-Wallis one-way ANOVA

Extension of the Wilcoxon Sign-Rank test for 2 groups; based on ranks

 Proc NPAR1WAY in SAS

Page 45: The paired t-test, non-parametric tests, and ANOVA July 13, 2004

Reading for this weekReading for this week

Chapters 4-5, 12-13 (last week)Chapters 6-8, 10, 14 (this week)