visual edf software to check the normality assumption dr. maria e. calzada dr. stephen m. scariano...

16
Visual EDF Software to Visual EDF Software to Check the Normality Check the Normality Assumption Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans [email protected] http://www.loyno.edu/~calza da

Upload: lindsay-palmer

Post on 16-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

Visual EDF Software to Check Visual EDF Software to Check the Normality Assumptionthe Normality Assumption

Dr. Maria E. Calzada

Dr. Stephen M. Scariano

Loyola University New Orleans

[email protected]

http://www.loyno.edu/~calzada

Page 2: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

Problem 7.11 of Moore’s the Basic Practice of Statistics reads (paraphrased):

Our subjects are 11 people diagnosed as being dependent on caffeine. Each subject was barred from coffee, colas, and other substances containing caffeine. Instead they took capsules containing their normal caffeine intake. During a different time period, they took placebo capsules… Table 7.3 contains data on two of several tests given to the subjects. “Depression” is the score of the Beck Depression inventory. Higher scores show more symptoms of depression. “Beats” is the beats per minute the subject achieved when asked to press a button 200 times as quickly as possible. We are interested in whether being deprived of caffeine affects these outcomes.

Page 3: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

SubjectDepressi

onDepressi

onDifferen

ce Beats BeatsDifferen

ce(Caffein

e)(Placebo

)Depressi

on(Caffein

e)(Placebo

) Beats

1 5 16 -11 281 201 80

2 5 23 -18 284 262 22

3 4 5 -1 300 283 17

4 3 7 -4 421 290 131

5 8 14 -6 240 259 -19

6 5 24 -19 294 291 3

7 0 6 -6 377 354 23

8 0 3 -3 345 346 -1

9 2 15 -13 303 283 20

10 11 12 -1 340 391 -51

11 1 0 1 408 411 -3

Page 4: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

Does this matched pairs study give evidence that being deprived of caffeine raises depression scores? Check that the differences are not strikingly nonnormal.

Now check the differences in beats per minute with and without caffeine. You should hesitate to use the t procedures on these data. Why?

Page 5: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

Histogram of the difference in depression scores, as given by SPSS. Are these data normal?

DIFFDEP

0.0

-2.5

-5.0

-7.5

-10.0

-12.5

-15.0

-17.5

-20.0

Histogram

Fre

qu

en

cy

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

Std. Dev = 6.92

Mean = -7.4

N = 11.00

Page 6: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

DIFFBEAT

125.0100.075.050.025.00.0-25.0-50.0

Histogram

Fre

qu

en

cy

5

4

3

2

1

0

Std. Dev = 48.75

Mean = 20.2

N = 11.00

Histogram of the difference in Beats score. Are these data normal?

Page 7: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

Normal P-P Plot of DIFFDEP

Observed Cum Prob

1.00.75.50.250.00

Exp

ect

ed

Cu

m P

rob

1.00

.75

.50

.25

0.00

Normal probability plot for difference in depression data

Page 8: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

Normal P-P Plot of DIFFBEAT

Observed Cum Prob

1.00.75.50.250.00

Exp

ect

ed

Cu

m P

rob

1.00

.75

.50

.25

0.00

Normal probability plot for difference in beats data

Page 9: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

While slight departures from normality are usually inconsequential, substantive departures from normality can seriously impair the validity of statistical procedures framed under the “Normality Assumption.”

The problem is: How can we help our students distinguish between “slight departures” and “substantive departures” from normality.

Page 10: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

EDF TESTS FOR NORMALITYEDF TESTS FOR NORMALITY

We have written a Visual Basic Program that implements the Lilliefors test for normality and the Anderson-Darling Test for normality.

We like the visual aspects of the Lilliefors test.

We like the power of the Anderson-Darling Test.

Page 11: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

Lilliefors TestLilliefors Test

We compute the distances between the data’s Empirical Distribution Function and the Cumulative Standard Normal Distribution .

If these distances are “not too large” we do not have evidence to reject the “Normality Assumption.”

Upper- percentiles for the largest distance have been tabulated using Monte-Carlo Simulations.

Fnx

x

Page 12: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada
Page 13: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

Anderson-Darling TestAnderson-Darling Test

Based on the accumulating square distance function

Usually

Our program implements the discrete Anderson-Darling test statistic developed by Stephens (1974) to approximate

D n

Fnx x2 xdFx

x x1 x 1

D .

Page 14: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada
Page 15: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

ConclusionsConclusions

The Anderson-Darling and Lilliefors tests are important tools that should be routinely used alongside probability plots to check the normality assumption.

The stand-alone desktop program enhances the craft of exploratory data analysis.

Program is available for downloading at http://www.loyno.edu/~calzada

Page 16: Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans Calzada@loyno.edu calzada

Some ReferencesSome References David S. Moore. The Basic Practice of Statistics,

Second Edition. Freeman, 2000. H. Lilliefors(1969). “On the Kolmogorov-

Smirnov Test for Normality with Mean and Variance Unknown.” Journal of the American Statistical Association 64, pp. 387-389.

M. A. Stephens (1974). “EDF Statistics for Goodness of Fit and Some Comparisons.” Journal of the American Statistical Association 69, pp. 730-737.