new error bars; power and sample size - university of iowa · 2020. 4. 8. · presenting results...

29
Presenting results with error bars Power and sample size Summary Error bars; Power and sample size Patrick Breheny April 2 Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 1 / 29

Upload: others

Post on 19-Apr-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Error bars; Power and sample size

Patrick Breheny

April 2

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 1 / 29

Page 2: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Introduction

In this lecture, we’ll discuss two topics that come up in one-samplestudies (as well as other types of studies):• One is the presentation of results, and specifically, thedecision when making figures and tables to include error barsand whether the error bar should be based on the standarddeviation or the standard error• The other is the issue of planning a study, and determiningbeforehand the power of the study for a given sample size

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 2 / 29

Page 3: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

A dynamite plot

Figures like the following areincredibly common:

Far Near

IQ

0

20

40

60

80

100

• There are a number ofpossible variations, but thegeneral idea is to have a baror a dot or a line that showsthe mean, then error barsthat show . . . something• These particular error barsare one SD, so theyillustrate something aboutvariability of individuals’ IQsin the two samples

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 3 / 29

Page 4: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Box plot

Far Near

60

80

100

120

140

Smelter

IQ

• Keep in mind, however, thatthe “dynamite” plot is justshowing two numbers, themean and SD• This has the potential tomisrepresent the actualdistribution of data, andmore informative sorts ofplots, such as the box plot,are alternatives worthconsidering

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 4 / 29

Page 5: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Strip plot

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

● ●

●●

● ●

50

75

100

125

Far NearSmelter

IQ

• We could also just show allthe data• This type of plot is knownas a strip plot

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 5 / 29

Page 6: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

±1 SD again

75

80

85

90

95

100

105

110

IQ

Far Near

• This is the same as the firstplot, although with dotsreplacing the bars; this hasthe advantage that the bardoesn’t block the lower errorbar, so we can see the±1 SD region more clearly• Recall that the mean ±1 SDtypically contains about themiddle two-thirds of thedata

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 6 / 29

Page 7: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

±1 SE

75

80

85

90

95

100

105

110

IQ

Far Near

• Here, we’re plotting ±1 SEinstead of ±1 SD• The error bars are a lotsmaller here, of course, sinceSE = SD/

√n, and now say

something about thevariability of the samplemean or, if you prefer, theerror with which we havemeasured the sample mean

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 7 / 29

Page 8: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Error bars: SD vs. SE

• So, the ±SD error bar plot emphasized the variability of data• This doesn’t really have anything to do with any sort of“error” at all, and furthermore, other kinds of plots are usuallybetter for showing the distribution of your data• On the other hand, the ±SE error bar plot emphasizes

uncertainty (or possible error) about the mean• However, this isn’t an ideal plot either, since ±SE is only a68% CI, and really, it isn’t even that, since as we learned inthe previous lecture, the coverage of the CI depends on thedegrees of freedom with which we’ve measured the SD

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 8 / 29

Page 9: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

CI plot

75

80

85

90

95

100

105

110

IQ

Far Near

• So typically, the mostmeaningful route is to plotthe confidence interval• For example, the plot on theleft suggests that it isn’tunreasonable to think thatthe average IQ for bothgroups is 90

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 9 / 29

Page 10: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Variability and uncertainty are both of interest

• It should be noted, however, that the confidence interval plot,although very useful and informative, doesn’t really tell usanything about the distribution of the data, which is alsointeresting• In principle, this is the choice one makes in deciding on afigure, whether to emphasize uncertainty about the average orto emphasize variability/diversity, although there are ways toillustrate both in a single figure

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 10 / 29

Page 11: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Illustrating both variability and uncertainty

60

80

100

120

140

IQ

●●

●●

●●

●●

●●

●●

●●

Far Near

In this plot, the shaded regiondenotes the 95% confidenceinterval, and we can get a senseof the variability amongindividuals as well as theuncertainty about the populationmeans

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 11 / 29

Page 12: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

±1 SD for NHANES height

62

64

66

68

70

72

Hei

ght

Male Female

Just to make sure the point isclear, let’s look at the NHANESheight data; here are the means±SD

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 12 / 29

Page 13: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

CI plot for NHANES height

62

64

66

68

70

72

Hei

ght

Male Female

And here the error bars denotethe 95% confidence interval;both this plot and the previousone tell us different things• It is obvious that men are,on average, taller thanwomen• At the same time, it is justas obvious that there isplenty of overlap in thedistributions – i.e., that it isquite common for a womanto be taller than a man

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 13 / 29

Page 14: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

NHANES height data

55

60

65

70

75

80

Hei

ght

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

Male Female

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 14 / 29

Page 15: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Cystic FibrosisOne last comment: plottingseparate confidence intervals andlooking for overlap is not thesame as plotting a confidenceinterval for the difference

100

200

300

400

FV

C R

educ

tion

Drug Placebo

The disparity is particularlydrastic for paired studies, but alsooccurs for two-sample studies, aswe will see in a week or two

−50 0 50 100 150 200 250 300

Difference in FVC reduction

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 15 / 29

Page 16: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Planning a study

• One of the most important questions as far as planning andbudgeting a study is concerned is: how many subjects do Ineed?• The number of subjects tends to play a very large role indetermining the cost of a study, so funding agencies generallywant to know the number of subjects that a study will requirebefore they make a decision about whether or not to pay for it• But of course, the fewer subjects you have, the harder it is todistinguish a real phenomenon from chance

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 16 / 29

Page 17: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Power

• The probability that you will successfully distinguish the realphenomenon from chance (i.e., reject the null hypothesis) iscaptured in the notion of power• Power is the opposite of the type II error we discussed back atthe beginning of the semester• Power is the probability of rejecting the null hypothesis giventhat it is in reality false; the type II error rate (β) was theprobability of failing to reject the null hypothesis given that itwas false• Thus, by the compliment rule:

Power = 1− β

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 17 / 29

Page 18: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Two important questions

• With the time remaining in today’s lecture, I want to addresstwo highly related questions:◦ If I plan a study with a certain number of subjects, what is my

power going to be?◦ If I want to achieve a certain power, how large does my sample

size need to be?

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 18 / 29

Page 19: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Two important questions (cont’d)

• These are important questions for any kind of data, and eachtype of study has its own formulas and procedures forcalculating power• We won’t get into the details of power calculations in thisclass (which can be quite complicated)• Instead, we will focus on the main concepts, which aregenerally similar for any type of study

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 19 / 29

Page 20: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Power: Basement analogy

• Consider the following analogy1: you send a child into thebasement to find an object• What’s the probability that she actually finds it?• This depends on three things:

◦ How long does she spend looking?◦ How big is the object?◦ How messy is the basement?

1This analogy comes from Intuitive Biostatistics, which in turn credits JohnHartung for the original idea

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 20 / 29

Page 21: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Power: Basement analogy (cont’d)

• If the child spends a long time looking for a large object in aclean, organized basement, then she’ll probably find it• If the child spends a short time looking for a small object in amessy basement, then there’s a good chance she won’t find it• All three of these questions have statistical analogs:

◦ How long does she spend looking? = How big was the samplesize?

◦ How big is the object? = How large is the effect size?◦ How messy is the basement? = How noisy/variable is the

data?

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 21 / 29

Page 22: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Specifying effect size and variability

• In general, one does not know the effect size or the variability– especially before we have conducted the study• So, in order to calculate power, we are going to essentiallymake up values for these quantities, and our calculated powerwill depend on the values that we choose• Of course, if we specify values that are far away from reality,our power calculations are not going to be accurate• Sometimes, reasonable values for certain quantities can bechosen on the basis of past studies or observations• Other times, a small initial study called a “pilot study” isconducted in order to provide some data with which toestimate these quantities and help plan for a larger study thatwould take place in the future.

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 22 / 29

Page 23: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Example

• Suppose we develop some intervention that we think canreduce the LDL cholesterol levels of individuals whoparticipate in it• Suppose we plan to conduct a study in which individuals tryboth the intervention and a control, and we are going to lookat the difference in each individual’s LDL cholesterol levels onand off the intervention• The power of our study (the probability that we will get ap-value under .05) depends on:◦ The sample size (how many people we enroll in our study)◦ The variability (how much variability there is in a person’s LDL

cholesterol levels)◦ The effect size (the amount by which our intervention actually

reduces cholesterol)

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 23 / 29

Page 24: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Power curves

• The main concepts behind power and sample size can beillustrated in power curves – graphs of what happens to poweras we change one of sample size/variability/effect size• In the curves that follow, I will start with:

◦ A sample size of 100◦ A variability of 36 mg/dL◦ An effect size of 5 mg/dL

• And we will see what happens to power as we vary each ofthem

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 24 / 29

Page 25: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Power curve #1

−20 −10 0 10 20

0.0

0.2

0.4

0.6

0.8

1.0

True effect

Pow

er

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 25 / 29

Page 26: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Power curve #2

10 20 30 40 50 60

0.0

0.2

0.4

0.6

0.8

1.0

True SD

Pow

er

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 26 / 29

Page 27: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Power curve #3

0 200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

Sample size

Pow

er

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 27 / 29

Page 28: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Sample size determination

• Thus, we can determine our sample size by looking at thepower curve• For instance, in the previous example, if we want a power of80%, we would need a sample size of about 400• In reality, of course, lots of other things like money, time,resources, availability of subjects, etc., influence the actualsample size of a study• Also, we may be interested in calculating the required samplesize under a few different designs to see which way is theeasiest/cheapest to conduct the study

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 28 / 29

Page 29: New Error bars; Power and sample size - University of Iowa · 2020. 4. 8. · Presenting results with error bars Power and sample size Summary Errorbars: SD vs. SE •So,the±SD errorbarplotemphasizedthevariability

Presenting results with error barsPower and sample size

Summary

Summary

• Displaying ±SD in a figure emphasizes variability, whiledisplaying ±SE emphasizes uncertainty, although ask yourself:Wouldn’t I be better off plotting the confidence interval?• The power of a study depends on:

◦ Sample size◦ Variability◦ Effect size

Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 29 / 29