rimi workshop: power analysis

43
RIMI Workshop: Power Analysis Ronald D. Yockey [email protected]

Upload: monita

Post on 22-Feb-2016

43 views

Category:

Documents


0 download

DESCRIPTION

RIMI Workshop: Power Analysis. Ronald D. Yockey [email protected]. Goals of the Power Analysis Workshop. 1. Understand what power is and why power analyses are important in conducting research. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: RIMI Workshop: Power Analysis

RIMI Workshop:Power Analysis

Ronald D. [email protected]

Page 2: RIMI Workshop: Power Analysis

Goals of the Power Analysis Workshop

1. Understand what power is and why power analyses are important in conducting research.

2. Recognize the limits of Null Hypothesis Significance Testing (NHST) and how effect sizes complement NHST.

3. Understand the relationship between power, effect size, and sample size.

4. Use GPower to estimate the sample size (N) required to obtain a desired level of power (e.g., 80%) for a number of statistical procedures.

5. Provide an estimate of power for your grant proposals!

Page 3: RIMI Workshop: Power Analysis

Null Hypothesis Significance Testing

(NHST)drug olddrugnew 0 : H

drug olddrugnew 1 : H

Page 4: RIMI Workshop: Power Analysis

What is Power?Power – the probability of rejecting the null hypothesis (i.e., obtaining significance) when it is false - Ranges from 0 to 1- When multiplied by 100%, power

is expressed as a percentage

Page 5: RIMI Workshop: Power Analysis

Examples of PowerExample #1

Power = .50- 50% of the time the null

hypothesis will be rejected (i.e., statistical significance will be obtained)

- 50% of the time the null hypothesis will not be rejected (i.e., statistical significance will not be obtained) – A Type II Error

Page 6: RIMI Workshop: Power Analysis

Examples of Power (continued)

Example #2Power = .80

- 80% of the time the null hypothesis will be rejected (i.e., statistical significance will be obtained)

- 20% of the time the null hypothesis will not be rejected (i.e., statistical significance will not be obtained) – A Type II Error

Page 7: RIMI Workshop: Power Analysis

Rationale for Power Analysis

- Work involved in a study - conceiving the idea, literature review, grant proposal submission, running participants, analyzing data, writing the results

- High power = high chance of obtaining significance (supporting the research hypothesis)

- Low power = low chance of obtaining significance

- Neglecting “A priori” Power Analysis frequently results in low power studies

- Power Analysis - Crucial for increasing the probability of getting significant results!

Page 8: RIMI Workshop: Power Analysis

Rationale for Power Analysis (continued)

Low power studies are very commone.g., Power = .30

- 30% chance of achieving significance (rejecting H0)

Is spending the time and effort to conduct the study (not to mention taxpayers’ money) worth it when there is only a 3 in 10 chance of getting significance?

Recommended power level – 70% to 80%(Diminishing returns in the 90%+ range)

Page 9: RIMI Workshop: Power Analysis

Factors That Influence Power

1. Alpha level (α = .05 or .01)- Larger α = greater power

2. One-tailed vs. two-tailed tests- One-tailed tests have greater power (for

a constant α)- Two-tailed tests are much more common

(a one-tailed test may require justification)

3. The size of the standard deviation (σ)- Smaller standard deviation = greater

power(σ can be very difficult to manipulate)

Page 10: RIMI Workshop: Power Analysis

Factors That Influence Power (continued)

4. Effect size – the size of the “treatment effect” in your study- Larger effect size = greater power

5. Sample size (N)- Larger N = greater power

(The most commonly manipulated factor for increasing power)

Page 11: RIMI Workshop: Power Analysis

Examples of Low Power Studies

Very “realistic” low power study examples (for the independent samples t test):

Example #1 (2-tailed, α=.05)n1 = 30 n2 = 30Small effect (i.e., a relatively small

difference between the groups; characteristic of many studies in the social and behavioral sciences)

Power = 12%!

Page 12: RIMI Workshop: Power Analysis

Examples of Low Power Studies (continued)

Example #2 (2-tailed, α=.05)n1=50, n2=50; Small effectPower = 17%!

Example #3 (2-tailed, α=.05)n1=30, n2=30; Medium effectPower = 48%All three studies suffer from

insufficient power.

Page 13: RIMI Workshop: Power Analysis

Rationale for Power Analysis (continued)

The prevalence of low power studies is one reason why funding agencies such as NIH and NIMH (among others) often require estimates of power with the submission of a grant proposal.

And that’s why we’re here today!

Page 14: RIMI Workshop: Power Analysis

Null Hypothesis Significance Testing

(NHST)old drugnew drugH :0

drugolddrugnewH 1 :

Page 15: RIMI Workshop: Power Analysis

NHST (Continued)If statistical significance is obtained (e.g., p

< .05), then we can declare that the groups are different.

While a “statistically significant” result with NHST tells us the groups are different, it says nothing about how different they are. Statistical significance means “beyond normal sampling error” or “reliable difference,” but it does not necessarily mean “big difference” or “important.”

Page 16: RIMI Workshop: Power Analysis

NHST (Continued)- While NHST can be a very useful tool, it

has frequently been misused, as far too many researchers have made the mistake of assuming statistical significance means “practical importance”

- Due to this common misunderstanding, the American Psychological Association (APA) now strongly encourages that effect sizes be presented (alongside the results of significance tests), and many journals require the reporting of effect sizes for manuscript consideration.

Page 17: RIMI Workshop: Power Analysis

What is an Effect Size?Effect size – Indicates the size or

degree of the effect of some treatment or phenomenon

Definitions of effect size provided by Cohen (1988; p. 8-9)

- “The degree to which the phenomenon is present in the population.”

- “The degree to which the null hypothesis is false.”

Page 18: RIMI Workshop: Power Analysis

NHST vs. Effect SizeCohen’s second definition of effect

size (repeated):- “The degree to which the null

hypothesis is false.”1. NHST – If reject null – what do you

conclude?The null is false – i.e., Experimental ≠ Control (NHST doesn’t indicate how different the groups are, just that they’re not equal)

2. Effect size – indicates how different the groups are

Page 19: RIMI Workshop: Power Analysis

NHST vs. Effect Size (continued)

Basic Question of Significance Testing (NHST) – Is there an effect?- Yes or No

Basic Question of Effect sizes – How big is the effect?- A question of degree

Page 20: RIMI Workshop: Power Analysis

Effect Sizes in Power Analysis

Effect sizes play a fundamental role in power analysis

– To conduct a power analysis, the effect size must be estimated.

(We’ll examine several effect size measures shortly.)

Page 21: RIMI Workshop: Power Analysis

Effect Sizes in Power Analysis (continued)

Different effect sizes are often used for different statistical procedures (t tests, ANOVA, Correlation, etc.)

Page 22: RIMI Workshop: Power Analysis

Effect Sizes – Mean Differences

Effect size of the difference between two meansExample #1 – IQ scores: group 1 = 115, group 2 = 105Effect size = mean group 1 – mean group 2

= 115-105 = 10 IQ pointsEffect size of 10 IQ points (notice the effect size indicates how different the groups are)

Page 23: RIMI Workshop: Power Analysis

Effect Sizes – Mean Differences (continued)

Example #2:Stress – breathing exercises vs.

control breathing exercises = 60, control = 67 (higher scores = greater stress)

Effect size = 60 – 67 = –7; effect size of 7 points

(Often the absolute value for an effect size is reported.)

Page 24: RIMI Workshop: Power Analysis

Effect Sizes – Mean Differences (continued)

Problems with mean difference approach:

1. When different scales are used (with different M and SD) to measure the same construct, the results of different studies cannot be meaningfully compared (comparing apples and oranges).

2. Power analysis requires a standardized or “scale free” measure of effect size.

Page 25: RIMI Workshop: Power Analysis

Standardized Measures of Effect Size

t tests – Cohen’s dANOVA – η2 (eta-square) or R2

Correlation – Pearson’s rMultiple Regression – R2

Chi–Square Test of Independence – Cramer’s Phi

Page 26: RIMI Workshop: Power Analysis

Cohen’s dUsed for all t tests (one sample t,

independent samples t, dependent samples t)

A standardized or “scale free” measure of mean differences

Page 27: RIMI Workshop: Power Analysis

Cohen’s d (continued)

sMMd 21

groups two theofdeviation standard averages2 groupfor Mean 1 groupfor Mean

2

1

MMWhere:

Page 28: RIMI Workshop: Power Analysis

Cohen’s d (continued)Example:

Examining the effect of a drug on pain levels- Pain questionnaire on a 10-50 scale administered to people suffering from back pain(higher score = greater pain). - old drug – 25, new drug – 20- standard deviation of 10.

Page 29: RIMI Workshop: Power Analysis

Cohen’s d (continued)

5.10

2025

d

d = .5(Interpret in terms of standard deviation differences - like z-scores)

Those who took the new drug had pain levels that were .5 standard deviations lower than those who took the old drug.

Page 30: RIMI Workshop: Power Analysis

Cohen’s conventions for d

Magnitude

d

Small .20 1/5 of a std. dev. difference

Medium .50 1/2 of a std. dev. difference

Large .80 8/10 of a std. dev. difference

Cohen’s standards for small, medium, and large effect sizes for the independent samples t test, one sample t test, and the dependent samples t test.

Page 31: RIMI Workshop: Power Analysis

Effect Size (d)Power .20 –

Small.50 –

Medium.80 – Large

.50 194 (388) 32 (64) 14 (28)

.60 246 (492) 41 (82) 17 (34)

.70 310 (620) 51 (102) 21 (42)

.80 394 (788) 64 (128) 26 (52) Sample size required per group (with total N listed in parentheses) for a given level of power and effect size for the Independent Samples t test (α = .05, 2-tailed). Note: Assumes equal n per group.

Power Table – Independent t (abridged)

Page 32: RIMI Workshop: Power Analysis

Cohen’s conventions for Pearson’s r

Magnitude

r

Small .10

Medium .30

Large .50Cohen’s standards for small, medium, and large effect sizes for the Pearson r correlation coefficient.

Page 33: RIMI Workshop: Power Analysis

Effect Size (r)Power .10 –

Small.30 –

Medium.50 – Large

.50 384 43 15

.60 489 54 19

.70 616 67 23

.80 782 84 29Sample size (N) required for a given level of power and effect size for the Pearson r correlation coefficient (α = .05, 2-tailed).

Power Table – Pearson’s r (abridged)

Page 34: RIMI Workshop: Power Analysis

Cohen’s Conventions for Cramer’s Phi/w (Chi-

Square)Magnitude

Phi, w

Small .10

Medium .30

Large .50Cohen’s standards for small, medium, and large effect sizes for the chi-square test of independence.Note: Applies only to 2 x k tables, where k ≥ 2.

Page 35: RIMI Workshop: Power Analysis

Effect Size (Phi, w)Power .10 –

Small.30 –

Medium.50 – Large

.50 385 43 16

.60 490 55 20

.70 618 69 25

.80 785 88 32Sample size required for a given level of power and effect size for the chi-square test of independence (α = .05, df = 1, i.e., 2 x 2 table).

Power Table – Chi-Square Test of Independence

(abridged)

Page 36: RIMI Workshop: Power Analysis

Effect Size - ANOVA

k

mm

f

k

ii

m

m

1

2

:where

k = the number of groups, mi = the mean of the ith group, m = the grand (overall) mean, and σ = the average (or pooled) standard deviation.

Page 37: RIMI Workshop: Power Analysis

Effect Size - ANOVA

error) (treatmentty variabiliTotals)difference group (i.e., treatment toduey Variabilit

Total SS BetweenSS

2

2

Page 38: RIMI Workshop: Power Analysis

Cohen’s Conventions for ANOVA

(f and η2)Magnitude

f η2

Small .10 .01

Medium .25 .06

Large .40 .14Cohen’s standards for small, medium, and large effect sizes for the one-way between subjects analysis of variance (ANOVA).

Page 39: RIMI Workshop: Power Analysis

Effect Size (f, η2)Powe

rf =.10; η2=.01 Small

f =.25; η2=.06

Medium

f =.40; η2=.14 Large

.50 167 (501) 28 (84) 12 (36)

.60 209 (627) 35 (105) 14 (42)

.70 258 (774) 43 (129) 18 (54)

.80 323 (969) 53 (159) 22 (66) Sample size (N) required per group (and total N) for a given level of power and effect size for the one-way between subjects ANOVA (α = .05). The power values provided are based on 3 groups; larger N is required to achieve the same level of power as the number of groups increase.

Power Table – ANOVA(abridged)

Page 40: RIMI Workshop: Power Analysis

Effect Size – Multiple Regression

2

22

2

1

Total SS BetweenSS

RRf

R

Page 41: RIMI Workshop: Power Analysis

Cohen’s conventions for Multiple Regression (f2

and R2)Magnitude

f2 R2

Small .02 .02

Medium .15 .13

Large .35 .26Cohen’s standards for small, medium, and large effect sizes for multiple regression.

Page 42: RIMI Workshop: Power Analysis

Effect Size (f2, R2)Powe

rf2 =.02; R2=.02 Small

f2 =.15; R2=.13 Medium

f2 =.35; R2=.26 Large

.50 292 43 21

.60 362 52 25

.70 444 63 30

.80 550 77 36Sample size (N) required for a given level of power and effect size for multiple regression (α = .05). The power values provided are based on 3 predictors (IVs); larger N is required to achieve the same level of power as the number of predictors increase.

Power Table – Multiple Regression (abridged)

Page 43: RIMI Workshop: Power Analysis

Estimating Power using GPower

GPower illustration…