sample size/power calculation by software/online...
TRANSCRIPT
Sample Size/Power Calculation by Software/Online Calculators
May 24, 2018
Li Zhang, [email protected] Professor Department of Epidemiology and BiostatisticsDivision of Hematology and Oncology Department of MedicineUniversity of California, San Francisco
Topics
• R packages• SAS Proc• Online calculator: CTSI sample size calculator• Online calculator for clinical trial: SWOG• Software: G*Power
2
Power Analysis with R
3
https://www.statmethods.net/stats/power.html
Power/Sample size calculation for one or two proportions• Power calculations for proportion tests (one sample)
– H0: p=p1 vs. Ha: p ≠ p1– pwr.p.test(h, n, sig.level, power, alternative =
c("two.sided","less","greater"))• Power calculation for two proportions (same sample
size)– H0: p1=p2 vs. Ha: p1 ≠ p2– pwr.2p.test(h, n, sig.level, power,
alternative=c("two.sided","less","greater"))• Power calculation for two proportions (different
sample sizes)– H0: p1=p2 vs. Ha: p1 ≠ p2– pwr.2p2n.test(h, n1, n2, sig.level, power, alternative
= c("two.sided", "less","greater"))• Effect size calculation• R Demo
4
Power calculations for chi-squared testspwr.chisq.test(w = NULL, N = NULL, df = NULL, sig.level = 0.05, power = NULL)• ES.w1(P0, P1):
– Effect size calculation in the chi-squared test for goodness of fit, which is the sum of differences between observed and expected outcome frequencies
– Compute effect size w for two sets of k probabilities P0 (null hypothesis) and P1 (alternative hypothesis)
• ES.w2(P0, P1): – Effect size calculation in the chi-squared test for
association– Compute effect size w for a two-way probability table
corresponding to the alternative hypothesis in the chi-squared test of association in two-way contingency tables
5
Power/Sample size calculation for one or two means• Power calculations for t-tests of means (one sample,
two samples and paired samples)– One sample: H0: 𝞵 = 𝞵1 vs. Ha: 𝞵 ≠ 𝞵1– Two sample or paired samples: H0: 𝞵1= 𝞵2 vs. Ha: 𝞵1
≠ 𝞵2– pwr.t.test(n, d, sig.level, power, type =
c("two.sample", "one.sample", "paired"), alternative = c("two.sided", "less", "greater"))
• Power calculations for two samples (different sizes) t-tests of means
– H0: 𝞵1= 𝞵2 vs. Ha: 𝞵1 ≠ 𝞵2– pwr.t2n.test(n1, n2, d, sig.level = 0.05, power,
alternative = c("two.sided", "less","greater"))
• R Demo6
Power calculations for balanced one-way analysis of variance tests• pwr.anova.test(k = NULL, n = NULL, f = NULL,
sig.level = 0.05, power = NULL)
7
k Number of groupsn Number of observations (per group)f Effect size
Power calculations for the general linear model• pwr.f2.test(u = NULL, v = NULL, f2 = NULL,
sig.level = 0.05, power = NULL)• u and v are the numerator and denominator degrees of
freedom. We use f2 as the effect size measure.• when evaluating the impact of a set of predictors on an
outcome
• when evaluating the impact of one set of predictors above and beyond a second set of predictors (or covariates)
8
Other R Packages for Sample Size Calculation• powerSurvEpi: Power and Sample Size Calculation for
Survival Analysis of Epidemiological Studies • epiR: Sample size cohort study, case-control study,
cross-sectional study, under one or two-stage cluster sampling
• kappaSize: Sample Size Estimation Functions for Studies of Interobserver Agreement
• powerMediation: Power/Sample size calculation for mediation analysis, simple linear regression, logistic regression, or longitudinal study
• power.roc.test {pROC}: Computes sample size, power, significance level or minimum AUC for ROC curves.
• RNASeqPower: Sample Size for RNA-Seq and similar Studies
9
Case-control study by library(epiR)
• A matched case control study is to be carried out to quantify the association between exposure A and an outcome B.
– Assume the prevalence of exposure in controls is 0.60 and the correlation between case and control exposures for matched pairs (rho) is 0.20 (moderate).
– Assuming an equal number of cases and controls, how many subjects need to be enrolled into the study to detect an odds ratio of 3.0 with 0.80 power using a two-sided 0.05 test?
– epi.ccsize(OR = 3.0, p0 = 0.60, n = NA, power = 0.80, r = 1, rho = 0.2, design = 1, sided.test = 2, conf.level = 0.95, method = "matched", fleiss = FALSE)
– A total of 162 subjects need to be enrolled in the study: 81 cases and 81 controls.
10
Case-control study by library(epiR)
• How many cases and controls are required if we select three controls per case?
• epi.ccsize(OR = 3.0, p0 = 0.60, n = NA, power = 0.80, r = 3, rho = 0.2, design = 1, sided.test = 2, conf.level = 0.95, method = "matched", fleiss = FALSE)
• A total of 204 subjects need to be enrolled in the study: 51 cases and 153 controls.
11
kappaSize: Sample Size Estimation Functions for Studies of Interobserver Agreement
• Library(kappaSize)• Can handle binary to 5 categories
• Confidence Interval Approach– E.g. CI3Cats
• Calculation of the Lowest Expected Value – E.g., FixedN4Cats
• Power-Based Approach– E.g., PowerBinary
12
Computes sample size/power/minimum AUC for ROC curvespower.roc.test(...) • One or Two ROC curves test with roc objects:
– power.roc.test(roc1, roc2, sig.level = 0.05, power = NULL, alternative = c("two.sided", "one.sided"), reuse.auc=TRUE, method = c("delong", "bootstrap", "obuchowski"), ...)
• One ROC curve with a given AUC: – power.roc.test(auc = NULL, ncontrols = NULL, ncases =
NULL, sig.level = 0.05, power = NULL, kappa = 1, alternative = c("two.sided", "one.sided"), ...)
• Two ROC curves with the given parameters: – power.roc.test(parslist, ncontrols = NULL, ncases = NULL,
sig.level = 0.05, power = NULL, kappa = 1, alternative = c("two.sided", "one.sided"), ...)
13
RNASeqPower: Sample Size for RNA-Seqand Similar Studies • rnapower(depth, n, n2 = n, cv, cv2 = cv, effect, alpha,
power) – depth average depth of coverage for the transcript or gene
of interest. Common values are 5-20, any numeric value >0 is valid.
– n sample size in group 1 (or both)– n2 sample size in group 2– cv biological coefficient of variation in group 1 (or both). – cv2 biological coefficient of variation in group 2– effect size target effect size
14
Comments about R packages
• Pros:– Free– A lot of resources for different tests/study designs– Generate a figure/table easily for different options of
parameters, for example, sample size calculation for sequencing data
• Cons:– Need to write codes– Hard to implement sometimes– Not sure about reliability
15
Power Analysis with SAS
• SAS – PROC POWER
• t-tests, equivalence tests, and confidence intervals for means tests,
• equivalence tests, and confidence intervals for binomial proportions
• multiple regression• tests of correlation and partial correlation• one-way analysis of variance• rank tests for comparing two survival curves• logistic regression with binary response• Wilcoxon-Mann-Whitney (rank-sum) test
– PROC GLMPOWER: Compute Power and Sample Size for Repeated Measures
16
SAS Example: Calculate power for Pearson chi-squared tests • Same sample size, two-sided test of proportions
proc power;twosamplefreq test=pchi groupproportions=(0.1 0.5) npergroup=30 power=.;run;
17
The SAS System 1
The POWER ProcedurePearson Chi-square Test for Proportion Difference
The SAS System 1
The POWER ProcedurePearson Chi-square Test for Proportion Difference
Fixed Scenario Elements
Distribution Asymptotic normal
Method Normal approximation
Group 1 Proportion 0.1
Group 2 Proportion 0.5
Sample Size per Group 30
Number of Sides 2
Null Proportion Difference 0
Alpha 0.05
ComputedPower
Power
0.943
The SAS System 1
The POWER ProcedurePearson Chi-square Test for Proportion Difference
The SAS System 1
The POWER ProcedurePearson Chi-square Test for Proportion Difference
Fixed Scenario Elements
Distribution Asymptotic normal
Method Normal approximation
Group 1 Proportion 0.1
Group 2 Proportion 0.5
Sample Size per Group 30
Number of Sides 2
Null Proportion Difference 0
Alpha 0.05
ComputedPower
Power
0.943
SAS Example: Calculate power for Pearson chi-squared tests • Different sample size, two-sided test of proportions
proc power;twosamplefreq test=pchi groupproportions=(0.1 0.5) groupns=25 | 50 power=.;run;
18
The SAS System 2
The POWER ProcedurePearson Chi-square Test for Proportion Difference
The SAS System 2
The POWER ProcedurePearson Chi-square Test for Proportion Difference
Fixed Scenario Elements
Distribution Asymptotic normal
Method Normal approximation
Group 1 Proportion 0.1
Group 2 Proportion 0.5
Group 1 Sample Size 25
Group 2 Sample Size 50
Number of Sides 2
Null Proportion Difference 0
Alpha 0.05
ComputedPower
Power
0.966
The SAS System 2
The POWER ProcedurePearson Chi-square Test for Proportion Difference
The SAS System 2
The POWER ProcedurePearson Chi-square Test for Proportion Difference
Fixed Scenario Elements
Distribution Asymptotic normal
Method Normal approximation
Group 1 Proportion 0.1
Group 2 Proportion 0.5
Group 1 Sample Size 25
Group 2 Sample Size 50
Number of Sides 2
Null Proportion Difference 0
Alpha 0.05
ComputedPower
Power
0.966
SAS Example: Calculate power for t-tests• Two independent samples, same size
proc power;twosamplemeans test=diff meandiff=2 stddev=2.8 npergroup=30 power=.;run;
19
The SAS System 3
The POWER ProcedureTwo-Sample t Test for Mean Difference
The SAS System 3
The POWER ProcedureTwo-Sample t Test for Mean Difference
Fixed Scenario Elements
Distribution Normal
Method Exact
Mean Difference 2
Standard Deviation 2.8
Sample Size per Group 30
Number of Sides 2
Null Difference 0
Alpha 0.05
ComputedPower
Power
0.776
The SAS System 3
The POWER ProcedureTwo-Sample t Test for Mean Difference
The SAS System 3
The POWER ProcedureTwo-Sample t Test for Mean Difference
Fixed Scenario Elements
Distribution Normal
Method Exact
Mean Difference 2
Standard Deviation 2.8
Sample Size per Group 30
Number of Sides 2
Null Difference 0
Alpha 0.05
ComputedPower
Power
0.776
SAS Example: Calculate power for t-tests• One sample
proc power;onesamplemeans test=t mean=2 stddev=2.8 ntotal=30 power=.; run;
20
The SAS System 4
The POWER ProcedureOne-Sample t Test for Mean
The SAS System 4
The POWER ProcedureOne-Sample t Test for Mean
Fixed Scenario Elements
Distribution Normal
Method Exact
Mean 2
Standard Deviation 2.8
Total Sample Size 30
Number of Sides 2
Null Mean 0
Alpha 0.05
ComputedPower
Power
0.966
The SAS System 4
The POWER ProcedureOne-Sample t Test for Mean
The SAS System 4
The POWER ProcedureOne-Sample t Test for Mean
Fixed Scenario Elements
Distribution Normal
Method Exact
Mean 2
Standard Deviation 2.8
Total Sample Size 30
Number of Sides 2
Null Mean 0
Alpha 0.05
ComputedPower
Power
0.966
SAS Example: Calculate power for t-tests• Paired samples
proc power;pairedmeans test=diff meandiff=2 corr=0.5 stddev=2.8 npairs=30 power=.; run;
21
The SAS System 5
The POWER ProcedurePaired t Test for Mean Difference
The SAS System 5
The POWER ProcedurePaired t Test for Mean Difference
Fixed Scenario Elements
Distribution Normal
Method Exact
Mean Difference 2
Standard Deviation 2.8
Correlation 0.5
Number of Pairs 30
Number of Sides 2
Null Difference 0
Alpha 0.05
ComputedPower
Power
0.966
The SAS System 5
The POWER ProcedurePaired t Test for Mean Difference
The SAS System 5
The POWER ProcedurePaired t Test for Mean Difference
Fixed Scenario Elements
Distribution Normal
Method Exact
Mean Difference 2
Standard Deviation 2.8
Correlation 0.5
Number of Pairs 30
Number of Sides 2
Null Difference 0
Alpha 0.05
ComputedPower
Power
0.966
SAS Example: Calculate power for t-tests• Two independent samples, different sizes
proc power;twosamplemeans test=diff meandiff=2 stddev=2.8 groupns=(20 40) power=.; run;
22
The SAS System 6
The POWER ProcedureTwo-Sample t Test for Mean Difference
The SAS System 6
The POWER ProcedureTwo-Sample t Test for Mean Difference
Fixed Scenario Elements
Distribution Normal
Method Exact
Mean Difference 2
Standard Deviation 2.8
Group 1 Sample Size 20
Group 2 Sample Size 40
Number of Sides 2
Null Difference 0
Alpha 0.05
ComputedPower
Power
0.727
The SAS System 6
The POWER ProcedureTwo-Sample t Test for Mean Difference
The SAS System 6
The POWER ProcedureTwo-Sample t Test for Mean Difference
Fixed Scenario Elements
Distribution Normal
Method Exact
Mean Difference 2
Standard Deviation 2.8
Group 1 Sample Size 20
Group 2 Sample Size 40
Number of Sides 2
Null Difference 0
Alpha 0.05
ComputedPower
Power
0.727
UCSF CTSI Sample Size Calculators
• http://www.sample-size.net– Can do most of the popular tests– Compare the mean of a continuous measurement in
two samples which allow for clustered sampling.• A cluster randomized controlled trial is a type
of randomized controlled trial in which groups of subjects (as opposed to individual subjects) are randomised.
23
Online Calculators: SWOG
https://stattools.crab.org• Primary objective is not a hypothesis, just estimation,
then provide the precision of the estimation– Example: The expected adherence rate is 80%, n=50– 95% CI is (66.3%, 90.0%)
• One arm binomial: H0: P=0.1 vs. Ha: P≠0.1• One arm survival:
– Length of the accrual period– Length of the follow-up period, i.e. the time from end of
accrual to analysis– H0: median OS=6 months vs. Ha: median OS>6 months– H0: s(t)=0.5 at 6 months vs. Ha: s(t) = 0.5 at 9 months
24
Online Calculators: SWOG (cont.)
• Two-arm binomial: H0: P1=P2 vs. Ha: P1≠P2– P1 = 0.1 vs. P2 = 0.25
• Two-arm survival: – Length of the accrual period– Length of the follow-up period, i.e. the time from end of
accrual to analysis– H0: HR = 2 vs. Ha: HR ≠ 2 (Median OS = 6months for null,
12-month accrual and 12-month followup)
25
Online Calculators: SWOG (cont.)
al If the number of successes after completing the first stage is < al, we reject the alternative hypothesis that p > Pa.
rl If the number of successes after completing the first stage is > rl, we reject the null hypothesis that p < P0.
a2 If the number of successes after completing the trial is < a2 then we reject the alternative hypothesis.
r2 If the number of successes after completing the trial is > r2 then we reject the null hypothesis.
26
Two stage
Online Calculators: SWOG (cont.)
• Other options:– Survival noninferiority
• Competing Risk: the hazard of the competing risk random variable
• Hazard ratios between experimental and standard defining equivalence
• Hazard ratio must be less than hazard ratio defining equivalence
– Expected Deaths• Make a table of expected death information• Provide expected deaths for a given time• Provide expected deaths for a time at which the
expected proportion of deaths have occurred.
27
Online Calculators: Simon’s two stage design • http://cancer.unc.edu/biostatistics/program/ivanova/Simo
nsTwoStageDesign.aspx– One arm Phase II clinical trial– Endpoint: Response rate or binary outcome– Incorporate interim analysis for futility – One-sided test– Example: H0: P=0.1 vs. Ha: P>0.1
Simon's two-stage design (Simon, 1989) will be used. The null hypothesis that the true response rate is 0.1 will be tested against a one-sided alternative. In the first stage, 22 patients will be accrued. If there are 2 or fewer responses in these 22 patients, the study will be stopped. Otherwise, 18 additional patients will be accrued for a total of 40.The null hypothesis will be rejected if 8 or more responses are observed in 40 patients. This design yields a type I error rate of 0.04 and power of 80%when the true response rate is 0.25.
28
G* Power
• G*Power is a tool to compute statistical power analyses for many different t tests, F tests, χ2 tests, z tests and some exact tests.
• G*Power can also be used to compute effect sizes and to display graphically the results of power analyses.
• It is free, both Windows and Mac version.
29
Exact: Proportion - inequality, two dependent groups (McNemar)
5 Exact: Proportion - inequality, twodependent groups (McNemar)
This procedure relates to tests of paired binary responses.Such data can be represented in a 2 ⇥ 2 table:
StandardTreatment Yes No
Yes p11 p12 ptNo p21 p22 1 � pt
ps 1 � ps 1
where pij denotes the probability of the respective re-sponse. The probability pD of discordant pairs, that is, theprobability of yes/no-response pairs, is given by pD =p12 + p21. The hypothesis of interest is that ps = pt, whichis formally identical to the statement p12 = p21.
Using this fact, the null hypothesis states (in a ratio no-tation) that p12 is identical to p21, and the alternative hy-pothesis states that p12 and p21 are different:
H0 : p12/p21 = 1H1 : p12/p21 6= 1.
In the context of the McNemar test the term odds ratio (OR)denotes the ratio p12/p21 that is used in the formulation ofH0 and H1.
5.1 Effect size indexThe Odds ratio p12/p21 is used to specify the effect size.The odds ratio must lie inside the interval [10�6, 106]. Anodds ratio of 1 corresponds to a null effect. Therefore thisvalue must not be used in a priori analyses.
In addition to the odds ratio, the proportion of discordantpairs, i.e. pD, must be given in the input parameter fieldcalled Prop discordant pairs. The values for this propor-tion must lie inside the interval [#, 1 � #], with # = 10�6.
If pD and d = p12 � p21 are given, then the odds ratiomay be calculated as: OR = (d + pD)/(d � pD).
5.2 OptionsPress the Options button in the main window to select oneof the following options.
5.2.1 Alpha balancing in two-sided tests
The binomial distribution is discrete. It is therefore notnormally possible to arrive at the exact nominal a-level.For two-sided tests this leads to the problem how to “dis-tribute” a to the two sides. G * Power offers the three op-tions listed here, the first option being selected by default:
1. Assign a/2 to both sides: Both sides are handled inde-pendently in exactly the same way as in a one-sidedtest. The only difference is that a/2 is used instead ofa. Of the three options offered by G * Power , this oneleads to the greatest deviation from the actual a (in posthoc analyses).
2. Assign to minor tail a/2, then rest to major tail (a2 =a/2, a1 = a � a2): First a/2 is applied to the side of
the central distribution that is farther away from thenoncentral distribution (minor tail). The criterion usedon the other side is then a � a1, where a1 is the actuala found on the minor side. Since a1 a/2 one canconclude that (in post hoc analyses) the sum of the ac-tual values a1 + a2 is in general closer to the nominala-level than it would be if a/2 were assigned to bothsides (see Option 1).
3. Assign a/2 to both sides, then increase to minimize the dif-ference of a1 + a2 to a: The first step is exactly the sameas in Option 1. Then, in the second step, the criticalvalues on both sides of the distribution are increased(using the lower of the two potential incremental a-values) until the sum of both actual a values is as closeas possible to the nominal a.
5.2.2 Computation
You may choose between an exact procedure and a fasterapproximation (see implementation notes for details):
1. Exact (unconditional) power if N < x. The computationtime of the exact procedure increases much faster withsample size N than that of the approximation. Giventhat both procedures usually produce very similar re-sults for large sample sizes, a threshold value x for Ncan be specified which determines the transition be-tween both procedures. The exact procedure is used ifN < x; the approximation is used otherwise.Note: G * Power does not show distribution plots forexact computations.
2. Faster approximation (assumes number of discordant pairsto be constant). Choosing this option instructs G * Power
to always use the approximation.
5.3 ExamplesAs an example we replicate the computations in O’Brien(2002, p. 161-163). The assumed table is:
StandardTreatment Yes No
Yes .54 .08 .62No .32 .06 .38
.86 .14 1
In this table the proportion of discordant pairs is pD =.32 + .08 = 0.4 and the Odds Ratio OR = p12/p21 =0.08/.32 = 0.25. We want to compute the exact power fora one-sided test. The sample size N, that is, the number ofpairs, is 50 and a = 0.05.
• SelectType of power analysis: Post hoc
• OptionsComputation: Exact
• InputTail(s): OneOdds ratio: 0.25a err prob: 0.05Total sample size: 50Prop discordant pairs: 0.4
14
30
5 Exact: Proportion - inequality, twodependent groups (McNemar)
This procedure relates to tests of paired binary responses.Such data can be represented in a 2 ⇥ 2 table:
StandardTreatment Yes No
Yes p11 p12 ptNo p21 p22 1 � pt
ps 1 � ps 1
where pij denotes the probability of the respective re-sponse. The probability pD of discordant pairs, that is, theprobability of yes/no-response pairs, is given by pD =p12 + p21. The hypothesis of interest is that ps = pt, whichis formally identical to the statement p12 = p21.
Using this fact, the null hypothesis states (in a ratio no-tation) that p12 is identical to p21, and the alternative hy-pothesis states that p12 and p21 are different:
H0 : p12/p21 = 1H1 : p12/p21 6= 1.
In the context of the McNemar test the term odds ratio (OR)denotes the ratio p12/p21 that is used in the formulation ofH0 and H1.
5.1 Effect size indexThe Odds ratio p12/p21 is used to specify the effect size.The odds ratio must lie inside the interval [10�6, 106]. Anodds ratio of 1 corresponds to a null effect. Therefore thisvalue must not be used in a priori analyses.
In addition to the odds ratio, the proportion of discordantpairs, i.e. pD, must be given in the input parameter fieldcalled Prop discordant pairs. The values for this propor-tion must lie inside the interval [#, 1 � #], with # = 10�6.
If pD and d = p12 � p21 are given, then the odds ratiomay be calculated as: OR = (d + pD)/(d � pD).
5.2 OptionsPress the Options button in the main window to select oneof the following options.
5.2.1 Alpha balancing in two-sided tests
The binomial distribution is discrete. It is therefore notnormally possible to arrive at the exact nominal a-level.For two-sided tests this leads to the problem how to “dis-tribute” a to the two sides. G * Power offers the three op-tions listed here, the first option being selected by default:
1. Assign a/2 to both sides: Both sides are handled inde-pendently in exactly the same way as in a one-sidedtest. The only difference is that a/2 is used instead ofa. Of the three options offered by G * Power , this oneleads to the greatest deviation from the actual a (in posthoc analyses).
2. Assign to minor tail a/2, then rest to major tail (a2 =a/2, a1 = a � a2): First a/2 is applied to the side of
the central distribution that is farther away from thenoncentral distribution (minor tail). The criterion usedon the other side is then a � a1, where a1 is the actuala found on the minor side. Since a1 a/2 one canconclude that (in post hoc analyses) the sum of the ac-tual values a1 + a2 is in general closer to the nominala-level than it would be if a/2 were assigned to bothsides (see Option 1).
3. Assign a/2 to both sides, then increase to minimize the dif-ference of a1 + a2 to a: The first step is exactly the sameas in Option 1. Then, in the second step, the criticalvalues on both sides of the distribution are increased(using the lower of the two potential incremental a-values) until the sum of both actual a values is as closeas possible to the nominal a.
5.2.2 Computation
You may choose between an exact procedure and a fasterapproximation (see implementation notes for details):
1. Exact (unconditional) power if N < x. The computationtime of the exact procedure increases much faster withsample size N than that of the approximation. Giventhat both procedures usually produce very similar re-sults for large sample sizes, a threshold value x for Ncan be specified which determines the transition be-tween both procedures. The exact procedure is used ifN < x; the approximation is used otherwise.Note: G * Power does not show distribution plots forexact computations.
2. Faster approximation (assumes number of discordant pairsto be constant). Choosing this option instructs G * Power
to always use the approximation.
5.3 ExamplesAs an example we replicate the computations in O’Brien(2002, p. 161-163). The assumed table is:
StandardTreatment Yes No
Yes .54 .08 .62No .32 .06 .38
.86 .14 1
In this table the proportion of discordant pairs is pD =.32 + .08 = 0.4 and the Odds Ratio OR = p12/p21 =0.08/.32 = 0.25. We want to compute the exact power fora one-sided test. The sample size N, that is, the number ofpairs, is 50 and a = 0.05.
• SelectType of power analysis: Post hoc
• OptionsComputation: Exact
• InputTail(s): OneOdds ratio: 0.25a err prob: 0.05Total sample size: 50Prop discordant pairs: 0.4
14
5 Exact: Proportion - inequality, twodependent groups (McNemar)
This procedure relates to tests of paired binary responses.Such data can be represented in a 2 ⇥ 2 table:
StandardTreatment Yes No
Yes p11 p12 ptNo p21 p22 1 � pt
ps 1 � ps 1
where pij denotes the probability of the respective re-sponse. The probability pD of discordant pairs, that is, theprobability of yes/no-response pairs, is given by pD =p12 + p21. The hypothesis of interest is that ps = pt, whichis formally identical to the statement p12 = p21.
Using this fact, the null hypothesis states (in a ratio no-tation) that p12 is identical to p21, and the alternative hy-pothesis states that p12 and p21 are different:
H0 : p12/p21 = 1H1 : p12/p21 6= 1.
In the context of the McNemar test the term odds ratio (OR)denotes the ratio p12/p21 that is used in the formulation ofH0 and H1.
5.1 Effect size indexThe Odds ratio p12/p21 is used to specify the effect size.The odds ratio must lie inside the interval [10�6, 106]. Anodds ratio of 1 corresponds to a null effect. Therefore thisvalue must not be used in a priori analyses.
In addition to the odds ratio, the proportion of discordantpairs, i.e. pD, must be given in the input parameter fieldcalled Prop discordant pairs. The values for this propor-tion must lie inside the interval [#, 1 � #], with # = 10�6.
If pD and d = p12 � p21 are given, then the odds ratiomay be calculated as: OR = (d + pD)/(d � pD).
5.2 OptionsPress the Options button in the main window to select oneof the following options.
5.2.1 Alpha balancing in two-sided tests
The binomial distribution is discrete. It is therefore notnormally possible to arrive at the exact nominal a-level.For two-sided tests this leads to the problem how to “dis-tribute” a to the two sides. G * Power offers the three op-tions listed here, the first option being selected by default:
1. Assign a/2 to both sides: Both sides are handled inde-pendently in exactly the same way as in a one-sidedtest. The only difference is that a/2 is used instead ofa. Of the three options offered by G * Power , this oneleads to the greatest deviation from the actual a (in posthoc analyses).
2. Assign to minor tail a/2, then rest to major tail (a2 =a/2, a1 = a � a2): First a/2 is applied to the side of
the central distribution that is farther away from thenoncentral distribution (minor tail). The criterion usedon the other side is then a � a1, where a1 is the actuala found on the minor side. Since a1 a/2 one canconclude that (in post hoc analyses) the sum of the ac-tual values a1 + a2 is in general closer to the nominala-level than it would be if a/2 were assigned to bothsides (see Option 1).
3. Assign a/2 to both sides, then increase to minimize the dif-ference of a1 + a2 to a: The first step is exactly the sameas in Option 1. Then, in the second step, the criticalvalues on both sides of the distribution are increased(using the lower of the two potential incremental a-values) until the sum of both actual a values is as closeas possible to the nominal a.
5.2.2 Computation
You may choose between an exact procedure and a fasterapproximation (see implementation notes for details):
1. Exact (unconditional) power if N < x. The computationtime of the exact procedure increases much faster withsample size N than that of the approximation. Giventhat both procedures usually produce very similar re-sults for large sample sizes, a threshold value x for Ncan be specified which determines the transition be-tween both procedures. The exact procedure is used ifN < x; the approximation is used otherwise.Note: G * Power does not show distribution plots forexact computations.
2. Faster approximation (assumes number of discordant pairsto be constant). Choosing this option instructs G * Power
to always use the approximation.
5.3 ExamplesAs an example we replicate the computations in O’Brien(2002, p. 161-163). The assumed table is:
StandardTreatment Yes No
Yes .54 .08 .62No .32 .06 .38
.86 .14 1
In this table the proportion of discordant pairs is pD =.32 + .08 = 0.4 and the Odds Ratio OR = p12/p21 =0.08/.32 = 0.25. We want to compute the exact power fora one-sided test. The sample size N, that is, the number ofpairs, is 50 and a = 0.05.
• SelectType of power analysis: Post hoc
• OptionsComputation: Exact
• InputTail(s): OneOdds ratio: 0.25a err prob: 0.05Total sample size: 50Prop discordant pairs: 0.4
14
• InputTail(s): TwoOdds ratio: 0.25α err prob: 0.05Total sample size: 50 Prop discordant pairs: 0.4 • OutputPower (1-β err prob): 0.80 Actual α: 0.04Proportion p12: 0.08 Proportion p21: 0.32
• SelectType of power analysis: Post hoc • Options Computation: Exact
F test: Fixed effects One-Way ANOVA
10 F test: Fixed effects ANOVA - oneway
The fixed effects one-way ANOVA tests whether there areany differences between the means µi of k � 2 normallydistributed random variables with equal variance s. Therandom variables represent measurements of a variable Xin k fixed populations. The one-way ANOVA can be viewedas an extension of the two group t test for a difference ofmeans to more than two groups.
The null hypothesis is that all k means are identical H0 :µ1 = µ2 = . . . = µk. The alternative hypothesis states thatat least two of the k means differ. H1 : µi 6= µj, for at leastone pair i, j with 1 i, j k.
10.1 Effect size indexThe effect size f is defined as: f = sm/s. In this equa-tion sm is the standard deviation of the group means µiand s the common standard deviation within each of thek groups. The total variance is then s2
t = s2m + s2. A dif-
ferent but equivalent way to specify the effect size is interms of h2, which is defined as h2 = s2
m/s2t . That is, h2
is the ratio between the between-groups variance s2m and
the total variance s2t and can be interpreted as “proportion
of variance explained by group membership”. The relation-ship between h2 and f is: h2 = f 2/(1 + f 2) or solved for f :f =
ph2/(1 � h2).
Cohen (1969, p.348) defines the following effect size con-ventions:
• small f = 0.10
• medium f = 0.25
• large f = 0.40
If the mean µi and size ni of all k groups are known thenthe standard deviation sm can be calculated in the followingway:
µ̄ = Âki=1 wiµi, (grand mean),
sm =q
Âki=1 wi(µi � µ̄)2.
where wi = ni/(n1 + n2 + · · ·+ nk) stands for the relativesize of group i.
Pressing the Determine button to the left of the effect sizelabel opens the effect size drawer. You can use this drawerto calculate the effect size f from variances, from h2 or fromthe group means and group sizes. The drawer essentiallycontains two different dialogs and you can use the Selectprocedure selection field to choose one of them.
10.1.1 Effect size from means
In this dialog (see left side of Fig. 11) you normally start bysetting the number of groups. G * Power then provides youwith a mean and group size table of appropriate size. Insertthe standard deviation s common to all groups in the SDs within each group field. Then you need to specify themean µi and size ni for each group. If all group sizes areequal then you may insert the common group size in theinput field to the right of the Equal n button. Clicking on
Figure 11: Effect size dialogs to calculate f
this button fills the size column of the table with the chosenvalue.
Clicking on the Calculate button provides a preview ofthe effect size that results from your inputs. If you clickon the Calculate and transfer to main window buttonthen G * Power calculates the effect size and transfers theresult into the effect size field in the main window. If thenumber of groups or the total sample size given in the ef-fect size drawer differ from the corresponding values in themain window, you will be asked whether you want to ad-just the values in the main window to the ones in the effectsize drawer.
24
31
Example: We compare 10 groups, and we have reason to expect a "medium" effect size (f = .25). How many subjects do we need in a test with α = 0.05 to achieve a power of 0.95? •SelectType of power analysis: A priori •InputEffect size f : 0.25α err prob: 0.05Power (1-β err prob): 0.95 Number of groups: 10 •OutputNoncentrality parameter λ: 24.375000 Critical F: 1.904538Numerator df: 9Denominator df: 380Total sample size: 390Actual Power: 0.952363
Questions?
32