meta analysis with r

71
Meta analysis Conducting Meta-Analyses in R

Upload: alberto-labarga

Post on 09-May-2015

8.848 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Meta analysis with R

Meta analysis

Conducting Meta-Analyses in R

Page 2: Meta analysis with R

What is meta-analysis?

• Science is a cumulative process . Therefore, it is not surprising that one can often find dozens and sometimes hundreds of studies addressing the same basic question.

• Researches trying to aggregate and synthesize the literature on a particular topic are increasingly conducting meta-analyses

Page 3: Meta analysis with R

Why do we need meta-analyses?

• Literature expansion in research• Allows researchers the ability to statistically

combine countless studies to increase power• Allows us to measure “how much” of a

relationship exists, rather than just whether a relationship exists

• Allows us to account for the variation in results between similar studies based on procedural characteristics of individual studies

Page 4: Meta analysis with R

What is meta-analysis?

• A standardized secondary analysis of primary data results from different studies that share same hypothesis

• A quantitative aggregation of findings during a research synthesis

• Calculating a standardized effect size for multiple studies

Page 5: Meta analysis with R

Effect Size in MA• Effect size makes meta-analysis possible– it is the “dependent variable”– it standardizes findings across studies such that

they can be directly compared• Any standardized index can be an “effect size” (e.g.,

standardized mean difference, correlation coefficient, odds-ratio) as long as:– It is comparable across studies – It represents the magnitude and direction of the

relationship of interest– It is independent of sample size

• Different meta-analyses may use different effect size indices

Page 6: Meta analysis with R

Examples of Different Types of Effect Sizes:• Standardized Mean Difference (continuous outcome)– group contrast research• treatment groups• naturally occurring groups

• Odds-Ratio (dichotomous outcome)– group contrast research• treatment groups• naturally occurring groups

• Correlation Coefficient– association between variables research

Page 7: Meta analysis with R

Statistical significance

• Turns out a lot of researchers do not know what precisely p < .05 actually means– Cohen (1994) Article: The earth is round (p<.05)

• What it means: "Given that H0 is true, what is the probability of these (or more extreme) data?”

• Trouble is most people want to know "Given these data, what is the probability that H0 is true?"

Page 8: Meta analysis with R

Always a difference

• With most analyses we commonly define the null hypothesis as ‘no relationship’ between our predictor and outcome (i.e. the ‘nil’ hypothesis)

• With sample data, differences between groups always exist (at some level of precision), correlations are always non-zero.

• Obtaining statistical significance can be seen as just a matter of sample size

• Furthermore, the importance and magnitude of an effect are not accurately reflected because of the role of sample size in probability value attained

Page 9: Meta analysis with R

What should we be doing?

• We want to make sure we have looked hard enough for the difference – power analysis

• Figure out how big the thing we are looking for is – effect size– Effect size refers to the magnitude of the impact of

some variable on another

Page 10: Meta analysis with R

Examples of Different Types of Effect Sizes:• Standardized Mean Difference (continuous outcome)– group contrast research• treatment groups• naturally occurring groups

• Odds-Ratio (dichotomous outcome)– group contrast research• treatment groups• naturally occurring groups

• Correlation Coefficient– association between variables research

Page 11: Meta analysis with R

The Standardized Mean Difference

• Represents a standardized group comparison on a continuous outcome measure.

• Uses the pooled standard deviation (some situations use control group standard deviation).

• Commonly called “Cohen’s d” or occasionally “Hedges’ g”.

pooleds

XXES 21

2

11

21

2221

21

nn

nsnsspooled

Page 12: Meta analysis with R

The Correlation Coefficient

• Represents the strength of association between two continuous measures.

• Generally reported directly as “r” (the Pearson product moment coefficient).

rES

Page 13: Meta analysis with R

Odds-Ratios• The Odds-Ratio is based on a 2 by 2 contingency

table, such as the one below.

• The Odds-Ratio is the odds of success in the treatment group relative to the odds of success in the control group.

Frequencies

Success Failure

Treatment Group a b

Control Group c d bc

adES

Page 14: Meta analysis with R

Converting results into a common metric

• Can convert p-values t, F, etc. into the standardized effect size metric being used in the meta-analysis (e.g., d, r)

Page 15: Meta analysis with R

Interpreting Effect Size Results

• Cohen’s “Rules-of-Thumb”– standardized mean difference effect size• small = 0.20• medium = 0.50• large = 0.80

– correlation coefficient• small = 0.10• medium = 0.25• large = 0.40

Page 16: Meta analysis with R

Cohen’s d (Hedge’s g)

• Defined d for the one-sample case

Xd

s

Page 17: Meta analysis with R

Cohen’s d• Now compare to the one-sample t-statistic

• So

• This shows how the test statistic (and its observed p-value) is in part determined by the effect size, but is confounded with sample size

• This means small effects may be statistically significant in many studies (esp. social sciences)

XXt

s

N

tt d N and d

N

Page 18: Meta analysis with R

Example

• Average number of times MGEC students curse in the presence of others out of total frustration over the course of a day

• Currently taking R course vs. not• Data:

2

2

13 7.5 30

11 5.0 30

s

n

X s n

X s n

Page 19: Meta analysis with R

Example

• Find the pooled variance and sd– Equal groups so just average the two

variances such that and sp2 = 6.25

13 11.8

6.25d

Page 20: Meta analysis with R

Odds ratios• Especially good for 2X2 tables• Take a ratio of two outcomes• Although neither gets the majority, we

could say which they were more likely to vote for respectively

• Odds Clinton among Dems= 564/636 = .887

• Odds McCain among Reps= 450/550 = .818

• .887/.818 (the odds ratio) means they’d be 1.08 times as likely to vote Clinton among democrats than McCain among republicans

• However, the 95% CI for the odds ratio is:– .92 to 1.28

Yes No TotalClinton 564 636 1200McCain 450 550 1000

Page 21: Meta analysis with R

Voting Method

• Voting method was commonly employed for aggregation of studies before the conception of meta-analysis

• Procedure:– Studies with a dependent variable and a specific

independent variable are examined– Studies are dichotomized as either statistically

significant or not statistically significant– Classification with higher tally is considered to be

the “true” relationship between variables

Page 22: Meta analysis with R

Voting Method

• Researcher A is conducting a study on the effects of RtI on a group of 1st graders’ fluency rate.

• In A’s study, which has a sample size of n=180, 110 children are given RtI and 70 children are given traditional instruction. After 12 weeks of instruction, children are dichotomized as either “pass” or “fail” on a reading measure.

• The improvement rate for the RtI group is .45 vs. .43 for the control group.

RtITraditionalPass503080Fail6040100

11070

Page 23: Meta analysis with R

Voting Method

• Researcher B conducts the same study at a different site

• In B’s study, which has a sample size of n=230, 90 children receive RtI and 140 receive traditional instruction

• Again the improvement rate for the RtI group is .67 vs. .64 for the control group.

• That’s 2-0 for the experimental group!

RtI TraditionalPass 60 90 150Fail 30 50 80

90 140

Page 24: Meta analysis with R

Aggregation of Raw Data• Suppose another researcher aggregates the data from the same

studies by summing the raw data instead of employing the voting method

• Add the number of subjects in both studies that received treatment and control: n=200 received RtI and n=210 received traditional instruction

• When dichotomized into “pass” or “fail”, the improvement rate for the treatment group is now .55 vs. 0.57 for the control group!

• This is known as Simpson’s Paradox

RtI TraditionalPass 110 120 230Fail 90 90 180

200 210

Page 25: Meta analysis with R

Voting Method

• Flaws:– Bias in favor of large-sample studies • Why is this a problem?

– No weighting of sample size– Tells us nothing about strength of relationship– Does not control for variation between studies

Page 26: Meta analysis with R

Methodological Considerations

• Determine the statistic of interest to calculate individual study effect sizes:Is your hypothesis assessing the relationship between

a dichotomous and continuous variable? Two continuous variables? Two dichotomous variables?

What do the preponderance of your studies report as an effect size, if any?

Based on this information you will choose one standardized effect size: r, d, or odds-ratio in your meta-analysis

Page 27: Meta analysis with R

Calculating Effect Sizes

• d-index:– Appropriate to use when the difference between

two means is being compared; a dichotomous and continuous variable

– Typically employed in association with t- or F-tests, based on a comparison of two conditions

– Expresses the distance between the two group means in relation to their common SD

Page 28: Meta analysis with R

Calculating Effect Sizes

• d-index formula:, where:

21

222

211 )1()1(

nn

SDnSDnspooled

Page 29: Meta analysis with R

Calculating Effect Sizes

• So, if you were to calculate the standardized mean difference in the fluency rate of the following two groups in an RtI study, what would you get as the effect size?– Group 1 (experimental): M1 = 80, SD1 =10, n1=

250– Group 2 (control): M2 = 65, SD2 = 20, n2=230– Effect size = ?

• What if you had three groups?

Page 30: Meta analysis with R

Calculating Effect Sizes

• What if the means and SDs aren’t reported and you only have a t-value?

• What if you have the F-value for two means?– Formula for d-index when the F-value of two means

is reported:

dferror = (n1+n2-2)

errordf

td

2

errordf

Fd

2

Page 31: Meta analysis with R

Calculating Effect Sizes

• r-index– The correlation coefficient tells you about the

strength of the relationship between two variables– Most appropriate metric for expressing an effect

size when interested in the relationship strength of two continuous variables

– Most common in correlational studies– Usually reported when appropriate– EX: relationship between years of schooling and

yearly salary

Page 32: Meta analysis with R

Calculating Effect Sizes

• What if you only have a t-value?– Formula for r-index when only t-value is reported:

errordft

tr

2

2

Page 33: Meta analysis with R

Calculating Effect Sizes• Think back to the previous RtI study on slide 16. The effect size was

d = 0.96. The difference between the control/experimental group.• Suppose you want to convert this d-index into an r-index:

, where

What do you get? r = ? What could this correlation represent?

• Or vice-versa:

ad

dr

221

221 )(

nn

nna

21

2

r

rd

Page 34: Meta analysis with R

Calculating Effect Sizes

• Odds-Ratio (OR)

• Applicable when both variables are dichotomous• The relationship between two sets of odds• EX: Suppose a study measures the effects of RtI on

whether students in two groups (e.g., experimental/control ) “pass” or “fail” a math test.

RtI Control

Pass 75 (a) 40 (b)

Fail 5 (c) 25 (d)

Page 35: Meta analysis with R

Calculating Effect Sizes

• Of n=80 in RtI, the ratio of passing is 15 to 1.• Of n =65 in control, the ratio of passing is 1.6

to 1.• Calculate the odds ratio:

OR = ad/bc = ?

Page 36: Meta analysis with R

Combining Effect Sizes

• Once individual study effect sizes have been calculated, the next step involves combining them to provide an average effect size.

• You must weight the individual effect sizes.– What do you base this weight on?

Page 37: Meta analysis with R

Combining Effect Sizes

• Suppose you have 7 d-indexes and group ns that compares the effect of homework vs. no homework on a measure of academic achievement:

Study ni 1 ni2 di

1 259 265 0.022 57 62 0.073 43 50 0.244 230 228 0.115 296 291 0.096 129 131 0.327 69 74 0.17

∑ 1083 1101 1.02

Page 38: Meta analysis with R

Combining Effect Sizes

• Step One: Weighting• Formula:

• EX: Study 1wi = 2(259 + 265) 259* 265/2(259 + 265)2+ 259* 265* .022= 130.98

221

221

2121

)(2

)(2

iiiii

iiiii dnnnn

nnnnw

Page 39: Meta analysis with R

Combining Effect Sizes

• Calculations:Study ni 1 ni2 di wi

1 259 265 0.02 130.982 57 62 0.07 29.683 43 50 0.24 22.954 230 228 0.11 114.325 296 291 0.09 146.596 129 131 0.32 64.177 69 74 0.17 35.58

∑ 1083 1101 1.02 544.27

Page 40: Meta analysis with R

Combining Effect Sizes

• Step Two: Multiply each weighted effect size and original d-index

• Formula: diwi

• EX: What is the answer for Study 1?

Page 41: Meta analysis with R

Combining Effect Sizes

• Calculations:Study ni 1 ni2 di wi di wi

1 259 265 0.02 130.98 2.6192 57 62 0.07 29.68 2.0783 43 50 0.24 22.95 5.5094 230 228 0.11 114.32 12.5765 296 291 0.09 146.59 13.1936 129 131 0.32 64.17 20.5367 69 74 0.17 35.58 6.048

∑ 1083 1101 1.02 544.27 62.559

Page 42: Meta analysis with R

Combining Effect Sizes

• Step Three: Divide the sum of these products by the sum of the weights.

• Formula:

• EX: d. = 62.56/544.27 = +.115 (average ES)

k

ii

k

iii

w

wdd

1

1.

Page 43: Meta analysis with R

Combining Effect Sizes

• Step Four: Computing Confidence Intervals• Formula:

• EX:

– Thus, we expect 95% of estimates of this effect to fall between .031 and .199. Do we reject the null?

k

ii

d

wdCI

1

%95.

196.1.

084.115.27.544

196.1115.%95. dCI

Page 44: Meta analysis with R

Combining Effect Sizes• Suppose that you have 6 r-indexes and ns that show the relationship between the

amount students spend on homework and their score on an achievement test.• Step One: Transform the r-indexes into a z-scores because as r gets larger the

distribution gets more skewed.

Formula:

Study ni ri zi

1 3505 0.06 0.062 3606 0.12 0.123 4157 0.22 0.224 1021 0.08 0.085 1955 0.27 0.286 12146 0.26 0.27

∑ 26390 1.01 1.03

]1

1[log5.

r

rZ er

Page 45: Meta analysis with R

Combining Effect Sizes

• Step Two: Weighting• Formula:

ni - 3

• EX: Study 13,505-3 = 3,502

Page 46: Meta analysis with R

Combining Effect Sizes

• Calculations:Study ni ri zi ni - 3

1 3505 0.06 0.06 35022 3606 0.12 0.12 36033 4157 0.22 0.22 41544 1021 0.08 0.08 10185 1955 0.27 0.28 19526 12146 0.26 0.27 12143

∑ 26390 1.01 1.03 26372

Page 47: Meta analysis with R

Combining Effect Sizes

• Step Three: multiply the weight and the effect size (i.e., z-score)

• Formula:(ni – 3) zi

• EX: Study 1(3,502).06 = 210.12

Page 48: Meta analysis with R

Combining Effect Sizes

• Calculations:Study ni ri zi ni - 3 (ni - 3)z

1 3505 0.06 0.06 3502 210.122 3606 0.12 0.12 3603 432.363 4157 0.22 0.22 4154 913.884 1021 0.08 0.08 1018 81.445 1955 0.27 0.28 1952 546.566 12146 0.26 0.27 12143 3278.61

∑ 26390 1.01 1.03 26372 5462.97

Page 49: Meta analysis with R

Combining Effect Sizes

• Step Four: Divide the sum of these products by the sum of the weights.

• Formula:

• EX: z. = 5462.97/26,372 = +.207 (average ES)

k

ii

k

iii

n

znz

1

1

)3(

)3(.

Page 50: Meta analysis with R

Combining Effect Sizes

• Step Five: Computing Confidence Intervals• Formula:

• EX: CIz95% = .207 ± 1.96/ √26,372

= .207 ± .012Thus, we expect 95% of estimates of this effect to

fall between .195 and .219. Do we reject the null?

k

i

z

n

zCI

11

%95

)3(

96.1.

Page 51: Meta analysis with R

Visualization

Page 52: Meta analysis with R

Funnel PlotsFunnel plots are a device for checking for publication bias.

• Each dot represents the overall effect from one RCT.

• As sample size increases, the width of the confidence interval should decrease.

• Result should be located in a symmetric, triangular area centered on the overall effect for all studies.

Page 53: Meta analysis with R

Funnel PlotsMissing studies will manifest as an asymmetry in the funnel plot.

• Missing studies will appear as a gap in the portion of the funnel plot where you would expect to find negative studies.

• The unopposed positive studies will shift the apparent treatment effect (blue line) towards a larger effect size than it really is.

Page 54: Meta analysis with R

Heterogeneity• Refers to differences between the outcomes of studies

included in a meta-analysis.• If most studies are similar to each other and show a similar

result (low heterogeneity), this increases confidence that the effect being measured is real.

• If results from different studies are vastly different from each other, this suggests that each study is measuring something slightly different from the other studies.

• High heterogeneity can be due to:• Random chance• Differences in patient populations

between studies• Differences in treatment

• Differences in assessing outcomes• Other differences in study

methodology

Page 55: Meta analysis with R

Measures of HeterogeneitySystematic reviews with high heterogeneity should either not combine results (in a meta-analysis) or should use statistical methods to compensate for the heterogeneity.

Fixed effects model. Assumes that any differences between study results are due only to random chance. Appropriate when heterogeneity is low.

Random effects model. Makes some conservative assumptions in order to combine studies. The overall result should be interpreted with caution since each study seems to be actually measuring something slightly different from the others. In a sense, the random effects model is comparing apples and oranges.

Subgroup analysis. If heterogeneity is high, but the differences may be due to known factors (e.g., patient age), results are sometimes stratified by these known factors and then individual strata from different studies become similar enough that they can be combined.

Page 56: Meta analysis with R

Forest plots

Page 57: Meta analysis with R

Reading Forest Plots • Green squares represent point

estimates• The size of the square is

proportional to the number of subjects in the group.

• The horizontal lines show the 95% confidence interval.

• The black diamonds represent the combined results for each subgroup.

• Note that this analysis used a fixed effects model.

Page 58: Meta analysis with R

Examples

Page 59: Meta analysis with R

Conducting Meta-Analyses in R

• meta, rmeta and metafor packages for conducting meta-analyses in R.

Page 60: Meta analysis with R

Tuberculosis

• The data set taken from van Houwelingen, Arends, and Stijnen (2002) consists of randomized controlled trials of a vaccine, Bacillus Calmette-Guerin (BCG), for the prevention of tuberculosis (TB).

• The data presented consist of the sample size and the number of cases of tuberculosis. Furthermore some covariates are available that might explain the heterogeneity among studies: geographic latitude of the place where the study was done, year of publication, and method of treatment allocation (random, alternate, or systematic).

Page 61: Meta analysis with R

Tuberculosis

Page 62: Meta analysis with R

Tuberculosis

Page 63: Meta analysis with R

Tuberculosis

Page 64: Meta analysis with R

Dentifrices

• The data set is taken from Abrams and Sanso (1998) and concerns a previously published meta-analysis which was conducted of all randomized controlled trials comparing sodium monofluorophosphate (SMFP) to sodium fluoride (NaF) dentifrices (toothpastes) in the prevention of caries; see Johnson (1993).

• The outcome in each trial was the change from baseline in the decayed missing (due to caries) filled surface (DMFS) dental index at three years follow-up.

Page 65: Meta analysis with R

Dentifrices

Page 66: Meta analysis with R

Dentifrices

Page 67: Meta analysis with R
Page 68: Meta analysis with R

Validity

• The studies were usually conducted in multisection courses in which the sections had different instructors but all sections used a common final examination. The index of validity was a correlation coefficient (a partial correlation coefficient, controlling for a measure of student ability) between the section mean instructor ratings and the section mean examination score.

Page 69: Meta analysis with R

Validity

Page 70: Meta analysis with R

Validity

Page 71: Meta analysis with R

Validity