[ppt]slide 1 - welcome to the ncrm eprints repository...

87
2 July 2008 RMF – Meta-analysis workshop (Marsh, O’Mara, Malmberg) 1 NCRM Research Methods Festival University of Oxford Department of Education

Upload: duongthien

Post on 17-Feb-2019

216 views

Category:

Documents


0 download

TRANSCRIPT

2 July 2008RMF – Meta-analysis workshop (Marsh, O’Mara, Malmberg) 1

NCRM Research Methods FestivalUniversity of Oxford

Department of Education

What is meta-analysis, when and why we use meta-analysis, Examples of meta-analyses benefits and pitfalls of using meta-analysis, defining a population of studies and finding publications, coding materials, inter-rater reliability, computing effect sizes, structuring a database, and a conceptual introduction to analysis and interpretation of results

based on fixed effects, random effects, and multilevel models.

2

Meta-analysis is an increasingly popular tool for summarising research findings

Cited extensively in research literatureRelied upon by policymakers Important that we understand the method, whether

we conduct or simply consume meta-analytic research

Should be one of the topics covered in all introductory research methodology courses

3

What is meta-analysisWhat is meta-analysis When and why we use meta-analysisWhen and why we use meta-analysis

4

Systematic synthesis of various studies on a particular research question

Do boys or girls have higher self-concepts? Collect all studies relevant to a topic

Find all published journal articles on the topic An effect size is calculated for each outcome

Determine the size/direction of gender difference for each study “Content analysis”

code characteristics of the study; age, setting, ethnicity, self-concept domain (math, physical, social), etc.

Effect sizes with similar features are grouped together and compared; tests moderator variables

Do gender differences vary with age, setting, ethnicity, self-concept, domain, etc.

5

Coding: the process of extracting the information from the literature included in the meta-analysis. Involves noting the characteristics of the studies in relation to a priori variables of interest (qualitative)

Effect size: the numerical outcome to be analysed in a meta-analysis; a summary statistic of the data in each study included in the meta-analysis (quantitative)

Summarise effect sizes: central tendency, variability, relations to study characteristics (quantitative)

6

7

One of the primary aims is to reach a conclusion related to the magnitude of the effect on a specific sample inferred to the population

Meta-analysis can test if the studies' outcomes show more variation than the variation that is expected because of sampling different research participant

In such cases, study characteristics (e.g., the measurement instrument used, population sampled, or aspects of the study‘s design) are coded. These characteristics are then used as predictor variables to analyze the excess variation in the effect sizes

8

What Disciplines do meta-analysis?ISI: 10 Feb, 2008. Topic: meta-analysis; Results found: , 21,286

10

Amato, P. R., & Keith, B. (1991). Parental divorce and the well-being of children: A meta-analysis . Psychological Bulletin, 110, 26-46. Times Cited: 471

Linn, M. C., & Petersen, A. C. (1985). Emergence and characterization of sex differences in spatial ability: A meta-analysis . Child Development, 56, 1479-1498. Times Cited: 570

Johnson, D. W., & et al (1981). Effects of cooperative, competitive, and individualistic goal structures on achievement: A meta-analysis . Psychological Bulletin, 89, 47-62. Times Cited: 426

Tett, R. P., Jackson, D. N., & Rothstein, M. (1991). Personality measures as predictors of job performance: A meta-analytic review . Personnel Psychology, 44, 703-742 Times Cited: 387

Hyde, J. S., & Linn, M. C. (1988). Gender differences in verbal ability: A meta-analysis . Psychological Bulletin, 104, 53-69. Times Cited: 316

Iaffaldano, M. T., & Muchinsky, P. M. (1985). Job satisfaction and job performance: A meta-analysis . Psychological Bulletin, 97, 251-273. Times Cited: 263.

11

De Wolff, M., & van IJzendoorn, M. H. (1997). Sensitivity and attachment: A meta-analysis on parental antecedents of infant attachment . Child Development, 68, 571-591. Times Cited: 340

Wellman, H. M., Cross, D., & Watson, J. (2001). Meta-analysis of theory-of-mind development: The truth about false belief . Child Development, 72, 655-684. Times Cited: 276

Cohen, E. G. (1994). Restructuring the classroom: Conditions for productive small groups . Review of Educational Research, 64, 1-35. Times Cited: 235

Hansen, W. B. (1992). School-based substance abuse prevention: A review of the state of the art in curriculum, 1980-1990 . Health Education Research, 7, 403-430. Times Cited: 207

Kulik, J. A., Kulik, C-L., Cohen, P. A. (1980). Effectiveness of Computer-Based College Teaching: A Meta-Analysis of Findings. Review of Educational Research, 50, 525-544. Times Cited: 198.

12

Sheppard, B. H., Hartwick, J., & Warshaw, P. R. (1988). The theory of reasoned action: A meta-analysis of past research with recommendations for modifications and future research . Journal of Consumer Research, 15, 325-343. Times Cited: 515

Jackson, S. E., & Schuler, R. S. (1985). A meta-analysis and conceptual critique of research on role ambiguity and role conflict in work settings . Organizational Behavior and Human Decision Processes, 36, 16-78. Times Cited: 401

Tornatzky Lg, Klein Kj. (1994). Innovation characteristics and innovation adoption-implementation - A meta-analysis of findings . IEEE Transactions On Engineering Management, 29, 28-4. Times Cited: 269.

Lowe KB, Kroeck KG, Sivasubramaniam N. (1996). Effectiveness correlates of transformational and transactional leadership: A meta-analytic review of the MLQ literature. Leadership Quarterly, 7, 385-425. Times Cited: 203.

Churchill GA, Ford NM, Hartley SW, et al. (1985). Title: The determinants of salesperson performance - A meta-analysis . Journal Of Marketing Research, 22, 103-118. Times Cited: 189.

13

Jadad AR, Moore RA, Carroll D, et al. (1996). Assessing the quality of reports of randomized clinical trials: Is blinding necessary? Controlled Clinical Trials, 17, 1-12. Times Cited:2008

Boushey Cj, Beresford Saa, Omenn Gs, Et . Al. (1995). A quantitative assessment of plasma homocysteine as a risk factor for vascular-disease - Probable benefits of increasing folic-acid intakes. JAMA-journal Of The American Medical Assoc, 274, 1049-1057. Times Cited: 2,128

Alberti W, Anderson G, Bartolucci A, et al. (1995). Chemotherapy in non-small-cell lung-cancer - A metaanalysis using updated data on individual patients from 52 randomized clinical-trials. British Medical Journal, 311, 899-909. Times Cited:1,591

Block G, Patterson B, Subar A (1992). Fruit, vegetables, and cancer prevention - A review of the epidemiologic evidence. Nutrition And Cancer-an International Journal, 18, 1-29. Times Cited: 1,422

14

Question: Does feedback from university students’ evaluations of teaching lead to improved teaching?

Teachers are randomly assigned to experimental (feedback) and control (no feedback) groups

Feedback group gets ratings, augmented, perhaps, with personal consultation

Groups are compared on subsequent ratings and, perhaps, other variables

Feedback teachers improved their teaching effectiveness by .3 standard deviations compared to control teachers on the Overall Rating item; even larger differences for ratings of Instructor Skill, Attitude Toward Subject, Student Feedback

Studies that augmented feedback with consultation produced substantially larger differences, but other methodological variations had little effect.

15

Question: What is the correlation between university teaching effectiveness and research productivity?

Based on 58 studies and 498 correlations: The mean correlation between measures of teaching

effectiveness (mostly based on SETs) and research productivity was + .06;

This near-correlation was consistent across different disciplines, types of university, indicators of research, and icomponents of teaching effectiveness.

This meta-analysis was followed by Marsh & Hattie (2002) primary data study to more fully evaluate theoretical model

16

Contention about global self-esteem versus multidimensional, domain-specific self-concept

Traditional reviews and previous meta-analyses of self-concept interventions have underestimated effect sizes by using an implicitly unidimensional perspective that emphasizes global self-concept.

We used meta-analysis and a multidimensional construct validation approach to evaluate the impact of self-concept interventions for children in 145 primary studies (200 interventions).

Overall, interventions were significantly effective (d = .51, 460 effect sizes).

However, in support of the multidimensional perspective, interventions targeting a specific self-concept domain and subsequently measuring that domain were much more effective (d = 1.16).

This supports a multidimensional perspective of self-concept 17

Examined predictors of sexual, nonsexual violent, and general (any) recidivism

82 recidivism studies Identified deviant sexual preferences and antisocial

orientation as the major predictors of sexual recidivism for both adult and adolescent sexual offenders. Antisocial orientation was the major predictor of violent recidivism and general (any) recidivism

Concluded that many of the variables commonly addressed in sex offender treatment programs (e.g., psychological distress, denial of sex crime, victim empathy, stated motivation for treatment) had little or no relationship with sexual or violent recidivism

18

“Epidemiologic studies have suggested that folate intake decreases risk of cardiovascular diseases. However, the

results of randomized controlled trials on dietary supplementation with folic acid to date have been inconsistent”

Included 12 studies with randomised control trials The overall relative risks (95% confidence intervals) of

outcomes for patients treated with folic acid supplementation compared with controls were 0.95 (0.88-1.03) for cardiovascular diseases, 1.04 (0.92-1.17) for

coronary heart disease, 0.86 (0.71-1.04) for stroke, and 0.96

(0.88-1.04) for all-cause mortality. Concluded folic acid supplementation does not reduce risk

of cardiovascular diseases or all-cause mortality among participants with prior history of vascular disease.

19

In lekking species (those that gather for competitive mating), a male's mating success can be estimated as the number of females that he copulates with.

Aim of the study was to find predictors of lekking species’ mating success through analysis of 48 studies

Behavioural traits such as male display activity, aggression rate, and lek attendance were positively correlated with male mating success. The size of "extravagant" traits, such as birds tails and ungulate antlers, and age were also positively correlated with male mating success.

Territory position was negatively correlated with male mating success, such that males with territories close to the geometric centre of the leks had higher mating success than other males.

Male morphology (measure of body size) and territory size showed small effects on male mating success. 20

21

Compared to traditional literature reviews: (1) there is a definite methodology employed in the

research analysis; and (2) the results of the included studies are quantified to a

standard metric thus allowing for statistical techniques for further analysis.

Therefore less biased and more replicable Able to establish generalisability across many studies

(and study characteristics).

22

Analyzing the results from a group of studies can allow more accurate data analysis

Increased powerEnhanced precision due to averaging out the

sampling error deviations from the true valuesAlso, provides corrections to mean values with

distortions due to measurement error and other possible artefacts

23

Studies that are published are more likely to report statistically significant findings. This is a source of potential bias.

The debate about using only published studies:peer-reviewed studies are presumably of a higher quality

VERSUSsignificant findings are more likely to be published than

non-significant findings There is no agreed upon solution. However, one

should retrieve all studies that meet the eligibility criteria, and be explicit with how they dealt with publication bias. Some methods for dealing with publication bias have been developed (e.g., Fail-safe N, Trim and Fill method).

24

Increasingly, meta-analysts evaluate the quality of each study included in a meta-analysis.

Sometimes this is a global holistic (subjective) rating. In this case it is important to have multiple raters to establish inter-rater agreement (more on this later).

Sometimes study quality is quantified in relation to objective criteria of a good study, e.g. larger sample sizes; more representative samples; better measures; use of random assignment; appropriate control for potential bias; double blinding, and low attrition rates (particularly for longitudinal studies)

25

Meta-analyses should always include subjective and/or objective indicators of study quality.

In Social Sciences there is some evidence that studies with highly inadequate control for pre-existing differences leads to inflated effect sizes. However, it is surprising that other indicators of study quality make so little difference.

In medical research, studies largely limited to RCTs where there is MUCH more control than in social science research. Here there is evidence that inadequate concealment of assignment and lack of double-blind inflate effect sizes, but perhaps only for subjective outcomes.

These issues are likely to be idiosyncratic to individual discipline areas and research questions.

26

Defining a population of studies and finding Defining a population of studies and finding publicationspublications

Coding materialsCoding materials Inter-rater reliability Inter-rater reliability Computing effect sizesComputing effect sizes Structuring a databaseStructuring a database

27

28

Comparison of treatment & control groups? What is the effectiveness of a reading skills program for treatment group compared to an inactive control group?

Pretest-posttest differences? Is there a change in motivation over time?

What is the correlation between two variables?What is the relation between teaching effectiveness and research productivity

Moderators of an outcome? Does gender moderate the effect of a peer-tutoring program on academic achievement?

29

Do you wish to generalise your findings to other studies not in the sample?

Do you have multiple outcomes per study. e.g.: achievement in different school subjects; 5 different personality scales; multiple criteria of success

Such questions determine the choice of meta-analytic model fixed effects random effectsmultilevel

30

Need to have explicit inclusion and exclusion criteriaThe broader the research domain, the more detailed

they tend to becomeRefine criteria as you interact with the literatureComponents of a detailed criteria

distinguishing features research respondents key variables research methods cultural and linguistic range time frame publication types

31

Search electronic databases (e.g., ISI, Psychological Abstracts, Expanded Academic ASAP, Social Sciences Index, PsycINFO, and ERIC)

Examine the reference lists of included studies to find other relevant studies

If including unpublished data, email researchers in your discipline, take advantage of Listservs, and search Dissertation Abstracts International

32

Inclusion process usually requires several steps to cull inappropriate studies

Example from Bazzano, L. A., Reynolds, K., Holder, K. N., & He, J. (2006).Effect of Folic Acid Supplementation on Risk of Cardiovascular Diseases: A Meta-analysis of Randomized Controlled Trials. JAMA, 296, 2720-2726

33

__ Study ID_ _ Year of publication__ Publication type (1-5)__ Geographical region (1-7)_ _ _ _ Total sample size _ _ _ Total number of males_ _ _ Total number of females

Publication type (1-5)1.Journal article2.Book/book chapter3.Thesis or doctoral dissertation4.Technical report5.Conference paper

1

99

2

1

874146

Code SheetCode Book/manual

34

Random selection of papers coded by both coders

Meet to compare code sheetsWhere there is discrepancy, discuss to reach

agreementAmend code materials/definitions in code book

if necessaryMay need to do several rounds of piloting, each

time using different papers

35

Percent agreement: Common but not recommendedCohen’s kappa coefficient

Kappa is the proportion of the optimum improvement over chance attained by the coders, where a value of 1 indicates perfect agreement and a value of 0 indicates that agreement is no better than that expected by chance

Kappa’s over .40 are considered to be a moderate level of agreement (but no clear basis for this “guideline”)

Correlation between different ratersIntraclass correlation. Agreement among multiple

raters corrected for number of raters using Spearman-Brown formula (r)

36

The purpose of this exercise is to explore various issues of meta-analytic methodology

Discuss in groups of 3-4 people the following issues in relation to the gender differences in smiling study (LaFrance et al., 2003)1. Did the aims of the study justify conducting a meta-

analysis?2. Was selection criteria and the search process explicit?3. How did they deal with interrater (coder) reliability?

37

1. Extend previous meta-analyses, include previously untested moderators based on theory/empirical observations

2. Search process: detailed databases and 5 other sources of studies, search terms. Selection criteria: justification provided (e.g., for excluding under the age of 13). However, not clear how many studies were retrieved and then eventually included (compare with flow chart on slide 51)

3. Multiple coders (group of coders consisted of four people with two raters of each sex coding each moderator). Interrater reliability was calculated by taking the aggregate reliability of the four coders at each time using the Spearman–Brown formula

38

39

The effect size makes meta-analysis possibleIt is based on the “dependent variable” (i.e., the outcome)It standardizes findings across studies such that they can

be directly comparedAny standardized index can be an “effect size” (e.g.,

standardized mean difference, correlation coefficient, odds-ratio), but mustbe comparable across studies (standardization)represent magnitude & direction of the relationbe independent of sample size

40

41

Means and standard deviations

Correlations

P-values

F-statistics

d

t-statistics

SE

Lipsey & Wilson (2001) present many formulae for calculating effect sizes from different information

However, need to convert all effect sizes into a common metric, typically based on the “natural” metric given research in the area. E.g.: Standardized mean difference Odds-ratio Correlation coefficient

42

Standardized mean difference Group contrast research

Treatment groups Naturally occurring groups

Inherently continuous constructOdds-ratio

Group contrast research Treatment groups Naturally occurring groups

Inherently dichotomous constructCorrelation coefficient

Association between variables research43

Study 1 Cntr ExpN 10 10M 100 105SD 15 15tpd

Study 2 Cntr ExpN 50 50M 100 105SD 15 15tpd

Study 3 Cntr ExpN 100 100M 100 105SD 15 15tpd

44

45

Study 1 Cntr ExpN 10 10M 100 105SD 15 15t -0.750p 0.466d 0.333

Study 2 Cntr ExpN 50 50M 100 105SD 15 15t -1.667p 0.099d 0.333

Study 3 Cntr ExpN 100 100M 100 105SD 15 15t -2.360p 0.019d 0.333

XLS

Represents a standardized group contrast on an inherently continuous measure

Uses the pooled standard deviation (some situations use control group standard deviation)

Commonly called “d”

pooled

GG

sXXES 21

2

22

21

21sssnnIf pooled

In a gender difference study, the effect size might be:

pooled

FemalesMales

SDXXES

In an intervention study with experimental and control groups, the effect size might be:

pooled

ControlExper

SDXX

ES

46

Represents the strength of association between two inherently continuous measures

Generally reported directly as r (the Pearson product moment coefficient)

rES

47

48

r d0.90 4.130.80 2.670.70 1.960.60 1.500.50 1.150.40 0.870.30 0.630.20 0.410.10 0.200.00 0.00-0.10 -0.20-0.20 -0.41-0.30 -0.63-0.40 -0.87-0.50 -1.15-0.60 -1.50-0.70 -1.96-0.80 -2.67-0.90 -4.13

Alternatively: transform rs into Fisher’s Zr-transformed rs, which are more normally distributed

The odds-ratio is based on a 2 by 2 contingency table

The Odds-Ratio is the odds of success in the treatment group relative to the odds of success in the control group

49

bcadES

Frequencies

Success Failure

Treatment Group a b

Control Group c d

Hedges proposed a correction for small sample size bias (n < 20)

Must be applied before analysis

50

9431'

NESES smsm

The effect sizes are weighted by the inverse of the variance to give more weight to effects based on large sample sizes

Variance is calculated as

The standard error of each effect size is given by the square root of the sampling varianceSE = vi

51

)(2)()(

21

2

21

21

nnd

nnnnv i

i

N - ‘size’

M - ‘mean’

d = effect size

Population

The “likely” population parameter is the sample parameter ± uncertainty Standard errors (s.e.) Confidence intervals (C.I.)

Interval estimates

52

Samplen - ‘size’m - ‘mean’d = effect size

53

54

Each study is one line in the data base Effect size Duration Sample sizes Reliability of

the instrumentVariance of the effect size

Fixed effects modelFixed effects model Random effects modelRandom effects model Multilevel modelMultilevel model

55

Includes the entire population of studies to be considered; do not want to generalise to other studies not included (e.g., future studies).

All of the variability between effect sizes is due to sampling error alone. Thus, the effect sizes are only weighted by the within-study variance.

Effect sizes are independent.

56 56

There are 2 general ways of conducting a fixed effects meta-analysis: ANOVA & multiple regression

The analogue to the ANOVA homogeneity analysis is appropriate for categorical variablesLooks for systematic differences between groups of

responses within a variableMultiple regression homogeneity analysis is more

appropriate for continuous variables and/or when there are multiple variables to be analysedTests the ability of groups within each variable to predict

the effect sizeCan include categorical variables in multiple regression as

dummy variables. (ANOVA is a special case of multiple regression)

57 57

2

ESESwQ ii

The homogeneity (Q) test asks whether the different effect sizes are likely to have all come from the same population (an assumption of the fixed effects model). Are the differences among the effect sizes no bigger than might be expected by chance?

= effect size for each study (i = 1 to k)

= mean effect size

= a weight for each study based on the sample size

However, this (chi-square) test is heavily dependent on sample size. It is almost always significant unless the numbers (studies and people in each study) are VERY small. This means that the fixed effect model will almost always be rejected in favour of a random effects model.

iES

iES ES iw

58 58

On the next slide, we will look at these outcomes in more detail to show the importance of various moderator variables

Do Psychosocial and Study Skill Factors Predict College Outcomes? A Meta-AnalysisRobbins, Lauver, Le, Davis, Langley, & Carlstrom (2004).

Psychological Bulletin, 130, 261–288Aim:

To examine the relationship between psychosocial and study skill factors (PSFs) and college retention by meta-analyzing 109 studies

59

60

N = sample size for that variable

k = number of correlation coefficients on which each distribution was based

r = mean observed correlation

CIr 10% = lower bound of the confidence interval for observed r

CIr 90% = upper bound of the confidence interval for observed r

Academic related skillslargest effect size

Institutional size smallest effect size

Statistically significant because CI does not contain zero

Not statistically significant because CI contains zero

Regression Coefficients and their standard errors B SE Sig?Target .4892 .0552 yes Target-related .1097 .0587 noNon-target .0805 .0489 no

From O’Mara, Marsh, Craven, & Debus (2006)

61

Target self-concept domains are those that are directly relevant to the intervention

Target-related are those that are logically relevant to the intervention, but not focal

Non-target are domains that are not expected to be enhanced by the intervention

61

Is only a sample of studies from the entire population of studies to be considered; want to generalise to other studies not included (including future studies).

Variability between effect sizes is due to sampling error plus variability in the population of effects.

Effect sizes are independent.

62

Variations in sampling schemes can introduce heterogeneity to the result, which is the presence of more than one intercept in the solution

E.g., if some studies used 30mg of a drug, and others used 50mg, then we would plausibly expect two clusters to be present in the data, each varying around the mean of one dosage or the other

Random effects models account for this

63

If the homogeneity test is rejected (it almost always will be), it suggests that there are larger differences than can be explained by chance variation (at the individual participant level). There is more than one “population” in the set of different studies.

Now we turn to the random effects model to determine how much of this between-study variation can be explained by study characteristics that we have coded.

The total variance associated with the effect sizes has two components, one associated with differences within each study (participant level variation) and one between study variance:

64

iTi vvv 64

The random error variance component is added to the variance calculated earlier (see slide 44)

This means that the weighting for each effect size consists of the within-study variance (vi) and between-study variance (vθ)

The new weighting for the random effects model (wiRE) is given by the formula:

65

vvw

iiRE

1

Do Self-Concept Interventions Make a Difference? A Synergistic Blend of Construct Validation and Meta-AnalysisO’Mara, Marsh, Craven, & Debus. (2006). Educational

Psychologist, 41, 181–206Aim:

To examine what factors moderate the effectiveness of self-concept interventions by meta-analyzing 200 interventions

66

QB = between group homogeneity. If the QB value is significant, then the groups (categories) are significantly different from each other

QW = within group homogeneity. If QW is significant, then the effect sizes within a group (category) differ significantly from each other

Only 2 variables had significant QB in the random effects model. ‘Treatment characteristics’ also had significant QW.

Note thatthe fixedeffectsare more significant than randomeffects

67

Meta-analytic data is inherently hierarchical (i.e., effect sizes nested within studies) and has random error that must be accounted for.

Effect sizes are not necessarily independentAllows for multiple effect sizes per study

68

New technique that is still being developedProvides more precise and less biased

estimates of between-study variance than traditional techniques

69

Level 1: outcome-level componentEffect sizes

Level 2: study componentPublications

70

Intercept-only model, which incorporates both the outcome-level and the study-level components (similar to a random effects model)

Expand model to include predictor variables, to explain systematic variance between the study effect sizes

71

Acute Stressors and Cortisol Responses: A Theoretical Integration and Synthesis of Laboratory ResearchDickerson & Kemeny (2004). Psychological Bulletin, 130,

355–391Aim:

To examine methodological predictors of cortisol responses in a meta-analysis of 208 laboratory studies of acute psychological stressors

72

Only 2 variables significant (Quad Time between stress onset & assessment; Time of day). The quadratic component is difficult to interpret as an unstandardized regression coefficient, but the graph suggests it is meaningfully large

73

Quadratic Function of time since Onset

Fixed, random, or multilevel?Generally, if more than one effect size per study is

included in sample, multilevel should be usedHowever, if there is little variation at study level, the

results of multilevel modelling meta-analyses are similar to random effects models

74

Do you wish to generalise your findings to other studies not in the sample?

No – fixed effects

Yes – random effects or multilevel

Yes – multilevel

No – random effects or fixed effects

Do you have multiple outcomes per study?

75 75

The purpose of this exercise is to consider choice of meta-analytic method

Discuss in groups of 3-4 people the question in relation to the gender differences in smiling study (LaFrance et al., 2003)

Is there independence of effect sizes? What are the implications for model choice (fixed, random, multilevel)?

76 76

No independence (research reports = 162, number of effect sizes (k) = 418).

“Of the total number of reports described here, less than one fourth contributed more than one effect size to the moderator analysis... Nevertheless, appropriate caution should be used interpreting these analyses, because they challenge the assumption of effect size independence (p. 313)”.

77 77

The purpose of this exercise is to practice reading meta-analytic results tables.

This study, by Reger et al. (2004), examines the relationship between neuropsychological functioning and driving ability in dementia.

1.In Table 3, which variables are homogeneous for the “on-road tests” driving measure in the “All Studies” column? What does this tell you about those variables?

2.In Table 4, look at the variables that were homogeneous in question (1) for the “on-road tests” using “All Studies”. Which variables have a significant mean ES? Which variable has the largest mean ES?

78 78

1. Homogeneous variables (non-significant Q-values): Mental status–general cognition, Visuospatial skills, Memory, Executive functions, Language

2. All of the relevant mean effect sizes are significant. Memory and language are tied as the largest mean ESs for homogeneous variables (r = .44)

79 79

80

We established what is meta-analysis, when and why we use meta-analysis, and the benefits and pitfalls of using meta-analysis

Summarised how to conduct a meta-analysisProvided a conceptual introduction to analysis and

interpretation of results based on fixed effects, random effects, and multilevel models

Applied this information to examining the methods of a published meta-analysis

81

82

Comparing apples and orangesQuality of studies included in the meta-analysis What to do when studies don’t report sufficient

information (e.g., “non-significant” findings)? Including multiple outcomes in the analysis (e.g.,

different achievement scores)Publication bias

83

With meta-analysis now one of the most popularly published research methods, it is an exciting time to be involved in meta-analytic research

The hottest topics in meta-analysis are:Multilevel modelling to address the issue of

independence of effect sizesNew methods in publication bias assessment

Also receiving attention:Establishing guidelines for conducting meta-analysis

(best practice)Meta-analyses of meta-analyses

84

Purpose-built Comprehensive Meta-analysis (commercial) Schwarzer (free,

http://userpage.fu-berlin.de/~health/meta_e.htm)Extensions to standard statistics packages

SPSS, Stata and SAS macros, downloadable from http://mason.gmu.edu/~dwilsonb/ma.html

Stata add-ons, downloadable from http://www.stata.com/support/faqs/stat/meta.html

HLM – V-known routine MLwiN MPlus

85

Cooper, H., & Hedges, L. V. (Eds.) (1994). The handbook of research synthesis (pp. 521–529). New York: Russell Sage Foundation.

Hox, J. (2003). Applied multilevel analysis. Amsterdam: TT Publishers.

Hunter, J. E., & Schmidt, F. L. (1990). Methods of meta-analysis: Correcting error and bias in research findings. Newbury Park: Sage Publications.

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage Publications.

86

Pick up a brochure about our intermediate and advanced meta-analysis courses

Visit our website http://www.education.ox.ac.uk/research/resgroup/self/training.php

87