![Page 1: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/1.jpg)
Multilevel Analysis
Kate Pickett
Senior Lecturer in Epidemiology
![Page 2: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/2.jpg)
Perspective
Health researchers:Are interested in answering research
questions (not maths)Want to be able to apply statistical
techniquesWant to be able to interpret resultsWant to be able to communicate with
consumers and statisticians
![Page 3: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/3.jpg)
Aims for this session
Understand the rationale for multilevel analysis
Understand common terminology Interpret output from multilevel models Be able to read and critically appraise
studies using multilevel models
![Page 4: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/4.jpg)
Context and composition
Studying populations (groups) and individuals
From Rose, G. Sick individuals and sick populations. Int J Epidemiol 1985;14:32-38
![Page 5: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/5.jpg)
Levels of analysis
Health researchers may collect and use data collected at the level of:Individuals, patientsFamilies or other social groupingsClinics or hospitalsSmall areas, neighbourhoodsLarge populations
![Page 6: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/6.jpg)
Population A Population B
How is Population A different from Population B?
![Page 7: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/7.jpg)
Ecological studies
Data are aggregated and represent a group, rather than an individual incidence rate of an illness prevalence of a particular health service
We don’t know which particular individuals within the group were ill or received the service
These group-based outcome measures are analyzed by correlating them with determinants measured for the same groups
![Page 8: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/8.jpg)
Source: Pickett KE, Kelly S, Brunner E, Lobstein T, Wilkinson RG. Wider income gaps, wider waistbands? An ecological studyof obesity and income inequality. J Epidemiol Community Health 2005;59:670–674.
![Page 9: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/9.jpg)
The ecological fallacy
Associations at the group level may not hold at an individual level Eg, we might see that rates of obesity are correlated
internationally with per capita calorie intake But, we don’t know if it is the obese individuals who
are eating all the calories Many group-level variables are correlated so we may get
spurious correlations Eg, obesity rates may also be correlated with number
of zoos per capita or some other completely unrelated factor
![Page 10: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/10.jpg)
The atomistic fallacy
But the ecological fallacy has a flip sideFactors that affect outcomes in individuals
may not operate in the same way at the population level
• Eg, teenage births are more common among the poor, but teenage birth rates are very high in some very wealthy countries.
![Page 11: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/11.jpg)
Source: Pickett KE, Mookherjee S, Wilkinson RG. Adolescent Birth Rates,Total Homicides, and Income Inequality In Rich Countries, AJPH2005;95:1181-1183.
Example of teenage births
![Page 12: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/12.jpg)
Ecological variables
Sometimes ecological studies are done because it is quick and easy
Sometimes ecological studies are the best design for the research questionBECAUSE
Some determinants are “ecological”: Population density Air quality/pollution GNP Income inequality % unemployed Ambient temperature
![Page 13: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/13.jpg)
Context and composition
But what if we are interested in both types of variables (individual and population) simultaneously?
Eg: we might want to know about the effect of population-level unemployment on health, above and beyond the health impact of being unemployed for any given individual
![Page 14: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/14.jpg)
Multilevel models
![Page 15: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/15.jpg)
Introduction to multilevel models
Number of papers using multilevel
analysis: Medline
0
50
100
150
200
1995 2000 2004
Year
Nu
mb
er
Hierarchical models
Mixed effects models
Random effects models
![Page 16: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/16.jpg)
Background
Developed in education research
Observations of students in a single class are not independent of one another
“Standard” statistical models assume that observations are independent
Two-level hierarchy Students within
classes Three-level hierarchy
Students within classes within schools
Four-level hierarchy Students within
classes within schools within local authority areas
![Page 17: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/17.jpg)
Health research context
Patients within a medical practice Residents within neighbourhoods Subjects within trial clusters Hospitals within PCTs….
![Page 18: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/18.jpg)
Examples for class
Some examples are drawn from Twisk JWR “Applied Multilevel Analysis” Cambridge University Press, 2006
Example data are available at: http:\www.emgo.nl\researchtools
Research question: what is the relationship between total cholesterol and age?
Statistical software: Stata but note that MLwiN is free to UK academics: http://www.cmm.bristol.ac.uk/MLwiN/download/index.shtml)
![Page 19: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/19.jpg)
Simple linear regression
4
5
6
7
8
30 40 50 60 70
Age (years)
To
tal c
ho
les
tero
l (m
mo
l/l)
Total cholesterol = β0 + β1 x age + ε
![Page 20: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/20.jpg)
Simple linear regression, adding a categorical variable
4
5
6
7
8
30 40 50 60 70
Age (years)
To
tal c
ho
les
tero
l (m
mo
l/l)
MalesFemales
Total cholesterol = β0 + β1 x age + β2 x gender + ε
![Page 21: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/21.jpg)
Simple linear regression, adding another variable (doctor)
4
5
6
7
8
30 40 50 60 70
Age (years)
To
tal c
ho
les
tero
l (m
mo
l/l)
MD1MD2MD3MD4MD5MD…
Total cholesterol = β0 + β1 x age + β2 x MD1 + β3 x MD2 + β4 x MD3 + β5 x MD4 +…..+ βm x MDm-1 + ε
![Page 22: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/22.jpg)
Multilevel analysis
Instead of estimating all those separate intercepts, we estimate the variance of them
In our example that means estimating 1 additional parameter, rather than 11
We are allowing the intercept to be random (random effects modelling)
An efficient way of correcting for a variable with many categories
Trade-off: Assumes that the different intercepts are
normally distributed
![Page 23: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/23.jpg)
Example data
Cholesterol Dataset 441 patients Age 44-86 years Cholesterol 3.90-
8.86 mmol/l 12 doctors
![Page 24: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/24.jpg)
Non-multilevel regression
. regress cholesterol age
Source | SS df MS Number of obs = 441-------------+------------------------------ F( 1, 439) = 142.06 Model | 99.3395851 1 99.3395851 Prob > F = 0.0000 Residual | 306.984057 439 .699280312 R-squared = 0.2445-------------+------------------------------ Adj R-squared = 0.2428 Total | 406.323642 440 .923462822 Root MSE = .83623
------------------------------------------------------------------------------ cholesterol | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- age | .0512619 .0043009 11.92 0.000 .042809 .0597148 _cons | 2.798691 .268571 10.42 0.000 2.270847 3.326536------------------------------------------------------------------------------
Example using Stata
![Page 25: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/25.jpg)
MultilevelModel inStata
. xtmixed cholesterol age ||doctor:, ml var
Performing EM optimization:
Performing gradient-based optimization:
Iteration 0: log likelihood = -404.68939 Iteration 1: log likelihood = -404.68939
Computing standard errors:
Mixed-effects ML regression Number of obs = 441Group variable: doctor Number of groups = 12
Obs per group: min = 36 avg = 36.8 max = 39
Wald chi2(1) = 262.76Log likelihood = -404.68939 Prob > chi2 = 0.0000
------------------------------------------------------------------------------ cholesterol | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- age | .0495866 .003059 16.21 0.000 .0435911 .0555822 _cons | 2.905812 .259134 11.21 0.000 2.397919 3.413705------------------------------------------------------------------------------
------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]-----------------------------+------------------------------------------------doctor: Identity | var(_cons) | .3685781 .1541985 .1623381 .8368327-----------------------------+------------------------------------------------ var(Residual) | .3314923 .0226341 .2899706 .3789597------------------------------------------------------------------------------LR test vs. linear regression: chibar2(01) = 282.37 Prob >= chibar2 = 0.0000
![Page 26: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/26.jpg)
Do we need the multilevel model?
Likelihood ratio test:Compare -2 log likelihood of model
with random intercept to -2 log likelihood of ordinary linear model
Difference has a Chi-square distribution with df = difference in number of parameters estimated
Difference = 284.73, highly significant
![Page 27: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/27.jpg)
Model parameters
Effects of age in each model: Coefficient in ordinary model = 0.0513 Coefficient in multilevel model = 0.0496
95% CI in ordinary model (0.0428, 0.0597) 95% CI in multilevel model (0.0435,0.0556)
Age is significant in both models
![Page 28: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/28.jpg)
Intraclass correlation coefficient This measures how dependent the
observations are within clusters Eg, how correlated are the observations of
patients belonging to the same doctor? Defined as:
Variance between clusters/Total variance The smaller the variance within clusters, the
greater the ICC
![Page 29: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/29.jpg)
ICC (a)
Distribution of an outcome variable
Assume that the total variance = 10
![Page 30: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/30.jpg)
ICC (b)
ICC is low because:
Variance within groups is high (9)
Variance between groups is low (1)
Numerator is small, relative to denominatorICC = 1/10=0.1
![Page 31: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/31.jpg)
ICC (c)
The groups are now more spread out, more different, and:
ICC is bigger because:
Variance within groups is lower (5)
Variance between groups is higher (5)
ICC=5/10 = 0.5
![Page 32: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/32.jpg)
ICC (d)
The groups are now completely different, and:
ICC is maximised because:
Variance within groups is minimal (1)
Variance between groups is maximal (9)
Numerator is large, relative to denominator
ICC=9/10 = 0.9
MUCH MORE DEPENDENCE WITHIN CLUSTER – each observation provides less unique information
![Page 33: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/33.jpg)
Impact on significance tests
Table of alpha values under different conditions of sample size and ICC
Intraclass Correlation Coefficient
Sample size 0.01 0.05 0.20
10 0.06 0.11 0.28
25 0.08 0.19 0.46
50 0.11 0.30 0.59
100 0.17 0.43 0.70
![Page 34: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/34.jpg)
ICC in our example
ICC = between doctor variance/total variance
ICC = 0.3686/(0.3686+0.3315) = 0.3686/0.7001 = 0.52652.6% of the total individual
differences in cholesterol are at the doctor level
![Page 35: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/35.jpg)
ICC
When ICC is highEvidence of a contextual effect on the
outcomeEvidence of differences in composition
between the clustersExplore by including explanatory
variables at each level When ICC is low
No need for a multilevel analysis
![Page 36: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/36.jpg)
Back to unemployment example
![Page 37: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/37.jpg)
Population A
Population B
Red = unemployed
Data Structure
![Page 38: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/38.jpg)
An ordinary regression model
Health =b0 + b1 (unemployed) + b2 (% unemployed) + e
e represents the effect of all omitted variables and measurement error and is assumed to have a random effect (so it gets ignored)
![Page 39: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/39.jpg)
Population A
Population B
Aside from unemployment, subjects in A are different fromB in other ways: composition (shape, size), context (density)
Data Structure
![Page 40: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/40.jpg)
A multi-level regression model
i = individual, j=context:
yij = bxij + BXi + Ej + eij
Health = b (unemployedij) + B(% unemployedi) +Ej
+ eij
![Page 41: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/41.jpg)
What does this mean for critical appraisal of the health literature?
When data are hierarchical or multi-level by nature, they should be analysed appropriately
The coefficients or odds ratios from the models can be interpreted as usual
The ICC shows how much variance in the outcome occurs between the higher-level contexts
If appropriate methods are not used, standard errors and significance tests may be wrong and coefficients biased
![Page 42: Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology](https://reader038.vdocument.in/reader038/viewer/2022103006/56649e6b5503460f94b69c95/html5/thumbnails/42.jpg)
A summary
Ecological studies Appropriate when the research question concerns
only ecological effects Ecological fallacy may be a problem
Individual-level studies Appropriate when the research question concerns
only individual-level effects Atomistic fallacy may be a problem
Multi-level studies Appropriate when the research question concerns
both context and composition of populations