stephen fisher, jane holmes, nicky best, sylvia richardson
DESCRIPTION
Combining individual and aggregate data to improve estimates of ethnic voting in Britain in 2001 and 2005. Stephen Fisher, Jane Holmes, Nicky Best, Sylvia Richardson Department of Sociology, University of Oxford Department of Epidemiology and Biostatistics Imperial College, London. - PowerPoint PPT PresentationTRANSCRIPT
Stephen Fisher, Jane Holmes, Nicky Best, Sylvia RichardsonDepartment of Sociology, University of OxfordDepartment of Epidemiology and Biostatistics
Imperial College, London
http://www.bias-project.org.uk
Combining individual and aggregate data to improve estimates of ethnic voting in Britain in 2001 and 2005
Outline Question of interest Model we will use Analysis
Target analysisIndividual exposure
Aggregate exposure
Individual outcomeyijxij
Zi, Xi
Aggregate exposure YiAggregate outcome
Ecological regressionZi, Xi
Aggregate outcome
Individual exposure
Aggregate exposure
Individual outcomeyijxij
Yi
Hierarchical Related Regression (HRR)
Zi, Xi
A decline in ethnic minority support for Labour? From 1974 to 2001 around 80% of ethnic minorities vote Labour Between 2001 and 2005 there were
Islamic terrorist attacks US and UK led invasions of Afghanistan and Iraq Heightened security and suspicion of non-whites Unlawful detention of foreign terror suspects Convictions of British soldiers for Iraqi prisoner abuse
These and other events are thought to have undermined support for Labour among ethnic minorities.
On the other hand, harsh stance on immigration in Conservative 2005 election campaign may have alienated ethnic voters
A decline in Muslim support for Labour? Initially
We found that the gap in Labour vote between whites and non-whites narrowed between 2001 and 2005.
Results presented at PSA 2009 Audience opinion was interesting, but really wanted to know
whether the same was true of Muslims So
We tested whether the gap in Labour vote between Muslims and non-Muslims narrowed between 2001 and 2005.
Individual-level model British Election Study post-election survey (BES)
Cross-sectional survey carried out after every general election For subject j in constituency i,
yij = voted Labour (1) / didn’t vote Labour (0) xij = Muslim (1) / non-Muslim (0)
But 1,898 subjects with validated data, only 20 Muslims
Area-level random effect
Probability subject j votes Labour
Log odds ratio of Muslim voting Labour compared with non-Muslim
Aggregate data However, we have data at the aggregate level for entire population
2001 Census data on % who are Muslim Number of people who vote Labour in each constituency from
General election results Data viewed as a 2x2 table. For constituency i:
yi = number who vote Labour ni = number who are eligible to vote xi = number who are Muslim
Vote Labour Don’t vote Labour
Non-Muslim ? ? 1- xi
Muslim ? ? xi
yi yi - ni ni
Ecological bias Standard analysis of this data will probably lead to biased results Bias in ecological studies can be caused by:
Confounding Confounders can be area-level (between-area) or
individual-level (within-area) include control variables and/or random effects in model
Non-linear covariate-outcome relationship, combined with within-area variability of covariate
No bias if covariate is constant in area (contextual effect) Bias increases as within-area variability increases … unless models are refined to account for this hidden
variability
Improving ecological inference Alleviate bias associated with within-area covariate variability Data at area-level, for constituency i:
Area-level outcome yi = number of people who vote Labour Area-level predictor = proportion who are Muslim
Then yi ~ Binomial(ni , pi ) where the area-level probability pi is calculated by integrating
individual-level probabilities given by individual-level model with respect to the within-area joint distribution fi(x) of all individual-level predictors
pi = pij(x) fi(x) dx pi is average group-level probability (of voting Labour) pij(x) is individual-level probability given covariates x fi(x) is distribution of covariate x within area i
The model for a single binary covariate Consider a single binary covariate x, e.g. Muslim/non-Muslim fi(x) is the proportion of individuals with x = 1 in each area, i.e. the
proportion Muslim in each constituency Individual-level model
pij = g(i + xij), where g() = e/(1+e) pij = g(i) if person j is non-Muslim pij = g(i + ) if person j is Muslim
Integrated group-level model = proportion Muslim in constituency i (mean of xij) pi = average probability (proportion) of voting Labour in area i
Prob. Muslimvotes Labour
Prob. of beingMuslim
Prob. non-Muslimvotes Labour
Prob. of beingnon-Muslim
Hierarchical Related Regression The parameters of the aggregate model have been derived from
an underlying individual-level model So the exposure-outcome relationship is assumed to be the same
in both the aggregate data and the individual-level data This means that the individual and aggregate data can be used
simultaneously to make inference on the underlying individual-level model.
The likelihood for the combined data is simply the product of the likelihoods of each set of data
This combined model is termed a hierarchical related regression (HRR). (Jackson, Best and Richardson, 2006)
Recap Question of interest
How do Muslims vote? And did they change their voting behaviour between the 2001 and 2005 general elections?
i denotes constituency, j denotes subject within a constituency
Individual-level data Aggregate dataOutcome yij = 1 if subject j votes Labour
0 if don’t vote Labouryi = number who vote Labourni = electorate
Explanatory variable
xij = 1 if subject j is Muslim 0 if subject j is not Muslim
= proportion who are Muslim
Proportion of electorate who voted Labour in 2001 and 2005, by constituency
0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.1
0.2
0.3
0.4
Proportion who are Muslim
Prop
ortio
n of
ele
ctor
ate
who
vote
Lab
our
20012005
Analyses To start, various models are fit to the 2001 general election only
Simple model with only an individual Muslim effect
Analyses To start, various models are fit to the 2001 general election only
Simple model with only an individual Muslim effect Add a contextual effect of Muslim as well as an individual
effect
Analyses To start, various models are fit to the 2001 general election only
Simple model with only an individual Muslim effect Add a contextual effect of Muslim as well as an individual
effect Add an interaction term
Analyses To start, various models are fit to the 2001 general election only
Simple model with only an individual Muslim effect Add a contextual effect of Muslim as well as an individual
effect Add an interaction term Include socio-economic status as a confounder
Partly motivated by the apparent interaction
Socio-economic status coded as manual/non-manual
More than one individual-level binary covariate For the integrated group-level model, when we have more than
one binary covariate we need to know the cross-classification of individuals between covariate categories within each area, e.g. number of Muslims who have a manual job
Then average probability of voting Labour in area i,
Estimate p(xij, zij) by proportion in area i with covariates xij, zij
Census does not contain these cross-classifications Estimate by product of the 2 marginals, Lasserre et al
0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Comparison of predictions for all models for 2001
Proportion who are Muslim
Prop
ortio
n of
ele
ctora
te w
ho vo
te L
abou
r
Ind. MuslimInd. + cont. MuslimInd. + cont. Muslim + interactionInd. cont. Muslim + ses
Odds ratio of voting Labour for Muslims = 9.45 (3.20, 19.81)
Comparison of voting behaviour in 2001 and 2005 What we are really interested in is whether Muslims changed their
voting behaviour between the 2001 and 2005 general elections
Individual model for 2001 election
Individual model for 2005 election
Results – odds ratios
Individual Muslim effect, 2001 8.32 (3.99, 16.47)
Individual Muslim effect, 2005 3.55 (1.48, 6.73)
Difference in individual Muslim effect 2.51 (1.18, 4.61)
Socio-economic status 0.52 (0.45, 0.59)
Conclusions Muslims are more likely to vote Labour than non-Muslims Muslims did significantly change their voting behaviour between
2001 and 2005 In 2005 they were less likely to support Labour than in 2001
We need to find and include more individual Muslim data in our analysis
Jackson, C. H, Best, N. G. and Richardson, S. (2006). Improving ecological inference using individual-level data. Statist. Med., 25, 2136-2159
Lasserre, V., Guihenneuc-Jouyaux, C. and Richardson, S. (2000). Biases in ecological studies: utility of including with-area distribution of confounders. Statist Med., 19, 45-59