weighting and imputation for core social housing statistics julia bowman & niall goulding

22
Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Upload: felix-carpenter

Post on 17-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Weighting and Imputationfor CORE Social Housing Statistics

Julia Bowman & Niall Goulding

Page 2: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

What CORE is

• COntinuous REcording of Social Housing Lettings

• Census – hybrid of interview and administrative data

• Household level data collected

• Private Registered Providers and Local Authorities

• Collected from all housing providers in England since 2004

• Many types of information are collected, not just the number of lettings…

Page 3: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Lettings log

Page 4: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

2012/13 Headline stats

Context – 378,700 lettings

Household characteristics – 91% UK nationals, 22% in work, 3% under 18Most common reason given for why the household left their last settled home - overcrowding

Average weekly rent - £79.58 / £104.52

Length of time vacant – 32 days

Staying within local authority – 90%

378,700 lettingsOvercrowding£79.58 per week32 days vacant90% remain in LA

Page 5: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Complimentary data setsLocal Authority Housing Statistics (LAHS)

English Housing Survey (EHS)

Page 6: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Users

Page 7: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Interests around household characteristics

• And media interest…

Page 8: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

QIF bid

• Two problems we sought to resolve…

• Placed bid to the UKSA’s Quality Improvement Fund (QIF)

• Work carried out by the ONS Methodology Advisory Board

Page 9: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Problem 1: LA missing records

• Lettings volume varies greatly by local authority

• Local Authority Housing Statistics (LAHS): nearly complete lettings data at LA level

• CORE: lettings data at household level

Page 10: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Problem 1: LA missing records

• Some LAs do not provide logs for every letting in CORE

• Introduces bias into demographic statistics

• Lettings grossed to LAHS counts on urban/rural classification

• Does not account for demographics of population

Page 11: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Solution 1: Improved Weighting

• Geographic approach maintained

• ONS area classifications (OACs) are used to replace urban/rural classifications.

• Areas grouped on many factors using a cluster methodology

Page 12: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding
Page 13: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Solution 1: Improved Weighting

• What is our best estimate for lettings per ONS cluster area?

• The highest of LAHS or CORE for each LA

• If neither, we use an imputed LAHS figure

• Sum these to get total lettings per ONS cluster area

Page 14: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Solution 1: Improved Weighting

Highest of LAHS, CORE, imputed LAHS for each LA

Sum lettings per ONS cluster area group

Compare to reported CORE figure per area group

Ratio of best estimate to CORE figure = weight

Page 15: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Problem 2: Record level missing data

• Both LA and PRPs submit logs with missing household characteristics

• Age, sex, ethnicity, nationality and economic status

• This can happen because

tenant refuses to provide the information

some LAs do not interview

admin data constraints

IT constraints

Page 16: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Solution 2: Imputation

• So how do we account for this?

• Donor imputation: Neighbour Imputation Method

• Canadian Census Edit and Imputation System – CanCEIS (Canadian Census 2001, UK Census 2011)

• Efficient, free license, variety of record editing rules

Page 17: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Solution 2: Imputation

Raw data comes to DCLG (SPSS)

Data reformatted for CanCEIS (ASCII)

CanCEIS finds incomplete and donor

records

CanCEIS matches records

Household characteristics that are available(age, sex, ethnicity, nationality, economic status)

Area classification, provider type (LA/PRP), previous tenure, size of property, asylum seeker,

refugee status (and client type)

Record randomly picked from pool of

donors

Imputed output data set

Age Sex Nationality Area Asylum

45 M UK 6 N

35 M EEA 2 N

27 F MISSING 4 N

Age Sex Nationality Area Asylum

45 1 1 6 0

35 1 2 2 0

27 2 -10 4 0

Age Sex Nationality Area Asylum

45 1 1 6 0

35 1 2 2 0

27 2 -10 4 0

Age Sex Nationality Area Asylum

27 2 -10 4 0

27 2 2 4 0 ×10

2

Page 18: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

The complete process

Raw data comes to DCLG

Weighting Imputation

Complete recordsWeights assigned

Final data set

Page 19: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Results

• What happens when we weight and impute?

PRP LA Total %

UK 113,071 69,256 91.8%

A10 4,258 2,547 3.4%

Other EEA 1,286 936 1.1%

Other 3,537 3,710 3.6%

Missing 4,324 17,131 9.7%

Total lettings 220,056

PRP LA Total %

UK 116,944 96,410 91.4%

A10 4,427 3,569 3.4%

Other EEA 1,347 1,369 1.2%

Other 3,758 5,510 4.0%

Total lettings 233,334

Original reported data Weighted and imputed dataImputed data

PRP LA Total %

UK 116,944 84,439 91.5%

A10 4,427 3,118 3.4%

Other EEA 1,347 1,204 1.2%

Other 3,758 4,819 3.9%

Total lettings 220,056

Page 20: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Testing

• But what further tests can we do?

• Remove logs from a complete data set and then test weighting against the complete version

• Deleting data and then imputing it to check error rate

• Finding other unaccounted biases needing weighting

• Any other thoughts?

Page 21: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Future work

• CORE is now National Statistics – improvements pending

• Use areas from 2011 census data

• Affordable rent weighting and imputation

• Improve data quality and volume from LAs – 2013/14 first year all LAs will participate

• On going disclosure control investigations

• Make CORE data more easily available via Open Data Communities

Page 22: Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

Thank you. Questions and comments please!