using research to inform geographic policy best-fitting from output areas to higher geographies

37
Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Upload: hector-hancock

Post on 16-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Using Research to Inform Geographic Policy

Best-fitting from Output Areas to Higher Geographies

Page 2: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Introduction

• The Geography Policy for National Statistics sets out the methodology for calculating areal statistics – “best-fitting”.

• Methodology was adopted for 2011 Census, but for small areas, this approach is not always as accurate as previous estimates.

• The results of ONS Geography research on these cases will feed into a review of GSS geographic policy.

Page 3: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Context

• ONS has a legal duty under the Statistics and Registration Service Act (2007) not to disclose “information which relates to and identifies a particular person”.

• Output Areas were designed for the 2001 Census with this requirement in mind.

Page 4: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Census Output Areas (OAs)

• Designed to minimise within-area and maximise between-area socio-economic variations.

• Individual OAs, and aggregations of complete OAs, are non-disclosive.

• For 2011 Census, revision of OAs was kept to a minimum. Only 2.6% of 2001 OAs were changed.

• Over 181,000 cover England and Wales.

Page 5: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Census Output Areas (OAs)

• Designed to minimise within-area and maximise between-area socio-economic variations.

• Individual OAs, and aggregations of complete OAs, are non-disclosive.

• For 2011 Census, revision of OAs was kept to a minimum. Only 2.6% of 2001 OAs were changed.

• Over 181,000 cover England and Wales.

• South Brent, Devon (pop. 3,000)

Page 6: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Creation of Output Areas

• Formed from groups of Thiessen polygons generated from households and snapped to residential postcode unit centroids.

• Target of 125 households per OA.

Page 7: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Census Households and

Unit Postcode Centroids

Page 8: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Household/ Unit Postcode

Thiessen polygon building blocks

Page 9: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Census Output Areas

from building blocks

Page 10: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Exact estimates

• Where statistics for an area are derived directly from the Census households located within them.

• Exact estimates were produced in 2001 for all geographies.

• Exact estimates can produce slivers when incompatible boundaries (here, parishes) are overlaid.

• Small cell adjustment was applied to the dissatisfaction of users.

2001 OAs

Page 11: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Exact estimates

• Even before 2001 Census published, demand expressed for release on future boundaries.

• We needed a methodology that can accommodate this demand.

2011 OAs

Page 12: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Statistical building blocks

• Statistical outputs should be constructed from the smallest geographical areas for which data are available.

• For Census univariate data, these building-blocks are OA; for Census multivariate data, LSOA.

• Allows consistent and comparable statistics, even if the geography changes.

Page 13: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Best-fitting to higher geographies

• For ‘higher’geographies, Census statistics are ‘best-fitted’ from the OA or LSOA pop-weighted centroids that a target area contains.

• For geographies made up of aggregations of complete OAs or LSOAs (e.g. MSOA / LAD) this equates to exact fit.

• The National Statistics Postcode Lookup (NSPL) provides a pre-built best-fitting tool.

Page 14: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Output AreaPWCsfrom

Census Households

(median position)

Page 15: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

OAs and OAPWCs

Page 16: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

OA and

highergeography e.g. LSOA

Page 17: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

• The Geography Policy for National Statistics allows for exceptions to be requested from the Statistical Policy Committee.

• The irregular boundaries and very small populations of National Parks makes them unsuitable for best-fitting as very few OAPWCs fall within them.

• Statistics for National Parks are therefore exact-fitted as an exception to the policy

Exception - National Parks

Page 18: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

• Usually, whatever small populations located within them are associated with OAPWCs that fall outside the park..

• Statistics for National Parks are therefore always under-counted if best fit.

Exception - National Parks

Page 19: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Best-fitting to higher geographies

• Otherwise, best-fitting using the OAPWC contained within the target geography works very well.

• Statistics are non-disclosive, on a consistent, reliable and relatively simple basis.

• Research by Ralphs (2011) determined that best fitting is a dependable methodology and it was adopted as fundamental to the policy.

• Suitable also for non-nesting geographies. Usually…

Page 20: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

OA, OAPWC and

non-nesting geography

Page 21: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Non-nesting Geography unit

without OAPWC

Page 22: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Statistics for areas without OAPWCs?

• Because we cannot release sub-OA statistics, no data may be published for these areas.

• But there is demand.

Page 23: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Statistics for areas without OAPWCs?

One option:

Best-fit, to the target area, the statistics for the OA pop-weighted centroid (OAPWC) which is nearest to the target area’s geometric centroid and which is within the same LAD.

Page 24: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

One option:

Best-fit, to the target area, the statistics for the OA pop-weighted centroid (OAPWC) which is nearest to the target area’s geometric centroid and which is within the same LAD.

Statistics for areas without OAPWCs?

Page 25: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Best-fitting using adjacent OAPWC

• This does produce non-disclosive statistics for the target area.

• But:

Statistics are indicative rather than precise;

Statistics for the nearby OAPWC serve for two or more areas;

Statistics cannot be aggregated up to compile statistics for higher areas.

Page 26: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

The ‘areas with a small population’ problem

• There are a number of administrative and statistical geographies that can in principle be smaller than an Output Area

• Parishes; • Wards; • Built-up areas (BUA);

• The problem is also clustered around particular areas such as the Isles of Scilly and the City of London.

• Civil parishes are a particularly good geography for the research due to the large number that are smaller than an Output Area

Page 27: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

There is a real demand for parish data

• There is a very real demand for parish data, demonstrated by the fact that they are one of the most popular geographies on the Neighbourhood Statistics site.

• Many parishes in England are very recent creations driven by the government’s localism agenda.

• For 2011 Census, some of the previous parish level outputs were removed creating a greater requirement for accuracy on the parish outputs that were produced.

• Yet we can not currently provide statistics for parishes that do not contain an OAPWC.

Page 28: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Parish area varies enormously

Chester Castle (0.04 km2)

Stanhope, County Durham (255km2)

Over 6,000x larger.

28

Page 29: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Parish population varies enormously

• Some old parishes are completely uninhabited; more recent creations are home to up to 75,000 people.

• By contrast, OA population falls within strict thresholds (100 – 625).

Page 30: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Coverage

The majority of the population of England live in “Unparished Areas”.

All of Wales is covered by “communities”, the local equivalent to parishes.

30

Page 31: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Outline methodology

• Fictitious data as Census 2011 data proxy linked to postcodes as household proxy.

• One copy aggregated to OA pop-weighted centroids as “model” (best fit) data.

• A second copy aggregated to parishes as “true” (exact fit) data.

• Stats for “model” and “true” data compared.

Page 32: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

The datapoint ratio and model power

• The “datapoint ratio” is the ratio of the number of postcodes in the OA that would be supplying “model” data to the number of postcodes in the parish that would be receiving those data).

Page 33: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

The datapoint ratio and model power

• Where the “datapoint ratio” was close to 1:1, correlations between the “model” statistics for the external OAPWC and the equivalent “true” statistics for the receiving parish were strong.

• Elsewhere, correlations were generally weak and raw error and percentage errors were high.

• For parishes without an OAPWC, the datapoint ratio was always less than 1:1 and the mean datapoint ratio was 0.3 : 1.

Page 34: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Conclusions - parishes

• At least eleven possible types of spatial relationship between parishes and OAs. Most drive the datapoint ratio away from 1:1.

• No structural link between parishes and the output area hierarchy of geographies.

• Estimating statistics for a parish on the basis of adjacent OAPWCs unlikely to succeed.

• Statistics for the 1,142 parishes without OAPWCs are on average likely to be over-estimated.

Page 35: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Conclusions – more general

• Using an external OAPWC to estimate statistics for an area that does not contain an OAPWC is only likely to produce statistics close to the ‘truth’ where the following conditions apply:

• The target geography is similar in number of postcodes/ households to the supplying geography and

• The target geography is structurally similar to the OA/LSOA/MSOA geography.

•  The pilot project was successful in classifying the types of geography and scenarios that make best-fitting via adjacent OAPWCs problematic.

• The procedure appears to be especially weak for count/sum variables.

Page 36: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

What’s next?

• We are about to embark upon the project proper, with dedicated high-performance hardware and household-level Census data that will provide a finer degree of granularity and realism to the results.

• This will allow us to develop measures to mitigate the problems identified, and this will assist in our aim to publish robust and non-disclosive estimates for the small geographies that users are most at home with.

• The research will feed into a review of the Geography Policy for National Statistics planned for 2014-15

Page 37: Using Research to Inform Geographic Policy Best-fitting from Output Areas to Higher Geographies

Links:

GEOGRAPHY POLICY FOR NATIONAL STATISTICS

http://www.ons.gov.uk/ons/guide-method/geography/geographic-policy/best-fit-policy/geography-policy-for-national-statistics.pdf

Best-fit Policy:

http://www.ons.gov.uk/ons/guide-method/geography/geographic-policy/best-fit-policy/index.html

Coady: An overview of best-fitting: Building 2011 Census estimates from Output Areas: http://www.ons.gov.uk/ons/guide-method/geography/geographic-policy/best-fit-policy/an-overview-of-best-fitting.pdf

Ralphs: Exploring the performance of best fitting to provide ONS data for non standard geographical areas

http://www.ons.gov.uk/ons/guide-method/geography/geographic-policy/best-fit-policy/exploring-the-performance-of-best-fitting-to-produce-ons-data-for-non--standard-geographical-areas.pdf

Contact ONS Geography at: [email protected]

Access Open Geography products at: https://geoportal.statistics.gov.uk/geoportal

Access as Linked Data at:   http://statistics.data.gov.uk

Follow ONS Geography at @ONSgeography