wicid and the 2001 interaction data john stillwell and oliver duke-williams school of geography,...
TRANSCRIPT
WICID AND THE 2001 WICID AND THE 2001 INTERACTION DATAINTERACTION DATA
John Stillwell and Oliver Duke-WilliamsJohn Stillwell and Oliver Duke-WilliamsSchool of Geography, University of LeedsSchool of Geography, University of Leeds
Presentation at the Ninth International Conference Presentation at the Ninth International Conference on Computers in Urban Planning and Urban on Computers in Urban Planning and Urban
Management (CUPUM), University College London Management (CUPUM), University College London 14-15 July, 200514-15 July, 2005
PresentationPresentation What is WICID?What is WICID?
Census 2001 interaction data sets Census 2001 interaction data sets Primary data setsPrimary data sets Tables and counts compared with Tables and counts compared with
19911991 Adjustment for disclosure controlAdjustment for disclosure control
WICID interface developmentsWICID interface developments WICID interfaceWICID interface Map selection toolMap selection tool Analytical functionsAnalytical functions
Using the 2001 data: some Using the 2001 data: some examplesexamples
ConclusionsConclusions
What is WICID?What is WICID? Web-based Interface for Census Interaction DataWeb-based Interface for Census Interaction Data
Software system developed as part of the Census Software system developed as part of the Census Interaction Data Service (CIDS)Interaction Data Service (CIDS)
CIDS is funded by ESRC/JISC and is part of the CIDS is funded by ESRC/JISC and is part of the Census Programme 2001-2006Census Programme 2001-2006
CIDS is a ‘Data Support Unit’ providing members CIDS is a ‘Data Support Unit’ providing members of the academic community with online access to of the academic community with online access to the census ‘Origin-Destination Statistics’the census ‘Origin-Destination Statistics’
Overall aim is to encourage more use to be made Overall aim is to encourage more use to be made of these data sets that are one of the products of of these data sets that are one of the products of the Censusthe Census
Census 2001 interaction data setsCensus 2001 interaction data sets
Origin-Destination Statistics larger Origin-Destination Statistics larger and more complex than from and more complex than from previous censusesprevious censuses
Three sets of data in 2001:Three sets of data in 2001: Special Migration Statistics (SMS) for the UKSpecial Migration Statistics (SMS) for the UK Special Workplace Statistics (SWS) for the UKSpecial Workplace Statistics (SWS) for the UK Special Travel Statistics (STS) only for Special Travel Statistics (STS) only for
ScotlandScotland and include journeys to place of study and include journeys to place of study as well as place of work as well as place of work
Geographical units used in 2001 Geographical units used in 2001 SMS/SWS/STSSMS/SWS/STS
CountryCountry Level 1Level 1 Level 2Level 2 Level 3Level 3
England London Boroughs (33), Metropolitan Districts (36), Unitary Authorities (46), Other Local Authorities (239)
CAS wards (7,969)
Output areas (165,665)
Wales Unitary Authorities (22) CAS wards ( 881) Output areas (9,769)
Scotland Council Areas (32) ST wards ( 1,176) Output areas (42,604)
Northern Ireland
Parliamentary Constituencies (18)
CAS wards (582 ) Output areas (5,022)
Total Districts (426) Interaction wards (10,608)
Output areas (223,060)
Data tables and counts compared with Data tables and counts compared with 19911991
More tables but some droppedMore tables but some dropped
More counts but some droppedMore counts but some dropped
Counts include infants and studentsCounts include infants and students
Counts for moving groups as well as Counts for moving groups as well as wholly moving householdswholly moving households
Classifications changed e.g. for ethnicity, Classifications changed e.g. for ethnicity, economic activity, areas outside UK, ……. economic activity, areas outside UK, …….
Tables and counts in the 2001 and 1991 Tables and counts in the 2001 and 1991 interaction data setsinteraction data sets
Data Data setssets
Level 1Level 1 Level 2Level 2 Level 3Level 3
2001 SMS
10 tables, 996 counts
5 tables, 96 counts 1 table, 12 counts
1991 SMS
Set 2: 11 tables, 94 counts
Set 1: 2 tables, 12 counts
-
2001 SWS
7 tables, 936 counts
6 tables, 354 counts 1 table, 36 counts
2001 STS
7 tables, 1,176 counts
6 tables, 478 counts 1 table, 50 counts
1991 SWS*
- Set C: 9 tables, 274 counts
-
* 10% sample
Tables and counts from 2001 SMS Level Tables and counts from 2001 SMS Level
1 and 1991 SMS Set 21 and 1991 SMS Set 2 VariablesVariables 2001 Level 12001 Level 1 1991 Set 21991 Set 2
Tables CountsTables Counts Tables CountsTables Counts
Age Table 1 75 Tables 1, 2 48
Family status Table 2 54 - -
Ethnicity Table 3 and 3N 33 Table 5 4
Limiting illness Table 4 84 Table 6 4
Economic activity Tables 5 and 8 378 Table 7, 9, 10 21
Moving groups Table 6 16 Table 2 2
Tenure Table 7 32 Table 8 and 8S 7
Occupation Table 9 288 - -
Some knowledge of Gaelic/Welsh/Irish
Table 10 36 Table 11S and 11W
2
Marital status - - Table 4 6
Adjustment for disclosure controlAdjustment for disclosure control
Various methods used by ONS in Various methods used by ONS in 20012001 minimum thresholds of people and minimum thresholds of people and
households before the release of datahouseholds before the release of data record swapping between areasrecord swapping between areas small cell adjustment method (SCAM)small cell adjustment method (SCAM)
SCAM assumed to adjust values of 1 SCAM assumed to adjust values of 1 and 2 to values of 0 and 3and 2 to values of 0 and 3
Impact of SCAM on output area Impact of SCAM on output area flowsflows
• Only those with destinations in Scotland have been disclosure controlled• Potential matrix of 223,060 origins by 180,456 destinations is > 40 billion cells
ONS only provide data on OA to OA flows for cells 5, 6, 8, 9 11, 12 where flow is non-zero
What are the counts included?
MIG301 gives data on flows between output areas in the UK
Impact of SCAM: e.g. on distribution of interior cell Impact of SCAM: e.g. on distribution of interior cell
values in SMS Table MG301 (excluding Scotland)values in SMS Table MG301 (excluding Scotland)
1
10
100
1,000
10,000
100,000
1,000,000
10,000,000
0 5 10 15 20 25 30
Interior cell value
Nu
mb
er o
f o
bse
rvat
ion
s
Around 5 million migrations in totalNo values of 1 or 2Over 1 million values of 3Only cell values up to 30 shownOver 10 million values of 0
99.3% of values in cells are 0 or 3 accounting for 95% of flows
Other effects of SCAMOther effects of SCAM
This indicates ‘single table’ effect of This indicates ‘single table’ effect of SCAMSCAM
There are also problems when There are also problems when comparing:comparing:
(a) flows between tables(a) flows between tables
(b) flows between spatial scales (b) flows between spatial scales
Impact of SCAM when comparing tables:Impact of SCAM when comparing tables:e.g. SMS at level 2e.g. SMS at level 2
There are 2 tables at level 2 (wards) that contain total migrants
MG201 Age by sex MG203 Ethnic group by sex
These cells provide flows of total migrants between 10,608 wards in the UK and 9,432 wards in UK-Scotland (aggregations of flows in interior cells
What differences occur in non-zero ward to ward flow totals?
Comparing total migrants in MG201 and Comparing total migrants in MG201 and MG203: distribution of absolute differences MG203: distribution of absolute differences
between alternative totalsbetween alternative totals
0
20000
40000
60000
80000
100000
120000
0 5 10 15 20 25
Difference between totals
Fre
qu
ency
Over 1.1 million ward to ward flow totals
Totals are different in 61% of cases
Distribution of differences dominated by 3s
Largest difference is 21 and frequency of larger differences is low
Impact of SCAM when comparing flows Impact of SCAM when comparing flows between spatial scalesbetween spatial scales
What are the effects of SCAM when comparing net migration rates based on data from:
Table TT 37 SMS Table 104
and data aggregated up from: SMS Table 204 SMS Table 304
Net migration comparison for London Net migration comparison for London boroughsboroughs
Net migration rates per 1000Net migration rates per 1000
BoroughBorough TT 37TT 37 SMS 104SMS 104 SMS 204SMS 204 SMS 304SMS 304
Top fiveTop five
City of London 4.87 5.01 11.70 3.34
Kingston upon Thames 3.33 3.95 2.76 3.14
Lambeth 0.00 0.55 -0.69 0.76
Sutton -0.15 1.39 0.66 1.30
Barking and Dagenham -0.49 -0.16 -0.23 -0.16
Bottom fiveBottom five
Brent -13.33 -12.94 -13.47 -12.68
Kensington and Chelsea -13.50 -13.70 -14.49 -14.37
Hounslow -14.04 -13.12 -14.24 -13.49
Ealing -14.64 -15.08 -14.70 -14.84
Newham -16.73 -18.04 -17.30 -18.05
Mean net rate -7.10 -6.94 -7.34 -7.18
Correlation with TT37 rate -- 0.985 0.975 0.980
WICID interface developmentsWICID interface developmentsCIDS homepage: http://cids.census.ac.uk/CIDS homepage: http://cids.census.ac.uk/
WICID home pageWICID home page
WICID general query interfaceWICID general query interface
See paper in Evironment and Planning A (2003) for further details
Example of a queryExample of a query
Technical detailsTechnical details WICID uses PostgreSQL (WICID uses PostgreSQL (www.postgresql.orgwww.postgresql.org) as ) as
its DBMS and to provide support for the storage its DBMS and to provide support for the storage and manipulation of geometric features and manipulation of geometric features
A third party add-on to PostgreSQL called PostGIS A third party add-on to PostgreSQL called PostGIS ((postgis.refractions.netpostgis.refractions.net) offers facilities to handle ) offers facilities to handle spatial data that follow the OpenGIS ‘Simple spatial data that follow the OpenGIS ‘Simple Features Specification for SQL’ standard Features Specification for SQL’ standard
In order for dynamic web pages to be created, a In order for dynamic web pages to be created, a programming language is required and WICID programming language is required and WICID uses PHP (PHP: Hypertext Preprocessor) (uses PHP (PHP: Hypertext Preprocessor) (www.php.netwww.php.net))
What new facilities have been What new facilities have been
developed in WICID?developed in WICID? Map selection toolMap selection tool
This has been developed because of the demand This has been developed because of the demand by users (particularly students) to be able to by users (particularly students) to be able to view the geographical areas that they want to view the geographical areas that they want to select select
Analysis toolsAnalysis tools This has been developed to provide users with This has been developed to provide users with
the opportunity to generate some basic some the opportunity to generate some basic some statistical information and indicators derived from statistical information and indicators derived from the data extracted the data extracted
When choosing origins or destinations, users When choosing origins or destinations, users are confronted with a set of alternative are confronted with a set of alternative selection toolsselection tools
Map Map selection selection window window in WICIDin WICID
WICID uses Post GIS extendedPostgresSQLdatabase and MapServerlibrary components
Example:Example:
Selection of Selection of City of City of London as a London as a
destinationdestination
AnalyticaAnalytical Toolsl Tools
Some basic statistics
Suite of indicatorsSome of which require additional data: e.g. distances populations at risk
Assembly of PARs is currently underway for 2001 data sets: needs specially commissioned counts for some variables
Example: Example:
Migration Migration effectiveneseffectiveness by ethnic s by ethnic group for group for GO regions, GO regions,
2000-01 2000-01
Using the 2001 data: some examplesUsing the 2001 data: some examples
Three examples based on data for London Three examples based on data for London boroughs for 2000-01:boroughs for 2000-01:
Internal net migration and ethnicityInternal net migration and ethnicity
Migration effectiveness by ageMigration effectiveness by age
Commuting connectivityCommuting connectivity
Example 1: Patterns of net migration for London boroughs in 2000-01
Rates of net migration with rest of GB
Source: 2001 Census SMS level 1
The importance of disaggregating net The importance of disaggregating net migration for London boroughs, 2000-01migration for London boroughs, 2000-01
Net rates of migration for boroughs with the rest of GB
Net rates of migrationfor boroughs within London
Source: 2001 Census SMS level 1
Net migration (GL only), selected ethnic groups, 2000-01Net migration (GL only), selected ethnic groups, 2000-01
Source: 2001 Census SMS level 1
Example Example 2: 2:
Migration Migration flows and flows and net net balances balances by age by age group for group for London, London,
2000-012000-01 -40000
-20000
0
20000
40000
60000
80000
100000
120000
140000
160000
1-4
5-9
10-1
4
15-1
9
20-2
4
25-2
9
30-3
4
35-3
9
40-4
4
45-4
9
50-5
4
55-5
9
60-6
4
65-6
9
70-7
4
75-7
9
80-8
4
85+
Age group
Mig
rati
on
Intra-migration
In-migration
Out-migration
Net migration
Source: 2001 Census SMS level 1
Net migration effectiveness by age, London, Net migration effectiveness by age, London,
2000-012000-01
-80
-60
-40
-20
0
20
401-
4
5-9
10-1
4
15-1
9
20-2
4
25-2
9
30-3
4
35-3
9
40-4
4
45-4
9
50-5
4
55-5
9
60-6
4
65-6
9
70-7
4
75-7
9
80-8
4
85+
Age group
Mig
rati
on
eff
ecti
ven
ess
Example 3:Example 3: Connectivity within London Connectivity within London
Journey to work flows by destination borough, 2001Source: 2001
Census SWS level 1
Out-migration and in-commuting Out-migration and in-commuting connectivity for London boroughs by ethnic connectivity for London boroughs by ethnic group, 2001group, 2001
0.00
0.20
0.40
0.60
0.80
1.00
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
Boroughs ranked by index
Ou
tflo
w C
I (w
ith
in G
L)
White
Indian
Pakistani and OSA
Chinese
Black
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
Boroughs ranked by index
In-c
om
mu
tin
g C
I (W
ith
in G
L)
White
Black
Indian
Pakistani and OSA
Chinese
Source: 2001 Census SMS and SWS level 1
ConclusionsConclusions CIDS and WICID now in successful operation since 2002CIDS and WICID now in successful operation since 2002
2001 data sets accessible but work still underway on 2001 data sets accessible but work still underway on map selection and analysis tools to handle 2001 datamap selection and analysis tools to handle 2001 data
Users need take care when using 2001 data – SCAM Users need take care when using 2001 data – SCAM means that there are several measures of the same means that there are several measures of the same count !count !
Examples demonstrate some of the insights that the Examples demonstrate some of the insights that the interaction data sets can provide into behaviourinteraction data sets can provide into behaviour
Comparisons between 1991 and 2001 possible but need Comparisons between 1991 and 2001 possible but need to be aware of the definitional, measurement and to be aware of the definitional, measurement and geographical inconsistencies between the two sets of geographical inconsistencies between the two sets of datadata
AcknowledgementAcknowledgement
CIDS is funded by the ESRC/JISC under CIDS is funded by the ESRC/JISC under Census Programme Research Grant Census Programme Research Grant
H507255177H507255177
http://cids.census.ac.ukhttp://cids.census.ac.uk