The Social Statistics Database:
Invaluable source of micro-data for socio-economic statistics
Johan van Rooijen
Contents of this pesentation
• What does the Social Statistics Database (SSD) contain?
• Linkage keys• Standardization• Data and metadata• Process• Privacy-protection• Output-examples
2
SSD: contents
3
• Persons• Relations between persons• Households• Jobs• Self-eployment• Social security benefits• State and employer pensions
• Income• Education• Hospitalizations• Causes of death• Criminal offense reports• Houses• Vehicles
The SSD is a central database containing microdata on:
and more……
Many administrative sources, such as:
•Population Register •Register Income Tax Declarations•Administration of Employee Insurance Schemes •Police Register•Valuation of real estate registration system
Survey:
•Labour Force Survey
Future developments: more administrative sources, more surveys, big data?
4
SSD: linkage keys
All mirco datafiles stored in the SSD contain a LINKAGE KEY:
Interlinking microdata is what the SSD is about!
Example:
linking microdata on graduates with data on persons and data
on employment in order to describe the transisiton from
education to labour market participation
5
6
PIN: person identification numberHIN: household identification numberAIN: adress identification numberEIN: enterprise identification number
SSD: contents and linkage keys
SSD: standardization
standardization is an important aspect of the SSD
•organization of the IT-infrastructure•file formats•linkage keys•names•metadata
7
SSD: relation between data and meta-data
8
In conclusion:
Microdata-files are linked to other microdata-files by means of a set of standardized linkage keys
Microdata-files are linked to their metadata through the name (file-name / variable-names)
9
10
SSD: process
SSD: privacy-protection
Legislation: •Statistics Netherlands Act •Netherlands Data Protection Act
These laws:•authorize Statistics Netherlands to use personal data•oblige Statistics Netherlands to take adequate measures aimed at privacy protection
11
Privacy protection measures:
•Linkage keys are anonymous, original personal identifiers are removed from the data •Access rights to the SSD are restricted•Limited e-mail facilities•Staff members have to take an oath•Check on disclosure risk of output
12
SSD: Examples of output
13
Transition from education to labour market participation
14
The effect of discharge and discharge reason on relationship dissolution
15
0,0% 5,0% 10,0% 15,0% 20,0% 25,0% 30,0% 35,0% 40,0%
no discharge (control group)
discharge (redundancy)
discharge (personal grounds)
discharge (longterm illness)
womenmen
Dropping out of school and criminal behavior
Suspected of a crime in the period 1999-2006
overall 8.2 %
drop-outs 37.6 %
16
Research by external researchers (through Centre for Policy-related Statistics)
• The Netherlands Cancer Institute: linking SSD-data to cancer registration
• Municipalities: evaluation of their social assistance policy by enriching own data with data from the SSD
17
The end
18