integrated data infrastructure (idi) project manager – guido stark june 2012 linking data across...
TRANSCRIPT
Integrated Data Infrastructure (IDI)Project manager – Guido Stark
June 2012
Linking data across government
How Statistics New Zealand maintains privacy in large scale data integration projects
2008 LEED – MSD Benefit data
2010 EOTE + secondary education data
Person to BusinessLink
Person to BusinessLink
2009 Employment Outcomes of Tertiary Education (EOTE)
2
EMSSelf-employed
IR
Person to BusinessLink
Person to BusinessLink
Person to BusinessLink
Person to BusinessLink
EMSSelf-employed
IR
HistoryBenefits
MSD (BDD)
Person to BusinessLink
Person to BusinessLink
2005 Linked Employer-Employee Data (LEED)
2004 Student Loans (SL)
EducationTertiary
MoE
Student Loans IR
MSD (StudyLink)SLAM
2007 Student Loans and Allowances (SLA)
EducationTertiary
MoE
Student Loans & Allowances
IRMSD (StudyLink)
SLAM
2007 Prototype Longitudinal
Business Database
LBD
EMSSelf-employed
IR
EMSSelf-employed
IR
EducationTertiary
MoE
EducationSecondary
MoE
EducationTertiary
MoE
2011 LEED – Household Labour Force Survey (HLFS)
HLFSStats NZ
Person to BusinessLink
Person to BusinessLink
EMSSelf-employed
IR
Lead the Official Statistics System• Responding to customer needs
Obtain more value from official statistics
Transform the way we deliver statistics• Maximise the use of administrative data
• Increase use of data integration
Create a responsive, customer-focused, influential, and sustainable organisation
Data integration raises real and perceived privacy risks
3
Statistics 2020 – Te Kāpehu Whetū Achieving the statistical system of the future
4
Integrated Data Infrastructure (IDI)Project manager – Guido Stark
Data Integration PolicyFour key principles
5
1. The public benefits of integration outweigh both:• privacy concerns about use, and• risks to the integrity of the
Official Statistics System or other activities of government
2. The integrated data will only be used for statistical or research purposes
3. The data integration will be conducted in an open and transparent manner
4. Data will not be integrated where an explicit commitment has been given to respondents that would preclude such action.
Safe Data
The triangle
6
Privacy
ConfidentialitySecurity
IDI security
Secure workspaces
Limited access
Access control
Regular audits of access and use
Output control
7
IDI confidentiality
Encryption / transformation
IDI rules
• Aggregating• Rounding• Suppression
Output checking
Statistical vs operational
8
IDI privacy
Privacy Impact Assessment• www.stats.govt.nz/IDI
Risks• Information used in a way that is detrimental to their personal circumstances• Information might be released that identifies them and aspects of their personal
circumstances• Unrelated information might be collected about them in an ever-growing
database for non-specific purposes (i.e. ‘Big Brother’).
Benefits• Potential for new official statistics• Provides an evidence base for research, evaluation, and policy formulation• Meets Statistics NZ’s strategic priorities
Data Integration Policy
The Privacy Act 1993Principle 1: Purpose of collection of personal informationPrinciple 2: Source of personal informationPrinciple 12: Unique identifiers
The Statistics Act 1975Section 3: Official statistics and coordinationSection 15: Independence of the Government StatisticianSection 37: Security of InformationSection 21: Declaration of secrecy
10
Information flows
SNZ derived unique identifier available
Source dataLoad source
dataLoad source
data
Clean source data
Clean source data
Create link tables
Create link tables LinkingLinking
Create core tables
Create core tables
Restricted access in accordance with Statistics NZ’s Security Framework
Access for bona-fide statistical purposes to required data sources only
Clean source data
Clean source data Core tablesCore tables
Names, addresses, date of birth removed
Unique identifiers are encrypted
OutputsOutputsConfidentiality checks
Safe Data
The triangle
12
Privacy
ConfidentialitySecuritySafe outputSafe access
Safe projectsSafe researchers
Use of the IDI
Developing regular measures of immigrant outcomes
Developing tertiary education outcome indicators
How successful is NZ in retaining qualifications
Intellectual property and productivity
Mapping post-compulsory school pathways and outcomes
The impact of immigration on the labour market
The influence of education on outcomes
What is the impact of gaining qualifications for beneficiaries
Who doesn’t participate in tertiary education
Benefit to work transitions
The effect of wage subsidies on individual and firm employment
Secondment
Statistics NZ datalab
• Legal compliance• Bona fide research• Non-regulatory• Proven researcher• Confidentiality• Suitable data source
AccessMaking the most of the IDI
15
Business data Person to business link
Person to business link
Educationsecondary &
tertiaryMoE
EMSSelf-employedInland Revenue
Student Loans & AllowancesInland RevenueMSD (StudyLink)
SLAM
HLFS / NZISSurvey
BenefitsMSD (BDD) Outputs
Relevant releasesDynamic datasets
Cutting edge cubesPowerful research
Central Linking Concordance
(CLC)
Central Linking Concordance
(CLC)
Migration dataDoL
LBDLEED
Integrated Data Infrastructure (IDI)
???
?