archiving microdata standards and good practices united nations statistics commission new york,...
TRANSCRIPT
Archiving microdataStandards and good practices
United Nations Statistics CommissionNew York, February 26, 2009
Olivier DupriezWorld Bank, Development Data Group
and
International Household Survey Network
The value of data
• Survey and censuses – High cost ! High value ?
• Data have value beyond the purpose for which they were originally collected (“repurposing” of data) – Large under-exploited potential
• Condition: proper archiving – Documentation, dissemination, preservation
Data archiving – Two modelsBy a specialized data center (“trusted repository”)(US, Canada, Europe)
• Often academic • High level of expertise • Infrastructure• Standards and best practices
for documentation • Formal dissemination and
preservation policies and procedures
• Support to users
By the data producer(Most developing countries)
• Not seen as a key role• Lack of expertise • Inappropriate
infrastructure • Ad hoc practices • No compliance with
international standards • Unclear policies and
procedures
Sharing good practices
Objective: transfer data archiving good practices and standards to data producers
International Household Survey Network (IHSN)– A network of international agencies (coordinated by
World Bank /PARIS21)– Develop tools, guidelines, training materials– Advocates compliance with good practices
and international standards
www.ihsn.org
Microdata documentation
Good documentation is needed to:
– Properly analyze the data– Increase credibility of derived indicators and analysis– Allow replication of data collection or analysis– Build institutional memory
DDI + Dublin Core metadata standards (XML)
A checklist of everything you need to know– Study description– File description– Variable description– Related materials www.ddialliance.org
IHSN DDI Metadata Editor
Documenting the study: sampling, data collection, scope and coverage, etc.
IHSN DDI Metadata EditorDocumenting files and variables: formulation
of question, interviewer’s instructions, computation of variables, etc.
IHSN DDI Metadata EditorMetadata in XML format … … can be “transformed”
into html, pdf, other
Microdata cataloguingXML/DDI metadata is web-ready, “browsable and searchable”
Microdata dissemination
• Growing demand for microdata • Potential to add much value to existing data • But requires:
– Enabling legislation– Formal policy/procedures (IHSN guidelines)– Technical capacity to prepare data for dissemination
• Documenting, cataloguing• Anonymizing (IHSN tools being tested)
Data and metadata preservation
Situation in many countries: documents in hard copy only, outdated storage media, multiple versions of datasets, much information lost (or never generated).
Goal: Data and documentation remain readable, meaningful, understandable, accessible manage hardware, software and storage media (not only backups; also “migration”)
On-going: IHSN-ICPSR guidelines (Open Archival Information System - OAIS; ISO 14721)
Conclusions and recommendations
– NSOs do not need to have all features of advanced data centers, but data archive is part of their mandate
– Documentation and preservation are a MUST, even if you don’t disseminate
– Good practices and standards are relatively easy to implement
– Good documentation of past surveys helps improve the quality of future surveys