federal statistical office estatistik.core - integrating respondents’ it systems into data...
TRANSCRIPT
Federal Statistical Office
eSTATISTIK.core - Integrating Respondents’ IT Systems into Data Collection
UNECE Work Session on Statistical Data EditingBonn, 25-27 September 2006
Michael Schäfer
Federal Statistical Office, Germany
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Outline
Overview and history Architecture Workflow Validation Conclusion
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
The core of .CORE
A new procedure for the primary collection of raw data from businesses and authorities
Integrates IT systems of respondents into the data collection system, allowing reporting to become a fully automated and seamless workflow
Includes methodological improvements Main objectives: disburden respondents, increase
data collection efficiency and data quality Initiated 03/2003, productive 03/2005 Partners: Statistical offices and AWV
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Current reporting channels
Statisticaloffices
WWWWWW
HR GL ..
Paper
Web
Phone, fax
Business data management systems (ERP)
Manual entry, small data volumes,little or no standardisation
Large data volumes, but no standard procedures/formats
Multiple, parallel reporting, various procedures/formats
Internal workflow,little or no support by standard ERP software
A multiplicity of data collection procedures
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
The .CORE reporting channel
Statisticaloffices
HR GL ..Business data management systems (ERP)
WWWWWW
CORE.server
Direct data retrieval and automated message generationMulti-message documents
A single point of delivery
Survey-independent data and programming interfaces
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
DatML/RAWRaw data
DatML/SDFSurvey definition
DatML/RESValidation protocol
ArchitectureBusiness Transport Statistics
DMS
StatisticsModule
Survey AC
OR
E.c
on
nec
t
Businessdata Statistics
ModuleSurvey B
DCS
Dat
a re
cep
tio
n
Val
ida
tio
n
Tra
nsf
orm
atio
n
Fo
rwar
din
g
Meta-data
Stat.Office
Stat.Office
Stat.Office
Production
Survey A
Survey B
XMLHTTPS
Rawdata
Resource database
MetadataaccessCORE
.reporter
CORE.connect
Businessdata
Rawdata
KonVert
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Data validation overview
Where does validation take place? optionally at the respondent (statistics module/CORE.reporter,) automatically on CORE.server
How is validation performed? by using the free CORE.connect library (Java, C, .NET)
What does validation require? the XML schema definition of DatML/RAW one or more survey definitions (DatML/SDF)
What is the result of a validation? on CORE.server, a detailed validation report (DatML/RES),
downloadable via CORE.connect and IDEV (on-line DCS) on the client, an InspectionReport object
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Survey definitionsDatML/SDF Survey description
Input data model
Output data model
Variable
Message data group
Case data group
Reference data
Classification
1
1
+
1
?
+
?
+
Variable group *
Data type 1
Value space ?
Include +
Include +1 = 1 occurrence
? = 0-1 occurrences
* = 0-n occurrences
+ = 1-n occurrences
Include +
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Data validation objects
Value Conformance with data type and value space
Occurrence: unconditional / independent Of variables, variable groups and data groups mandatory or optional
Occurrence: conditional / dependent Of variables and variable groups Conditions test existence and values of variables and variable
groups mandatory or optional or forbidden
Occurrence: instances Minimum and maximum number of data and variable groups
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Conditions
Case data group
variable group
variable group
variable
includes
includes
includes
condition
condition
targets
targets
overrides
has
has
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Some remarks on quality
“Give us the data you have and we see what we can make of it”
Facilitates data provision for respondents Reduces errors by avoiding to make respondents transform or calculate
data
Improved description, fewer interpretation errors Alignment of statistical terms to business terms facilitates
understanding of our definitions Standardised documentation for SW producers Respondents no longer need to find out what data we want – they can
rely on the statistics module to provide it.
System-provided data Fewer human interactions = increased coherence and consistence of
data over time and per respondent
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Conclusion
.CORE is mainly an IT-based solution but marketing activities, methodological improvements, intensive co-operation with the user community and political support are of equal importance
.CORE has regular data validation capabilities but provides them in a new standardised and survey-independent way to respondents and collectors
Automation, standards and methodological work contribute most to improving data quality
As a fully metadata-driven system, .CORE relies on good tool support in back-end procedures
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Information and contacts:
eSTATISTIK.core:
http://www.statistik-portal.de
Information and support, +49 (0)611/75-2040
Mr. Jörg Decker, +49 (0)611/75-2442
Mr. Michael Schäfer, +49 (0)611/75-3652
© Federal Statistical Office Germany; Units IIC4, VD2
Federal Statistical Office
Thank you very muchfor your attention!
Any questions?