ess.vip validation

22
Eurosta t ESS.VIP Validation Objectives, scope & concepts Angel Simón Delgado EUROSTAT Email: [email protected]

Upload: sidney

Post on 25-Feb-2016

89 views

Category:

Documents


1 download

DESCRIPTION

ESS.VIP Validation. Objectives, scope & concepts. Angel Simón Delgado EUROSTAT Email: [email protected]. ESS.VIP VALIDATION. VIP VALIDATION – First Phase VIP VALIDATION – Deliverables in First Phase ESS.VIP VALIDATON – Definition of the project - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ESS.VIP Validation

Eurostat

ESS.VIP Validation

Objectives, scope & concepts

Angel Simón DelgadoEUROSTAT

Email: [email protected]

Page 2: ESS.VIP Validation

Eurostat

ESS.VIP VALIDATION

VIP VALIDATION – First Phase VIP VALIDATION – Deliverables in First Phase ESS.VIP VALIDATON – Definition of the project Validation Service (EDIT)

2

Page 3: ESS.VIP Validation

Eurostat

ESS VIP-V First PhaseOverall goal: To develop validation solutions to be used by different production chains (horizontal integration), within the ESS (vertical integration)Bottom-up approach:• Extensive consultation of all possible stakeholders• Participative management• Business driven approach• From pilots experience to general principles

3

Page 4: ESS.VIP Validation

Eurostat

Scope, objectives and outputs• Documentation / Standardisation:

a) template and guidelines for process descriptionb) template and guidelines for a standard documentation of the validation process

• Methodological analysis of Data Validationa) Typology of validation rulesb) Standard definition of validation levelsc) Standard formalised “syntax” (understandable by business users) to express validation rules

• Distribution of responsibilities in the production chaina) Guidelines to be used for the attribution of responsibility in the whole production chain (MSs

and Eurostat) by the WG. b) Guidelines based on efficiency principles (Validation Corrections: “the sooner, the better”)c) Preparing for IS/IT solutions and architecture

ESS VIP-V First Phase

4

Page 5: ESS.VIP Validation

Eurostat

Scope, objectives and outputs

• Towards IT/IS solutions and architecture:a) Users’ requirements to develop a new software to allow business users

to input validation rules and the corresponding error messages in a shared Central Repository of Validation Rules. The new software should be able to generate the rules in the Validation syntax developed by the project.

b) Validation Architecture defining the elements and their relationships in an integrated validation system ("common platform") to be used by:

Internal users, in an appropriate IS architecture to facilitate horizontal integration of IT/IS systems

All stakeholders in the production chain

ESS VIP-V First Phase

5

Page 6: ESS.VIP Validation

Eurostat

Contribution to the VISION

More efficient production chain with

clear attribution of responsibilitie

s

Standard definitions,

guidelines and validation

syntax

Development of a common

validation procedure

Common solutions to be shared within

the ESS

6

Page 7: ESS.VIP Validation

7

1.1 Inventory of documents1.2 Analysis of inventory1.5 Inventory of validation rules1.6 Inventory of error messages

Documentation1.4 Validation typologies2.4 Analysis of validation

typologies2.5 Levels of validation

Methodology1.3 Validation & statistical processes2.4 Analysis of validation typologies3.1 Validation rules by typology3.3 Error messages

Examples

3.2 Validation syntax4.1 Functional specifications for

GUI

Solutions2.1 Documentation of validation

process2.2 Documentation of statistical

process3.3 Error messages

3.4 Selection of validation rules3.5 Improvement actions3.6 Attribution of responsibilities

Templates & Guidelines

ESS.VIP Validation First Phase - Deliverables

Page 8: ESS.VIP Validation

Eurostat

Deliverables: Validation levelsD

ata

Wit

hin

an o

rgan

isat

ion

Wit

hin

a do

mai

n

From

the

sam

e so

urce

Sam

e da

tase

t Same fileLevel 0: Format &

file structure

Level 1: Cells, records, file

Between files

Level 2: Revisions and Time series

Between datasets

Level 2: Between correlated datasets

From different sources Level 3: Mirror checks

Between domains Level 4: Consistency checks

Between different organisations Level 5: Consistency checks Va

lidat

ion

com

plex

ity

8

Page 9: ESS.VIP Validation

Eurostat

Deliverables: Typology of validation rulesFile Structure

Filename File typeDelimitersFormat

>1 file checks

Referential integrityCode listCardinalityMirrorTime seriesRevised data integrityModel-based consistency

1 file checks

Type

Length

Presence

Allowed character

Uniqueness

Range

1n file checks

ConsistencyControlConditional

9

Page 10: ESS.VIP Validation

Eurostat

Deliverables: Guidelines for the attribution of responsibility of validation activities in the whole production chain

10

Step

1 Data preparation by the NSI's St

ep 2 Transmission

of data and validation report St

ep 3 Loading

data to production database St

ep 4 Additional

processing & dissemination

Validation Controls – Different actors – Different responsibilities

• Guidelines for the allocation of responsibility, for the implementation of validation rules within the ESS based on an AGREEMENT Eurostat-NSI's with periodic performance revisions from both sides

• Proposal for a generic business architecture of data validation:

Page 11: ESS.VIP Validation

Eurostat

Validation report structure:

Header• Time stamp• User ID• Data checked (dataset name…)

Body

• Rules applied• Total failures

• No. errors• No. warnings

• Total records• Records failed• Sum of weights• Maximum

admissible error weight

• Rate of acceptance• Maximum possible

amount of error• Rate of

performance

Footer• Error/warning

messages

Deliverables: Standard template for error/warning messages

Standard templates for error/warning messages and for validation report

Error/warning message structure:

• Rule ID• Severity• Rule type ID• Message text• Action• Failing data

11

Page 12: ESS.VIP Validation

Eurostat

Deliverables: VALS - Validation syntaxStandard syntax for validation languageTo define a meta-language for the domain of statistical data validation to express, document and communicate validation rulesTrade-off between Human-understandable and Computer-parseable languageImplementation through Graphic User Interface to support business users to input and maintain validation rules and rule-sets

Types ExamplesType check validate ( type(A1.Rcount)='ΤΕΧΤ')Range & math validate( Table2.C_5 between

(Table2.C_5_1 + Table2.C_5_2 + Table2.C_5_3 - tolerance) and (Table2.C_5_1 + Table2.C_5_2 + Table2.C_5_3 + tolerance))

Code list check validate ( match_codelist (A1.Quarter, CL_QRT) )Range check validate ( H.HB050 between 1 and 12 )Rules & Metarules

validate ( age_range_rule.result = true and country_codes_rule.result = true)

… …12

Page 13: ESS.VIP Validation

Eurostat

Proposed approachESS.VIP VALIDATION - Package 1:

IMPLEMENTATION

Goals• Implementation of the methodological developments of VIP-V

Phase I in the statistical domains/WGs• Maintenance and refinement of standards developed• User requirements for further developments• Evaluation, monitoring and reporting

13

Page 14: ESS.VIP Validation

Eurostat

Proposed approach

Goals• Vertical integration of the micro data validation within the ESS

production processes taking into account the results of the first phase of ESS VIP-V

• Extension of the functional specifications to apply to micro data validation

• Integrated solutions for micro data validation

ESS.VIP VALIDATION - Package 2:

MICRO DATA

14

Page 15: ESS.VIP Validation

Eurostat

Goals• Adaptation of existing validation tools to the functional

specifications issued from ESS VIP-V Phase I• Deployment of validation solutions to MSs

• Distribution of validation rules in agreed language• Building-Blocks in an adequate web services architecture• Provision of web services validation solutions to be used by

Member States before transmission to Eurostat

Proposed approachESS.VIP VALIDATION - Package 3:

SOLUTIONS

15

Page 16: ESS.VIP Validation

Eurostat

Goals• Overall coordination of the project• Coherence of validation approaches within ESS• Implementation of the meta-language within ESS• Analysis of links with other VIP & ESS VIP projects• Good practices identification• More sophisticated validation solutions:

• Longitudinal validation• Mirror checks validation• ESS wide shared final warehouses

Proposed approachESS.VIP VALIDATION - Package 4:

EXTENSIONS AND GENERAL COORDINATION

16

Page 17: ESS.VIP Validation

Eurostat

Elements in a validation system: ESS production chain

Statistical domain n

Member States

statistical production

and validation

Single Entry Point

Web Inteface

Statistical domain 1

Validation Services

Member StatesEurostat

Statistical domain …

Eurostat

Page 18: ESS.VIP Validation

Eurostat

Validation Service

Validation Service

ErrorsValidation reportMetrics

Validation rulesrepository

Data Definition registry

Data

18

Page 19: ESS.VIP Validation

Validation architecture elements – Global overview

VALIDATION RULESETS MAINTENANCE

Repository of rulesets and their metadata

VALIDATION SERVICE

SYNTAX ANALYSER

SYNTAX SPECIFICATIONS

Data

Data Structure

Validation Report

SEPSingle

Entry Point

Page 20: ESS.VIP Validation

Eurostat

Current actions and next steps

Assessment of Eurostat domains in the field of validation

Set of standard documentation for domain managers for harmonisation of communication with data providers

Task Force to: • Identify best practices in ESS• Advice on implementation in the ESS• Optimisation of the validation process

20

Page 21: ESS.VIP Validation

Eurostat

Current actions and next steps

Functional specifications for Validation services (EDIT) to be accordingly adapted to the findings of the project

Tools development:• System to create/maintain a central repository of validation

rules• Development and/or adaptation of IT Tools (EDIT, eDAMIS)• Validation Quality Metrics

21

Page 22: ESS.VIP Validation

Eurostat

Thank you

More information on:Email:[email protected] (only from EUROSTAT):http://www.cc.cec/wikis/display/ESTATmethodology/ESS.VIP+VALIDATION)