eea data quality management supporting inspire ......eea’s data flow stages • source: eea common...

Post on 04-Aug-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Daniela Cristiana Docan I 6th Sept. I INSPIRE Conference 2017, Strasbourg

EEA Data Quality Management supporting

INSPIRE implementation

Data Quality in INSPIRE

INSPIRE Technical Guidelines use ISO 19157

Geographic Information-Data quality

•Data Quality Elements mentioned in INSPIRE TG

ISO 19157 Geographic Information – Data Quality

Data Quality in INSPIRE

INSPIRE TGs cover:

1.Data Quality elements/sub-elements2. Data quality measures

(tests to be applied on dataset)3. Minimum data quality requirements/conformance quality level/

ISO 19157 Geographic Information – Data Quality

•Data Quality in INSPIRE : Protected Sites TG

Recommendations:

•DQ elements and sub-elements•Corresponding DQ measures•Minimum data quality requirements

•--

acceptance criteria or

conformance quality level

•Data Quality in INSPIRE : Protected Sites TG

Item: fields,

records, value,

features,

relationships,

files in the

dataset

package

•Data Quality in INSPIRE : Protected Sites TG

INSPIRE data quality requirements

INSPIRE:

Completeness/Omission

INSPIRE: Rate of missing item

INSPIRE: No recommendation/constrains

EEA’s Data Flow stages

•Source: EEA Common Workspace/Generic QA/QC

•Guidelines for Reporting Obligations, XML schema, Database schema, Quality control checks•A priori DQ requirements -absolute positional accuracy-10m

“what we want”

•Automatic and manually quality checks•Conformance test (minimum data quality requirements)•Metadata and/or standalone data quality report]• A posteriori DQ results/values

“what we get”

•Run automatic quality checks [QA scripts – XQuery]•Automatic QA report for MS

•ETL (Extract Transform Load) tools•Automatic and manually quality checks

•EEA’s Automatic quality report

•Source: Eionet Central Data Repository (CDR

equivalent to measures in ISO standard

•How will EEA’s quality checks connect with ISO elements & measures?

INSPIRE Requirements in INSPIRE EEA’s CDF’s quality checks

Element: Completeness-Omission ✓Mandatory values

Standardised Measure: Rate of missing

items

[Error rate] (e.g. real, percentage, ratio)

✓User defined data quality measure:

All records must have the SITE_CODE field

filled

Minimum data quality requirements:

None

Conformance test: acceptance criteria is

[0%] errors in the dataset

Source: Eionet Central Data repository (CDR)

Data Quality Rule Registry (DQRR)• [catalogue of standardized and user defined data quality measures]

= Measures in ISO 19157

= Elements in ISO 19157

Data Quality Rule Registry (DQRR)

•Data quality checks – different point of views

1.“Minimal mapping unit”

[the smallest size of area allowed to be represented in a given data set]

Topological consistency or conceptual consistency?

2. “C34 – Coordinate accuracy”

These are required to be in format the ETRS89 (2D)-EPSG:4258 coordinate reference system,

with a 10m accuracy. Hence a check is required to ensure that, when coordinates are

reported, each coordinate is to 4 decimal places, adhering to the 10m accuracy required.”

CDF’s guideline

The number of decimal places for decimal degrees coordinates

☺ Precision or resolution Not absolute positional accuracy

Source: www.ncetm.org.uk

Logical consistency-Conceptual consistency

or Format consistency?

The error vector for a single point,

(Source: Weir et al., 2001, p.413)

Conclusions

INSPIRE Data Specification requirements/recommendations

Data Specifications are not restrictive on the data quality (e.g. elements to be covered, measure (tests) to be applied, or minimum data quality requirements)

Consistency in defining data quality elements and measures (tests) across different annexes or/and themes

(e.g. Annex III - Topological consistency and Temporal consistency and validity)

•Conclusions

EEA’s QA/QC workflow

Fulfils the INSPIRE requirements on data qualityNew components of the data quality will be covered/improved (e.g. absolute positional accuracy)Data Quality Rule Register (DQRR) project promote the “interoperability of the data quality” by proposing:

•Common criteria to categorise data quality checks across EEA’s production stages•Harmonize DQ terminology across different Core Data Flows

(e.g. record uniqueness, duplicate elements, duplicates entries, duplicate value, duplicities, and uniqueness of primary key)

•Assign/link the existing DQ checks to ISO quality elements and sub-elements•Harmonise the Standalone Data Quality Reports

•Q&A

Thank you,

Daniela Cristiana Docan

daniela.docan@eea.europa.eu

top related