data quality definitions

9
Page 1 Oliver Grill, Michael Küsters Michael Küsters Data Quality Management Definitions The Characteristics of Data Quality

Upload: michael-kuesters

Post on 22-Jan-2015

8.038 views

Category:

Business


1 download

DESCRIPTION

What Data Quality is all about...

TRANSCRIPT

Page 1: Data Quality Definitions

Page 1

Oliver Grill, Michael Küsters Michael Küsters

Data Quality Management Definitions

The Characteristics of Data Quality

Page 2: Data Quality Definitions

Page 2

Oliver Grill, Michael Küsters Michael KüstersSlide 2

What is „Data Quality“?

Data Quality CharacteristicsAccurateAccurate

PrecisePrecise

RelevantRelevant

CompleteComplete

Harmonized information need and provision Harmonized information need and provision1

Mutual understanding of data capability Mutual understanding of data capability2

Trustworthy and credible information Trustworthy and credible information3

Data Quality stands for:Data Quality stands for:

ConsistentConsistent

TimelyTimely

TransparentTransparent

Page 3: Data Quality Definitions

Page 3

Oliver Grill, Michael Küsters Michael KüstersSlide 3

The Characteristic „Accuracy“

Accuracy stands for:Accuracy stands for:

Examples for Data Accuracy issues:Examples for Data Accuracy issues:

Data Accuracy is the degree at which a data object overlaps with the real world object or event described.

Data accuracy is measured as reciprocal maximum gap between data and reality. [ high is good ]

Data Accuracy is the degree at which a data object overlaps with the real world object or event described.

Data accuracy is measured as reciprocal maximum gap between data and reality. [ high is good ]

Frank Meyer is recorded as “Fritz Meier” in the Database.Frank Meyer is recorded as “Fritz Meier” in the Database.

An incident is reported with €23m when the loss was €12k.An incident is reported with €23m when the loss was €12k.

The amount invoiced does not represent the customer’s usage.The amount invoiced does not represent the customer’s usage.

AccurateAccurate

Good fit between the data and realityGood fit between the data and reality

The ability to draw correct conclusions from dataThe ability to draw correct conclusions from data

Business processes that match realityBusiness processes that match reality

Page 4: Data Quality Definitions

Page 4

Oliver Grill, Michael Küsters Michael KüstersSlide 4

The Characteristic „Precision“

Precision stands for:Precision stands for:

Examples of Data Precision issues:Examples of Data Precision issues:

Data Precision is the closeness between all possible interpretations of a data object.

Data precision is measured as reciprocal maximum distance between all applicable data interpretations. [ high is good ]

Data Precision is the closeness between all possible interpretations of a data object.

Data precision is measured as reciprocal maximum distance between all applicable data interpretations. [ high is good ]

A close link between desired and offered informationA close link between desired and offered information

The ability to pinpoint decisions based on data.The ability to pinpoint decisions based on data.

Lean Business processes.Lean Business processes.

Frank Meyer lives in Bonn - or Cologne? Or was that Jon Myers?Frank Meyer lives in Bonn - or Cologne? Or was that Jon Myers?

This Billing incident was caused by Mediation... I think…This Billing incident was caused by Mediation... I think…

Why do we charge the customer 2 minutes for a 59sec call?Why do we charge the customer 2 minutes for a 59sec call?

PrecisePrecise

Page 5: Data Quality Definitions

Page 5

Oliver Grill, Michael Küsters Michael KüstersSlide 5

The Characteristic „Relevance“

Relevance stands for:Relevance stands for:

Examples of Data Relevance Issues:Examples of Data Relevance Issues:

Data Relevance is the closeness between data consumer needand data provider output.

Data relevance is measured as percentageof all data required divided by all data provided. [100% is best ]

Data Relevance is the closeness between data consumer needand data provider output.

Data relevance is measured as percentageof all data required divided by all data provided. [100% is best ]

Data that helps you know what you want.Data that helps you know what you want.

The ability to use data with maximum efficiency.The ability to use data with maximum efficiency.

Not having to sort through information you don’t need.Not having to sort through information you don’t need.

The Revenue Assurance report also tells you about the weather!The Revenue Assurance report also tells you about the weather!

A CSR asks the cell phone customer if they have a microwave.A CSR asks the cell phone customer if they have a microwave.

You need to fill in a 7-page form to apply for a tariff change.You need to fill in a 7-page form to apply for a tariff change.

RelevantRelevant

Page 6: Data Quality Definitions

Page 6

Oliver Grill, Michael Küsters Michael KüstersSlide 6

The Characteristic „Accuracy“

Completeness stands for:Completeness stands for:

Examples of Data Completeness issues:Examples of Data Completeness issues:

Data Completeness is the extent by which the data consumer’s need is met.

Data completeness is measured as percentageof data available divided by the data required. [100% is best ]

Data Completeness is the extent by which the data consumer’s need is met.

Data completeness is measured as percentageof data available divided by the data required. [100% is best ]

Data that does not leave any open questions.Data that does not leave any open questions.

The ability to make a good decision based on available data.The ability to make a good decision based on available data.

Closeness between “need to know” and what the data tells you.Closeness between “need to know” and what the data tells you.

We can not tell how many cell phone contracts Egon Huber has.We can not tell how many cell phone contracts Egon Huber has.

The CC application does not provide a “Call back wanted” field.The CC application does not provide a “Call back wanted” field.

A summary report includes projects that did not report status!A summary report includes projects that did not report status!

CompleteComplete

Page 7: Data Quality Definitions

Page 7

Oliver Grill, Michael Küsters Michael KüstersSlide 7

The Characteristic „Consistency“

Consistency stands for:Consistency stands for:

Examples of Data Consistency Issues:Examples of Data Consistency Issues:

Data Consistency is the synchronization of data objectsacross the company.

Data consistency is measured as reciprocal ratio of distinct data objects per described object or event. [100% is best ]

Data Consistency is the synchronization of data objectsacross the company.

Data consistency is measured as reciprocal ratio of distinct data objects per described object or event. [100% is best ]

Data in harmony across the company.Data in harmony across the company.

The ability to trust in data regardless of source.The ability to trust in data regardless of source.

Identical information available to all processes and units.Identical information available to all processes and units.

We send Mr. Smith’s invoices to “Smith” and ads to “Schmitz”.We send Mr. Smith’s invoices to “Smith” and ads to “Schmitz”.

Asking DWH or SAP for revenue yields different numbers.Asking DWH or SAP for revenue yields different numbers.

Mr. Kim defines “churn” as cancel/total and Mr. Jones as cancel/new.Mr. Kim defines “churn” as cancel/total and Mr. Jones as cancel/new.

ConsistentConsistent

Page 8: Data Quality Definitions

Page 8

Oliver Grill, Michael Küsters Michael KüstersSlide 8

The Characteristic „Transparency“

Transparency stands for:Transparency stands for:

Examples of Data Transparency issues:Examples of Data Transparency issues:

Data Transparency is the ability to trace back data to it’s originand find out it’s real world meaning.

Data transparency is measured as percentage of maximum traceable distance by total processing steps. [100% is best ]

Data Transparency is the ability to trace back data to it’s originand find out it’s real world meaning.

Data transparency is measured as percentage of maximum traceable distance by total processing steps. [100% is best ]

Trustworthy data in the entire data supply chain.Trustworthy data in the entire data supply chain.

The ability to connect data with it’s real meaning. The ability to connect data with it’s real meaning.

Real accountability for data objects.Real accountability for data objects.

We can’t tell why Frank Müller is now “Udo Huber” in the DB!We can’t tell why Frank Müller is now “Udo Huber” in the DB!

A report contains a figure which nobody can explain.A report contains a figure which nobody can explain.

Project leaders get away with reporting “green” when it’s “red”!Project leaders get away with reporting “green” when it’s “red”!

TransparentTransparent

Page 9: Data Quality Definitions

Page 9

Oliver Grill, Michael Küsters Michael KüstersSlide 9

The Characteristic „Timeliness“

Timeliness stands for:Timeliness stands for:

Examples of Data Timeliness Issues:Examples of Data Timeliness Issues:

Data that is available without delay.Data that is available without delay.

The ability to know what you need, when you need.The ability to know what you need, when you need.

Smooth Information Flow: ‘Data Delayed’ is ‘Data Denied’!Smooth Information Flow: ‘Data Delayed’ is ‘Data Denied’!

The agenda is distributed during the Telco!The agenda is distributed during the Telco!

Customers decide for a competitor before credit is approved!Customers decide for a competitor before credit is approved!

Receiving a “budget exceeded” SMS after you went over the limit!Receiving a “budget exceeded” SMS after you went over the limit!

TimelyTimely

Data Timeliness is the availability of data at the time it needs to be utilized.

Data timeliness is measured as percentage of processing time attributed to waiting for data. [0% is best ]

Data Timeliness is the availability of data at the time it needs to be utilized.

Data timeliness is measured as percentage of processing time attributed to waiting for data. [0% is best ]