dallas data brewery meetup #2: data quality perception

Post on 18-Dec-2014

206 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Brief, introductiory slides for second Dallas Data Brewery meetup. Topic: Data Quality Perception.

TRANSCRIPT

Data Quality Perception

data brewery

Dallas Data Brewery, June 2013

Topic■ What is "high quality data"?

■ What are data quality expectations?you, people or businesses you know have

■ Business issues and data qualityHow to you deal with it?

■ What happens when you ignore it?

What is data quality ?

Dimensions■ completeness – data provided

■ accuracy – reflecting real world

■ credibility – regarded as true

■ timeliness – up-to-date

■ consistency – matching facts across datasets

■ integrity – valid references between datasets

... and there are more

Fallacies

■ “good data are error-free and valid”

■ “improving quality means cleansing”

■ “it is IT problem”

■ “it can be fixed”

Short Story:Completeness

Open Public Procurements

from this...

... to this:

http://tendre.sme.sk

0%

25%

50%

75%

100%

2005

-320

05-5

2005

-720

05-9

2005

-11

2006

-120

06-3

2006

-520

06-7

2006

-920

06-1

120

07-1

2007

-320

07-5

2007

-720

07-9

2007

-11

2008

-120

08-3

2008

-520

08-7

2008

-920

08-1

120

09-1

2009

-320

09-5

2009

-720

09-9

2009

-11

2010

-120

10-3

2010

-520

10-7

2010

-9

better

have it all

none

Quality measure

completeness: 55%

how many % of the field is filled and successfully processed?

type 1 type 2

+

how many % of the field is filled and successfully processed?

0%

25%

50%

75%

100%

2005-3

2005-5

2005-7

2005-9

2005-10

2005-12

2006-3

2006-5

2006-7

2006-9

2006-11

2007-1

2007-3

2007-5

2007-7

2007-9

2007-10

2007-12

2008-3

2008-5

2008-7

2008-9

2008-11

2009-1

2009-3

2009-5

2009-7

2009-9

2009-11

2010-1

2010-3

2010-5

2010-7

2010-9

Quality measure

completeness: 88%

better

have it all

none

What does that mean:

“high quality data?”

?

85% ?

Conclusion

appropriate for given purpose

Data Project

■ define data quality requirements

■ measure during development

■ provide data quality report

More topics

■ Data quality measurementindicators, probes

■ Data quality managementroles, processes, impact

■ Data cleansing

top related