introduction to data quality services
DESCRIPTION
Introduction to Data quality services. Presentation by Tim Mitchell (Artis Consulting) www.TimMitchell.net. Today’s Agenda. Overview of DQS Structure Knowledge Base DQS Project Operations Matching Cleansing Administration SSIS Component Shortcomings. About the Presenter. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/1.jpg)
INTRODUCTION TO DATA QUALITY SERVICES
Presentation by Tim Mitchell (Artis Consulting)www.TimMitchell.net
![Page 2: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/2.jpg)
2
Today’s AgendaOverview of DQS
StructureKnowledge Base
DQS Project
OperationsMatching
Cleansing
Administration
SSIS Component
Shortcomings
![Page 3: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/3.jpg)
3
About the Presenter
Tim MitchellBI Consultant, Artis Consulting
North Texas SQL Server User Group
SQL Server MVP
Contributing author, MVP Deep Dives Vol 2
Coauthor, SSIS Design Patterns
TimMitchell.net | twitter.com/Tim_Mitchell
![Page 4: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/4.jpg)
4
Housekeeping
Questions
Surveys
![Page 5: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/5.jpg)
v
Overview of Data Quality Services
![Page 6: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/6.jpg)
6
What is DQS?
DQS is a knowledge driven data cleansing and matching servicesBuilt on top of SQL Server 2012Simple yet powerful interface
![Page 7: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/7.jpg)
7
What is DQS?
![Page 8: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/8.jpg)
8
What is DQS?
Replaces manual data quality work you’re already doing
Stored procedures
Triggers
Custom applications
![Page 9: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/9.jpg)
v
DQS Structure
![Page 10: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/10.jpg)
10
Knowledge Base
DQS Structure and Flow
Domains Matching Policies
Composite Domains
Matching Project
Cleansing Project
Matching Project
Cleansing Project
Cleansing Project
![Page 11: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/11.jpg)
11
Knowledge BaseStarting point for data quality provisioningUses locally customized data stores or marketplace data sourcesHighly reusable and evolutionaryKey elements:
Domains
Matching policies
![Page 12: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/12.jpg)
12
Knowledge BaseCreate by:
Knowledge discovery
Domain management
Matching rule
![Page 13: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/13.jpg)
13
Knowledge Base
![Page 14: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/14.jpg)
14
Domains
Domain = data fieldDomain rules
Composite domainsAllows greater flexibility in domain rules
![Page 15: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/15.jpg)
15
Data Quality Project
Create interactive projects for data matching and cleansing
Leverage one or more domains in an existing knowledge base
Somewhat reusable
![Page 16: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/16.jpg)
16
Data Quality Project
Nondestructive – no changes to source of data to be cleansed
No changes to the KB eitherSeparately, DQS project data can be used to improve the knowledge base
![Page 17: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/17.jpg)
17
Data Quality Project
![Page 18: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/18.jpg)
18
DQS Operations
CleansingProcess data against known entities and domain rules
Similar to Fuzzy Lookup transform in SSIS
MatchingGroup data together
Similar to Fuzzy Grouping transform in SSIS
![Page 19: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/19.jpg)
19
DQS Administration
Monitor past activity
Set logging options
Set confidence thresholds
![Page 20: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/20.jpg)
20
DQS Administration
![Page 21: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/21.jpg)
21
DQS and SSIS
SQL Server Integration Services has integrated hook into DQS
DQS Cleansing Component
Provide automated, noninteractive data cleansing operations
![Page 22: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/22.jpg)
22
DQS and SSIS
![Page 23: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/23.jpg)
v
Demos
![Page 24: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/24.jpg)
24
Shortcomings
V1 product
No API – must use DQS client interactively
SSIS component only does cleansing
![Page 25: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/25.jpg)
25
Final Thoughts
CU1 performance improvementshttp://bit.ly/IKmMow
DQS videos / blogshttp://technet.microsoft.com/en-us/sqlserver/hh780961
My blog (www.TimMitchell.net)
DQS/MDS virtual chaptermasterdata.sqlpass.org
![Page 26: Introduction to Data quality services](https://reader030.vdocument.in/reader030/viewer/2022033021/56816331550346895dd3b302/html5/thumbnails/26.jpg)
v
Questions?