introduction to data quality services
DESCRIPTION
Introduction to Data quality services. Presentation by Tim Mitchell (Artis Consulting) www.TimMitchell.net. Today’s Agenda. Overview of DQS Structure Knowledge Base DQS Project Operations Matching Cleansing Administration SSIS Component Shortcomings. About the Presenter. - PowerPoint PPT PresentationTRANSCRIPT
INTRODUCTION TO DATA QUALITY SERVICES
Presentation by Tim Mitchell (Artis Consulting)www.TimMitchell.net
2
Today’s AgendaOverview of DQS
StructureKnowledge Base
DQS Project
OperationsMatching
Cleansing
Administration
SSIS Component
Shortcomings
3
About the Presenter
Tim MitchellBI Consultant, Artis Consulting
North Texas SQL Server User Group
SQL Server MVP
Contributing author, MVP Deep Dives Vol 2
Coauthor, SSIS Design Patterns
TimMitchell.net | twitter.com/Tim_Mitchell
4
Housekeeping
Questions
Surveys
v
Overview of Data Quality Services
6
What is DQS?
DQS is a knowledge driven data cleansing and matching servicesBuilt on top of SQL Server 2012Simple yet powerful interface
7
What is DQS?
8
What is DQS?
Replaces manual data quality work you’re already doing
Stored procedures
Triggers
Custom applications
v
DQS Structure
10
Knowledge Base
DQS Structure and Flow
Domains Matching Policies
Composite Domains
Matching Project
Cleansing Project
Matching Project
Cleansing Project
Cleansing Project
11
Knowledge BaseStarting point for data quality provisioningUses locally customized data stores or marketplace data sourcesHighly reusable and evolutionaryKey elements:
Domains
Matching policies
12
Knowledge BaseCreate by:
Knowledge discovery
Domain management
Matching rule
13
Knowledge Base
14
Domains
Domain = data fieldDomain rules
Composite domainsAllows greater flexibility in domain rules
15
Data Quality Project
Create interactive projects for data matching and cleansing
Leverage one or more domains in an existing knowledge base
Somewhat reusable
16
Data Quality Project
Nondestructive – no changes to source of data to be cleansed
No changes to the KB eitherSeparately, DQS project data can be used to improve the knowledge base
17
Data Quality Project
18
DQS Operations
CleansingProcess data against known entities and domain rules
Similar to Fuzzy Lookup transform in SSIS
MatchingGroup data together
Similar to Fuzzy Grouping transform in SSIS
19
DQS Administration
Monitor past activity
Set logging options
Set confidence thresholds
20
DQS Administration
21
DQS and SSIS
SQL Server Integration Services has integrated hook into DQS
DQS Cleansing Component
Provide automated, noninteractive data cleansing operations
22
DQS and SSIS
v
Demos
24
Shortcomings
V1 product
No API – must use DQS client interactively
SSIS component only does cleansing
25
Final Thoughts
CU1 performance improvementshttp://bit.ly/IKmMow
DQS videos / blogshttp://technet.microsoft.com/en-us/sqlserver/hh780961
My blog (www.TimMitchell.net)
DQS/MDS virtual chaptermasterdata.sqlpass.org
v
Questions?