denodo data virtualization platform architecture: data discovery and data governance (session 4 from...
TRANSCRIPT
Five In-depth Technology and Architecture Sessions on Data Virtualization
Session 4: Data Discovery & Governance
Today’s Speaker
Anastacio Molano
Head of Solutions and Business Development
Architect-to-Architect Series
■ Series of five webinars over 3 months
■ Deeper look into Denodo Platform
■ Architectural Overview
■ Performance
■ Scalability (today’s session)
■ Data Discovery and Governance
■ Security
Denodo Express
■ Denodo Express
■ Free to Download
■ Fully functioning Data Virtualization Platform
■ Single user, supports common data sources
■ Many of the same capabilities of Denodo Platform
■ Performance, Data Discovery, Governance, internal Security, Publishing, …
Data Discovery & Governance –Architecture Modules
Data Discovery and Governance
■ Metadata Repository
■ Embedded Apache Derby database
■ Contains view and data source configuration data
■ Relationships between views – dependencies, etc.
■ Searchable – Catalog searches, etc.
■ Data Lineage
■ Trace how data changes between source and consumer
■ Change Impact Analysis
■ What is the impact of a change in a data source schema?
Data Discovery and Governance
■ RESTful Web Services
■ Supports Global Search functionality
■ Index and search data sources
■ ‘Google’ like search
■ Linked Data Services
Metadata Introspection
■ Denodo Platform gathers metadata from data sources
■ Automatically or via configuration
■ Maps native data types to ‘Denodo types’
■ Inspects indexes in the sources
■ Analyzes source query capabilities and abstracts them into common model
■ Stores all metadata and configuration data in Metadata repository
■ Uses built-in Apache Derby database
■ Small size – only stores metadata…actual data is retrieved in real time from sources or cache
Metadata Catalog
■ Two ways to inspect catalog
■ Graphically in Denodo Admin Tool
■ Search and browse contents of Metadata Repository
■ Filter by element type, name, date created, etc.
■ Drill down to view schema for individual elements
■ Programmatically
■ SQL Query using ‘list’ and ‘desc’ commands
■ e.g. ‘list views’ or ‘desc view address’
■ Stored procedures for complex catalog queries
Data Lineage
■ Graphical view for showing data lineage for any field in any virtual view
■ Trace source of any field
■ Includes any functions applied to field contents
■ Trace source of calculated fields
■ View calculations used to create new fields
Used By Tool
■ Graphical view for showing where a view is used
■ ‘Big picture’ view of usage
■ Useful tool for seeing impact of changes on whole system
Change Impact Analysis
■ Denodo Platform can perform impact analysis to show impact of data source changes
■ Highlights changes to data source
■ Shows other views impacted by the change
■ i.e. derived views using the changed base view
■ Select which views you want to propagate changes to…
Data Discovery – Global Search
■ Denodo Platform also supports data discovery for non-Admin Tool users
■ Not all users have access to Admin Tool
■ Browser-based ‘global search’ provides simple search mechanism
■ Keyword-based searches for intuitive discovery process
■ Search metadata and data to find what you want
■ Browse schema or actual data
■ Traverse relationships between entities
Data Discovery – Under the Covers
■ Global Search uses Denodo Aracne’s indexer to index data source contents
■ Based on Lucene indexer
■ Indexes stored in Denodo Platform
■ Searches run against indexes
■ Faster retrieval of ‘hits’ without overhead of full scans on sources
■ Indexing can be scheduled
■ Denodo Scheduler runs indexing jobs
■ e.g. overnight when minimal impact
Data Discovery & Governance - Summary
■ Data discovery and governance is pervasive in Denodo Platform
■ Users can inspect catalog of virtualization objects through catalog search to find data combinations for reuse
■ Data lineage helps users to understand where data has come from and how it has changed from the source
■ Impact analysis helps architects understand the consequences of changes in the data source schemas
■ Propagate changes selectively with a single click
■ Global Search gives a full view of the Data Virtualization project
■ Both data and metadata at the same time
■ Start from whole data sets, then drill down to individual data rows
■ Point and click to traverse associations between entities and to access related data
Q&A
Data Virtualization – Next Steps
Move forward at your own pace
Download Denodo Express –
The fastest way to Data Virtualization
Denodo Community: Documents, Videos, Tutorials, and more.
Attend Architect-to-Architect Series
Performance
Scalability
Move forward with one of our Data Virtualization experts
Phone: (+1) 877-556-2531 (NA)
Phone: (+44) (0)20 7869 8053 (EMEA)
Email: [email protected] | www.denodo.com
Data Discovery and Governance
Security
Five In-depth Technology and Architecture Sessions on Data Virtualization
Thank You!
Next SessionSession 5
Denodo Platform: Security