Transcript
Page 1: Data Vault Introduction

Data Vault Fundamentals & Best Practices

1

Erik Fransen, managing consultant+31 6 159 444 76@erikfransen

Page 2: Data Vault Introduction

Agenda• Introduction• Data Vault Basics• Benefits & Challenges• Best practices: Automation & Data

Virtualization• Recommended reading

2

Page 3: Data Vault Introduction

• Founded in 1998, The Hague, NL• 40+ consultants• Business Intelligence, Data Vault, Datawarehousing,

Datawarehouse Automation, Big Data, Data Virtualization• Business & technical consultancy, end-to-end

implementation projects of Data Vault EDW, audits, training, certification

• Wide range of customers (profit, non-profit) across variousindustries

• Since 2009 Genesee Academy partner for Data Vault Day and Data Vault Certification in NL, B & D

• Implementation partner of Cisco, MapR, Qlik & Tableau

Page 4: Data Vault Introduction

The Data Vault modeling approachData Vault is a data modeling approach

…so it fits into the family of modeling approaches:

4

3rd NormalForm EnsembleModeling Dimensional

• While 3rd Normal Form is optimal for Operational Systems

…and Dimensional is optimal for Data Marts

…the Ensemble Modeling is optimal for the Datawarehouse

• And Data Vault is the leading form of Ensemble Modeling

Page 5: Data Vault Introduction

Forms of Ensemble Modeling

5

Page 6: Data Vault Introduction

Why do we use Data Vault for DWH?

6

• When we need a DWH that supports:– Integration– Traceability– History– Incremental Build– Agility

• Gracefully Adapts to New Sources• Full Auditability - Source to Mart• Enterprise View of Central Data• Ready for Automation

DataVault isspecificallydesigned for modelling the

EDW

Page 7: Data Vault Introduction

The Data Vault Ensemble

7

• The Data Vault Ensemble conforms to a single key – embodied in the Hub construct

• The parts for the Data Vault Ensemble only include:– Hubs The Natural Business Keys– Links The Natural Business Relationships– Satellite s All Context, Descriptive Data and History of

Links and Hubs“Separating things that change from things that don’t change”

Page 8: Data Vault Introduction

The Data Vault modeling approach

• As the scope of the EDW is expanded and new data sources added, the Data Vault can adapt to these changes without impacting the existing model

• This is what allows the EDW to be built incrementally and to adapt to change without the need for re-engineering.

NewAreaabsorbed

8

H_Cust

H_SaleH_Empl

H_Store

H_Car

Toolsfor DWHAutomationupdatetheDataVaultEDW(model+data)inafast,agile&consistentway

Page 9: Data Vault Introduction

• Business benefits• Ability to adapt quickly to new business needs• Data is traceable allowing for a fully auditable, integrated data store• Allows the EDW to absorb all data all of the time• Easily adapts to new data sources and changing business rules – without expensive re-

engineering• Results in an Data Warehouse with lower total cost of ownership (TCO)• Automation: short time to market, consist quality

• Project/development benefits • Ideal for agile development techniques resulting in lower project risk and more

frequent deliverables• Can be built incrementally without compromising the core architecture• Automation: fast and incremental sprints, predictable costs

• Architectural benefits• Parallel loading• Data architecture that supports future expanded scope• Can scale to virtually any size• Ready for Automation: forces standardization

Data Vault Benefits

9

Page 10: Data Vault Introduction

Data Vault Modeling Process

The Modeling Process for creating a Data Vault model includes three primary steps:

1) Identify and Model the Core Business Concepts• Business Interviews is at the heart of this step

What do you do? What are the main things you work with?• Also find best/target Natural Business Key

2) Identify and Model the Natural Business Relationships• Specific Unique Relationships

3) Analyze and Design the Context Satellites• Consider Rate of Change, Type of Data and also the Sources of

your data during design process

10

Ideallythedatavaultismodelled basedonbusiness processesandbusiness

concepts

Page 11: Data Vault Introduction

Getting data out of the Data Vault • Problem:

– The Data Vault EDW is about data decomposition, data registration and data integration

– Data Vault is not intended, nor designed or optimized for data distribution and data consumption downstream the EDW

– Leads typically to many complex physical data marts (high maintenance, high cost)

• Solution:– Start thinking differently: focus on creating functional data

products for the business– Stop loading and replicating data physically, start using

data virtualization 11

Page 12: Data Vault Introduction

Eliminate the need for physical data martsNo data replicationneededReal-time data refreshmentNo redundant data storageSimple updates of data modelsSimple queries

Short Time toMarketAutomatic updatesLower storage costsHigh performanceReady for Big Data

DataVaultEDW

CRM

ERP

Weblogs

Production

DataDataCopy

Steeringinformation

SQL

DataVirtualization

Tool+

DataAbstraction

Layers

NoDataCopyatall

12

Page 13: Data Vault Introduction

Virtual

13

SuperNovaDataModel

OperationalDataModel

UniformDataModel

DataVirtualization ”Physical”Model

VirtualApplicationLayer

Virtual“Physical”Layer

VirtualBusinessLayer

Webservices Views

Any other sourcedata

Data Layers for Data Virtualization

DataVault datawarehouse

Automated step!

Page 14: Data Vault Introduction

Wrap up• Data Vault Basics:

– Hubs, Links, Satellites– Integration, history, incremental modelling, agility

• Benefits: – Business, project, architecture– Make use of automation tools for fast, agile and consistent

delivery• Challenges:

– Data downstream the data vault EDW– Solution: use virtual data marts and automate SuperNova

data models for reporting & analytics

14

Page 15: Data Vault Introduction

Recommendedreading onSuperNovaFreedownloadhttp://www.cisco.com/web/services/enterprise-it-services/data-

virtualization/documents/whitepaper-cisco-datavaul.pdf

15

Page 16: Data Vault Introduction

RecommendreadingonDataVaultFreedownloadshttp://hanshultgren.wordpress.com/

16

Page 17: Data Vault Introduction

RecommendreadingonEnsemble&DataVaultModelingtheAgileDataWarehousewithDataVault

• DataVaultModeling• AgileDataWarehousingBI• EnterpriseDataWarehousing• DataIntegrationandDWBIArchitecture• UnifiedDecomposition™

• EnsembleModeling™

• AcompletebookonDataVault• AnIntroduction,aGuideandaReference• Modeling,Architecture&theDataWarehousingProgram• Data&SemanticIntegrationforEnterpriseCentralMeaning• ApplyingConceptstoasuccessful AgileDWBIProgram

17

Page 18: Data Vault Introduction

RecommendreadingonDataVirtualizationDataVirtualizationinBusinessIntelligenceArchitectures

• Firstindependent book ondatavirtualization thatexplains inaproduct-independentwayhow datavirtualization technology works.

• Illustrates concepts using examples developed withcommercially available products.

• Showsyou how to solve commondataintegrationchallenges such asdataquality,systeminterference,and overallperformanceby followingpracticalguidelines onusing datavirtualization.

• Apply datavirtualization rightawaywith threechapters fullofpracticalimplementation guidance.

• Understandthebigpictureofdatavirtualizationand its relationship with datagovernance andinformationmanagement.

18

Page 19: Data Vault Introduction

Data Vault Training & Certification

• CDVDM: March 31, April 1 2016 Amsterdam• DVD: March 2, 2016 Diegem

• www.centennium-opleidingen.nl• For all questions: [email protected]

19

Page 20: Data Vault Introduction

A short history on Data Vault• 2002: First papers published by Dan Linstedt• 2006: Start CDVDM certification program by Genesee

Academy • 2007: Start of Data Vault EDW implementations

– Primarily in Europe (NL, S), some in USA

• 2008-2015: Several books published on DataVault by Dan Linstedt, Hans Hultgren and others

• 2013: Data Vault on the radar in B, DACH, UK, USA, AUS, NZ, Asia

• 2013: Data Vault EDW implementations going worldwide• 2015: Over 900 CDVDM professionals and 750+ Data Vault

EDW worldwide20


Top Related