data analytics - unifying all your data with a data fabric · 2018-11-21 · internationally...

99
Copyright © 1991 - 2018 R20/Consultancy B.V., The Netherlands. All rights reserved. No part of this material may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photographic, or otherwise, without the explicit written permission of the copyright owners. Unifying All Your Data with a Data Fabric: Beyond the Data Warehouse and Data Lake Rick F. van der Lans Industry analyst Email [email protected] Twitter @rick_vanderlans www.r20.nl

Upload: others

Post on 22-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 1991 - 2018 R20/Consultancy B.V., The Netherlands. All rights reserved. No part of this

material may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photographic, or otherwise,

without the explicit written permission of the copyright owners.

Unifying All Your Data with a Data Fabric:Beyond the Data Warehouse and Data Lake

Rick F. van der LansIndustry analyst

Email [email protected] Twitter @rick_vanderlanswww.r20.nl

Page 2: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 2

Rick F. van der Lans

Rick F. van der Lans is a highly-respected independent analyst, consultant, author, and internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data, and database technology. He is managing director of R20/Consultancy BV.

He has presented countless seminars, webinars, and keynotes at industry-leading conferences. Rick helps clients worldwide to design their data warehouse, big data, and business intelligence architectures and solutions and assists them with selecting the right products. He has been influential in introducing the new logical data warehouse architecture worldwide which helps organizations to develop more agile business intelligence systems.

In 2018 he was selected the sixth most influential BI analyst worldwide by onalytica.com.

Affiliate to SimplicityBI: SimplicityBI and Rick have independently promoted the use of data virtualization technology for years. To support the market better, they have decided to work more closely together. In the role of affiliate, Rick presents seminars and webinars, writes blogs for the SimplicityBI website, and assists the SimplicityBI specialists.

You can get in touch with Rick van der Lans via: Email: [email protected]: www.r20.nlTwitter: @Rick_vanderlansLinkedIn: http://www.linkedin.com/pub/rick-van-der-lans/9/207/223

Page 3: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 3

Agenda and Subjects

1. Introduction

2. Current Data Delivery Systems

3. The New Form of Data Usage: The

Data Marketplace

4. Replication of Meta Data

Specifications

5. Data Virtualization in a Nutshell

6. Unifying All the Data Delivery

Platforms

7. Closing Remarks

Page 4: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 4

Part 1: Introduction

Page 5: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 5

Page 6: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 6

Data hasn’t changed,

it’s just more of the same

Page 7: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 7

Data usage has changedSelf-service BIEmbedded BI

Supplier- and Customer-driven BIApplied AI in Text, Image, Video Analysis

Edge AnalyticsData Marketplace

Data ScienceAutomated decisions

Page 8: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 8

The Supply Chain

Entire network of entities, directly

or indirectly interlinked and

interdependent in serving the same

consumer or customer.

It comprises of vendors that supply

raw material, producers who

convert the material into products,

warehouses that store, distribution

centers that deliver to the retailers,

and retailers who bring the product

to the ultimate user.

Raw materials

Supplier

Manufacturing

Distribution

Customer

Consumer

Page 9: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 9

The Data Supply Chain

Entire network of …

It comprises of vendors that

supply raw data, producers

who convert the data into

products, data warehouses

that store data, distribution

centers that deliver data to

the retailers, and retailers

who bring the data to the

ultimate user.

Page 10: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 10

Data supplychain

Data producer

Data provider

Data distri-butor

Data retailer

Data enricher / blender

Data buyer

Data consumer

Actors in the Data Supply Chain

AcxiomEquifaxInfoUSATeletrack

Tracking:AdSonarPulse260

QuantcastRubicon

UndertoneTraffic-

Marketplace

1990 census:87% of the US population can be identified by

Zipcode, gender, and

DoB

Page 11: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 11

0

100

200

300

400

500

Market Cap in Billion $US

Am

azo

n

Bo

ein

g

Face

bo

ok

GE

Go

ogl

e

Hei

nek

en HP

ING

Lin

ked

In

Mic

roso

ft

Net

flix

Twit

ter

Wal

mar

t

Underlined companies are data-driven.

Page 12: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 12

0

10

20

30

40

50

Ratio Market Cap / Annual Revenue

Twit

ter

Face

bo

ok

Lin

ked

In

Go

ogl

e

Net

flix

Mic

roso

ft

Am

azo

n

GE

ING

Hei

nek

en

Bo

ein

g

HP

Wal

mar

t

Page 13: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 13

0

500,000

1,000,000

1,500,000

2,000,000

2,500,000

Annual Revenue / Employees

Net

flix

Go

ogl

e

Face

bo

ok

Mic

roso

ft

Am

azo

n

Bo

ein

g

ING GE

HP

Lin

ked

In

Wal

mar

t

Hei

nek

en

Twit

ter

Page 14: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 14

Successful Data-Driven Organizations

Page 15: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 15

Part 2: Current Data Delivery Systems

Page 16: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 16

ETL ETLETL

Sourcesystems

Data martsStagingarea

Analytics &reporting

Datawarehouse

The Classic Data Warehouse Architecture

Page 17: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 17

Limitations of the Classic DW Architecture

Limited flexibility

Duplication of data

Diminished data quality

Limited support for operational

business intelligence

Complex incorporation of big data

technology

Complex import of external data

Restricted support for self-service BI

Non-trivial support for bi-modal BI

Difficult support for streaming data

Page 18: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 18

The Logical Data Warehouse Architecture

ETLETL

Sourcesystems

Stagingarea

Analytics &reporting

Datawarehouse

Externaldata

Logical Data WarehouseArchitecture

Big data

Data V

irtualizatio

nse

rver

Page 19: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 19

The Data Lake

Data sourcesInvestigative

analytics

ETData lake

ETL

ETL

ETL

Data science

ET

Page 20: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 20

Challenges of a Physical Data Lake

Big data too big to move

Too slow to copy and bandwidth issues

Complex “T” moved to data usage

Company politics

Data privacy and protection regulations

Data in data lake is stored outside

original security realm

Metadata to describe data

Some sources are hard to copy

For example, mainframe data

Refreshing of data lake

Management of data lake required

Data lake

Page 21: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 21

APIGateway

APISystem

API

Clientapp 1

Client app 2

Client app n

Data Services and Apps

Page 22: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 22

Managed File Transfer

FileProduction

FileProcessing

Network

Page 23: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 23

Producersof data

Storage ofstreaming data

Consumersof data

Listener

Listener

Listener

Streamprocessor

Data Streaming

Page 24: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 24

Part 3: The New Form of Data Usage:

The Data Marketplace

Page 25: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 25

Examples of Public Data Marketplaces

DataMarket offers more than 45,000 datasets from

around the world, delivered by among others 42

governments

DataStreamX is the global marketplace for commercial

data. Founded in 2014, their mission is to accelerate

data access worldwide by bringing together buyers and

vendors of data onto one simple-to-use platform

QunB allows companies to upload their own data to

QunB and to combine it with other datasets; these

datasets can be sold or can be given away for free

Knoema provides access to over 100 million time

series. All available data is interactive and can be

exported if needed

Data.Gov offers more than 190,000 data sets.

Page 26: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 26

Shopping for Data at theData Marketplace

Page 27: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 27

Data Warehouse - Taylor-Made Reports

Page 28: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 28

We Assume Too Much

Page 29: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 29

externaldata

ETL ETLETL

Sourcesystems

Data martsStagingarea

Analytics &reporting

Datawarehouse

Governance

We’re in Denial

Page 30: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 30

Assumption: Users Know What They Want

“If I had askedpeople what theywanted,they would have said faster horses.”

- Henry Ford

Page 31: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 31

Assumption: Users Know What They Want

“People don’tknow what theywant until youshow it to them.”

- Steve Jobs

Page 32: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 32

“It’s not the customer’s job toknow what theywant.”

- Steve Jobs

Assumption: Users Know What They Want

Page 33: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 33

Assumption: Transactional Data Fulfills the User’s Information Needs

Page 34: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 34

Assumption: Users Understand BI Tools

Source: Wayne Eckerson - http://insideanalysis.com/2013/04/the-promise-of-self-service-bi/ April 2013

Page 35: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 35

Assumption: Users Love Developing Reports

Page 36: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 36

Assumption: Users Love Developing Reports

“Most goodprogrammers do programming notbecause they expect toget paid or get adulation by the public, but because it is fun to program.”

- Linus Torvalds

Page 37: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 37

Assumption:Users Love Developing Reports

“In fifteen yearswe’ll be teaching programming justlike reading andwriting … andwondering why we didn’t do it sooner.”

- Mark Zuckerberg

Page 38: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 38

The Private/Enterprise Data Marketplace

Business users

Enterp

rise Data

Marketp

lace

Data sets

Page 39: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 39

Potential Data Products

Data as file

Data via SQL

Report

Embeddable KPI

Service

Stream of DataApps

Page 40: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 40

Data Warehouse versus Data Marketplace

With an enterprise data warehouse,

IT develops what the business

requests.

Page 41: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 41

Data Warehouse versus Data Marketplace

With an enterprise data warehouse,

IT develops what the business

requests.

With an enterprise data

marketplace,

IT develops what they think the

business needs.

Page 42: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 42

Enterprise Data Marketplace and the Data Shopper

The data marketplace is a storefront

Users can shop for data products

Private data and public data

Users are shoppers

Internal and external users

Find the data products that meet the

users’ needs

Users can develop their own data

products to be shared by others

Page 43: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 43

From Taylor-Made to Ready-Made

Page 44: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 44

Types of Data Marketplaces (Data Stores)

Type of Store Description

Taylor made stores Sell data products asked for specifically by customers

Specialty stores Sell small set of highly specialized data products

Brand stores Sell only data products they produce themselves

Mom and pop stores Sell data products others produce

General stores Sell data products they produce and others produce

eBay-like Sell data products for third-parties

Hyper stores Sell everything

Page 45: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 45

Features of a Data Marketplace

Data description

Categorization

Definitions

Tags

Search

Metadata

Data catalog

Business glossary

Data security and

privacy

Interfaces

File interface

Service interface

SQL interface

Analytical interface

Data insert

by owner

by customers

Price

Free

Subscription

Pay by the sip

Page 46: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 46

Challenge 1: Research and Development

Page 47: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 47

Challenge 2:Prioritizing Development of Data Products

Page 48: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 48

Challenge 3: Marketing and Selling Data Products

Page 49: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 49

Challenge 4: Discoverable Data Products

Categories

Descriptions

Definitions

Tags

Metadata

Data catalog

Business

glossary

Page 50: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 50

Challenge 5: Who Pays?

Data products are developed

before they are requested

Data warehouse reports are paid

in advance

Pay by the sip?

Subscription?

What if data products don’t

sell?

Page 51: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 51

Challenge 6: Organization

Developers need input from

the business

Developers need to

understand the business

Current and future needs

BICC not a cost center

anymore

The need for commercially-

oriented people

Page 52: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 52

Comparison on Characteristics

Data Warehouse Data Lake Data Marketplace

User Business Data scientists Anyone

Deliverable Data Model Data product

Development style

Pre-programmed Investigative Pre-programmed and investigative

Data usage Query Query Query and create

Data form Processed Raw Processed

Payment of dvelopment

Beforedevelopment

Before/whiledevelopment

After development

Development time

After request After request Before request

Page 53: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 53

IT Must Understand the Business

Page 54: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 54

Part 4: Replication of

Meta Data Specifications

Page 55: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 55

Many Data Delivery Systems

The classic data

warehouse architecture

The data lake

The data marketplace

Data services

Managed file transfer

Data streaming

Page 56: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 56

From Data to Reports

Specifications

Sourcesystems Analytics & reporting

Data structure specifications

Integration specifications

Transformation specifications

Data security specifications

Data cleansing specifications

Analytical specifications

Visualization specifications

Data privacy specifications

Page 57: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 57

Data Delivery System 1: The Data Warehouse

ETL ETLETL

Sourcesystems

Data martsStagingarea

Analytics &reporting

Datawarehouse

Data structure specifications

Integration specifications

Transformation specifications

Data cleansing specifications

Analytical specifications

Visualization specifications

Page 58: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 58

Data Delivery System 2: The Data Lake

Data sourcesInvestigative

analytics

ET

Data lake

ETL

ETL

ETL

Data science

ET

Page 59: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 59

Data Delivery System 3: The Data Marketplace

Business users

Data

Marketp

lace

Data sources

Page 60: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 60

Data Delivery System 4: Data Services

API GatewayAPI

SystemAPI

Client app 1

Client app 2

Client app n

Page 61: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 61

Data Delivery System 5: Managed File Transfer

FileProduction

FileProcessing

Network

Page 62: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 62

Data Delivery System 6: Data Streaming

Producersof data

Storage ofstreaming data

Consumersof data

Listener

Listener

Listener

Streamprocessor

Page 63: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 63

Shared Sources and Shared Users

Business users

Data warehouse

Data lake

Data marketplace

Data services

Data file transfer

Data streaming

Data delivery systemsData sources

Page 64: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 64

Drawback: Replicated Specifications

Data warehouse

Data lake

Data marketplace

Data streaming

Data file transfer

Data services

Page 65: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 65

Drawback: Replicated Specifications

SourceSystem 1

SourceSystem 2

Data warehouse

Data lake

Data services

Analytics & reporting

Data science

App=

=

Page 66: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 66

The Solution is not an

Extension of the

Data Warehouse Architecture

Page 67: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 67

The Solution is not an

Extension of the

Data Lake

Page 68: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 68

Part 5: Data Virtualization in a Nutshell

Page 69: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 69

Data Virtualization Overview

productionapplication website

analytics& reporting

mobileApp

internalportal dashboard

Data Virtualization Server

SQLdatabases

streamingdatabases

socialmedia data

Hadoop,NoSQL

database

ESBmessaging

unstructureddatalegacy

database

cloudapplications

privatedata

applications

Page 70: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 70

Data Virtualization Overview

streamingdatabases

socialmedia data

productionapplication website

analytics& reporting

mobileApp

internalportal dashboard

privatedata

ODBC/SQL JDBC/SQL XML/SOAP REST/JSON XQuery MDX/DAX

JMS SQL SQL+ XSLT Hive Prop. Excel JSONCICS SOAP

applications

SQL statement

JMS message SQL statement SOAP messageData Virtualization Server

unstructureddataSQL

databasesHadoop,NoSQL

database

ESBmessaging

legacydatabase

cloudapplications

Page 71: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 71

Dat

a V

irtu

aliz

atio

n S

erve

r

Virtual table pointing to source

Data consumer

Importing Source Data

Source

Page 72: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 72

Dat

a V

irtu

aliz

atio

n S

erve

r

Virtual table pointing to source

Virtual table:May contain row selections, column selections, column concatenations, transformations, column and table name changes, groupings, aggregations, data cleansing, …

Data consumer

Developing Virtual Tables

Source

Page 73: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 73

Layers of Virtual Tables

Enterprise data layer

Data consumption

layer

Data sourcelayer

Data V

irtualizatio

n Server

Dataconsumers

Datasources

Page 74: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 74

Different Users Accessing Different Virtual Layers

Reporting Data scienceSelf-service BI

Enterprise data layer

Data consumption

layer

Source data layer

Page 75: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 75

Caching to Mimimize Access of Data Stores

Virtual tablewith cache

Virtual tablewithout cache

Data source Data source

Page 76: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 76

Data Virtualization

Data sources

ETL ETL Cached Cached

Data Virtualization

Page 77: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 77

The Market of Data Virtualization Servers

AtScale

Cirro Data Hub

DataVirtuality (Pipes, Pipes Prof, LDW)

Denodo Platform

Dremio

Fraxses

IBM InfoSphere Federation Server &

IBM Data Virtualization Manager for

z/OS (formerly Rocket Data

Virtualization)

Red Hat JBoss Data Virtualization (Teiid)

Stone Bond Enterprise Enabler Virtuoso

Tibco Data Virtualization (formerly Cisco

& Composite)

And many more …

Page 78: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 78

Part 6: Unifying All the

Data Delivery Systems

Page 79: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 79

The Unified Data Fabric

Analytics &reporting

Co

mm

on

Data D

elivery

Data warehouse

Data lake

Data marketplace

Data services

Data file transfer

Data streaming

Data delivery systemsData sources

Co

mm

on

Data Extractio

n

Page 80: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 80

New Principles for Data Delivery Platforms

One unified data fabric for all the data

Transactional, external, fast (streaming), sensor, …

One unified data fabric for all forms of data consumption

Standard reporting, self-service BI, apps, data science, mobile apps, …

Centralized and active metadata specifications

Searchable definitions and descriptions for technical and business users

Lineage and impact analysis

Data storage and access technology agnostic

Hadoop, SQL, cubes, …

Abstraction

Pushing the processing to the data, not the data to the processing

Decentralized data production

Edge analytics

Hyper-decentralized data production and storage

Page 81: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 81

The Unified Data Fabric

Data consumption

Transactional data

Sensor data

External data

Streaming data

Data hub

Data warehouse

DataDelivery

Master data

Cached data

UnifiedData

Fabric

All Data

Page 82: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 82

Data Fabric Building Blocks: ODS

Data loaded continuously

Data replication, ESB, messaging

Historic data

Versions

Not integrated data structures, but

integrable data structures

Standardized, processed

As close to up to date as possible

Polyglot persistent

Page 83: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 83

Data Fabric Building Blocks: Data Warehouse

Primarily structured data

Data loaded periodically - ETL

Historic data

Versions

Not integrated data structures, but

integrable data structures

Cleansed, standardized, processed

As close to up to date as possible

Polyglot persistent

Auditable and governable

SQL access to make data available to many

tools

Page 84: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 84

Data Fabric Building Blocks: Master and Cached Data

Master data:

In the original sense of the word

The golden records

The single version of the truth

Fast master data interface

Cached Data:

Used to speed up access

For continues and incidental access

Multiple database technologies

Page 85: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 85

Reporting and Analysis

Data consumption

Transactional data

Sensor data

External data

Streaming data

Data hub

Data warehouse

DataDelivery

Master data

Cached data

UnifiedData

Fabric

All Data

Page 86: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 86

Data Science

Data consumption

Transactional data

Sensor data

External data

Streaming data

Data hub

Data warehouse

DataDelivery

Master data

Cached data

UnifiedData

Fabric

All Data

Page 87: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 87

Mobile Apps

Data consumption

Transactional data

Sensor data

External data

Streaming data

DataDelivery

Master data

Cached data

UnifiedData

Fabric

All Data

Data hub

Data warehouse

Page 88: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 88

Streaming Data

Data consumption

Transactional data

Sensor data

External data

Streaming data

Data hub

Data warehouse

DataDelivery

Master data

Cached data

UnifiedData

Fabric

All Data

Page 89: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 89

Tools for the Unified Data Fabric (1)

Data consumption

Transactional data

Sensor data

External data

Streaming data

Data hub

Data warehouse

DataDelivery

Master data

Cached data

UnifiedData

Fabric

All Data

DenodoSQL Server views

Polybase

MS Master Data Services

MS SQL ServerMS Parallel DW

Azure SQL DatabaseAzure SQL DW

Page 90: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 90

Tools for the Unified Data Fabric (2)

Data consumption

Transactional data

Sensor data

External data

Streaming data

Data hub

Data warehouse

DataDelivery

Master data

Cached data

UnifiedData

Fabric

All Data

Azure StreamAnalytics

Cosmos DBAzure Data Lake

Hadoop

PowerBIReporting ServicesAzure DataBricks

SparkR

Page 91: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 91

Part 7: Closing Remarks

Page 92: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 92

Data usage has changedSelf-service BIEmbedded BI

Supplier- and Customer-driven BIApplied AI in Text, Image, Video Analysis

Edge AnalyticsData Marketplace

Data ScienceAutomated decisions

Page 93: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 93

Page 94: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 94

Advantages of a Unified Data Fabric

Improved time to market

Define specifications once, and

reuse many times

Improved report consistency

One data processing factory

Reduced duplication of data

Easier management of data

Improved transparency

Reduced development costs

Less reinvention of the wheel

over and over again

Page 95: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 95

Watch Out For

Data Delivery Islands!

Page 96: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 96

Page 97: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 97

Whitepapers by Rick van der Lans – www.r20.nl

Architecting the Multi-Purpose Data Lake With Data Virtualization, April 2018

The Next Wave of Analytics - At the Edge, December 2017

Data Virtualization in the Time of Big Data, December 2017

Developing a Bi-Modal Logical Data Warehouse Architecture Using Data Virtualization, October 2016

Designing a Logical Data Warehouse, February 2016

Designing a Data Virtualization Environment; A Step-By-Step Approach, January 2016

How Drill Enriches Self-Service Analytics; The Added Value of a SQL-on-Everything Engine; November 2015; sponsored by MapR Technologies

Strengthening Self-Service Analytics With Data Preparation and Data Virtualization, September 2015

Agile Data Modeling: Not an Option: but Essential, April 2015

Streamlining Self-Service BI with Data Virtualization and a Business Directory, March 2015

Migrating to Virtual Data Marts using Data Virtualization, January 2015

Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization, November 2014

The New Generation of Self-Service BI; Avoiding Typical Self-Service BI Pitfalls With an Integrated BI Platform

Creating an Agile Data Integration Platform using Data Virtualization

Empowering Operational Business Intelligence with Data Replication

Data Virtualization for Business Intelligence Agility, February 2012

Page 98: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 98

Articles by Rick van der Lans – www.r20.nl

Simplifying Big Data Integration with Data Virtualization, October 2017

Data Virtualization for Developing Customer-Facing Apps, August 2017

Do Data Scientists Really Ask for Physical Data Lakes, May 2017

Challenges for Developing Data Lakes, March 2017

What Do You Mean, SQL Can't Do Big Data?, March 2017

When to Use NoSQL, January 2017

The Big BI Dilemma, November 2016

The Roots of the Logical Data Warehouse Architecture, November 2016

The Logical Data Warehouse Architecture is Not the Same as Data Virtualization, October 2016

Data Virtualization is Not the Same as Data Federation, October 2016

The Big BI Dilemma, October 2016

Interview with Rick van der Lans: New Technologies Complementing Traditional BI, September 2016

Analysts and Data Scientists Need SQL-on-Everything, December 2015

Convergence of Data Virtualization Servers and SQL-on-Hadoop Engines?

Polyglot Persistence and Future Integration Costs

Data Virtualization and Data Vault: Double Agility

Big Data Warehouses Require Hybrid Data Storage

Drowning in Data Lakes and Data Reservoirs

An Overlooked Difference Between SQL-on-Hadoop Engines and Classic SQL Database Servers

Data Virtualization: Where Do We Stand Today?

….

Page 99: Data Analytics - Unifying All Your Data with a Data Fabric · 2018-11-21 · internationally acclaimed lecturer specializing in data warehousing, business intelligence, big data,

Copyright © 2018 R20/Consultancy B.V., The Netherlands 99