red hat jboss data virtualization

29
www.dlt.com Red Hat JBoss Data Virtualization July, 2016 Rick Stewart, Middleware SA Herndon, VA

Upload: dlt-solutions

Post on 20-Jan-2017

172 views

Category:

Services


2 download

TRANSCRIPT

Page 1: Red Hat JBOSS Data Virtualization

www.dlt.com

Red Hat JBoss Data Virtualization

July, 2016

Rick Stewart, Middleware SA

Herndon, VA

Page 2: Red Hat JBOSS Data Virtualization

7/19/16 DLT Solutions LLC – Proprietary & Confidential 2

“Kiss” “Whitesnake” “Poison”

“Bad Company”Data

Warehouse

Page 3: Red Hat JBOSS Data Virtualization

“Bad Company”

7/19/16 DLT Solutions LLC – Proprietary & Confidential 3

“Kiss” “Whitesnake” “Poison”

Data

WarehouseData Virtualization Server

Page 4: Red Hat JBOSS Data Virtualization

What does Data Virtualization software do?

7/19/16 DLT Solutions LLC – Proprietary & Confidential 4

Virtual Consolidated Data Source

BI Reports

Data Virtualization Software•Consume•Compose•Connect

SAP Salesforce.comOracle DW XML, CSV& Excel files

Siloed &Complex

VirtualizeAbstractFederate

Easy,Real-time

InformationAccess

Applications

DATA CONSUMERS

DATA SOURCES

Page 5: Red Hat JBOSS Data Virtualization

“Bad Company”

7/19/16 DLT Solutions LLC – Proprietary & Confidential 5

“Kiss” “Whitesnake” “Poison”

Data

WarehouseData Virtualization Server

Page 6: Red Hat JBOSS Data Virtualization

“Bad Company”

7/19/16 DLT Solutions LLC – Proprietary & Confidential 6

“Kiss” “Whitesnake” “Poison”

Data

WarehouseData Virtualization Server

Page 7: Red Hat JBOSS Data Virtualization

Data Challenges Getting Bigger

7/19/16 DLT Solutions LLC – Proprietary & Confidential 7

BI ReportsOperational

ReportsEnterprise

Applications Cloud Native Applications

Mobile Applications

Hadoop NoSQL Cloud Apps Data Warehouse & Databases

Mainframe XML, CSV& Excel Files

Enterprise Apps

Integration Complexity

Consumption & Creation

Siloed

How to Integrate?

Page 8: Red Hat JBOSS Data Virtualization

Improve Access to Your Data

7/19/16 DLT Solutions LLC – Proprietary & Confidential 8

BI ReportsOperational

ReportsEnterprise

Applications Cloud Native Applications

Mobile Applications

Hadoop NoSQL Cloud Apps Data Warehouse & Databases

Mainframe XML, CSV& Excel Files

Enterprise Apps

Broad & Streamlined

Adaptable & Secure

Federated & MeaningfulData Virtualization Server

Page 9: Red Hat JBOSS Data Virtualization

Simplify Access to Your Data

7/19/16 DLT Solutions LLC – Proprietary & Confidential 9

streamingdatabases

socialmedia data

productionapplication

big datastores

website

ESB

analytics& reporting

unstructureddata

mobileApp

datawarehouse

& data marts

internalportal dashboard

externaldata

privatedata

ODBC/SQL JDBC/SQL XML/SOAP REST/JSON OData SQL

JMS SQL JDBC OData Hive RSS Excel JSONREST SOAP

JMS message SQL statement SOAP messageData Virtualization Server

productiondatabases

applications

Page 10: Red Hat JBOSS Data Virtualization

Turn Siloed Data into Actionable Information

7/19/16 DLT Solutions LLC – Proprietary & Confidential 10

Connect

Compose

Consume

BI Reports & AnalyticsMobile Applications

Applications & Portals ESB, ETL

Native Data Connectivity

Standard based Data ProvisioningJDBC, ODBC, SOAP, REST, OData

JBoss Data

Virtual-ization

Data Consumers

Data Sources

Design Tools

Dashboard

Optimization

Caching

Security

Metadata

Hadoop NoSQL Cloud Apps Data Warehouse & Databases

MainframeXML, CSV

& Excel Files

Enterprise Apps

Siloed &Complex

VirtualizeTransformFederate

Easy,Real-time

InformationAccess

Unified Virtual Database / Common Data ModelData Transformations

Page 11: Red Hat JBOSS Data Virtualization

Supported Data Sources

7/19/16

DLT Solutions LLC – Proprietary & Confidential 11

Enterprise RDBMS:

•Oracle

•IBM DB2

•Microsoft SQL Server

•Sybase ASE

•MySQL

•MariaDB

•PostgreSQL

•Ingres

Enterprise EDW:

•Teradata

•Netezza

•Greenplum

Search:

•Apache SOLR

Hadoop:

•Apache

•HortonWorks

•Cloudera

•More coming…

Office Productivity:

•Microsoft Excel

•Microsoft Access

•Google Spreadsheets

Specialty Data

Sources:

•ModeShape Repository

•Mondrian

•MetaMatrix

•LDAP

•Apache POI for Excel

NoSQL:

•JBoss Data Grid

•MongoDB

•Cassandra

•More coming…

Enterprise & Cloud

Applications:

•Salesforce.com

•SAP

Technology

Connectors:

•Flat Files, XML Files,

XML over HTTP

•SOAP Web Services

•REST Web Services

•OData Services

7/19/16

Page 12: Red Hat JBOSS Data Virtualization

Data As A Service

DLT Solutions LLC – Proprietary & Confidential 127/19/16

Contextual view of disparate source data

Single point of accessStandard based interfacesShareable integration and

transformation logicReusable data services

But you cannot achieve this by writing more application code…

Hadoop NoSQL Cloud Apps Data Warehouse & Databases

Mainframe XML, CSV& Excel Files

Enterprise Apps

JBoss Data Virtualization

BI Dashboard & Reports

Analytical Applications

ESB/SOA Integration

BPM Applications

Mobile Applications

SQL Statement SOAP MessageREST Message

REST Request

JSON Result

SQL Request

SQL Result

Page 13: Red Hat JBOSS Data Virtualization

Logical Architecture

7/19/16 DLT Solutions LLC – Proprietary & Confidential 13

Data Consumers

Data Sources

Page 14: Red Hat JBOSS Data Virtualization

Teiid Data Virtualization Designer

7/19/16 DLT Solutions LLC – Proprietary & Confidential 14

Page 15: Red Hat JBOSS Data Virtualization

7/19/16 DLT Solutions LLC – Proprietary & Confidential 15

Tooling VirtualDB Engine Server

Page 16: Red Hat JBOSS Data Virtualization

7/19/16 DLT Solutions LLC – Proprietary & Confidential 16

Tooling VirtualDB Engine Server

Users create data models based on metadata:

•Imported from data

sources

•Supplied via DDL

•Provided by Engine

•Specified by user

Models are packaged in a Virtual Database (VDB)

Physical Models representing actual data sources

Logical Models

Page 17: Red Hat JBOSS Data Virtualization

7/19/16 DLT Solutions LLC – Proprietary & Confidential 17

Tooling VirtualDB Engine Server

Build XML Document

models from XML Schemas

Map XML Document

models to other data models

Enable data access via

XML

Page 18: Red Hat JBOSS Data Virtualization

7/19/16 DLT Solutions LLC – Proprietary & Confidential 18

Tooling VirtualDB Engine Server

Virtual Databases (VDBs) are deployment archives similar to .WAR.

VDBs contain

•Source metadata and models

•View metadata and models

•System metadata

•Connection information, which is bound to

sources at deployment time

VDBs are deployed to the query engine

VDB Internals

Source Models

Connector

Binding

Properties

View Models

Manifesto Info

Page 19: Red Hat JBOSS Data Virtualization

7/19/1619

Tooling VirtualDB Engine Server

JBoss Data Virtualization can offer finer-grained

security control:

Authentication: Kerberos, LDAP, WS-UsernameToken, HTTP Basic, SAMLAuthorization: Virtual data views, Role based access

controlAdministration: Centralized management of Virtual DB

privilegesAudit: Centralized audit logging and dashboardProtection:

Row and column maskingSSL encryption (ODBC and JDBC)

DLT Solutions LLC – Proprietary & Confidential

Page 20: Red Hat JBOSS Data Virtualization

7/19/16 DLT Solutions LLC – Proprietary & Confidential 20

Tooling VirtualDB Engine Server

Query Engine

JDBC API

VDB

Connector Binding (1)

Connector Binding (2)

C1 C2

DBOracle

DB SQL Server

Data Consumer Apps

Query Engine is core data virtualization functionality: Federating relational query engine. Rule and cost based optimizer, advanced query planner, caching, hint processing.

Query Engine hosts VDBs, binds to data sources, performs query execution and results processing.

Page 21: Red Hat JBOSS Data Virtualization

7/19/1621

Tooling VirtualDB Engine Server

The Teiid Query engine is hosted in JBoss EAP and uses key container-provided services:

•Transaction manager

•JAAS security framework

•Container managed data sources

•EAP management infrastructure

•EAP deployment

The Server exposes views /services to consumers and managed connections and connection pools for data sources.

DLT Solutions LLC – Proprietary & Confidential

JBoss EAP

ApplicationsSecurity

JAASTransaction

Manager

JDV Runtime Engine

BufferMgrThreading

Local Cachesetc.

VDBVDBs

ODBC Socket Transport

Admin Socket Transport

JDBC Socket Transport

Profile Service

ODBC

JDBC

Admin / AdminShell

JON

DSDS

DS

DS

JC

A

Tra

nsla

tors

Embedded DSxxx-ds.xml

yyy-ds.xml

zzz-ds.xml

Page 22: Red Hat JBOSS Data Virtualization

7/19/1622

Tooling VirtualDB Engine Server

DLT Solutions LLC – Proprietary & Confidential

CACHING & MATERIALIZATIONMultiple levels of caching to meet performance requirements and manage load on source systems:Materialized Views

–External or Internal materialized views–Ability to override use of materialized views

Result set Caching–Applied to results return from user queries and virtual procedure calls–Configurable time to live and max. number of entries

Code Table Caching–Suited for integrating reference data with transaction/operational data e.g. Country code, State Code etc.

QUERYAccess Patterns – criteria requirements on

pushdown queriesPushdown – decompose user query into

source queries–Projection minimization to remove unused select items–Decompose aggregates over joins/unions–Generating SQL matching Teiid system functions

Dependent Joins (can use hints) – feed equi-join values from one side of the join to the otherPartition aware aggregation and joinsCopy Criteria – uses criteria transitivity to

minimize join tuples.

PERFORMANCE OPTIMIZATION

Page 23: Red Hat JBOSS Data Virtualization

Business Dashboard

7/19/16 DLT Solutions LLC – Proprietary & Confidential 23

Page 24: Red Hat JBOSS Data Virtualization

Bring It All Together

7/19/16 DLT Solutions LLC – Proprietary & Confidential 24

Hadoop

Data IntegrationJBoss Data Virtualization

In-memory CacheJBoss Data Grid

BI Analytics (historical, operational, predictive)

Composite Applications

Messaging and Event Processing JBoss A-MQ and JBoss BRMS

J

Structured DataStreaming

DataSemi-Structured

Data

Cap

ture

& P

rocess

Inte

gra

te &

An

aly

ze

Red Hat Storage

Page 25: Red Hat JBOSS Data Virtualization

25

Questions

?

Page 26: Red Hat JBOSS Data Virtualization

Bring It All Together

7/19/16 DLT Solutions LLC – Proprietary & Confidential 26

Page 27: Red Hat JBOSS Data Virtualization

27

Thank

You!

Page 28: Red Hat JBOSS Data Virtualization

JBoss Data Virtualization – Use Cases

7/19/16 DLT Solutions LLC – Proprietary & Confidential 28

Self-Service Business Intelligence

The virtual, reusable data model provides business-friendly representation of data, allowing the user to interact with their data without having to know the complexities of their database or where the data is stored and allowing multiple BI tools to acquire data from centralized data layer. Gain better insights from Big Data using JBoss Data Virtualization to integrate with existing information sources.

360◦

Unified View

Deliver a complete view of master & transactional data in real-time. The virtual data layer serves as a unified, enterprise-wide view of business information that improves users’ ability to understand and leverage enterprise data.

Agile SOA Data Services

A data virtualization layer deliver the missing data services layer to SOA applications. JBoss Data Virtualization increases agility and loose coupling with virtual data stores without the need to touch underlying sources and creation of data services that encapsulate the data access logic and allowing multiple business service to acquire data from centralized data layer.

Regulatory Compliance

Data Virtualization layer deliver the data firewall functionality. JBoss Data Virtualization improves data quality via centralized access control, robust security infrastructure and reduction in physical copies of data thus reducing risk. Furthermore, the metadata repository catalogs enterprise data locations and the relationships between the data in various data stores, enabling transparency and visibility.

Page 29: Red Hat JBOSS Data Virtualization

7/19/16 DLT Solutions LLC – Proprietary & Confidential 29

BA C D

JBoss Data Virtualization

Leveraged TPC-H like schema, data and queries

Use 4 different commercial enterprise RDBMS

Each database with 1 TB data representing

•150 million customers, with over

•600 million order records, and

•6 billion order line items.

•Total 4 TB of data

Findings:

•No measurable JDV queries overhead vs. direct queries

•Queries to federated data from four data sources ran

61.7 percent faster vs. baseline

•Scaling queries workload by 2x resulted in <10% impact

on response time

Download Benchmark Study @ http://www.redhat.com/en/resources/jboss-data-virtualization-query-performance-benchmark-study