cloud and big data - dell emc · pdf filegemfire . vpostgres . 5 cloud stack – neutral...

47
© 2009 VMware Inc. All rights reserved Cloud + Big Data – Putting it all Together Even Solberg

Upload: lenguyet

Post on 08-Mar-2018

217 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

© 2009 VMware Inc. All rights reserved

Cloud + Big Data – Putting it all Together Even Solberg

Page 2: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

2

Page 3: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

3

Cloud Delivery Model Data as a service for private and public clouds

Big, Fast and Flexible Data

Flexible

OSS Relational

Document

Object

Key / Value

Fast OLTP

workloads

Analytic workloads

Big Big Data

Processing

Big Data Analytics

Page 4: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

4

Big, Fast and Flexible Data

Flexible Big Big Data

Processing

Big Data Analytics

Serengeti

Fast OLTP

workloads

Analytic workloads

Cloud Delivery Model Data as a service for private and public clouds

OSS Relational

Document

Object

Key / Value

GemFire

vPostgres

Page 5: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

5

Cloud Stack – Neutral View

SaaS

PaaS

IaaS

Page 6: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

6

Big Data IaaS

Page 7: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

7

…but first, some Background. How to build an IaaS Cloud

Page 8: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

Central Infrastructure Management

Automated ITIL Process Including Approvals

Automated Provisioning Multi Tenancy IT Service Catalog Resource Distribution Resource Allocation

Customer A

Customer B

Customer C

Customer D

Service Catalog

Service Catalog

Service Catalog

Service Catalog

Users Groups

Users Groups

Users Groups

Users Groups

Administrative Interface / Resource Allocation and Definition

Out Of The Box Integration

Human Interaction

Integration must be built

Generate Ticket 1:st Line Support

Performance Mgmt Resource Mgmt Capacity Mgmt Compliance Mgmt

Cost Models Usage Allocation Pay As You Go CB / SB

Exported Billing Information

Ser

vice

Del

iver

y M

anag

emen

t

https://customer.portal.org Service Catalog Workflow engine SLA Descriptions Show back Billing Information Customer Portal

Service Owner

Network & Security Firewall VPN Load Balancer NAT

ITSM Ticketing Change Mgmt Support

Change & Release

Mgmt

Service Renewal

Gold Silver

Bronze

Cust C Cust B Cust A

Page 9: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

Automated ITIL Process Including Approvals

Automated Provisioning Multi Tenancy IT Service Catalog Resource Distribution Resource Allocation

Customer A

Customer B

Customer C

Customer D

Service Catalog

Service Catalog

Service Catalog

Service Catalog

Users Groups

Users Groups

Users Groups

Users Groups

Administrative Interface / Resource Allocation and Definition

Out Of The Box Integration

Human Interaction

Integration must be built

Generate Ticket 1:st Line Support

Gold Silver

Bronze

Cust C Cust B Cust A

Performance Mgmt Resource Mgmt Capacity Mgmt Compliance Mgmt

Cost Models Usage Allocation Pay As You Go CB / SB

Exported Billing Information

Ser

vice

Del

iver

y M

anag

emen

t

https://customer.portal.org Service Catalog Workflow engine SLA Descriptions Show back Billing Information Customer Portal

Service Owner

Network & Security Firewall VPN Load Balancer NAT

ITSM Ticketing Change Mgmt Support

Change & Release

Mgmt

Service Renewal

vCNS

vCenter Operations

Suite

Service Manager

-- DynamicOps

Service Manager Application

Director

Ser

vice

Man

ager

/ IT

BM

Service Manager

vSphere

vCenter Chargeback

vCloud Director

Page 10: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

Organization: Finance Organization: Marketing

Org VDC Catalogs Org VDC Catalogs

VMware vSphere

VMware vCenter Server

Resource Pools Datastores Port Groups

Provider Virtual Datacenters

Gold Bronze Silver

Users & Policies Users & Policies

Page 11: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

11

Virtualization

Complete Cloud Suite

(server, storage, network)

vSphere

Management

vFabric Application

Director

vCenter Operations Mgmt Suite

vCenter Site

Recovery Manager

Cloud Infrastructure

vCloud Director

Software Defined Storage

Software Defined

Networking

Software Defined

Availability

Software Defined Security

Extensibility

vCloud APIs

vCloud Connector

vCenter

Orchestrator

Page 12: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

12

Virtualizing Hadoop Project Serengeti

Page 13: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

13

3 Big Reasons to Virtualize Hadoop

Page 14: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

14

SQL

Hadoop

1. Virtualize Hardware

DSS

NoSQL Unified Big Data Infrastructure

Private

Public

Big SQL Hadoop NoSQL

Page 15: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

15

2. Rapid Provisioning

I want my Hadoop cluster NOW!

Page 16: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

16

3. Leverage Capabilities

Increase Utilization

No single points of failure

VM Isolation

Resource Management

Page 17: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

17

What? Hadoop in a VM? Really?

Actually, Hadoop performs well in a virtual machine

Page 18: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

18

Performance of Hadoop for Several Workloads

0

0,2

0,4

0,6

0,8

1

1,2

Rat

io to

Nat

ive

1 VM 2 VMs

Ratio of time taken – Lower is Better

Page 19: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

19

Fast Provisioning

From a “seed” node to a cluster

Thin Provisioning Linked Clone

60GB => 3.5GB ~6 second

Page 20: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

20

Being Efficient through Resource over-commitment

Memory over-commitment • Hadoop JVMs hold onto memory

even when not busy

• vSphere memory overcommit allows us to pack more hadoop nodes per host

• If you use EM4J, this can be optimized further

Disk over-commitment • Hadoop is designed for large

dataset

• Thin-provisioning is wonderful in saving disk footprint

Page 21: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

21

Performance

Create more smaller VMs • Makes Hadoop scale better

• Single large Hadoop node is limited by JVM scalability

• Allows for easier/faster adjustment of packing of VMs across hosts by vSphere (through DRS)

Sizing/Configuration of storage is critical • Plan on ~50Mbytes/sec of bandwidth per core

• SAN ports/switches will limit performance

• SANs are typically configured by default for IOPS, not Bandwidth

• Performance of the backend storage should be tested/sized

• Local disks will give ~100MBytes/sec per disk: pick correct controller

Page 22: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

22

Summary

Hadoop does work well in a virtual environment Plan a virtual cluster, enable other big-data solutions on the same

infrastructure Leverage the recipes to automate your configuration and

deployment

Page 23: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

The big glaring hole [with cloud] is data handling.

-Adrian Kunzle, MD Head of Engineering & Architecture, JPMorgan Chase

“ ”

Page 24: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

24

New Ways to Work with Data

NoSQL • In-memory • Key/value pairs,

simplicity, high productivity

• Different offerings, different data models: document, graph, big table, column

NewSQL • In-memory • Scalability benefits

of in-memory systems with standardized SQL

Classic SQL • Traditional RDBMS • ACID (atomicity,

consistency, isolation, durability)

Page 25: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

25

How do you scale the data tier?

Page 26: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

26

vFabric GemFire

Application Data Lives Here

Application Data Sleeps Here

Page 27: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

27

Key Capabilities

Low-latency, linearly-scalable, memory-based data fabric

Data-aware execution

Active/continuous querying and event notification

Page 28: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

28

Primary Use Cases

Web session cache, L2 cache

App data cache, in-memory DB

Grid data fabric: client compute

Grid data fabric: fabric compute

Page 29: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

vFabric Data Director

Provisioning Backup / Restore Clone One click

HA DBA App Dev

Automation Self-Service

Resource Mgmt

Security Mgmt

Database Templates Monitor

IT Admin

Policy Based Control

DBA

Existing Applications New Applications

Public Cloud Private Cloud Hybrid Cloud

Page 30: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

30

Big Data PaaS Cloud Foundry & vFabric

Page 31: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

31

Cloud Stack – Neutral View

SaaS

PaaS

IaaS

Page 32: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

32

Cloud Stack – Classic Pyramid

SaaS PaaS IaaS

Page 33: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

33

Cloud Stack – By Numbers

SaaS PaaS IaaS

Page 34: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

34

Cloud Stack – By Value

SaaS

IaaS

PaaS

Page 35: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

35

Big Data PaaS Architecture

Infrastructure as a Service (IaaS)

Coordination

Data Integration Data

Process U-Data Store

Graph Store

Read / Write

Access

Languages

Workflow Scheduling Metadata

Analytics

UI Framework Other Application Services

Big Data API

Business Intelligence Applications

App

licat

ion

Life

cycl

e M

anag

emen

t

Sec

urity

Sys

tem

s M

onito

ring

& M

anag

emen

t

Page 36: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application
Page 37: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

37

OSS community

Page 38: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

38

Data Services

Other Services

Msg Services

vFabric Postgres

vFabric RabbitMQTM

Additional partners services …

Page 39: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

39

Data Services

Other Services

Msg Services

Private Clouds

Public Clouds

Micro Clouds

.COM

Partners

Page 40: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

40

VMware Cloud Application Platform

Virtual Datacenter Cloud Infrastructure and Management

Rich Web

Programming Model

Social and Mobile

Data Access

Integration Patterns

Batch Framework

WaveMaker Spring Tool Suite

Cloud Foundry

App Monitoring (Spring Insight)

Performance Mgmt (Hyperic)

Automated App Provisioning

(AppDirector)

Java Optimizations

(EM4J, …)

Java Runtime (tc Server)

Web Runtime (ERS)

Messaging (RabbitMQ)

Global Data (GemFire)

In-mem SQL (SQLFire)

Page 41: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

41

Big Data SaaS Cetas

Page 42: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

42

Data Sources

Page 43: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

43

On-Premise Installation

Page 44: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

44

Cloud-based Installation

Page 45: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

45

Summary

Page 46: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

46

Big, Fast and Flexible Data

Flexible Big Big Data

Processing

Big Data Analytics

Serengeti

Fast OLTP

workloads

Analytic workloads

Cloud Delivery Model Data as a service for private and public clouds

OSS Relational

Document

Object

Key / Value

GemFire

vPostgres

Page 47: Cloud and Big Data - Dell EMC · PDF fileGemFire . vPostgres . 5 Cloud Stack – Neutral View SaaS . PaaS . IaaS . 6 ... Big Data API . Business Intelligence Applications . Application

47