benchmarking sahara based big data as a service solutions

24
Benchmarking Sahara-based Big-Data-as-a-Service Solutions Zhidong Yu, Weiting Chen (Intel) Trevor McKay (Red Hat) May 2015

Upload: zhidong-yu

Post on 11-Aug-2015

293 views

Category:

Technology


2 download

TRANSCRIPT

Benchmarking Sahara-based

Big-Data-as-a-Service Solutions

Zhidong Yu, Weiting Chen (Intel)

Trevor McKay (Red Hat)

May 2015

o Why Sahara

o Sahara introduction

o Deployment considerations

o Performance testing and results

o Future envisioning

o Summary and Call to Action

2

Agenda

o You or someone at your company is using AWS, Azure, or Google

o You’re probably doing it for easy access to OS instances, but also the modern application features, e.g. AWS’ EMR or RDS or Storage

o Migrating to, or even using, OpenStack infrastructure for workloads means having application features, e.g. Sahara & Trove

o Writing applications is complex enough without having to manage supporting (non-value-add) infrastructure

3

Why Sahara: Cloud features

o Writing data analysis applications are especially hard

o Complexity in acquiring data

o Complexity in organizing (ETL) data

o Complexity in analyzing data

o Complexity in integrating analysis into applications (bonus!)

o This compounded complexity does not even include the necessary tooling and infrastructure

4

Why Sahara: Data analysis

o Why Sahara

o Sahara introduction

o Deployment considerations

o Performance testing and results

o Future envisioning

o Summary and Call to Action

5

Agenda

o Repeatable cluster provisioning and management

operations

o Data processing workflows (EDP)

o Cluster scaling (elasticity), Storage integration

(Swift, Cinder, HCFS)

o Network and security group (firewall) integration

o Service anti-affinity (fault domains & efficiency)

6

Sahara features

7

Sahara

architecture

o Users get choice of integrated data progressing

engines

o Vendors get a way to integrate with OpenStack and

access users

o Upstream - Apache Hadoop (Vanilla), Hortonworks,

Cloudera, MapR, Apache Spark, Apache Storm

o Downstream - depends on your OpenStack vendor

8

Sahara plugins

o Why Sahara

o Sahara introduction

o Deployment considerations

o Performance testing and results

o Future envisioning

o Summary and Call to Action

9

Agenda

10

Storage Architecture#2 #4#3#1

Host

HDFS

VM

Comput-

ing Task

VM

Comput-

ing Task

Host

HDFS

VM

Comput-

ing Task

VM

HDFS

Host

HDFS

VM

Comput-

ing Task

VM

Comput-

ing Task

Legacy

NFSGlusterFS Ceph

External

HDFSSwift

HDFS

Scenario #1: computing and data service collocate in the

VMs

Scenario #2: data service locates in the host world

Scenario #3: data service locates in a separate VM world

Scenario #4: data service locates in the remote network

o Tenant provisioned (in VM)o HDFS in the same VMs of computing tasks vs. in

the different VMso Ephemeral disk vs. Cinder volume

o Admin providedo Logically disaggregated from computing taskso Physical collocation is a matter of deploymento For network remote storage, Neutron DVR is very

useful feature

o A disaggregated (and centralized) storage system has significant valueso No data silos, more business opportunitieso Could leverage Manila serviceo Allow to create advanced solutions (.e.g. in-memory

overlayer)o More vendor specific optimization opportunities

o Container seems to be promising but still need better support

o Determining the appropriate cluster size is always a challenge to tenants

o e.g. small flavor with more nodes or large flavor with less nodes

11

Compute EnginePros Cons

VM

• Best support in OpenStack

• Strong security

• Slow to provision

• Relatively high runtime performance overhead

Container

• Light-weight, fast provisioning

• Better runtime performance than VM

• Nova-docker readiness

• Cinder volume support is not ready yet

• Weaker security than VM

• Not the ideal way to use container

Bare-Metal

• Best performance and QoS

• Best security isolation

• Ironic readiness

• Worst efficiency (e.g. consolidation of workloads with

different behaviors)

• Worst flexibility (e.g. migration)

• Worst elasticity due to slow provisioning

o Direct Cluster Operations

o Sahara is used as a provisioning engine

o Tenants expect to have direct access to the virtual cluster

o e.g. directly SSH into the VMs

o May use whatever APIs comes with the distro

o e.g. Oozie

o EDP approach

o Sahara’s EDP is designed to be an abstraction layer for tenants to consume the services

o Ideally should be vendor neutral and plugin agnostic

o Limited job types are supported at present

o 3rd party abstraction APIs

o Not supported yet

o e.g. Cask CDAP

Data Processing API

Deployment Considerations Matrix

Storage

Compute

Distro/Plugin

Data Processing API

Vanilla CDH HDP MapRSpark

VM Container Bare-metal

Tenant vs. Admin

provisioned

Disaggregated vs. Collocated HDFS vs. other options

Traditional EDP

(Sahara native)3rd party APIs

Storm

Performance results in the next section

o Why Sahara

o Sahara introduction

o Deployment considerations

o Performance testing and results

o Future envisioning

o Summary and Call to Action

14

Agenda

● Host OS: CentOS7

● Guest OS: CentOS7

● Hadoop 2.6.0

● 4-Nodes Cluster

● Baremetal

●OpenStack using KVM○qemu-kvm v1.5

●OpenStack using Docker○Docker v1.6

15

Testing Environment

Ephemeral Disk Performance

Host

HDFS

Host

Nova Inst.

Store

VM

HDFS

VM

HDFS

RAID

…..

…..vs.

● 1.3x read overhead, 2.1x overhead

○ disk access pattern change: 10%

○ virtualization overhead: 90%

■ 60% due to I/O overhead

■ 30% due to memory efficiency in virtualization

Heavy tuning is required.

1.3x overhead

2.1x overhead

o Performance difference comes from several factors:

o Free of I/O virtualization overhead

o DVR bring huge performance enhancement

o Location awareness(HVE) is not enabled

External HDFS Performance

Host

HDFS

VMVM

Host

Nova Inst.

Store

VM

HDFS

VM

HDFS

RAID

…..

….. vs.

RAID

2x improvement

1.3x overhead

Host

Swift Performance

o Similar to external HDFS but even worse

o Without enabling location awareness, all the traffic go through the Swift

proxy node

Swift

…..

VMVM

Host

Nova Inst.

Store

VM

HDFS

VM

HDFS

RAID

…..

….. vs.

RAID

1.35x overhead

● For an I/O intensive workload, 2.19x overhead is big but is consistent with previous

results.

● Container demonstrates a fair performance compared to KVM

○ considering OpenStack services also consumes resources, 0.46x is not that bad.

Bare-metal vs. Container vs. VM

Host

Container

Host Host

VM

HDFS

HDFS

HDFS

2.19X

1.46X

1X

o Why Sahara

o Sahara introduction

o Deployment considerations

o Performance testing and results

o Future envisioning

o Summary and Call to Action

20

Agenda

o Architectured for disaggregated computing and storage

o Supporting more storage backend

o Integration with Manila

o Better support for container and bare-metal (Nova-docker and Ironic)

o Magnum as an alternative?

o EDP as a PaaS like layer for Sahara core provisioning engine

o Data connector abstraction

o Workflow management

o Policy engine for resource and SLA management

o Auto-scale, auto-tune

o Sahara needs to offer broader vendor integration opportunities (not just engines)

o A complete big data stack may have many options at each layer

o e.g. acceleration libraries, analytics, developer oriented application framework (e.g.

CDAP)

o Requires a more generic plugin/driver framework to support it

21

Future of Sahara? (NOT a roadmap)

o Upgrade HDP plugin to HDP 2.2

oHadoop HA for CDH and HDP

oEnhancements to provisioning through Heat

oBring Sahara to python-openstackclient

oBare-metal clusters with Ironic

oSecurity enhancements

oEDP enhancements

oJob scheduling

oCoordination

oLog retrieval

o Improved parameter specification

22

Liberty Roadmap Highlights

o Great improvement in Sahara Kilo release. Production ready with

real customer deployments.

o A complete Big-Data-as-a-Service solution requires more

considerations than simply adding a Sahara service to the

existing OpenStack deployment

o Preliminary benchmark results show the performance gap with

bare-metal is still huge. Tuning and optimizations are required.

o Many features could be added to enhance Sahara. Opportunities

exist for various types of vendors.

23

Summary and Call-to-Action

Join in the Sahara community and make it even more vibrant!

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at [intel.com].

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Include as footnote in appropriate slide where performance data is shown:o §Configurations: [describe config + what test used + who did testing] o §For more information go to http://www.intel.com/performance.

Intel, the Intel logo, {List the Intel trademarks in your document} are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others.

© 2015 Intel Corporation.

24

Disclaimer