enabling high performance big data platform with … · enabling high performance big data platform...
TRANSCRIPT
2
• Administration tooling
• Performance
• Reliability
• SQL support
• Backup and recovery
From 451 Research 2013 Hadoop survey
Shortcomings of Hadoop
3
Where can we improve Hadoop?
• Issues
– Inherent data latency issue
with HDFS
– Cannot support large number
of small files
– Efficiency of Map Reduce,
Hbase, Hive, etc.
HDFS™ (Hadoop Distributed File System)
Map Reduce HBase
Hive Pig
Map Reduce
SQL (e.g. Impala) • High demand to improve
– Real-time operation
– Fast execution
– Streaming data
4
HDFS Operation
Client
NameNode
DataNode
1 4
8
DataNode
4
8
DataNode
1
4
2
Write
Read
Replication Replication
HDFS Federation
NameNode
• HDFS Federation
• Faster Disks
• Faster CPU and Memory
• Bigger network pipe
5
RDMA (Remote Directory Memory Access)
RDMA over InfiniBand or Ethernet
KERN
EL
HARD
WARE
USER
RACK 1
OS
NIC Buffer 1
Application
1 Application
2
OS
Buffer 1
NIC Buffer 1
TCP/IP
RACK 2
HCA HCA
Buffer 1 Buffer 1
Buffer 1
Buffer 1
Buffer 1
6
RDMA: Critical for Efficient Data Movement
ZERO Copy Remote Data Transfer
Low Latency, High Performance Data Transfers
InfiniBand - 56Gb/s RoCE
Kernel Bypass Protocol Offload
* RDMA over Converged Ethernet
Application Application USER
KERNEL
HARDWARE
Buffer Buffer
7
HDFS Operation with RDMA
Client
NameNode
DataNode
1 4
8
DataNode
4
8
DataNode
1
4
2
Write
Read
Replication Replication
NameNode
8
• Hadoop HDFS-RDMA acceleration: – 100% java code written on top of JXIO
• Same memory footprint as the vanilla client/server uses
– First results show double performance for HDFS WRITE operation
• With 3 replications compared to vanilla
HDFS RDMA Acceleration – Solution 1
9
• Open source
– https://github.com/accelio/accelio/ && www.accelio.org
• Faster RDMA integration to application
• Maximize message and CPU parallelism
Accelio, High-Performance Reliable
Messaging and RPC Library
10
• Package available at: http://hadoop-
rdma.cse.ohio-state.edu/
• Big performance gain with RDMA support
HDFS RDMA Acceleration – Solution 2
12
RDMA-Enabled MapReduce
• Unstructured Data Accelerator - UDA
– Uses RDMA to do the Shuffle & Merge
– Plug-in architecture
– Open-source
• Supported Hadoop Distributions
– Apache 3.0, Apache 2.2.x, Apache 1.3
– Cloudera Distribution Hadoop 4.4 Inbox
13
Storage Limitations for Hadoop
• Hadoop using local disk to maintain data locality and
reduce latency
– High-value that resides on external storage systems
– Copy data onto HDFS, run Analytics, and then copy the results
to another system
– Wasting storage space
– As data sources increase, managing data is nightmare
• Option of just accessing the external data without having
to deal with “copying”
– Need to provide performance
14
Storage: From Scale-Up to Scale-Out
• Scale-out storage systems using distributed computing
architectures
– Scalable and resilient
16
Fastest and Lowest Latency Storage
Access with iSER
iSCSI(TCP/IP)
1 x FC 8 Gbport
4 x FC 8 Gbport
iSER 1 x40GbE/IB
Port
iSER 2 x40GbE/IB
Port(+Acceleratio
n)
KIOPs 130 200 800 1100 2300
0
500
1000
1500
2000
2500
K IO
Ps
@ 4
K I
O S
ize
18
Hadoop over Cloud?
• Heavily utilized, rather than
being massively provisioned
• Cloud storage is slower and
expensive
• Data locality makes a big
difference for performance
Concerns:
• Lowering the cost of innovation
• Procuring large scale resources
quickly
• Running closer to the data
• Simplifying Hadoop operations
Benefits:
? Performance?
19
• Using OpenStack Built-in components and management
– RDMA is already inbox and used by OpenStack
• RDMA enables faster performance, with much lower CPU%
Fastest OpenStack Storage Access
Hypervisor (KVM)
OS
V
M OS
V
M OS
V
M
Adapter
Open-iSCSI w iSER
Compute Servers
RDMA Capable Interconnect
iSCSI/iSER Target (tgt)
Adapter Local Disks
RDMA Cache
Storage Servers
OpenStack (Cinder)
Using RDMA to
accelerate
iSCSI storage
20
Fast Interconnect with RDMA to Boost Big Data
4X Faster Run Time! Benchmark: TestDFSIO (1TeraByte, 100 files)
2X Higher Performance! Benchmark: 1M Records Workload (4M Operations)
2X faster run time and 2X higher throughput
2X Faster Run Time! Benchmark: MemCacheD Operations
3X Faster Run Time! Benchmark: Redis Operations
23 23
All trademarks are property of their respective owners. All information is provided “As-Is” without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and
completeness of the information contained herein. HPC Advisory Council Mellanox undertakes no duty and assumes no obligation to update or correct any information presented herein
Questions?