delivering a flexible it infrastructure for analytics on ibm power systems

30
Delivering a Flexible IT Infrastructure for Analytics with HDP on IBM Power Systems Nov 30, 2016

Upload: hortonworks

Post on 09-Jan-2017

152 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Delivering a Flexible IT Infrastructure for Analytics

with HDP on IBM Power Systems

Nov 30, 2016

Page 2: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Agenda

• Hortonworks and IBM Power Systems Partnership

• Customer Analytics Journey

• Open Community Innovation

• Leading Time to Insights

2

Page 3: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Hortonworks, IBM Collaborate to Offer Open Source Distribution on Power SystemsLatest Hortonworks Data Platform (HDP) to provide IBM customers with more choice in open source Hadoop distribution for big data processing

3

Las Vegas, NV (IBM Edge) - 19 Sep 2016: IBM (NYSE: IBM) and Hortonworks (NASDAQ: HDP) today announced the planned availability of Hortonworks Data Platform (HDP®) for IBM Power Systems enabling POWER8 clients to support a broad range of new applications while enriching existing ones with additional data sources.

Scott Gnau, CTO, Hortonworks at Edge. Youtube: http://bit.ly/2dSOliW

James Wade, Director of Application Hosting, Florida Blue. Youtube: http://bit.ly/2dxVHIY 

© 2016 IBM Corporation

Page 4: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Customer Story – Guidewell Health - Florida Blue

• Business Problem– Transformational journey resulting in rapid expansion of business models– Technology innovation required to keep up with the business expansion while improving

client satisfaction, reducing costs and supporting the company’s green IT initiativeso Existing x86 server sprawl not sustainable

• Solution with Hortonworks, IBM OpenPOWER servers and Sage Solutions Consulting– Embraces the open software and hardware model adopted by Florida Blue– Hortonworks supporting new fraud analytics initiative to reduce costs and client premiums– OpenPOWER to enable smaller datacenter footprint with stronger reliability

4

See the full story in this Hortonworks Blog post.

© 2016 IBM Corporation

Page 5: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Payment Tracking

DueDiligence

SocialMapping

ProductDesign M & ACall

AnalysisMachine

Data

DefectDetecting

FactoryYields

CustomerSupport

BasketAnalysis Segments

CustomerRetention

SentimentAnalysis

OptimizeInventories

SupplyChain

Cross-Sell

VendorScorecards

AdPlacement

CyberSecurity

DisasterMitigation

InvestmentPlanning

AdPlacement

RiskModeling

ProactiveRepair

InventoryPredictions

NextProduct Recs

OPEXReduction

HistoricalRecords

MainframeOffloads

Device Data

Ingest

Rapid Reporting

DigitalProtection

Dataas a

Service

FraudPrevention

PublicData

Capture

INNOVATE

RENOVATE

EXPLORE OPTIMIZE TRANSFORM

ACTIVEARCHIVE

ETLONBOARD

DATAENRICHMENT

DATADISCOVERY

SINGLEVIEW

PREDICTIVEANALYTICS

M&A StorageBlending

M&A Ingest

Integration

Page 6: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

About Hortonworks

The Leader in Connected Data PlatformsPublicly traded on NASDAQ: HDPHortonworks DataFlow for data in motionHortonworks Data Platform for data at restPowering new modern data applications

Partnering for Customer SuccessLeader in open-source community, focused on innovation to meet enterprise needsUnrivaled support subscriptions

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Founded in 2011

Original 24 Architects, Developers, Operators of Hadoop from Yahoo!

1000+Employees

1500+Ecosystem

Partners

Page 7: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

HPD is a 100% Open Source Connected Data Platform

Eliminates Riskof vendor lock-in by delivering100% Apache open source technology

Maximizes Community Innovationwith hundreds of developers across hundreds of companies

Integrates Seamlesslythrough committed co-engineering partnerships with other leading technologies

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Page 8: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Apache Hadoop Committers

We Employ the Committersone third of all committers to the Apache® Hadoop™ project, and a majority in other important projects

Our Committers Innovateand expand Open Enterprise Hadoop

We Influence the Hadoop Roadmapby communicating important requirementsto the community through our leaders

Hortonworks Influences the Apache Community

Page 9: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Hortonworks Nourishes the Community and Ecosystem

Hortonworks Community Connection

Hortonworks Partnerworks

• Community Q/A Resources

• Articles & Code Repos!

• Community of (big data) developers

• Open Ecosystem of Big Data for vendors & end-users

• Advance Apache™ Hadoop®

• Enable more Big Data Apps

• World class partner program

• Network of partners providing best-in-class solutions

Hadoop & Big data ecosystem

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Page 10: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Hortonworks Delivers Proactive Support

Hortonworks SmartSense™with machine learning and predictive analytics on your cluster

Integrated Customer Portalwith knowledge base and on-demand training

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Page 11: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Client Value proposition for HDP on Power:

• 100% Open Source Hadoop running on OpenPOWER Hardware Platform– OpenPOWER is open-source hardware solution for open source SW– Intel x86 – perceived as default/commodity option - but x86 is not open– Open = no vendor lock-in and flexibility

• Processor and Server Architecture optimized for big data processing– 2x per core performance compared to intel x86

o Fewer cores & servers needed: contain server sprawlo Improved HW price/performance

– Higher server & data reliability – designed to run the enterprise

11

Page 12: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

5 IBMers contributing to Linux and Apache Projects

1999

IBM is investing in the Linux ecosystems & open innovation

270+ OpenPOWER-based innovations under way

2016

50k+ IBMers contributing to 150+ open organizations

1. Source: https://developer.ibm.com/start/

1

BlockchainHyperledger

Open Source Databases

12

Page 13: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

13 13

Accelerated innovation through collaboration of partners

Amplified capabilities driving industry performance leadership

Vibrant ecosystem through open development

Cloud ComputingHyperscale & Large scale

Datacenters

High PerformanceComputing & Analytics

Domestic IT Agendas

Industry adoption, Open choice

OpenPOWER Strategy

Moore’s law no longer satisfies performance gain

Numerous IT consumption models

Growing workload demands

Mature Open software ecosystem

Market Shifts

OpenPOWER is an open development community, using the POWER Architecture to serve the evolving needs of customers.

OpenPOWER, a catalyst for Open Innovation

Page 14: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Fueling an Open Development Community

14

Chip / SOC

BoardsSystems

I/O / StorageAcceleration

System / SoftwareIntegration

Implementation / HPC /Research

Page 15: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

- Design and cost optimized for deployments of multiples (cloud and cluster)

- Broad number of optimal solutions

- Co-Designed with the OpenPOWER Ecosystem

IBM Support

Community / 3rd Party Support

running

The LC Line

The L Line

PurePower

Enterprise& IFLs

IaaS

Scale-Out, Linux-Only

ConvergedInfrastructure

Scale-Up

- Enterprise level RAS for single system deployments

- Solutions for Big Data & Analytics

- Converged infrastructure offering

- Rapid time to value and simplicity of management

- Enterprise level robustness and IFL capability

- Solution editions for in memory databases

- (HANA, DB2 BLU)

- Hosted cloud and hybrid cloud solutions

- Rapid deployments and POCs

The IBM Power Systems Linux Portfolio

Broad Linux portfolio delivers all your Linux deployment needs

POWER8 is designed for the Big Data era and delivers price-performance leadership to the Linux Market!

155 © 2016 IBM Corporation

Page 16: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Introducing the IBM Power Systems LC Line

16

S822LC For High Performance Computing

•Incorporates the new POWER8 processor with NVIDIA NVLink

•Delivers 2.8X the bandwidth to GPUs accelerators

•Up to 4 integrated NVIDIA “Pascal” GPUs

S822LC For Big Data

•Ideal for storage-centric and high data through-put workloads

•Brings 2 POWER8 sockets for Big Data workloads

•Big data acceleration with work CAPI and GPUs

S821LC

•Storage rich single socket system for big data applications

•Memory Intensive workloads

S822LC

•2X memory bandwidth of Intel x86 systems

•Memory Intensive workloads

S812LC

•2 POWER8 sockets in a 1U form factor

•Ideal for environments requiring dense computing

NEW NEWNEWBig Data

High Performance Computing

ComputeIntensive

Announce 9/8, GA 9/26Announce and GA 9/8

Announce and GA 9/8

OpenPOWER servers for cloud and cluster deployments that are different by design

Page 17: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Innovation Pervasive in the Design

Power Systems S822LC for Big Data

NVIDIA: Tesla K80 GPU Accelerator

Red Hat, Ubuntu, SUSE: Linux OS

Mellanox: InfiniBand/Ethernet Connectivity in and out of server

HGST: Optional NVMe Adapters

Alpha Data with Xilinx FPGA: Optional CAPI Accelerator

Broadcom: Optional PCIe Adapters

QLogic: Optional Fiber Channel PCIe

Samsung: SSDs & NVMe

Hynix, Samsung, Micron: DDR4

IBM: POWER8 CPU

17

Page 18: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Leading Operational DBMSs Available & Optimized for Linux on Power

In-Memory, NoSQLSAP HANAMongoDBNeo4jEnterpriseDBMariaDB

Open SourceIBM DB2 BLURedisLabsCassandraPostgreSQL

18

Page 19: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

19

IBM is your single, trusted vendor to support and help you manage your Linux infrastructure

1Based on IBM internal data. 2Original equipment manufacturer

Page 20: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Pric

e/P

erfo

rman

ce

Moore’s Law

Processor Technology

2000 2020

Firmware / OSAcceleratorsSoftwareStorageNetwork

20

Data holds competitive valueFull system and stack open innovation required

You are here

44 zettabytes

unstructured data

2010 2020

structured data

Data G

rowth

Today’s challenges demand innovation

Page 21: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

4X

Threads per core4X

Mem. Bandwidth1

6XMore cache2 @ Lower Latency

SMT=Simultaneous Multi-Threading OLTP = On-Line Transaction Processing

These design decisions result in best performance for data centric workloads like: Spark, Hadoop, Database, NoSQL, Big Data Analytics, OLTP

POWER8: Designed for data to deliver breakthrough performance

POWER8SMT8

x86Hyperthread

Parallel Processing

POWER8pipe

Data flow

x86 pipe POWER8

x86 POWER8 + OpenPOWER

x86

21

1. Up to 4X depending on specific x86 and POWER8 servers being compared2. Up to 6X more cache comparing Intel e7-8890 servers to 12 core POWER8 servers. See speaker notes for more details

© 2016 IBM Corporation

Page 22: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

SAP S&D industry benchmark demonstrates that x86 technologies have NOT improved performance per core

Source: http://global.sap.com/solutions/benchmark/sd2tier.epx

Certification # 2016023Date: 05/10/2016Fujitsu PRIMEQUEST 2800E3192 Cores & 384 Threads E7-8890 v4 at 2.20 GHz2048 GB MemoryUsers: 74,000, SAPS: 404,200Windows Server 2012 R2SQL Server 2012

Sandy Bridge EP Ivy Bridge EP Ivy Bridge EX Haswell EP Haswell EX Broadwell EP Broadwell EXCertification # 2014034Date: 10/3/2014IBM Power E87080 Cores & 640 Threads POWER8 at 4.19 GHz2 TB MemoryUsers: 79,750, SAPS: 436,100AIX 7.1DB2 10.5

POWER8

2.58x

Certification # 2013039Date: 12/10/2013Cisco UCS C420 M332 Cores & 64 Threads E5-4650 at 2.70 GHz256 GB MemoryUsers: 13,010, SAPS: 71,170Windows Server 2012SQL Server 2012

Certification # 2014017Date: 05/05/2014Dell PowerEdge R72024 Cores & 48 Threads E5-2697 v2 at 2.70 GHz256 GB MemoryUsers: 10,253, SAPS: 55,970RHEL 6.5SAP ASE 16

Certification # 2014018Date: 05/05/2014Cisco UCS B260 M430 Cores & 60 threads E7-4890 v2 at 2.80 GHz512 GB MemoryUsers: 12,280, SAPS: 67,020Windows Server 2012SQL Server 2012

Certification # 2014033Date: 09/10/2014Dell PowerEdge R73036 Cores & 72 threads E5-2699 v3 at 2.3 Ghz256 GB MemoryUsers: 16,500, SAPS: 90,120RHEL 7.0SAP ASE 16

Certification # 2015012Date: 05/05/2015Dell PowerEdge R93072 Cores & 144 threads E7-8890 v3 at 2.5 GHz1 TB MemoryUsers: 31,000, SAPS: 170,030RHEL 7.1SAP ASE 16

Certification # 2016006Date: 03/31/2016Cisco UCS C240 M444 Cores & 88 Threads E5-2699 v4 at 2.20 GHz512 GB MemoryUsers: 21,210, SAPS: 115,820Windows Server 2012 R2DB2 10.1

Page 23: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Rel

ativ

e R

PE2*

* Per

Cor

e

P

OW

ER

6 =

1.0

**Gartner RPE2 Details:http://www.gartner.com/technology/research/RPE2-methodology-details.jsp

RPE2** numbers are derived from the following six benchmark inputs:SAP SD Two-Tier, TPC-C, TPC-H, SPECjbb2006 and two SPEC CPU2006 components

The data on this chart is derived from RPE2 from Gartner, Inc.'s Competitive Profile tool. © 2014 Gartner, Inc. and/or its affiliates. All rights reserved.

This was a POWER8 Design Goal

POWER73.55 GHz

16 cores

POWER7+4.2 GHz

16 cores

POWER83.52 GHz

24 cores

POWER64.2 GHz

16 cores

POWER Performance per core is increasing!

IncreasingPerformance

The servers shown are best in each category (sockets and number of cores)

1.0

Page 24: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Example: OpenPOWER acceleration with NVLink

Current CPU to GPU PCIe Attachment

System Bottleneck

Graphics Memory

New POWER8

CPU

P100Tesla GPU

NVL

ink NVLink

NVLink

115GB/s

80 GB/s

80 G

B/s 80 G

B/s

New POWER8 with NVLink Processor Technology

CPU

GPU

PC

Ie

32 G

B/s

Graphics Memory

Graphics Memory

POWER8 with NVLink delivers 2.8X the

bandwidthPCIe Data Pipe

POWER8 NVLink Data Pipe

P100Tesla GPUz

© 2016 IBM Corporation

Page 25: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

YCSB running MongoDB on POWER8 delivers leadership performance and 2.2X better price-performance than Intel Xeon E5-2690 v3 Haswell

IBM Power

S822LC(16-core, 256GB)

HP DL380 Gen9(24-core, 256GB)

Server web price*-3-year warranty

$16,295 $24,615

System Cost-Server + RHEL OS + MongoDB Annual Subscription

$29,584($16,295 + $1,299 + $11,990)

$37,904($24,615 + $1,299 +

$11,990)

MongoDB YCSB(total operations per second)

297.5 k ops 169.5 k ops

$ / Op per Sec 100 $ / ops 223 $ / ops 2.2X better

2.2XBetter Price-Performance

33%Lower HW costs and

maintenance

75%More Performance

per Server

@

•Based on IBM internal testing of single system and OS image running Yahoo Cloud Services Benchmark (YCSB) 0.6.0, 1M record workload at 50/50 read/write factor. Conducted under laboratory condition, individual result can vary based on workload size, use of storage subsystems & other conditions.• IBM Power System S822LC; 16 cores (2 x 8c chips) / 128 threads, POWER8; 3.3 GHz, 256 GB memory, MongoDB 3.3.4, RHEL 7.2. Competitive stack: HP Proliant DL380 Gen9; 24 cores (2 x 12c chips) / 48 threads; Intel E5-2690 v3; 2.6 GHz; 256 GB memory, MongoDB 3.3, RHEL 7.2 . Both server priced with 2 x 1TB SATA 7.2K rpm HDD, 1 Gb 2-port, 2 x 16gbps FCA. Configurations represent the highest processor frequency for that specific processor running the MongoDB server on 1 socket & the YCSB application workload on the 2nd socket. RAM disk was used to focus testing on processor technology differences.Pricing is based on web pricing for S822LC http://www-03.ibm.com/systems/power/hardware/s822lc-commercial/buy.html and HP DL380 Gen9 https://h22174.www2.hp.com/SimplifiedConfig/Index MongoDB https://www.mongodb.com/compare/mongodb-oracle Page: 6

Page 26: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

POWER Advantages for Spark

• Streaming and SQL benefit from High Thread Density and Concurrency

• Machine Learning benefits from Large Caches and Memory Bandwidth

• Graph also benefits from Large Caches, Memory Bandwidth and Higher Computational Strength

26

Machine Learning SQL Graph

2X Core-to-Core Advantage

Machine Learning SQL Graph

1.5X Price Performance Advantage

7-Node S812LC 10-core vs. 7-Node E5-2690 v3 12-core

© 2016 IBM Corporation

Page 27: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

Future Proof Your Hadoop Infrastructure

• Total Cost of Ownership benefits of a Linux on Power decision

– Less infrastructure means reduced costs in many areas:o Energy, cooling, server administration, floor space, SW licensing

– Position for future growth, avoid hitting the data center wall with cluster sprawl

– As your workloads evolve, POWER8 gives you options:o Scale up each node by exploiting the memory bandwidth and multi-threadingo Add new workload optimized servers to the cluster (such a GPU with NVLink)

27

Page 28: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

HDP and OpenPOWER – Better Together

• Leading Open Hadoop and Spark distribution on the leading Open Server platform – built for Big Data, driving continuous innovation in the community

• Industry Experience to Lead Clients on an Analytics Journey – Discovery workshops to identify the key business use cases and the client progression

• Mission Critical Support– Stable, trusted Hadoop platform on proven Power System with outstanding client support

• Performance optimized for enterprise scale– Price performance leadership with POWER8– Exploiting POWER8 advantages for key workloads such as SQL and Spark– Future exploitation of accelerators to maintain leadership

28© 2016 IBM Corporation

Page 29: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

29

Hortonworks (HDP) on Power Roadmap

3Q2016

Ramp Up ProgramHDP early code on Power

Head start on your Analytics Journey

IBM and Hortonworks assistance for non-prod

POCs

General AvailabilityHDP on Power

Proven Power Reference Configs

Hortonworks and IBM production

Support

General AvailabilityHDP next on

Power

Simultaneous releases

4Q2016 Q1 2017 later 2017

Partnership Announce

Edge Ann Sept 19Note: Timing and content subject to change.

© 2016 IBM Corporation

Page 30: Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems

How to Get Started with HDP on OpenPOWER Systems

• Join the Hortonworks Community: https://community.hortonworks.com/

• Learn more about the benefits of Hortonworks: http://hortonworks.com/training/

• Learn more about the benefits of IBM Power Systems and OpenPOWER

• If you are interested in discussing a HDP on Power Systems option or proposal, talk to your Hortonworks or Power sales reps

30© 2016 IBM Corporation