munich hug 21.11.2013

© Hortonworks Inc. 2013 - Confidential

Hortonworks: We Do Hadoop.Our mission is to enable your Modern Data Architecture

by delivering One Enterprise Hadoop

November 2013

Agenda

• Hortonworks Overview of Tez–Quick and painless

• A driver for Tez: The Stinger Initiative• Tez Deep Dive• Demo


A Brief History of Apache Hadoop

2013

Focus on INNOVATION2005: Hadoop created

at Yahoo!

Focus on OPERATIONS2008: Yahoo team extends focus to

operations to support multiple projects & growing clusters

Yahoo! begins to Operate at scale

EnterpriseHadoop

Apache Project Established

HortonworksData Platform

2004 2008 2010 20122006

STABILITY2011: Hortonworks created to focus on “Enterprise Hadoop“. Starts with

24 key Hadoop engineers from Yahoo


Our Mission:

Our Commitment

Innovate in the OpenWe employ the core architects and operators of Hadoop and drive innovation through open source Apache Foundation projects to avoid vendor lock-in

Certify for the EnterpriseWe engineer, test and certify the Hortonworks Data Platform for enterprise usage and deliver the highest quality of support

Interoperate with the EcosystemWe work with partners to deeply integrate Hadoop with key technologies so you can leverage existing skills and investments

Headquarters: Palo Alto, CAEmployees: 240+ and growingCustomers: 120+ and growingInvestors: Benchmark, Index, Yahoo, Dragoneer, Tenaya

Trusted Partners with:

Enable your Modern Data Architecture by delivering One Enterprise Hadoop


Goal: Interoperable and Familiar

APPL

ICAT

ION

SDA

TA S

YSTE

MSO

URC

ES

RDBMS EDW MPP

Emerging Sources (Sensor, Sentiment, Geo, Unstructured)

HANA

BusinessObjects BI

OPERATIONAL TOOLS

DEV & DATA TOOLS

Existing Sources (CRM, ERP, Clickstream, Logs)

INFRASTRUCTURE


UDADiagram

Betting on Hortonworks…

Teradata Portfolio for Hadoop

• Seamless data access between Teradata and Hadoop (SQL-H)

• Simple management & monitoring with Viewpoint integration

• Flexible deployment options

HDInsight & HDP for Windows

• Only Hadoop Distribution for Windows Azure & Windows Server

• Native integration with SQL Server, Excel, and System Center

• Extends Hadoop to .NET community

Complete Portfolio for Hadoop

Appliances

Instant Access + Infinite Scale

• SAP can assure their customers they are deploying an SAP HANA + Hadoop architecture fully supported by SAP

• Enables analytics apps (BOBJ) to interact with Hadoop


Hortonworks Approach to Enterprise Hadoop

Identify and introduce enterprise requirements into the public domain

Work with the community to advance and incubate open source projects

Apply Enterprise Rigor to provide the most stable and reliable distribution

Community Driven Enterprise Apache Hadoop


Driving Hadoop Innovation

147,933 lines

614,041 lines

End Users

449,768 lines

Total Net Lines Contributed to Apache Hadoop

Yahoo: 10

Cloudera: 7

IBM: 3

10 Others

21

Facebook: 5

LinkedIn: 3

Total Number of Committers to Apache Hadoop

63total

Hortonworks engineers focus on making Apache Hadoop an enterprise viable

platform that powers modern data architectures and deeply integrates

with existing data center technologies


HDP: Enterprise Hadoop Platform

Hortonworks Data Platform (HDP)

• The ONLY 100% open source and complete platform

• Integrates full range of enterprise-ready services

• Certified and tested at scale

• Engineered for deep ecosystem interoperability

OS/VM Cloud Appliance

PLATFORM SERVICES

HADOOP CORE

Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

HDFS

SQOOP

FLUME

NFS

LOAD & EXTRACT

WebHDFS

KNOX*

OOZIE

AMBARI

FALCON*

YARN

MAP TEZREDUCE

HIVE &HCATALOG

PIGHBASE


Hortonworks: The Value of “Open” for You

Connect With the Hadoop CommunityWe employ a large number of Apache project committers & innovators so that you are represented in the open source community

Avoid Vendor LockHortonworks Data Platform remain as close to the open source trunk as possible and is developed 100% in the open so you are never locked in

The partners you rely on, rely on Hortonworks We work with partners to deeply integrate Hadoop with data center technologies so you can leverage existing skills and investments

Certified for the EnterpriseWe engineer, test and certify the Hortonworks Data Platform at scale to ensure reliability and stability you require for enterprise use

Support from the expertsWe provide the highest quality of support for deploying at scale. You are supported by hundreds of years of Hadoop experience


SQL-in-Hadoop with Apache Hive

• Apache Hive is the standard for SQL interaction with Hadoop–Enterprise makes final purchasing

decision on two key characteristics: 'compatibility' with existing investments (60%) and skills (20%)

–Most application claim Hive compatibility TODAY*

• Stinger Initiative: Simple Focus–Performance–SQL-Compatibility

Claims publicly made by: Teradata, Microsoft, Oracle, Microstrategy, IBM, Information Builders, SAS, QlikTech, SAP, Tableau, Tibco, Actuate, Jaspersoft, Alteryx, Datameer, Pentaho

Had

oop

HDFS

Hive

TezMapReduce

SQL

YARN

Business Analytics

CustomApps

Improves existing tools & preserves investments


Stinger Initiative Goals

• Enables Hive to support interactive workloads• Improves existing tools & preserves investments

Query Planner

Hive

Execution Engine

Tez= 100X+

FileFormat

ORC file

= SQL Compatible

+

Data Types

Windowing&

Subqueries+


Stinger: Hive For All Analytics

Enterprise Reports

Dashboard / Scorecard

Parameterized Reports

Visualization Data Mining

Interactive Batch

100X Faster+

SQL Compatible


Stinger Roadmap

DATA TYPES• Subqueries for IN,

NOT IN, HAVING• Datatypes: CHAR,

VARCHAR, DATETIME

• Improvements to DECIMAL datatype

• Integration with Tez and Tez Service

• Vectorization Preview

• Intelligent Optimizer• Column Statistics• Authentication and

Authorization Enhancements

• Full vector query

• Join optimizations• ORCFile• SQL:2003

windowing functions


Stinger: Some early Results

• Query Engine Work ONLY• Uses TPC “style” benchmark• Just a few weeks of work

• OTHER work coming


Apache Tez : Accelerating Hadoop Query Processing


Tez – Introduction

• Distributed execution framework targeted towards data-processing applications.

• Based on expressing a computation as a dataflow graph.

• Built on top of YARN – the resource management framework for Hadoop.

• Open source Apache incubator project and Apache licensed.


Old School Hadoop: MapReduce


Fundamentals of YARN

• The fundamental idea of YARN is to split up the two major responsibilities of the JobTracker/TaskTracker into separate entities:–a global ResourceManager–a per-application ApplicationMaster.–a per-node slave NodeManager and–a per-application Container running on a NodeManager


New School Hadoop with YARN


Tez – Design Themes

• Empowering End Users• Execution Performance


Tez – Empowering End Users

• Expressive dataflow definition API’s• Flexible Input-Processor-Output runtime model• Data type agnostic• Simplifying deployment



• Expressive dataflow definition API’s–Enable definition of complex data flow pipelines using simple

graph connection API’s. Tez expands the logical plan at runtime.–Targeted towards data processing applications like Hive/Pig but

not limited to it. Hive/Pig query plans naturally map to Tez dataflow graphs with no translation impedance.

TaskA-1 TaskA-2 TaskB-1 TaskB-2 TaskC-1 TaskC-2

TaskD-1 TaskD-2 TaskE-1 TaskE-2


Aggregate Stage

Partition Stage

Preprocessor Stage


• Expressive dataflow definition API’s

Sampler

Task-1 Task-2

Task-1 Task-2

Task-1 Task-2

Samples

Ranges

Distributed Sort



• Flexible Input-Processor-Output runtime model–Construct physical runtime executors dynamically by connecting

different inputs, processors and outputs.–End goal is to have a library of inputs, outputs and processors that

can be programmatically composed to generate useful tasks.

IntermediateReduce

ShuffleInput

ReduceProcessor

FileSortedOutput

FinalReduce

ShuffleInput

ReduceProcessor

HDFSOutput

PairwiseJoin

Input1

JoinProcessor

FileSortedOutput

Input2



• Data type agnostic–Tez is only concerned with the movement of data. Files and

streams of bytes.–Does not impose any data format on the user application. MR

application can use Key-Value pairs on top of Tez. Hive and Pig can use tuple oriented formats that are natural and native to them.

File

Stream

Key Value

Tez Task

Tuples

User Code

Bytes Bytes



• Simplifying deployment–Tez is a completely client side application.–No deployments to do. Simply upload to any accessible

FileSystem and change local Tez configuration to point to that.–Enables running different versions concurrently. Easy to test new

functionality while keeping stable versions for production.–Leverages YARN local resources.

ClientMachine

NodeManager

TezTask

NodeManager

TezTaskTezClient

HDFSTez Lib 1 Tez Lib 2

ClientMachine

TezClient



• Expressive dataflow definition API’s• Flexible Input-Processor-Output runtime model• Data type agnostic• Simplifying usage

With great power API’s come great responsibilities

Tez is a framework on which end user applications can be built


Tez – Execution Performance

• Performance gains over Map Reduce• Optimal resource management• Plan reconfiguration at runtime• Dynamic physical data flow decisions



• Performance gains over Map Reduce–Eliminate replicated write barrier between successive

computations.–Eliminate job launch overhead of workflow jobs.–Eliminate extra stage of map reads in every workflow job.–Eliminate queue and resource contention suffered by workflow

jobs that are started after a predecessor job completes.

Pig/Hive - MRPig/Hive - Tez



• Optimal resource management–Reuse YARN containers to launch new tasks.–Reuse YARN containers to enable shared objects across tasks.

YARN Container

TezTask Host

TezTask1

TezTask2

Sha

red

Obj

ects

YARN Container

Tez Application Master

Start Task

Task Done

Start Task



• Plan reconfiguration at runtime–Dynamic runtime concurrency control based on data size, user

operator resources, available cluster resources and locality.–Advanced changes in dataflow graph structure.–Progressive graph construction in concert with user optimizer.

HDFS Blocks

YARNResources

Stage 150 maps

100 partitions

Stage 2100

reducers

Stage 150 maps

100 partitions

Stage 2100 10

reducers

Only 10GB’s

of data



• Dynamic physical data flow decisions–Decide the type of physical byte movement and storage on the fly.–Store intermediate data on distributed store, local store or in-

memory.–Transfer bytes via blocking files or streaming and the spectrum in

between.

Producer(small size)

In-Memory

Consumer

Producer

Local File

Consumer

At Runtime


Tez – Deep Dive – API

DAG dag = new DAG();

Vertex map1 = new Vertex(MapProcessor.class);

Vertex map2 = new Vertex(MapProcessor.class);

Vertex reduce1 = new Vertex(ReduceProcessor.class);

Vertex reduce2 = new Vertex(ReduceProcessor.class);

Vertex join1 = new Vertex(JoinProcessor.class);

…….

Edge edge1 = Edge(map1, reduce1, SCATTER_GATHER, PERSISTED, SEQUENTIAL, MOutput.class, RInput.class);

Edge edge2 = Edge(map2, reduce2, SCATTER_GATHER, PERSISTED, SEQUENTIAL, MOutput.class, RInput.class);

Edge edge3 = Edge(reduce1, join1, SCATTER_GATHER, PERSISTED, SEQUENTIAL, MOutput.class, RInput.class);

Edge edge4 = Edge(reduce2, join1, SCATTER_GATHER, PERSISTED, SEQUENTIAL, MOutput.class, RInput.class);

…….

dag.addVertex(map1).addVertex(map2)

.addVertex(reduce1).addVertex(reduce2)

.addVertex(join1)

.addEdge(edge1).addEdge(edge2)

.addEdge(edge3).addEdge(edge4);

reduce1

map2

reduce2

join1

map1

Scatter_Gather

Bipartite Sequential

Scatter_Gather

Bipartite Sequential

Simple DAG definition API


Tez – Deep Dive – API

• Data movement – Defines routing of data between tasks– One-To-One : Data from the ith producer task routes to the ith consumer

task.– Broadcast : Data from a producer task routes to all consumer tasks.– Scatter-Gather : Producer tasks scatter data into shards and consumer

tasks gather the data. The ith shard from all producer tasks routes to the ith consumer task.

• Scheduling – Defines when a consumer task is scheduled– Sequential : Consumer task may be scheduled after a producer task

completes.– Concurrent : Consumer task must be co-scheduled with a producer task.

• Data source – Defines the lifetime/reliability of a task output– Persisted : Output will be available after the task exits. Output may be lost

later on.– Persisted-Reliable : Output is reliably stored and will always be available– Ephemeral : Output is available only while the producer task is running

Edge properties define the connection between producer and consumer vertices in the DAG


Tez – Deep Dive – Scheduling

reduce1

map1

Start

vertex

Vertex Scheduler

Start

tasks

DAGScheduler

Get Priority

Get Priority

Start

vertex

TaskScheduler

Get container

Get container

• Vertex SchedulerDetermines when tasks in a vertex can start

• DAG SchedulerDetermines priority of task

• Task SchedulerAllocates containers from YARN and assigns them to tasks


Tez – Deep Dive – Task Execution

Task Attempt(real on machine)

Task Attempt(logical in AM)

Env, cmd line, resources

Task JVM

InputProcessor

Output

Get Task

Start container

Input

Processor

OutputData

InformationData Events

• Start task shell with user specified env, resources etc.

• Fetch and instantiate Input, Processor, Output objects

• Receive (incremental) input information and process the input

• Provide output information


Tez - Sessions

• The amount of work programmed into a script/query may not be doable within a single Tez DAG.


Tez - Sessions

• Even better performance gains may be achieved through caching with the session: Within AM or container


Tez – Automatic Reduce Parallelism

Map Vertex

Reduce VertexApp Master

Vertex ManagerData Size Statistics

Vertex StateMachine

Set Parallelism

Cancel Task

Re-Route

Event Model

Map tasks send data statistics events to the Reduce Vertex Manager.

Vertex ManagerPluggable user logic that understands the data statistics and can formulate the correct parallelism. Advises vertex controller on parallelism


Tez – Reduce Slow Start/Pre-launch

Map Vertex

Reduce VertexApp Master

Vertex ManagerTask Completed

Vertex StateMachine

Start Tasks

Start

Event Model

Map completion events sent to the Reduce Vertex Manager.

Vertex ManagerPluggable user logic that understands the data size. Advises the vertex controller to launch the reducers before all maps have completed so that shuffle can start.


Tez – Current status

• Apache Incubator Project–Rapid development. Over 330 jiras opened. Over 220 resolved.–Growing community.

• Focus on stability–Testing and quality are highest priority.–Working on Tez+YARN to fix basic performance overheads.–Code ready and deployed on multi-node environments.

• DAG of MR processing is working– Already functionally equivalent to Map Reduce. Existing Map

Reduce jobs can be executed on Tez with few or no changes.– Working Hive prototype that can target Tez for execution of

queries (HIVE-4660).–Work started on prototype of Pig that can target Tez.


Tez – Current status

Fact TableDimension

Table 1

Result Table 1

Dimension Table 2

Result Table 2

Dimension Table 3

Result Table 3

Join

Join

Join

Typical pattern in a TPC-DS query

Fact Table

Dimension Table 1

Dimension Table 1

Dimension Table 1

Optimization for

small data sets

Both can now run as a single Tez job


Tez – MRR Performance

RC File _x000d_Scale 200

ORC File _x000d_Scale 200

RC File _x000d_Scale 1000

ORC File _x000d_Scale 1000

0

10

20

30

40

50

60

70

80

55 54

75

65

35 34

55

46

Traditional _x000d_Map-ReduceTez Map_x000d_Reduce Reduce

Elap

sed

Tim

e (s

econ

ds)

TPC-DS Query 12 with Hive on Tez


Tez – Roadmap

• Full DAG support–Multi-way input and output.–Other graph connection patterns.

• Performance optimizations–Container reuse–Cross task shared resources–Using HDFS data caching

• Runtime plan optimizations–Automatic input (map) parallelism–Automatic aggregation (reduce) parallelism

• Usability.–Stability and testability–Recovery and history


Tez – Community

• Early adopters and contributors welcome–Adopters to drive more scenarios. Contributors to make them

happen.–Hive and Pig communities are on-board and making great

progress - HIVE-4660 and PIG-3446

• Stay tuned for Tez meetups with deep dives on Tez architecture and using Tez–http://www.meetup.com/Apache-Tez-User-Group

• Useful links–Work tracking: https://issues.apache.org/jira/browse/TEZ–Code: https://github.com/apache/incubator-tez– Developer list: [email protected]

User list: [email protected] Issues list: [email protected]

http://www.meetup.com/Apache-Tez-User-Group

http://www.meetup.com/Apache-Tez-User-Group

https://issues.apache.org/jira/browse/TEZ

https://github.com/apache/incubator-tez

https://github.com/apache/incubator-tez

mailto:[email protected]




Tez – Takeaways

• Distributed execution framework that works on computations represented as dataflow graphs

• Naturally maps to execution plans produced by query optimizers

• Execution architecture designed to enable dynamic performance optimizations at runtime

• Open source Apache project – your use-cases and code are welcome

• It works and is already being used by Hive


Tez

https://github.com/t3rmin4t0r/tez-autobuild

Tez: https://github.com/apache/tez.git

Demo: https://github.com/t3rmin4t0r/tez-autobuild

Thanks for your time and attention!

Questions?




munich hug 21.11.2013

Technology

hadoop enterprise

apache hadoop hortonworks

hadoop experience hortonworks

hadoop architecture

hadoop distribution

operators of hadoop

yahoo hortonworks

investments hortonworks