apache hadoop yarn

37
2/23/13 Introduction To YARN Adam Kawa, Spotify The 9 th Meeting of Warsaw Hadoop User Group

Upload: adam-kawa

Post on 10-May-2015

11.721 views

Category:

Technology


1 download

DESCRIPTION

Introduction to Apache Hadoop YARN at Warsaw Hadoop User Group (WHUG)

TRANSCRIPT

Page 1: Apache Hadoop YARN

2/23/13

Introduction To YARN

Adam Kawa, Spotify

The 9th Meeting of Warsaw Hadoop User Group

Page 2: Apache Hadoop YARN

2/23/13

About Me

Data Engineer at Spotify, Sweden

Hadoop Instructor at Compendium (Cloudera Training Partner)

+2.5 year of experience in Hadoop

Page 3: Apache Hadoop YARN

2/23/13

“Classical” Apache Hadoop Cluster

Can consist of one, five, hundred or 4000 nodes

Page 4: Apache Hadoop YARN

2/23/13

Hadoop Golden EraHadoop Golden Era

One of the most exciting open-source projects on the planetOne of the most exciting open-source projects on the planet

Becomes the standard for large-scale processingBecomes the standard for large-scale processing

Successfully deployed by hundreds/thousands of companiesSuccessfully deployed by hundreds/thousands of companies

Like a salutary virusLike a salutary virus

... But it has some little drawbacks... But it has some little drawbacks

Page 5: Apache Hadoop YARN

2/23/13

HDFS Main LimitationsHDFS Main Limitations

Single NameNode thatSingle NameNode that

keeps all metadata in RAMkeeps all metadata in RAM

performs all metadata operationsperforms all metadata operations

becomes a single point of failure (SPOF)becomes a single point of failure (SPOF)

(not the topic of this presentation… )(not the topic of this presentation… )

Page 6: Apache Hadoop YARN

2/23/13

“Classical” MapReduce Limitations

Limited scalability

Poor resource utilization

Lack of support for the alternative frameworks

Lack of wire-compatible protocols

Page 7: Apache Hadoop YARN

2/23/13

Limited ScalabilityLimited Scalability

Scales to only Scales to only

~4,000-node cluster~4,000-node cluster

~40,000 concurrent tasks~40,000 concurrent tasks

Smaller and less-powerful clusters must be createdSmaller and less-powerful clusters must be created

Image source: http://www.flickr.com/photos/rsms/660332848/sizes/l/in/set-72157607427138760/Image source: http://www.flickr.com/photos/rsms/660332848/sizes/l/in/set-72157607427138760/

Page 8: Apache Hadoop YARN

2/23/13

Poor Cluster Resources Utilization

Hard-coded values. TaskTrackers must be restarted after a change.

Page 9: Apache Hadoop YARN

2/23/13

Poor Cluster Resources Utilization

Map tasks are waiting for the slots which are NOT currently used by Reduce tasks

Hard-coded values. TaskTrackers must be restarted after a change.

Page 10: Apache Hadoop YARN

2/23/13

Resources Needs That Varies Over TimeHow to hard-code the number of map and reduce slots efficiently?

Page 11: Apache Hadoop YARN

2/23/13

(Secondary) Benefits Of Better Utilization (Secondary) Benefits Of Better Utilization

The same calculation on more efficient Hadoop clusterThe same calculation on more efficient Hadoop cluster

Less data center spaceLess data center space

Less silicon wasteLess silicon waste

Less power utilizationsLess power utilizations

Less carbon emissionsLess carbon emissions

==

Less disastrous environmental consequencesLess disastrous environmental consequences

Image source: http://globeattractions.com/wp-content/uploads/2012/01/green-leaf-drops-green-hd-leaf-nature-wet.jpgImage source: http://globeattractions.com/wp-content/uploads/2012/01/green-leaf-drops-green-hd-leaf-nature-wet.jpg

Page 12: Apache Hadoop YARN

2/23/13

“Classical” MapReduce Daemons

Page 13: Apache Hadoop YARN

2/23/13

JobTracker ResponsibilitiesJobTracker Responsibilities

Manages the computational resources (map and reduce slots)Manages the computational resources (map and reduce slots)

Schedules all user jobsSchedules all user jobs

Schedules all tasks that belongs to a jobSchedules all tasks that belongs to a job

Monitors tasks executionsMonitors tasks executions

Restarts failed and speculatively runs slow tasksRestarts failed and speculatively runs slow tasks

Calculates job counters totalsCalculates job counters totals

Page 14: Apache Hadoop YARN

2/23/13

Simple Observation

JobTracker has lots tasks...

Page 15: Apache Hadoop YARN

2/23/13

JobTracker Redesign Idea

Reduce responsibilities of JobTracker

Separate cluster resource management from job coordination

Use slaves (many of them!) to manage jobs life-cycle

Scale to (at least) 10K nodes, 10K jobs, 100K tasks

Page 16: Apache Hadoop YARN

2/23/13

Disappearance Of The Centralized JobTracker

Page 17: Apache Hadoop YARN

2/23/13

YARN – Yet Another Resource Negotiator

Page 18: Apache Hadoop YARN

2/23/13

Resource Manager and Application Master

Page 19: Apache Hadoop YARN

2/23/13

YARN Applications

MRv2 (simply MRv1 rewritten to run on top of YARN)

No need to rewrite existing MapReduce jobs

Distributed shell

Spark, Apache S4 (a real-time processing)

Hamster (MPI on Hadoop)

Apache Giraph (a graph processing)

Apache HAMA (matrix, graph and network algorithms)

... and http://wiki.apache.org/hadoop/PoweredByYarn

Page 20: Apache Hadoop YARN

2/23/13

NodeManager

More flexible and efficient than TaskTracker

Executes any computation that makes sense to ApplicationMaster

Not only map or reduce tasks

Containers with variable resource sizes (e.g. RAM, CPU, network, disk)

No hard-coded split into map and reduce slots

Page 21: Apache Hadoop YARN

2/23/13

NodeManager Containers

NodeManager creates a container for each task

Container contains variable resources sizes

e.g. 2GB RAM, 1 CPU, 1 disk

Number of created containers is limited

By total sum of resources available on NodeManager

Page 22: Apache Hadoop YARN

2/23/13

Application Submission In YARN

Page 23: Apache Hadoop YARN

2/23/13

Resource Manager Components

Page 24: Apache Hadoop YARN

2/23/13

UberizationUberization

Running the small jobs in the same JVM as AplicationMasterRunning the small jobs in the same JVM as AplicationMaster

mapreduce.job.ubertask.enablemapreduce.job.ubertask.enable true (false is default)true (false is default)

mapreduce.job.ubertask.maxmapsmapreduce.job.ubertask.maxmaps 9*9*

mapreduce.job.ubertask.maxreducesmapreduce.job.ubertask.maxreduces 1*1*

mapreduce.job.ubertask.maxbytes mapreduce.job.ubertask.maxbytes dfs.block.size*dfs.block.size*

* Users may override these values, but only downward* Users may override these values, but only downward

Page 25: Apache Hadoop YARN

2/23/13

Interesting DifferencesInteresting Differences

Task progress, status, counters are passed directly to the Task progress, status, counters are passed directly to the Application MasterApplication Master

Shuffle handlers (long-running auxiliary services in Node Shuffle handlers (long-running auxiliary services in Node Managers) serve map outputs to reduce tasksManagers) serve map outputs to reduce tasks

YARN does not support JVM reuse for map/reduce tasksYARN does not support JVM reuse for map/reduce tasks

Page 26: Apache Hadoop YARN

2/23/13

Fault-Tollerance

Failure of the running tasks or NodeManagers is similar as in MRv1

Applications can be retried several times yarn.resourcemanager.am.max-retries 1

yarn.app.mapreduce.am.job.recovery.enable false

Resource Manager can start Application Master in a new container yarn.app.mapreduce.am.job.recovery.enable false

Page 27: Apache Hadoop YARN

2/23/13

Resource Manager Failure

Two interesting features

RM restarts without the need to re-run currently running/submitted applications [YARN-128]

ZK-based High Availability (HA) for RM [YARN-149] (still in progress)

Page 28: Apache Hadoop YARN

2/23/13

Wire-Compatible ProtocolWire-Compatible Protocol

Client and cluster may use different versions

More manageable upgrades

Rolling upgrades without disrupting the service

Active and standby NameNodes upgraded independently

Protocol buffers chosen for serialization (instead of Writables)

Page 29: Apache Hadoop YARN

2/23/13

YARN MaturityYARN Maturity

Still considered as both production and not production-readyStill considered as both production and not production-ready

The code is already promoted to the trunkThe code is already promoted to the trunk

Yahoo! runs it on 2000 and 6000-node clustersYahoo! runs it on 2000 and 6000-node clusters

Using Apache Hadoop 2.0.2 (Alpha)Using Apache Hadoop 2.0.2 (Alpha)

Anyway, it will replace MRv1 sooner than laterAnyway, it will replace MRv1 sooner than later

Page 30: Apache Hadoop YARN

2/23/13

How To Quickly Setup The YARN Cluster

Page 31: Apache Hadoop YARN

2/23/13

Basic YARN Configuration Parameters

yarn.nodemanager.resource.memory-mb 8192

yarn.nodemanager.resource.cpu-cores 8

yarn.scheduler.minimum-allocation-mb 1024

yarn.scheduler.maximum-allocation-mb 8192

yarn.scheduler.minimum-allocation-vcores 1

yarn.scheduler.maximum-allocation-vcores 32

Page 32: Apache Hadoop YARN

2/23/13

Basic MR Apps Configuration Parameters

mapreduce.map.cpu.vcores 1

mapreduce.reduce.cpu.vcores 1

mapreduce.map.memory.mb 1536

mapreduce.map.java.opts -Xmx1024M

mapreduce.reduce.memory.mb 3072

mapreduce.reduce.java.opts -Xmx2560M

mapreduce.task.io.sort.mb 512

Page 33: Apache Hadoop YARN

2/23/13

Apache Whirr Configuration File

Page 34: Apache Hadoop YARN

2/23/13

Running And Destroying Cluster$ whirr launch-cluster --config hadoop-yarn.properties

$ whirr destroy-cluster --config hadoop-yarn.properties

Page 35: Apache Hadoop YARN

2/23/13

It Might Be A Demo

But what about running But what about running

production applications production applications

on the real cluster on the real cluster

consisting of hundreds of nodes?consisting of hundreds of nodes?

Join Spotify!Join Spotify!

[email protected]@spotify.com

Page 36: Apache Hadoop YARN

2/23/13

Image source: http://allthingsd.com/files/2012/07/10Questions.jpegImage source: http://allthingsd.com/files/2012/07/10Questions.jpeg

Page 37: Apache Hadoop YARN

2/23/13

Thank You!