meet hadoop family: part 2

14
Meet Hadoop Family: part 2 YARN

Upload: caizerx

Post on 11-Jan-2017

145 views

Category:

Data & Analytics


5 download

TRANSCRIPT

Page 1: Meet Hadoop Family: part 2

Meet Hadoop Family: part 2

YARN

Page 2: Meet Hadoop Family: part 2
Page 3: Meet Hadoop Family: part 2

• What is it? Resource manager platform in a Hadoop cluster, it allows dynamic memory and CPU sharing between processing frameworks such as MapReduce, Spark, and others

• PurposeMore predictable performance Better cluster utilization

• Compared to MapReduce v1MapReduce v1 starts to break on > 4000 nodes YARN allows other frameworks to run on it also support multi tenancy YARN is backward compatible with MapReduce V1

Page 4: Meet Hadoop Family: part 2

YARN Architecture

Page 5: Meet Hadoop Family: part 2

Scheduler types

• FIFO Scheduler

• Capacity scheduler Fixed pools for resources FIFO scheduling for each pools

• Fair schedulerWeighted pools for resources Fair sharing

Page 6: Meet Hadoop Family: part 2

Capacity Scheduler

• Capacity guaranted on each pool, with hard limits and soft limits

• Hierarchical pool with a root pool

• Elasticity with preemptive option

Page 7: Meet Hadoop Family: part 2

Preemption Option

• T1: Time of App2’s submission

• T2: Time of App1 can finish

• T3: Time of App2 can finish

Page 8: Meet Hadoop Family: part 2

Fair Scheduler

• Each application assigned to a pool, a subpool is possible

• Excess capacity will be spreaded across all pools

• Pools with minimum resources defined received priority during allocation

• Minimum resources are minimum amount of resources that must be allocated to the pool before any fair allocation, often used to satisfy SLA (service level agreement)

• Pools can be assigned a weight

• Preemption types, minimum and fair share

Page 9: Meet Hadoop Family: part 2
Page 10: Meet Hadoop Family: part 2
Page 11: Meet Hadoop Family: part 2

• Resource manager web interface, port 8088

• Job history web interface, port 19888

Page 12: Meet Hadoop Family: part 2

Log Aggregation

• Logs can be grouped by application

• Stored in HDFS (was not in Map Reduce v1)

• Gives better load balance when writting logs

Page 13: Meet Hadoop Family: part 2

• Show applications yarn application -listyarn application -list allyarn application -status <application_id> yarn application -list -appstates FINISHED

• Kill application yarn application -kill <application_id>

• Show logsyarn logs -applicationId <application_id>

• List YARN nodes yarn node -list

Common Commands

Page 14: Meet Hadoop Family: part 2

Questions?https://www.meetup.com/Jakarta-Hadoop-Big-Data/