Download - Apache Yarn - Hadoop Cluster Management
YARN Core
Resource Manager (RM)
Node Managers (NM)
Application Masters (AM)
Node 1
RM
Node 2 Node 3
NM NM
AM
AM
AM
Capacity Scheduler
Organize jobs into queues
Use resources of other queues if they are not busy
Preemption
Hive and Map Reduce
Multiple Map Reduce jobs per query
Separate Application Master and containers for each job
10+ seconds overhead in a busy cluster
Spark
Driver in Client or Application Master
Spark Web UI – Application Master URL in Resource Manager
Easy deployment
YARN and Docker
Running Dockercontainers in YARN
Isolating applications
Packaging complex applications
Llama – Impala on YARN
Get resources from YARN
Single Application Master per queue – multiplex all Impala requests
YARN vs Mesos
Mesos
Scalable, global resource manager for the entire data center
2-level scheduler
Myriad
Dynamic YARN