resource management and query optimization in the cloud · centralized resource management [yarn,...

48

Upload: others

Post on 17-Apr-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •
Page 2: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

cluster ROIConsolidate workloads

• Wide variety

Workload Heterogeneity in Shared Clusters

Page 3: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

APIs

high cluster utilization

centralized distributed

Resource Management in Shared Clusters

Page 4: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

Page 5: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

Page 6: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

1. Request

Page 7: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

1. Request

2. Allocation

Page 8: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

1. Request

2. Allocation

3. Start task

Page 9: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Distributed Resource Management[Apollo, Sparrow]

Node Manager

Node Manager

Node Manager

Page 10: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Distributed Resource Management[Apollo, Sparrow]

Node Manager

Node Manager

Node Manager

Page 11: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Distributed Resource Management[Apollo, Sparrow]

Node Manager

Node Manager

Node Manager

Page 12: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Distributed Resource Management[Apollo, Sparrow]

Node Manager

Node Manager

Node Manager

Page 13: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Centralized vs. Distributed Scheduling

Centralized Distributed

Workload heterogeneity

Task placement

Enforcing scheduling

invariants

Allocation latency

Slot utilization

Scalability

Page 14: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

• “Trade performance guarantees for allocation latency”

choose among scheduling typesBased on job type, job characteristics, cluster load, etc.

Mercury provides a programmatic way to use otherwise idle resources

Mercury: Key Insight

Page 15: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

• “Trade performance guarantees for allocation latency”

choose among scheduling typesBased on job type, job characteristics, cluster load, etc.

Mercury provides a programmatic way to use otherwise idle resources

Mercury: Key Insight

Gains over YARN:

Up to 40% task throughput

Up to 66% mean job latency

Page 16: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •
Page 17: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Mercury Architecture (Conceptual)

Mercury Runtime

Mercury Runtime

Mercury Runtime

Mercury Resource Management Framework•

Page 18: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Mercury Architecture (Conceptual)

Mercury Runtime

Mercury Runtime

Mercury Runtime

Mercury Resource Management Framework•

Page 19: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Mercury Architecture (Conceptual)

Mercury Runtime

Mercury Runtime

Mercury Runtime

Mercury Resource Management Framework•

resource type

Page 20: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Mercury Architecture (Conceptual)

Mercury Runtime

Mercury Runtime

Mercury Runtime

Mercury Resource Management Framework•

resource type

Page 21: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Container Types

GUARANTEED containers

QUEUEABLE containers

• opportunistically

Page 22: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Container Types

GUARANTEED containers

QUEUEABLE containers

• opportunistically

central

distributed

Page 23: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Use of Container Types in DAGs: Examples

Page 24: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

• LocalRM

• Queuing

• Framework

• Application

Mercury Architecture over YARN

Page 25: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

GUARANTEED Request and Allocation

Page 26: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

GUARANTEED Request and Allocation

Page 27: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

GUARANTEED Request and Allocation

request(GUARANTEED, …)

Page 28: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

GUARANTEED Request and Allocation

request(GUARANTEED, …)

allocate(…)

Page 29: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

GUARANTEED Request and Allocation

request(GUARANTEED, …)

allocate(…)

Page 30: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

GUARANTEED Request and Allocation

start(GUARANTEED, …)

request(GUARANTEED, …)

allocate(…)

Page 31: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

QUEUEABLE Request and Allocation

Page 32: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

QUEUEABLE Request and Allocation

Page 33: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

QUEUEABLE Request and Allocation

request(QUEUEABLE, …)

Page 34: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

QUEUEABLE Request and Allocation

request(QUEUEABLE, …)allocate(…)

Page 35: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

QUEUEABLE Request and Allocation

start(QUEUEABLE, …)

request(QUEUEABLE, …)allocate(…)

Page 36: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Task Execution: Conflict Resolutiontwo priorities

types of schedulers shared resources

Page 37: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Application Policies

• container type to be requested for each task

Page 38: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Framework Policies

Page 39: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

rebalance

reorderingjob arrival time

QUEUEABLE containers per node

Load Shaping Policies

Mercury Runtime

Mercury Runtime

Mercury Runtime

Page 40: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •
Page 41: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Experimental Setup

Page 42: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Task Throughput for Increasing Task Duration

Page 43: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Cosmos-based Workload: Task Throughput

Page 44: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Cosmos-based Workload: Job Latency

Page 45: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •
Page 46: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Application Engines

M/R AM REEFTezSpark Runtime

Cluster-wide resource management: YARN++

YARN + Federation

YARN + Rayon

YARN + Mercury

YARN + Mercury

YARN + Mercury YARN + Mercury YARN + Mercury

Per-job/framework Resource Management

Hive …Storm Giraph PigSpark

The Bigger Picture

Page 47: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •

Conclusion

Page 48: Resource Management and Query Optimization in the Cloud · Centralized Resource Management [YARN, Mesos, Omega, Borg] Node Manager Node Manager Node Manager • •