resource management and query optimization in the cloud · centralized resource management [yarn,...

Post on 17-Apr-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

cluster ROIConsolidate workloads

• Wide variety

Workload Heterogeneity in Shared Clusters

APIs

high cluster utilization

centralized distributed

Resource Management in Shared Clusters

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

1. Request

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

1. Request

2. Allocation

Centralized Resource Management[YARN, Mesos, Omega, Borg]

Node Manager

Node Manager

Node Manager

1. Request

2. Allocation

3. Start task

Distributed Resource Management[Apollo, Sparrow]

Node Manager

Node Manager

Node Manager

Distributed Resource Management[Apollo, Sparrow]

Node Manager

Node Manager

Node Manager

Distributed Resource Management[Apollo, Sparrow]

Node Manager

Node Manager

Node Manager

Distributed Resource Management[Apollo, Sparrow]

Node Manager

Node Manager

Node Manager

Centralized vs. Distributed Scheduling

Centralized Distributed

Workload heterogeneity

Task placement

Enforcing scheduling

invariants

Allocation latency

Slot utilization

Scalability

• “Trade performance guarantees for allocation latency”

choose among scheduling typesBased on job type, job characteristics, cluster load, etc.

Mercury provides a programmatic way to use otherwise idle resources

Mercury: Key Insight

• “Trade performance guarantees for allocation latency”

choose among scheduling typesBased on job type, job characteristics, cluster load, etc.

Mercury provides a programmatic way to use otherwise idle resources

Mercury: Key Insight

Gains over YARN:

Up to 40% task throughput

Up to 66% mean job latency

Mercury Architecture (Conceptual)

Mercury Runtime

Mercury Runtime

Mercury Runtime

Mercury Resource Management Framework•

Mercury Architecture (Conceptual)

Mercury Runtime

Mercury Runtime

Mercury Runtime

Mercury Resource Management Framework•

Mercury Architecture (Conceptual)

Mercury Runtime

Mercury Runtime

Mercury Runtime

Mercury Resource Management Framework•

resource type

Mercury Architecture (Conceptual)

Mercury Runtime

Mercury Runtime

Mercury Runtime

Mercury Resource Management Framework•

resource type

Container Types

GUARANTEED containers

QUEUEABLE containers

• opportunistically

Container Types

GUARANTEED containers

QUEUEABLE containers

• opportunistically

central

distributed

Use of Container Types in DAGs: Examples

• LocalRM

• Queuing

• Framework

• Application

Mercury Architecture over YARN

GUARANTEED Request and Allocation

GUARANTEED Request and Allocation

GUARANTEED Request and Allocation

request(GUARANTEED, …)

GUARANTEED Request and Allocation

request(GUARANTEED, …)

allocate(…)

GUARANTEED Request and Allocation

request(GUARANTEED, …)

allocate(…)

GUARANTEED Request and Allocation

start(GUARANTEED, …)

request(GUARANTEED, …)

allocate(…)

QUEUEABLE Request and Allocation

QUEUEABLE Request and Allocation

QUEUEABLE Request and Allocation

request(QUEUEABLE, …)

QUEUEABLE Request and Allocation

request(QUEUEABLE, …)allocate(…)

QUEUEABLE Request and Allocation

start(QUEUEABLE, …)

request(QUEUEABLE, …)allocate(…)

Task Execution: Conflict Resolutiontwo priorities

types of schedulers shared resources

Application Policies

• container type to be requested for each task

Framework Policies

rebalance

reorderingjob arrival time

QUEUEABLE containers per node

Load Shaping Policies

Mercury Runtime

Mercury Runtime

Mercury Runtime

Experimental Setup

Task Throughput for Increasing Task Duration

Cosmos-based Workload: Task Throughput

Cosmos-based Workload: Job Latency

Application Engines

M/R AM REEFTezSpark Runtime

Cluster-wide resource management: YARN++

YARN + Federation

YARN + Rayon

YARN + Mercury

YARN + Mercury

YARN + Mercury YARN + Mercury YARN + Mercury

Per-job/framework Resource Management

Hive …Storm Giraph PigSpark

The Bigger Picture

Conclusion

top related