mesos

117
Motivation Mesos Implementation Evaluation Conclusion Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center Muhammad Anis uddin Nasir KTH Royal Institute of Technology November 20, 2012 Muhammad Anis uddin Nasir Mesos 1/32

Upload: anis-nasir

Post on 19-Aug-2015

2.394 views

Category:

Education


2 download

TRANSCRIPT

Page 1: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Mesos: A Platform for Fine-Grained ResourceSharing in the Data Center

Muhammad Anis uddin Nasir

KTH Royal Institute of Technology

November 20, 2012

Muhammad Anis uddin Nasir Mesos 1/32

Page 2: Mesos

MotivationMesos

ImplementationEvaluationConclusion

1 Motivation

2 MesosOverviewArchitecture

3 Implementation

4 Evaluation

5 Conclusion

Muhammad Anis uddin Nasir Mesos 2/32

Page 3: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Motivation

Diverse cluster computer framework

Muhammad Anis uddin Nasir Mesos 3/32

Page 4: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Motivation

No optimal framework

Run multiple frameworks

Higher UtilizationData sharing betweenclustersReduce Cost

Muhammad Anis uddin Nasir Mesos 4/32

Page 5: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Motivation

No optimal framework

Run multiple frameworks

Higher UtilizationData sharing betweenclustersReduce Cost

Muhammad Anis uddin Nasir Mesos 4/32

Page 6: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Motivation

No optimal framework

Run multiple frameworksHigher Utilization

Data sharing betweenclustersReduce Cost

Muhammad Anis uddin Nasir Mesos 4/32

Page 7: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Motivation

No optimal framework

Run multiple frameworksHigher UtilizationData sharing betweenclusters

Reduce Cost

Muhammad Anis uddin Nasir Mesos 4/32

Page 8: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Motivation

No optimal framework

Run multiple frameworksHigher UtilizationData sharing betweenclustersReduce Cost

Muhammad Anis uddin Nasir Mesos 4/32

Page 9: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Existing Solutions

Static Partitioning

Virtual Machines

Muhammad Anis uddin Nasir Mesos 5/32

Page 10: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Existing Solutions

Static Partitioning

Virtual Machines

Muhammad Anis uddin Nasir Mesos 5/32

Page 11: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Fine-grained

Hadoop and Dryad

SlotsTasks

Benefits

Data LocalityUtilization

Muhammad Anis uddin Nasir Mesos 6/32

Page 12: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Fine-grained

Hadoop and DryadSlots

Tasks

Benefits

Data LocalityUtilization

Muhammad Anis uddin Nasir Mesos 6/32

Page 13: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Fine-grained

Hadoop and DryadSlotsTasks

Benefits

Data LocalityUtilization

Muhammad Anis uddin Nasir Mesos 6/32

Page 14: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Fine-grained

Hadoop and DryadSlotsTasks

Benefits

Data LocalityUtilization

Muhammad Anis uddin Nasir Mesos 6/32

Page 15: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Fine-grained

Hadoop and DryadSlotsTasks

BenefitsData Locality

Utilization

Muhammad Anis uddin Nasir Mesos 6/32

Page 16: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Fine-grained

Hadoop and DryadSlotsTasks

BenefitsData LocalityUtilization

Muhammad Anis uddin Nasir Mesos 6/32

Page 17: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

1 Motivation

2 MesosOverviewArchitecture

3 Implementation

4 Evaluation

5 Conclusion

Muhammad Anis uddin Nasir Mesos 7/32

Page 18: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Mesos

Common resource sharing layer

Fine grained sharing

Across diverse cluster computing frameworks

Muhammad Anis uddin Nasir Mesos 8/32

Page 19: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Goals

High Utilization

Data sharing among frameworks

Muhammad Anis uddin Nasir Mesos 9/32

Page 20: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Goals

High Utilization

Data sharing among frameworks

Muhammad Anis uddin Nasir Mesos 9/32

Page 21: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Challenges

Scalable

Support diverse frameworks

Efficient

Fault tolerant

Highly available

Muhammad Anis uddin Nasir Mesos 10/32

Page 22: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Challenges

Scalable

Support diverse frameworks

Efficient

Fault tolerant

Highly available

Muhammad Anis uddin Nasir Mesos 10/32

Page 23: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Challenges

Scalable

Support diverse frameworks

Efficient

Fault tolerant

Highly available

Muhammad Anis uddin Nasir Mesos 10/32

Page 24: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Challenges

Scalable

Support diverse frameworks

Efficient

Fault tolerant

Highly available

Muhammad Anis uddin Nasir Mesos 10/32

Page 25: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Challenges

Scalable

Support diverse frameworks

Efficient

Fault tolerant

Highly available

Muhammad Anis uddin Nasir Mesos 10/32

Page 26: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Other Benefits

Run Multiple instances of same framework

Isolate production and experimental jobsRun multiple versions of a framework

Build special framework targeting particular problemdomain

Muhammad Anis uddin Nasir Mesos 11/32

Page 27: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Other Benefits

Run Multiple instances of same frameworkIsolate production and experimental jobs

Run multiple versions of a framework

Build special framework targeting particular problemdomain

Muhammad Anis uddin Nasir Mesos 11/32

Page 28: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Other Benefits

Run Multiple instances of same frameworkIsolate production and experimental jobsRun multiple versions of a framework

Build special framework targeting particular problemdomain

Muhammad Anis uddin Nasir Mesos 11/32

Page 29: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Other Benefits

Run Multiple instances of same frameworkIsolate production and experimental jobsRun multiple versions of a framework

Build special framework targeting particular problemdomain

Muhammad Anis uddin Nasir Mesos 11/32

Page 30: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Design Elements

Fine-grained sharing

Allocation at the level of tasks within a jobImprove utilization, latency and data locality

Resource offers

Simple and ScalableApplication-controlled scheduling mechanism

Muhammad Anis uddin Nasir Mesos 12/32

Page 31: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Design Elements

Fine-grained sharingAllocation at the level of tasks within a job

Improve utilization, latency and data locality

Resource offers

Simple and ScalableApplication-controlled scheduling mechanism

Muhammad Anis uddin Nasir Mesos 12/32

Page 32: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Design Elements

Fine-grained sharingAllocation at the level of tasks within a jobImprove utilization, latency and data locality

Resource offers

Simple and ScalableApplication-controlled scheduling mechanism

Muhammad Anis uddin Nasir Mesos 12/32

Page 33: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Design Elements

Fine-grained sharingAllocation at the level of tasks within a jobImprove utilization, latency and data locality

Resource offers

Simple and ScalableApplication-controlled scheduling mechanism

Muhammad Anis uddin Nasir Mesos 12/32

Page 34: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Design Elements

Fine-grained sharingAllocation at the level of tasks within a jobImprove utilization, latency and data locality

Resource offersSimple and Scalable

Application-controlled scheduling mechanism

Muhammad Anis uddin Nasir Mesos 12/32

Page 35: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Design Elements

Fine-grained sharingAllocation at the level of tasks within a jobImprove utilization, latency and data locality

Resource offersSimple and ScalableApplication-controlled scheduling mechanism

Muhammad Anis uddin Nasir Mesos 12/32

Page 36: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Fine-Grained Sharing

Muhammad Anis uddin Nasir Mesos 13/32

Page 37: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Resource Offers

Offers resources to framework

Framework choose resources according to needs

Benefits

gives freedom to framework for implementationkeep Mesos simple and scalable

Drawback

decentralized decision might not be optimal

Muhammad Anis uddin Nasir Mesos 14/32

Page 38: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Resource Offers

Offers resources to framework

Framework choose resources according to needs

Benefits

gives freedom to framework for implementationkeep Mesos simple and scalable

Drawback

decentralized decision might not be optimal

Muhammad Anis uddin Nasir Mesos 14/32

Page 39: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Resource Offers

Offers resources to framework

Framework choose resources according to needs

Benefits

gives freedom to framework for implementationkeep Mesos simple and scalable

Drawback

decentralized decision might not be optimal

Muhammad Anis uddin Nasir Mesos 14/32

Page 40: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Resource Offers

Offers resources to framework

Framework choose resources according to needs

Benefitsgives freedom to framework for implementation

keep Mesos simple and scalable

Drawback

decentralized decision might not be optimal

Muhammad Anis uddin Nasir Mesos 14/32

Page 41: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Resource Offers

Offers resources to framework

Framework choose resources according to needs

Benefitsgives freedom to framework for implementationkeep Mesos simple and scalable

Drawback

decentralized decision might not be optimal

Muhammad Anis uddin Nasir Mesos 14/32

Page 42: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Resource Offers

Offers resources to framework

Framework choose resources according to needs

Benefitsgives freedom to framework for implementationkeep Mesos simple and scalable

Drawback

decentralized decision might not be optimal

Muhammad Anis uddin Nasir Mesos 14/32

Page 43: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Resource Offers

Offers resources to framework

Framework choose resources according to needs

Benefitsgives freedom to framework for implementationkeep Mesos simple and scalable

Drawbackdecentralized decision might not be optimal

Muhammad Anis uddin Nasir Mesos 14/32

Page 44: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Architecture

Muhammad Anis uddin Nasir Mesos 15/32

Page 45: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

Architecture

Muhammad Anis uddin Nasir Mesos 16/32

Page 46: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource Allocation

Fair sharingStrict prioritiesDelay sharing

Isolation

Linux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 47: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharing

Strict prioritiesDelay sharing

Isolation

Linux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 48: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict priorities

Delay sharing

Isolation

Linux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 49: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

Isolation

Linux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 50: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

Isolation

Linux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 51: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux Containers

Solaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 52: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 53: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 54: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux ContainersSolaris Project

Scalability andRobustness

Filters

Re-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 55: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resources

Incentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 56: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 57: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault Tolerance

Soft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 58: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault ToleranceSoft state master

Report node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 59: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault ToleranceSoft state masterReport node failure toframework

Multiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 60: Mesos

MotivationMesos

ImplementationEvaluationConclusion

OverviewArchitecture

More Features

Resource AllocationFair sharingStrict prioritiesDelay sharing

IsolationLinux ContainersSolaris Project

Scalability andRobustness

FiltersRe-offering resourcesIncentives

Fault ToleranceSoft state masterReport node failure toframeworkMultiple schedulers

Muhammad Anis uddin Nasir Mesos 17/32

Page 61: Mesos

MotivationMesos

ImplementationEvaluationConclusion

1 Motivation

2 MesosOverviewArchitecture

3 Implementation

4 Evaluation

5 Conclusion

Muhammad Anis uddin Nasir Mesos 18/32

Page 62: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Implementation

10,000 lines of code in C++

Linux, Solaris, OS X

support framework written in Java, C++, Python

ZooKeeper

Frameworks

HadoopTorque and MPISpark

Muhammad Anis uddin Nasir Mesos 19/32

Page 63: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Implementation

10,000 lines of code in C++

Linux, Solaris, OS X

support framework written in Java, C++, Python

ZooKeeper

Frameworks

HadoopTorque and MPISpark

Muhammad Anis uddin Nasir Mesos 19/32

Page 64: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Implementation

10,000 lines of code in C++

Linux, Solaris, OS X

support framework written in Java, C++, Python

ZooKeeper

Frameworks

HadoopTorque and MPISpark

Muhammad Anis uddin Nasir Mesos 19/32

Page 65: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Implementation

10,000 lines of code in C++

Linux, Solaris, OS X

support framework written in Java, C++, Python

ZooKeeper

Frameworks

HadoopTorque and MPISpark

Muhammad Anis uddin Nasir Mesos 19/32

Page 66: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Implementation

10,000 lines of code in C++

Linux, Solaris, OS X

support framework written in Java, C++, Python

ZooKeeper

Frameworks

HadoopTorque and MPISpark

Muhammad Anis uddin Nasir Mesos 19/32

Page 67: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Implementation

10,000 lines of code in C++

Linux, Solaris, OS X

support framework written in Java, C++, Python

ZooKeeper

FrameworksHadoop

Torque and MPISpark

Muhammad Anis uddin Nasir Mesos 19/32

Page 68: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Implementation

10,000 lines of code in C++

Linux, Solaris, OS X

support framework written in Java, C++, Python

ZooKeeper

FrameworksHadoopTorque and MPI

Spark

Muhammad Anis uddin Nasir Mesos 19/32

Page 69: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Implementation

10,000 lines of code in C++

Linux, Solaris, OS X

support framework written in Java, C++, Python

ZooKeeper

FrameworksHadoopTorque and MPISpark

Muhammad Anis uddin Nasir Mesos 19/32

Page 70: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Spark

Data flow for logistic regression

Muhammad Anis uddin Nasir Mesos 20/32

Page 71: Mesos

MotivationMesos

ImplementationEvaluationConclusion

1 Motivation

2 MesosOverviewArchitecture

3 Implementation

4 Evaluation

5 Conclusion

Muhammad Anis uddin Nasir Mesos 21/32

Page 72: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Dynamic Resource Sharing

96 Node Mesos Cluster

4 CPU Cores and 15GB Ram

Muhammad Anis uddin Nasir Mesos 22/32

Page 73: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Dynamic Resource Sharing

96 Node Mesos Cluster4 CPU Cores and 15GB Ram

Muhammad Anis uddin Nasir Mesos 22/32

Page 74: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Dynamic Resource Sharing

Muhammad Anis uddin Nasir Mesos 23/32

Page 75: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Data Locality with Resource Offers

16 instance of Hadoop using 93 EC2 nodes

1.7x speed up with Mesos

97% data locality with 5sec data scheduling

Muhammad Anis uddin Nasir Mesos 24/32

Page 76: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Data Locality with Resource Offers

16 instance of Hadoop using 93 EC2 nodes

1.7x speed up with Mesos

97% data locality with 5sec data scheduling

Muhammad Anis uddin Nasir Mesos 24/32

Page 77: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Data Locality with Resource Offers

16 instance of Hadoop using 93 EC2 nodes

1.7x speed up with Mesos

97% data locality with 5sec data scheduling

Muhammad Anis uddin Nasir Mesos 24/32

Page 78: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Scalability

99 Amazon EC2 nodes

Scaled to 50,000 emulated slaves, 200 frameworks, 100K tasks

unable to scale beyond 50,000 slaves as Amazon EC2 clusterwas the bottleneck

Muhammad Anis uddin Nasir Mesos 25/32

Page 79: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Scalability

99 Amazon EC2 nodes

Scaled to 50,000 emulated slaves, 200 frameworks, 100K tasks

unable to scale beyond 50,000 slaves as Amazon EC2 clusterwas the bottleneck

Muhammad Anis uddin Nasir Mesos 25/32

Page 80: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Scalability

99 Amazon EC2 nodes

Scaled to 50,000 emulated slaves, 200 frameworks, 100K tasks

unable to scale beyond 50,000 slaves as Amazon EC2 clusterwas the bottleneck

Muhammad Anis uddin Nasir Mesos 25/32

Page 81: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Further Experiments

Fault Tolerance

Mesos masters connected to a 5-node ZooKeeper quorumfault detection and recovery in 10 sec

Overhead

LINPACK for MPI and Wordcount Hadoop BecnhmarkOverhead of Mesos was less than 4%

Muhammad Anis uddin Nasir Mesos 26/32

Page 82: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Further Experiments

Fault ToleranceMesos masters connected to a 5-node ZooKeeper quorum

fault detection and recovery in 10 sec

Overhead

LINPACK for MPI and Wordcount Hadoop BecnhmarkOverhead of Mesos was less than 4%

Muhammad Anis uddin Nasir Mesos 26/32

Page 83: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Further Experiments

Fault ToleranceMesos masters connected to a 5-node ZooKeeper quorumfault detection and recovery in 10 sec

Overhead

LINPACK for MPI and Wordcount Hadoop BecnhmarkOverhead of Mesos was less than 4%

Muhammad Anis uddin Nasir Mesos 26/32

Page 84: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Further Experiments

Fault ToleranceMesos masters connected to a 5-node ZooKeeper quorumfault detection and recovery in 10 sec

Overhead

LINPACK for MPI and Wordcount Hadoop BecnhmarkOverhead of Mesos was less than 4%

Muhammad Anis uddin Nasir Mesos 26/32

Page 85: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Further Experiments

Fault ToleranceMesos masters connected to a 5-node ZooKeeper quorumfault detection and recovery in 10 sec

OverheadLINPACK for MPI and Wordcount Hadoop Becnhmark

Overhead of Mesos was less than 4%

Muhammad Anis uddin Nasir Mesos 26/32

Page 86: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Further Experiments

Fault ToleranceMesos masters connected to a 5-node ZooKeeper quorumfault detection and recovery in 10 sec

OverheadLINPACK for MPI and Wordcount Hadoop BecnhmarkOverhead of Mesos was less than 4%

Muhammad Anis uddin Nasir Mesos 26/32

Page 87: Mesos

MotivationMesos

ImplementationEvaluationConclusion

1 Motivation

2 MesosOverviewArchitecture

3 Implementation

4 Evaluation

5 Conclusion

Muhammad Anis uddin Nasir Mesos 27/32

Page 88: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Conclusion

Mesos shares clusters efficiently among diverseframeworks

Fine-grained sharing at the level of tasksResource Sharing, a scalable mechanism forapplication-controlled scheduling

Enabales co-existence of current frameworks anddevelopment of new specialized frameworks

Muhammad Anis uddin Nasir Mesos 28/32

Page 89: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Conclusion

Mesos shares clusters efficiently among diverseframeworks

Fine-grained sharing at the level of tasks

Resource Sharing, a scalable mechanism forapplication-controlled scheduling

Enabales co-existence of current frameworks anddevelopment of new specialized frameworks

Muhammad Anis uddin Nasir Mesos 28/32

Page 90: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Conclusion

Mesos shares clusters efficiently among diverseframeworks

Fine-grained sharing at the level of tasksResource Sharing, a scalable mechanism forapplication-controlled scheduling

Enabales co-existence of current frameworks anddevelopment of new specialized frameworks

Muhammad Anis uddin Nasir Mesos 28/32

Page 91: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Conclusion

Mesos shares clusters efficiently among diverseframeworks

Fine-grained sharing at the level of tasksResource Sharing, a scalable mechanism forapplication-controlled scheduling

Enabales co-existence of current frameworks anddevelopment of new specialized frameworks

Muhammad Anis uddin Nasir Mesos 28/32

Page 92: Mesos

MotivationMesos

ImplementationEvaluationConclusion

References

Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph,A. D., Katz, R., Shenker, S., et al. (n.d.). Mesos : A Platformfor Fine-Grained Resource Sharing in the Data Center.

http://incubator.apache.org/mesos/

http://datainthecloud.blogspot.se/2011/10/

mesos-platform-for-fine-grained.html

https://www.usenix.org/conference/nsdi11/

mesos-platform-fine-grained-resource-sharing-data-center

http://static.usenix.org/event/nsdi11/tech/

slides/hindman.pdf

Muhammad Anis uddin Nasir Mesos 29/32

Page 93: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Mesos: A Platform for Fine-Grained ResourceSharing in the Data Center

Muhammad Anis uddin Nasir

KTH Royal Institute of Technology

November 20, 2012

Muhammad Anis uddin Nasir Mesos 30/32

Page 94: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Tasks

elastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferences

weighted fair allocation policylottery scheduling

Heterogeneous Tasks

random task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 95: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task duration

rigid frameworks with exponential task duration

Placement Preferences

weighted fair allocation policylottery scheduling

Heterogeneous Tasks

random task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 96: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferences

weighted fair allocation policylottery scheduling

Heterogeneous Tasks

random task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 97: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferences

weighted fair allocation policylottery scheduling

Heterogeneous Tasks

random task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 98: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policy

lottery scheduling

Heterogeneous Tasks

random task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 99: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policylottery scheduling

Heterogeneous Tasks

random task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 100: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policylottery scheduling

Heterogeneous Tasks

random task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 101: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policylottery scheduling

Heterogeneous Tasksrandom task assignment

reserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 102: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policylottery scheduling

Heterogeneous Tasksrandom task assignmentreserve some resources on each node for small tasks

maximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 103: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policylottery scheduling

Heterogeneous Tasksrandom task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 104: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policylottery scheduling

Heterogeneous Tasksrandom task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentives

short taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 105: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policylottery scheduling

Heterogeneous Tasksrandom task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentivesshort tasks

elastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 106: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policylottery scheduling

Heterogeneous Tasksrandom task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentivesshort taskselastic tasks

do not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 107: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Different workloads

Homogeneous Taskselastic frameworks with constant task durationrigid frameworks with exponential task duration

Placement Preferencesweighted fair allocation policylottery scheduling

Heterogeneous Tasksrandom task assignmentreserve some resources on each node for small tasksmaximum task duration

Framework incentivesshort taskselastic tasksdo not accept unknown resources

Muhammad Anis uddin Nasir Mesos 31/32

Page 108: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentation

not optimal bin packinglarge jobs may starveminimum offer size on each slave

Interdependent framework constraints

scenarios where only one task can be accommodated

Framework Complexity

framework scheduling is complexframework has a choicefailures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32

Page 109: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentationnot optimal bin packing

large jobs may starveminimum offer size on each slave

Interdependent framework constraints

scenarios where only one task can be accommodated

Framework Complexity

framework scheduling is complexframework has a choicefailures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32

Page 110: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentationnot optimal bin packinglarge jobs may starve

minimum offer size on each slave

Interdependent framework constraints

scenarios where only one task can be accommodated

Framework Complexity

framework scheduling is complexframework has a choicefailures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32

Page 111: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentationnot optimal bin packinglarge jobs may starveminimum offer size on each slave

Interdependent framework constraints

scenarios where only one task can be accommodated

Framework Complexity

framework scheduling is complexframework has a choicefailures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32

Page 112: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentationnot optimal bin packinglarge jobs may starveminimum offer size on each slave

Interdependent framework constraints

scenarios where only one task can be accommodated

Framework Complexity

framework scheduling is complexframework has a choicefailures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32

Page 113: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentationnot optimal bin packinglarge jobs may starveminimum offer size on each slave

Interdependent framework constraintsscenarios where only one task can be accommodated

Framework Complexity

framework scheduling is complexframework has a choicefailures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32

Page 114: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentationnot optimal bin packinglarge jobs may starveminimum offer size on each slave

Interdependent framework constraintsscenarios where only one task can be accommodated

Framework Complexity

framework scheduling is complexframework has a choicefailures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32

Page 115: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentationnot optimal bin packinglarge jobs may starveminimum offer size on each slave

Interdependent framework constraintsscenarios where only one task can be accommodated

Framework Complexityframework scheduling is complex

framework has a choicefailures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32

Page 116: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentationnot optimal bin packinglarge jobs may starveminimum offer size on each slave

Interdependent framework constraintsscenarios where only one task can be accommodated

Framework Complexityframework scheduling is complexframework has a choice

failures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32

Page 117: Mesos

MotivationMesos

ImplementationEvaluationConclusion

Limitations

Fragmentationnot optimal bin packinglarge jobs may starveminimum offer size on each slave

Interdependent framework constraintsscenarios where only one task can be accommodated

Framework Complexityframework scheduling is complexframework has a choicefailures are easy to handle with resource offers

Muhammad Anis uddin Nasir Mesos 32/32