Download - Apache Mesos: a simple explanation of basics
Mesos
“There's Just No Getting around It: You're Building a Distributed System” -Mark Cavage
A simple presentation on mesos by Gladson Manuel
What is Mesos?
● Mesos is a kernel designed to run on distributed systems.
● In a distributed environment, Mesos runs on every machine.
● Scheduler capable of handling multiple resources.
Why Mesos?
● Scalability● Fault tolerant● Support Docker● Isolation between tasks and linux containers● Frameworks can be built on
Java/Python/Scala/C/C++● WebUI
Mesos partitioning
● Mesos is datacentre kernel, so resources of a node is not for that one node. It is for the whole distributed system.
Zookeeper:
● Used to elect master if a running master is failed. --It is recommended to keep the number of slaves as odd. As leader election is based on a strict majority, zookeeper splits the available number of masters into two and select the set with higher number of Nodes.
Zookeeper quorum:
A limited number of zookeeper servers.
● Slave 1 reports to the master that it has 4 CPUs and 4 GB of memory free. The master then invokes the allocation policy module, which tells it that framework 1 should be offered all available resources.
● The master sends a resource offer describing what is available on slave 1 to framework 1.
● The framework’s scheduler replies to the master with information about two tasks to run on the slave, using <2 CPUs, 1 GB RAM> for the first task, and <1 CPUs, 2 GB RAM> for the second task.
● Finally, the master sends the tasks to the slave, which allocates appropriate resources to the framework’s executor, which in turn launches the two tasks (depicted with dotted-line borders in the figure). Because 1 CPU and 1 GB of RAM are still unallocated, the allocation module may now offer them to framework 2.
● In addition, this resource offer process repeats when tasks finish and new resources become free.
Getting Started
● Download source install dependencies and build
$ wget http://www.apache.org/dist/mesos/0.20.1/mesos-0.20.1.tar.gz
● Clone from git repository
$ git clone http://git-wip-us.apache.org/repos/asf/mesos.git
Common issues while build
● Java home not set● Fix: export JAVA_HOME=/usr/java/<jdk_as_mentioned_by
mesos>/bin/java
● Maven downloads nothing● Set maven proxy in ~/.m2/settings.xml
● DELAY!!● Sorry its a compilation process. Either upgrade your hardware to
moster configuration or try compiled packages like mesosphere(practical workaround).
Mesosphere installation
● Add Key of mesosphere repositorysudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)
● Add mesosphere repository to ubuntu sourcesecho "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | \
sudo tee /etc/apt/sources.list.d/mesosphere.list
sudo apt-get -y update
sudo apt-get install mesosphere
Build is done(finally). Now what?
● Start mesos master./bin/mesos-master.sh --ip=127.0.0.1 –work_dir=/var/lib/mesos
● Start mesos slave./bin/mesos-slave.sh –master=127.0.0.1:5050
● Access WebUIhttp://127.0.0.1:5050
Frameworks in detail
Components:
Scheduler: Receives resource offers and Launch tasks
Executor: Executor is launched by the slave to execute tasks on the slave
Flow of Execution
● slave notifies master about its available resources
● tasks is scheduled by the scheduler.So scheduler have info about available tasks
● Scheduler sends the tasks to the right slave based on the available resources of the slave
● Slave check of executor that is already running,if not it launches a new one and execute the task on the executor.
Status Updates
● non-terminal updates(TASK_RUNNING)● terminal updates(TASK_FINISHED). Terminal
updates are very important since it is the only way mesos get informed that a task in done. Only then the resources are freed
● TASK_LOST(slave terminated)
Activities
● Callbacks
Callbacks are synchronous and single threaded. Since only one one call is made at a time no blocks will occur
--registered(frameworkId,masterInfo)
--resourceOffers(offers)
--statusUpdate(taskStatus)
● Actions
Asynchronous, for example sendStatusUpdate() gets queued in the driver
--launchTasks(offerId,taskInfo,filters)
--killTask(taskId)
--declineOffer(offerId,filters)
Whats going on in the code?
● Every request is looped through each offer
● If offer satisfies a request, MesosTask is created by calling driver.launchTasks(offer.getId,tasks,filters)
● Task is executed and perform an exit()