(private) cloud computing with mesos at twitter benjamin hindman @benh
TRANSCRIPT
![Page 1: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/1.jpg)
(Private) Cloud Computing with Mesos at Twitter
Benjamin Hindman@benh
![Page 2: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/2.jpg)
what is cloud computing?
self-servicescalable
economic
elastic
virtualized
managed
utility
pay-as-you-go
![Page 3: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/3.jpg)
what is cloud computing?
• “cloud” refers to large Internet services running on 10,000s of machines (Amazon, Google, Microsoft, etc)
• “cloud computing” refers to services by these companies that let external customers rent cycles and storage– Amazon EC2: virtual machines at 8.5¢/hour, billed
hourly– Amazon S3: storage at 15¢/GB/month– Google AppEngine: free up to a certain quota– Windows Azure: higher-level than EC2,
applications use API
![Page 4: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/4.jpg)
what is cloud computing?• cheap nodes, commodity networking
• self-service (use personal credit card) and pay-as-you-go
• virtualization– from co-location, to hosting providers running the
web server, the database, etc and having you just FTP your files … now you do all that yourself again!
• economic incentives– provider: sell unused resources– customer: no upfront capital costs building data
centers, buying servers, etc
![Page 5: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/5.jpg)
“cloud computing”
• infinite scale …
![Page 6: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/6.jpg)
“cloud computing”
• always available …
![Page 7: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/7.jpg)
challenges in the cloud environment
• cheap nodes fail, especially when you have many– mean time between failures for 1 node = 3
years– mean time between failures for 1000 nodes =
1 day– solution: new programming models
(especially those where you can efficiently “build-in” fault-tolerance)
• commodity network = low bandwidth– solution: push computation to the data
![Page 8: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/8.jpg)
moving target
infrastructure as a service (virtual machines)
software/platforms as a service
why?• programming with failures is hard• managing lots of machines is hard
![Page 9: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/9.jpg)
moving target
infrastructure as a service (virtual machines)
software/platforms as a service
why?• programming with failures is hard• managing lots of machines is hard
![Page 10: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/10.jpg)
programming with failures is hard
• analogy: concurrency/parallelism– imagine programming with threads that
randomly stop executing– can you reliably detect and differentiate
failures?
• analogy: synchronization– imagine programming where
communicating between threads might fail (or worse, take a very long time)
– how might you change your code?
![Page 11: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/11.jpg)
problem:distributed systems are hard
![Page 12: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/12.jpg)
solution:abstractions (higher-level
frameworks)
![Page 13: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/13.jpg)
MapReduce
• Restricted data-parallel programming model for clusters (automatic fault-tolerance)
• Pioneered by Google– Processes 20 PB of data per day
• Popularized by Apache Hadoop project– Used by Yahoo!, Facebook, Twitter, …
![Page 14: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/14.jpg)
beyond MapReduce
• many other frameworks follow MapReduce’s example of restricting the programming model for efficient execution on clusters– Dryad (Microsoft): general DAG of tasks– Pregel (Google): bulk synchronous processing– Percolator (Google): incremental computation– S4 (Yahoo!): streaming computation– Piccolo (NYU): shared in-memory state– DryadLINQ (Microsoft): language integration– Spark (Berkeley): resilient distributed datasets
![Page 15: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/15.jpg)
everything else
• web servers (apache, nginx, etc)• application servers (rails)• databases and key-value stores
(mysql, cassandra)• caches (memcached)• all our own twitter specific services
…
![Page 16: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/16.jpg)
managing lots of machines is hard
• getting efficient use of out a machine is non-trivial (even if you’re using virtual machines, you still want to get as much performance as possible)
nginxnginxHadoop
Hadoop nginxnginxHadoo
pHadoo
p
![Page 17: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/17.jpg)
managing lots of machines is hard
• getting efficient use of out a machine is non-trivial (even if you’re using virtual machines, you still want to get as much performance as possible)
nginxnginxHadoop
Hadoop
![Page 18: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/18.jpg)
problem:lots of frameworks and
services … how should we allocate resources (i.e., parts
of a machine) to each?
![Page 19: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/19.jpg)
idea:can we treat the datacenter
as one big computer and multiplex applications and
services across available machine resources?
![Page 20: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/20.jpg)
solution: mesos
• common resource sharing layer – abstracts resources for frameworks
nginxnginxHadoop
Hadoop
MesosnginxnginxHadoop
Hadoop
uniprograming multiprograming
![Page 21: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/21.jpg)
twitter and the cloud
• owns private datacenters (not a consumer)– commodity machines, commodity networks
• not selling excess capacity to third parties (not a provider)
• has lots of services (especially new ones)
• has lots of programmers• wants to reduce CAPEX and OPEX
![Page 22: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/22.jpg)
twitter and mesos
• use mesos to get cloud like properties from datacenter (private cloud) to enable “self-service” for engineers
(but without virtual machines)
![Page 23: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/23.jpg)
computation model: frameworks• A framework (e.g., Hadoop, MPI) manages one or
more jobs in a computer cluster• A job consists of one or more tasks• A task (e.g., map, reduce) is implemented by one
or more processes running on a single machine
23
cluster
FrameworkScheduler (e.g., Job Tracker)
Executor(e.g., Task
Tracker)
Executor(e.g., Task
Traker)
Executor(e.g., TaskTracker)
Executor (e.g., TaskTracker)
task 1task 5
task 3task 7
task 4
task 2task 6
Job 1: tasks 1, 2, 3, 4Job 2: tasks 5, 6, 7
![Page 24: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/24.jpg)
two-level scheduling
• Advantages: – Simple easier to scale and make resilient– Easy to port existing frameworks, support new ones
• Disadvantages: – Distributed scheduling decision not optimal
24
MesosMaster
Organization policies
Resource availability
Frameworkscheduler
Taskschedule
Fwkschedule
Frameworkscheduler
FrameworkScheduler
![Page 25: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/25.jpg)
resource offers
• Unit of allocation: resource offer – Vector of available resources on a node– E.g., node1: <1CPU, 1GB>, node2: <4CPU, 16GB>
• Master sends resource offers to frameworks
• Frameworks select which offers to accept and which tasks to run
25
Push task scheduling to frameworks
Push task scheduling to frameworks
![Page 26: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/26.jpg)
Hadoop JobTracker
MPI JobTracker
8CPU, 8GB
Hadoop Executor
MPI executor
task 1
task 1
8CPU, 16GB
16CPU, 16GB
Hadoop Executortask 2
Allocation Module
S1 <8CPU,8GB>S2 <8CPU,16GB>S3 <16CPU,16GB>
S1 <6CPU,4GB>S2 <4CPU,12GB>S1 <2CPU,2GB>
Mesos Architecture: Example
26
(S1:<8CPU,
8GB>,
S2:<8CPU,
16GB>)
(task1:
[S1:<2CPU,4GB>];
task2:
[S2:<4CPU,4GB>])
S1:<8CPU,8GB>
S2:<8CPU,16GB>
S3:<16CPU,1
6GB>
Slaves continuously send status
updates about resources
Pluggable scheduler to
pick framework to send an offer to
Framework scheduler selects
resources and provides tasks
Framework executors launch
tasks and may persist across tasks
task 1:<2CPU,4GB
> task 2:<4CPU,4GB>
(S1:<6CPU,4GB>, S3:<16CPU,16
GB>)
([task1:S1:<4CPU,2GB])
Slave S1
Slave S2
Slave S3
Mesos Mastertask1:<4CPU,2GB>
![Page 27: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/27.jpg)
twitter applications/services
if you build it … they will come
if you build it … they will come
let’s build a url shortner (t.co)!let’s build a url shortner (t.co)!
![Page 28: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/28.jpg)
development lifecycle
1. gather requirements
2. write a bullet-proof service (server)‣ load test‣ capacity plan‣ allocate & configure machines‣ package artifacts‣ write deploy scripts‣ setup monitoring‣ other boring stuff (e.g., sarbanes-oxley)
3. resume reading timeline (waiting for machines to get allocated)
![Page 29: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/29.jpg)
development lifecycle with mesos
![Page 30: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/30.jpg)
t.co
• launch on mesos!
CRUD via command line:$ scheduler create t_co t_co.mesos
Creating job t_co
OK (4 tasks pending for job t_co)
![Page 31: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/31.jpg)
t.co
• launch on mesos!
CRUD via command line:$ scheduler create t_co t_co.mesos
Creating job t_co
OK (4 tasks pending for job t_co)
tasks represent shards
![Page 32: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/32.jpg)
t.co
cluster
Scheduler
Executor Executor
ExecutorExecutor
task 1task 5
task 3task 7
task 4
task 2task 6
$ scheduler create t_co t_co.mesos
![Page 33: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/33.jpg)
t.co
• is it running? (“top” via a browser)
![Page 34: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/34.jpg)
what it means for devs?
• write your service to be run anywhere in the cluster
• anticipate ‘kill -9’• treat local disk like /tmp
![Page 35: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/35.jpg)
bad practices avoided
• machines fail; force programmers to focus on shared-nothing (stateless) service shards and clusters, not machines– hard-coded machine names (IPs)
considered harmful– manually installed packages/files
considered harmful– using the local filesystem for persistent
data considered harmful
![Page 36: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/36.jpg)
level of indirection #ftw
nginxnginxt.cot.co
Mesos
@DEVOPS_BORAT
Need replace server!
Need replace server!
![Page 37: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/37.jpg)
level of indirection #ftw
nginxnginxt.cot.co
Mesos
@DEVOPS_BORAT
Need replace server!
Need replace server!
![Page 38: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/38.jpg)
level of indirection #ftw
example from operating systems?
![Page 39: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/39.jpg)
isolation
Executortask
1task 5
what happens when task 5 executes: while (true) {}
![Page 40: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/40.jpg)
isolation
• leverage linux kernel containers
task 1 (t.co) task 2 (nginx)
CPU
CPU
CPURAMRAM
container 1 container 2
![Page 41: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/41.jpg)
software dependencies
1.package everything into a single artifact
2.download it when you run your task
(might be a bit expensive for some services, working on next generation solution)
![Page 42: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/42.jpg)
t.co + malwarewhat if a user clicks a link that takes them some place bad?
what if a user clicks a link that takes them some place bad?
let’s check for malware!let’s check for malware!
![Page 43: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/43.jpg)
t.co + malware
• a malware service already exists … but how do we use it?
cluster
Scheduler
Executor Executor
ExecutorExecutor
task 1task 5
task 3task 1
task 4
task 2task 6
![Page 44: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/44.jpg)
t.co + malware
• a malware service already exists … but how do we use it?
cluster
Scheduler
Executor Executor
ExecutorExecutor
task 1task 5
task 3task 1
task 4
task 2task 6
![Page 45: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/45.jpg)
t.co + malware
• a malware service already exists … but how do we use it?
cluster
Scheduler
Executor Executor
ExecutorExecutor
task 1task 5
task 3task 1
task 4
task 2task 6
how do we name the malware service?
![Page 46: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/46.jpg)
naming part 1
‣ service discovery via ZooKeeper
‣ zookeeper.apache.org
‣ servers register, clients discover
‣ we have a Java library for this
‣ twitter.github.com/commons
![Page 47: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/47.jpg)
naming part 2
‣ naïve clients via proxy
![Page 48: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/48.jpg)
naming
• PIDs• /var/local/myapp/pid
![Page 49: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/49.jpg)
t.co + malware
• okay, now for a redeploy! (CRUD)
$ scheduler update t_co t_co.config
Updating job t_co
Restarting shards ...
Getting status ...
Failed Shards = []
...
![Page 50: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/50.jpg)
rolling updates …
Updater{Job, Update Config}
Restart Shards Healthy?
Rollback
Complete?
Finish{Success, Failure}
Yes
YesNo
No
![Page 51: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/51.jpg)
datacenter operating system
Mesos+ Twitter specific scheduler+ service proxy (naming)+ updater+ dependency managerdatacenter operating system (private
cloud)
![Page 52: (Private) Cloud Computing with Mesos at Twitter Benjamin Hindman @benh](https://reader035.vdocument.in/reader035/viewer/2022062518/56649eac5503460f94bb37a0/html5/thumbnails/52.jpg)
Thanks!