finagle linkerd and apache mesos twitter style microservices at scale

Post on 13-Feb-2017

232 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Finagle, linkerd, and MesosTwitter-style microservices at scaleoliver gouldcto, buoyant

ApacheCon North America, May 2016

from

oliver gould • cto @ buoyantopen-source microservice infrastructure

• previously, tech lead @ twitter:observability, traffic

• core contributor: finagle

• creator: linkerd

• loves: dogs@olix0r ver@buoyant.io

overview• 2010: A Failwhale Odyssey

• Automating the Datacenter

• Microservices: A Silver Bullet

• Finagle: The Once and Future Layer 5

• Introducing linkerd

• Demo

• Q&A

2010A FAILWHALE ODYSSEY

Twitter, 2010107 users

107 tweets/day

102 engineers

101 services

101 deploys/week

102 hosts

0 datacenters

101 user-facing outages/weekhttps://blog.twitter.com/2010/measuring-tweets

The Monorail, 2010

103 of RPS

102 of RPS/host

101 of RPS/process

hardware lb

the monorail

mysql memcache kestrel

Problems with the MonorailRuby performance

MySQL scaling

Memcache operability

Deploys

Events

https://blog.twitter.com/2013/new-tweets-per-second-record-and-how

Asymmetry

Photo by Troy Holden

Provisioning

automating the datacenter

mesos.apache.orgUC Berkeley, 2010

Twitter, 2011

Apache, 2012

Abstracts compute resources

Promise: don’t worry about the hosts

aurora.apache.orgTwitter, 2011

Apache, 2013

Schedules processes on Mesos

Promise: no more puppet, monit, etc

timelines

Aurora (or Marathon, or …)

host

Mesos

host host host host host

users notifications

x800 x300 x1000

microservicesA SILVER BULLET

scaling teams

growing software

flexibility

performance correctness monitoring debugging efficiency security

resilience

not a silver bullet.(sorry.)

Resilience is an imperative: our software runs on the truly dismal computers we call datacenters. Besides being heinouslycomplex… they are unreliable and prone to operator error.

Marius Eriksen @mariusRPC Redux

resilience in microservicessoftware you didn’t write

hardware you can’t touch

network you can’t configure

break in new and surprising ways

and your customers shouldn’t notice

resilient microservices means

resilient communication

datacenter

[1] physical

[2] link

[3] network

[4] transport aurora, marathon, … mesos canal, weave, …

aws, azure, digitalocean, gce, …

business languages, libraries[7] application

rpc[5] session

[6] presentation json, protobuf, thrift, …

http/2, mux, …

layer 5 dispatches requests onto layer 4 connections

finagleTHE ONCE AND FUTURE LAYER 5

github.com/twitter/finagleRPC library (JVM)

asynchronous

built on Netty

scala

functional

strongly typed

first commit: Oct 2010

used by…

programming finagleval users = Thrift.newIface[UserSvc](“/s/users”) val timelines = Thrift.newIface[TimelineSvc](“/s/timeline”)

Http.serve(“:8080”, Service.mk[Request, Response] { req => for { user <- users.get(userReq(req)) timeline <- timelines.get(user) } yield renderHTML(user, timeline) })

operating finagletransport security

service discovery

circuit breaking

backpressure

deadlines

retries

tracing

metrics

keep-alive

multiplexing

load balancing

per-request routing

service-level objectives

Obs

erve

Sess

ion

timeo

ut

Ret

ries

Req

uest

dra

inin

g

Load balancer

Mon

itor

Obs

erve

Trac

e

Failu

re a

ccru

al

Req

uest

tim

eout

Pool

Fail

fast

Expiration Dispatcher

“It’s slow”is the hardest problem you’ll ever debug.

Jeff Hodges @jmhodgesNotes on Distributed Systems for Young Bloods

the more components you deploy, the more problems you have

the more components you deploy, the more problems you have

😩

the more components you deploy, the more problems you have

😩

😩

😩

😩

😩

😩

lb algorithms: • round-robin • fewest connections • queue depth • exponentially-weighted

moving average (ewma) • aperture

load balancing at layer 5

timeouts & retries

timelines

users

web

db

timeout=400ms retries=3

timeout=400ms retries=2

timeout=200ms retries=3

timelines

users

web

db

deadlines

timelines

users

web

db

timeout=400ms

deadline=223ms

deadline=10ms

177ms elapsed

213ms elapsed

retry budget

typical:

retries=3 worst-case: 300% more load!!!

better: retryBudget=20% worst-case: 20% more load

tracing

tracing

😎

tracing

layer 5 routing

layer 5 routingapplications refer to logical names

requests are bound to concrete names

delegations express routing

/s/users

/io.l5d.zk/prod/users

/s => /io.l5d.zk/prod/http

per-request routing: staging

GET / HTTP/1.1 Host: mysite.comDtab-local: /s/B => /s/B2

per-request routing: debug proxy

GET / HTTP/1.1Host: mysite.comDtab-local: /s/E => /s/P/s/E

so all i have to do is rewrite my app in scala?

linkerd

github.com/buoyantio/linkerdmicroservice rpc proxy

layer-5 router

aka l5d

built on finagle & netty

pluggable

http, thrift, …

etcd, consul, kubernetes, marathon, zookeeper, …

magic resiliency sprinklestransport security

service discovery

circuit breaking

backpressure

deadlines

retries

tracing

metrics

keep-alive

multiplexing

load balancing

per-request routing

service-level objectives

Service Binstance

linkerd

Service Cinstance

linkerd

Service Ainstance

linkerd

namerdreleased in March

centralized routing policy

delegates logical names to service discovery

pluggable

etcd

kubernetes

zookeeper

namerd

demo: gob’s microservice

web

wordgen

l5d

l5dl5d

web

wordgen

gen-v2

l5d

l5dl5d

l5d

web

wordgen

gen-v2

l5d

l5dl5d

l5d

namerd

master

dc/os marathon zookeeper

node nodepublic node node

ELB ELB

master

dc/os marathon zookeeper

node nodepublic node node

linkerd linkerd linkerd linkerd

ELB ELB

namerd

master

dc/os marathon zookeeper

node nodepublic node node

linkerd linkerd linkerd linkerd

ELB ELB

namerd

web (x1) gen (x3)

word (x3)

word-growthhack (x3)

github.com/buoyantio/linkerd-examples

linkerd roadmap• Netty4.1

• HTTP/2+gRPC linkerd#174 • TLS client certs • Richer routing policies • Announcers • More configurable everything

more at linkerd.io

slack: slack.linkerd.io

email: ver@buoyant.io

twitter:

• @olix0r

• @linkerd

thanks!

top related