nats and netlify

28
NATS and Netlify Building out a data plane for a globally distributed system @ry_boflavin @netlify

Upload: ryan-neal

Post on 16-Apr-2017

487 views

Category:

Engineering


0 download

TRANSCRIPT

Page 1: Nats and netlify

NATS and Netlify Building out a data plane for a globally distributed system

@ry_boflavin @netlify

Page 2: Nats and netlify

About Me: Ryan Neal- Head of Infrastructure at Netlify

- Simultaneously fixing and breaking everything

- Senior Dev at Yelp

- Internal tools and metrics team

- Used to about 400k metrics/sec

- 12-18k pageviews/sec

- FDE at Palantir

- Developed counter terrorist software

- 4 Billion records / day

@ry_boflavin @netlify

Page 3: Nats and netlify

@ry_boflavin @netlify

A developer’s toolkit for deploying git-backed,

browser-driven sites to an intelligent CDN

- Global CDN

- CI cluster

- Redundant DNS

- Prerender cluster

- Mongo cluster

- Rails cluster

- 4 cloud providers

- 14 PoPs

Page 4: Nats and netlify

API cluster

Global CDNPre-Render cluster

CI cluster

Distributed systems are cool

buildbotbuildbot

buildbotbuildbot

APIAPI

APIAPI

CDN CDN CDN CDN CDN CDN CDN APIAPI

DB land

Page 5: Nats and netlify

The problem

your request

@ry_boflavin @netlify

CDN Node

db

Page 6: Nats and netlify

The problem

your request

@ry_boflavin @netlify

CDN Node

db

Page 7: Nats and netlify

The problem

your request

@ry_boflavin @netlify

CDN Node

proxy

db

Page 8: Nats and netlify

The problem

your request

@ry_boflavin @netlify

CDN Node

proxy origin

db

Page 9: Nats and netlify

The problem

your request

@ry_boflavin @netlify

CDN Node

proxy origin

api

db

Page 10: Nats and netlify

The problem

your request

@ry_boflavin @netlify

CDN Node

db

XX

X

Page 11: Nats and netlify

Unity- Cohesive view of system

- Traceability between services

- Build for now, not later

Page 12: Nats and netlify

The Naive Solution

@ry_boflavin @netlify

random service

logs Papertrail daemon

papertrailrandom service

Page 13: Nats and netlify

Immediate Problem

- Make the logs searchable

- Easy to add more logs

Long Term Vision

- A generic system to let services push data out

- An easy way to access that data for new and fun uses

Tool Requirements

- Easy installation

- Good scaling factors

- Secure

Spec before building

@ry_boflavin @netlify

Page 14: Nats and netlify

And so the story begins...

@ry_boflavin @netlify

Rabbit MQ

- Existing infrastructure

- Didn’t need enterprise messaging features

- Data was only metrics, telemetry and logs

Kafka

- Didn’t want to run zookeeper

- Didn’t need rewind or buffering

Page 15: Nats and netlify

Creating the Data plane

@ry_boflavin @netlify

logs nats

random service

Page 16: Nats and netlify

Creating the Data plane

@ry_boflavin @netlify

logs nats

random service

streamer

Page 17: Nats and netlify

Creating the Data plane

@ry_boflavin @netlify

random service

logs nats

random service

streamer

Page 18: Nats and netlify

Creating the Data plane

@ry_boflavin @netlify

random service

logs nats

random service

streamer elastinats

es

Page 19: Nats and netlify

Creating the Data plane

@ry_boflavin @netlify

random service

logs nats

random service

streamer

elastinats

elastinats

elastinats

elastinats

es

Page 20: Nats and netlify

Creating the Data plane

@ry_boflavin @netlify

random service

logs nats

random service

streamer

taptap

elastinats

elastinats

elastinats

elastinats

es

Page 21: Nats and netlify

Elastinats lessons

@ry_boflavin @netlify

func(m *nats.Msg) { stats.IncrementMessagesConsumed() go func() { payload := message.NewPayload(string(m.Data), m.Subject)

// maybe it is json! _ = json.Unmarshal(m.Data, payload) c <- payload }()}

func(m *nats.Msg) { stats.IncrementMessagesConsumed() payload := message.NewPayload(string(m.Data), m.Subject)

// maybe it is json! _ = json.Unmarshal(m.Data, payload) c <- payload}

- Don’t block the consumer

Page 22: Nats and netlify

Elastinats lessons

@ry_boflavin @netlify

- Don’t block the consumer

- Use ES’s Bulk API

Page 23: Nats and netlify

Elastinats lessons

@ry_boflavin @netlify

- Don’t block the consumer

- Use ES’s Bulk API

- Add error reporting

handle := func(nc *nats.Conn, sub *nats.Subscription, err error) {log.Warn(err)

}

nc, err := nats.Connect(serverString, nats.Secure(tlsConfig), nats.ErrorHandler(handle))if err != nil {

panic(err)}

Page 24: Nats and netlify

Elastinats lessons

@ry_boflavin @netlify

- Don’t block the consumer

- Use ES’s Bulk API

- Add error reporting

- Use buffering

ch := make(chan *nats.Msg, 100000)sub, err := nc.ChanSubscribe(subject, ch)if err != nil {

log.Fatal("Failed to subscribe")}defer sub.Unsubscribe()

sub, err := nc.SubscribeSync(subject)if err != nil {

log.Fatal("Failed to subscribe")}defer sub.Unsubscribe()err := sub.SetPendingLimits(numMsgs, numBytes)

Page 25: Nats and netlify

Future Work

@ry_boflavin @netlify

Page 26: Nats and netlify

Future Work

@ry_boflavin @netlify

- Use a nats_metrics library to measure and push to nats

- Add more taps for log analysis

- Migrate legacy services to push based metrics and logs

Page 27: Nats and netlify

@ry_boflavin @netlify

Questions?

Page 28: Nats and netlify

Linkshttps://github.com/netlify/elastinats

https://github.com/netlify/streamer

https://github.com/rybit/nats_metrics

@ry_boflavin @netlify

https://github.com/rybit

[email protected]

Check out the slides on slideshare!!