redundant devops

74
Redundant Devops about reinventing the wheel

Upload: szabolcs-szabolcsi-toth

Post on 23-Jan-2018

228 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Redundant devops

Redundant Devopsabout reinventing the wheel

Page 2: Redundant devops

Szabolcs Szabolcsi-Toth @_Nec

Page 3: Redundant devops

Senior Engineer

Page 4: Redundant devops

JSConf Budapest 2017

Curator, Organizer

Page 5: Redundant devops

What’s this about?

Page 6: Redundant devops

Metrics Error Logs Logging Secret Store Service Discovery Process Supervision Running Programs Connecting Services

Page 7: Redundant devops

Metrics

Page 8: Redundant devops

Metrics are • time bound, historical

• numeric data

• software, network or hardware property

Page 9: Redundant devops

Metrics are great! • see trends

• mark releases

• notice anomalies: spikes & gaps

• create alerts

!

?

Page 10: Redundant devops

Metric delivery • collect (scrape) or push data?

• collect periodically

• put metric data where it can be collected

Page 11: Redundant devops

Tools for metrics

• prometheus

• graphite

Page 12: Redundant devops

Node best practices • put your metrics on an accessible endpoint /metrics /status

• there are node libs to automate this instrument http

• let the metrics tool do scraping, delivery

• watch those nice graphs ☺check out grafana

Page 13: Redundant devops

Key metrics • latency check for slow queries, create performance tests on them iterate code, re-test againdo not average, use a histogram

• resource usageslow memory leaks disk is getting full predict resource shortage via trends

latency

Page 14: Redundant devops

Sending metrics is not the job of your app

Page 15: Redundant devops

Error logs

Page 16: Redundant devops

Catch errors as fast as possible! • instant alert of production errors

• use while feature testing

• keep an eye on it during releases

• aggregate errors in a single service, see all

• catch before the user

Page 17: Redundant devops

Ideal error reports have• environment of error

build / release / branch / server

• stack trace exact code location

• custom data anything that helps identifying the problem

Page 18: Redundant devops

Error log delivery• can happen any time,

hopefully rare

• push data

• expect the unexpected,handle the unhandled

• never log secrets

• sampling, throttling, timeoutdo not let error logging itselfkill your app

Page 19: Redundant devops

Tools & services for error reporting• airbrake

• errbit (airbrake api, open source)

• sentry

• raygun

• rollbar

• …

Page 20: Redundant devops

Integrate, get notified! • pagerduty

• slack / hipchat chatops - resolve, react within your chat

Page 21: Redundant devops

Logging

Page 22: Redundant devops

Logging vs Error logging • logging is anticipated

• error logs are occasional

Page 23: Redundant devops

Log levels

Page 24: Redundant devops
Page 25: Redundant devops

Log levels, recap • fatal - needs instant intervention, see error logs

• error - inform the user, see error logs

• warn - escalate if happens again

• info - just a step in a regular flow

• debug - full of lines, and traces

Page 26: Redundant devops

Benefits of logging, custom logs • debug

• custom events

• tracking the usage and behaviour of app

• profile, AB test, product development

Page 27: Redundant devops

Logging in node • console.log

• bragi

• debug

• npmlog

• winston

Page 28: Redundant devops

Logging in node - general • has timestamps

• has loglevels

• can be routed to stdout/stderr

• can be formatted

• create or use Correlation ID

Page 29: Redundant devops

Correlation ID quick quidecID

cID

cIDcID

cID

cIDcID

cID

cID

services

logs

Page 30: Redundant devops

Best practices • just put it to stdout

(docker & kubernetes clearly ecourages this)

• let the log collector handle it

• pipe stdout to a file, or whatever you like

• able to set to debug mode runtime use signals

• never log secrets

Page 31: Redundant devops

Log collectors • fluentd

• logstash

• syslog-ng

• rsyslog

Page 32: Redundant devops

A good log collector should • read from stdout / file tail

• use your correlation ID

• remove the burden of transferring your logs

Page 33: Redundant devops

Remote logging • Stackdriver (fluentd based)

• Elasticsearch (fluentd based)

Page 34: Redundant devops

Sending logs is not the job of your app

Page 35: Redundant devops

Secret Store

Page 36: Redundant devops

Secrets • passwords / usernames

• db names

• API keys

• private keys

Page 37: Redundant devops

NOT Secret Storage × source code

× private VCS repositories

× config files

× simple database fields

× ENV variables

Page 38: Redundant devops

Benefits • ACL, policies

access set of secrets by revokeable tokens

• centralized key rotationedit, update all secrets at one place

• single use access, n-use access

• time bound keys

• audit log

• runtime accessno secrets stored on disk

• build-time access

Page 39: Redundant devops

How it works

Page 40: Redundant devops

build server

app server

Secret Store

Build time Run time

Version Control

secret/name secret/name

secrets built in the deployed code

secrets were requested on app startup, stored onlyin memory

- token- secret/name

- actual secrets

- token- secret/name

- actual secrets

Page 41: Redundant devops

Secret store server • powerful encryption

• has to be unlocked on start

• secrets are totally inaccessible without unlocking

Page 42: Redundant devops

Secret store services • HashiCorp Vault

• Amazon KMS

• Docker Swarm

• Keywhiz

Page 43: Redundant devops

Never store your secrets in your source code

Page 44: Redundant devops

Service Discovery

Page 45: Redundant devops

Service discovery can help • Service Registration

and notify other services of the registered one

• Service Discoverysearching for services?

• Monitoring is a service active and responding?

• Load Balancing direct traffic to the new service

Page 46: Redundant devops

How it works • can act like a DNS

simple usecaseinternal network

• can write / create configsmore complexmore control

Page 47: Redundant devops

How it works

APP

SD AGENT check PORTcheck PID

LBStart scraping metrics

Loadbalancer directstraffic

Service registry

Page 48: Redundant devops

Service discovery agent • separate task, job, process

• can be configured what to check

• independent of your app

Page 49: Redundant devops

Service discovery services • Apache Zookeeper

• Netflix Eureka

• HashiCorp Consul

• Doozer

• Etcd (can be used to build service discovery)

Page 50: Redundant devops

Registering services is not the job of your app

Page 51: Redundant devops

Process Supervision

Page 52: Redundant devops

Process supervision • keeping your app working

• based on some property you definenot just process id, butportpinghttp response

• can fail after trying

Page 53: Redundant devops

Process supervision in Node-land • PM2

• forever

Page 54: Redundant devops

Process supervision in general • monit

manage any processsmall footprint simple

Page 55: Redundant devops

Pro Con

UsingMonit

Not usingMonit

monit can instantly restart your failing

service

you might not know why it was failing

MTTR* can be relatively high

you can debug what actually

happened

*Mean Time To Repair

Page 56: Redundant devops

Running Programs

Page 57: Redundant devops

Simple role • start & stop your app

watch the process itself handle process state

• send signals to the appsignals can be interpreted as tasks

Page 58: Redundant devops

Running Programs in general • runit

• upstart

• systemd

• Supervisord

• God

• Circus

Page 59: Redundant devops

A good program runner • distribution independent

you can migrate your scripts any time

• easy to config

Page 60: Redundant devops

monit + runit (or similar) • avoid using auto restart in both

can create weird race conditions, they do not know about each other

• use runit to configure app start/stop

• let monit decide when to restart & use runit

Page 61: Redundant devops

Connecting Services

Page 62: Redundant devops

Goals & benefits • decoupling

separate services loosen up the connection between them

• scalingscale up easily when needed scale down after

Page 63: Redundant devops

HTTP based APIs

vs

Message Queues

Page 64: Redundant devops

HTTP based APIs

LOA

D B

ALA

NC

ER

Service “1” Service “2”

Page 65: Redundant devops

Message Queues

Service “1” Service “2”

MESSAGE QUEUE

Page 66: Redundant devops

HTTP based APIs or

Message Queues?

It depends

Page 67: Redundant devops

HTTP APIs

• async / sync • remote • open API

Msg Queues

• async (usually) • grouped, close • low latency

Page 68: Redundant devops

Lessons learned

Page 69: Redundant devops

Prototype & learn

• use whatever modules and services you like

• get ready to go to live & production environments

• get ready to scale easily

Page 70: Redundant devops

Focus your app

• your app should do it’s job!

• not sending logs, metrics, notifying service registries or keeping itself running

• keep it simple

Page 71: Redundant devops

Talk to your ops

• they are here to run your app

• can help you a lot

• get on a common ground

• ask the right questions

Page 72: Redundant devops

With many thanks to

Peter Wilcsinszky / @pepov

Ferenc Kovacs / @Tyr43l

Page 73: Redundant devops

Let’s talk! :)

Find me around here, or come visit us in 2 weeks!

JSConf Budapest 2017

Page 74: Redundant devops

Thank you