scalable distributed system architecture

26
Scalable Raspberry Pi Architecture ( distributed systems ) LB a node app HDFS -Aulëkin

Upload: nicholas-van-de-walle

Post on 15-Jul-2015

202 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Scalable Distributed System Architecture

Scalable Raspberry Pi Architecture

( distributed systems )LBa node app

HDFS

-Aulëkin

Page 2: Scalable Distributed System Architecture

Goal: "Painless" Development and Maintenance

Specifically, SaaS ecosystems (Google, Facebook, Twitter)

Page 3: Scalable Distributed System Architecture

Mindset Primer:

Simple vs Easy, avoid Complection

UNIX Philosophy

Power of Functional Programming

Do 1 thing well, compose through universal interfaces

Focusing on inputs and outputs, referential transparency

Easy hides complexity, simple eliminates it

Page 4: Scalable Distributed System Architecture

Web Developer = Internet Plumber

Page 5: Scalable Distributed System Architecture

Our Tests

Starting a New ServiceGrowing Existing Service

Performance Monitoring

Logging

Host Management

Decoupled consumption

Deployment

Page 6: Scalable Distributed System Architecture

Starting a New Service

Growing Existing Service

How much work does it take to get a new serviceready for users?

How do you communicate with your team aboutincreased cluster complexity?

How hard is it to add features to an existing service?

Can you increase robustness / redundancy whenunforeseen edge cases are stumbled upon?

Page 7: Scalable Distributed System Architecture

Performance Monitoring

Logging

How much visibility do you have into the runtimeperformance characteristics of your programs?

If something is breaking, how do you diagnose + fix?

How do you monitor what your application is actuallydoing?

How do you manage logs of all user + serverinteractions?

Page 8: Scalable Distributed System Architecture

Host Management

Deployment

How do you set up the program's environment?- Configuration (DB access), env vars, libs / packages

How do you get your code (as a binary or as scripts)onto host machines?

Page 9: Scalable Distributed System Architecture

Decoupled consumption

How easy is it to replace a service?

How easy is consuming a new service?

Are your services just "big objects?"

Page 10: Scalable Distributed System Architecture

app

Growing New Service

Easy because there's 1

Growing Old Service

Monitoring

HTOP

Logging

Host Management

Deployment

Decoupled ConsumptionHardcoded

What's deployment?

apt-get install

node server.js > \ $(date +"%d-%m-%y-%s").log

Not there yet

N = 1

Page 11: Scalable Distributed System Architecture

Not even there yet

Upstart + scp

Bash scripts

Growing New Service

Growing Old Service

Monitoring

Logging

Host Management

Deployment

Decoupled ConsumptionHardcoded

LB

app

app

Git push!

HTOP always

Growing steadily

N = 4

Page 12: Scalable Distributed System Architecture

Bash Scripts

Still a monolith

10k lines, little structure

Growing New Service

Growing Old Service

Monitoring

Logging

Host Management

Deployment

Decoupled Consumption

LB app

app

app

HDFS

StatsD, New Relic

Dump to Loggly + HDFS

Bash script -> Git push's

Hardcoded

N < 10

Page 13: Scalable Distributed System Architecture

Turn an app into an image

Compose images into 'pods'

Unique & Reproducible

Generic Daemons(enables reuse)

Handle logs, monitoring

Programmatic Host Config

Idempotent (and fast)

Can also deployYour app is just an app

Pools servers together

Config rules are mixable

Page 14: Scalable Distributed System Architecture

Loggly ($$$) or HDFS

Growing New Service

Growing Old Service

Monitoring

Logging

Host Management

Deployment

Decoupled Consumption

LB

app

app

app

LB

app

app

app

LB

app

app

app HDFS

HDFS

LB

LB

app

app

app

Bunch of work

Tight coupling, hard

StatsD (New relic $$$)

Ansible / Salt

Above + Docker

DNS rules

N < 50

Page 15: Scalable Distributed System Architecture

DNS management decouples services

Key-Value store handles configuration management

API driven private DNS rules are like aprogrammable service phone book

Also feature flagging, distributed coordination( kv-store semaphores )

Page 16: Scalable Distributed System Architecture

Growing New Service

Growing Old Service

Monitoring

Logging

Host Management

Deployment

Decoupled Consumption

LB

app

app

app

LB

app

app

app

LB

app

app

app

LB

app

app

app

LB

app

app

app

LB

app

app

app

LB

app

app

app

HDFS

HDFS

LB

LB

LB

app

app

app

LB

app

app

app

LB

app

app

app

Balls of mud

Time consuming

StatsD

Kafka / Flume

Ansible / Salt

Above + Docker

Consul + DNS

N < 500

Page 17: Scalable Distributed System Architecture

What were we doing again?Plumbing the internet, moving data around

We make programs which run in parallel but shareresources (global state, hosts, pod characteristics)

Manage inter-service http communication, wireour cluster together

- Running code

- Managing services and their resources

- Managing communication

What does that sound like?

Page 18: Scalable Distributed System Architecture

Need a Cluster OS

Give services resources

Run those services only when needed(upstart for apps, cron for jobs)

Orchestrate service connections

Properly isolate service processes

Page 19: Scalable Distributed System Architecture

A Data Center's Kernel

Distributed Schedulingand resource management

Enter Mesos

Modeled after Google's Borg

Manages 10,000+ nodes intwitter's data centers

+ Aurora

Page 20: Scalable Distributed System Architecture

a node a node a node a node a node a node a node

a node a node a node a node a node a node a node

a node a node a node a node a node a node a node

a node a node a node a node a node a node a node

a node a node a node a node a node a node a node

a node

a node

a node

a node

a node

a node

a node

a node

a node

a node

zk zk zkmastermastermaster

Mesos

Page 21: Scalable Distributed System Architecture

a node a node a node a node a node a node a node

a node a node a node a node a node a node a node

a node a node a node a node a node a node a node

a node a node a node a node a node a node a node

a node a node a node a node a node a node a node

a node

a node

a node

a node

a node

a node

a node

a node

a node

a node

zk zk zkmastermastermaster

LB

app

app

app

LB

app

app

LB

appapp

app

HDFS

HDFS

LB

appapp

HDFS

LB

app

app

app

LB

app

app

app

LB

app

app

app

app

app

app

LB

app

app

app

LB

app

app

app

LB

app

app

app

LB

app

app

app

LB

app

app

LB

app

app

app

LB

Page 22: Scalable Distributed System Architecture

Higher level of abstraction

Deployments are tagged and reproducible

Lots of redundancy, 'Startup DevOps' is baked in

What have we gotten?

Stop worrying about individual machines!

Automatic High Availability

Constraints and Labels control what runs where

Page 23: Scalable Distributed System Architecture

Why not start with Mesos?

Page 24: Scalable Distributed System Architecture

Unproven

Additional Setup Complexity

Except that Mesos runs Twitter, AirBnB, Netflix, TWC, Paypal, OpenTable, Groupon, FourSquare, eBay...

( At least for data processing needs )

Compared to N < 10, perhaps. Starting w/ Mesos leadsto less code churn down the line. Less slowing down.

Page 25: Scalable Distributed System Architecture

Why start with Mesos?

Start with Microservices, Private PaaS (Heroku)

Easy rolling deployments

Higher node utilization Forced decoupling

Stop worrying about individual machines!

Page 26: Scalable Distributed System Architecture

Invest in Infrastructure

Developer Happiness :) We love building things

Productivity soars when tedium is removed

Hackathon projects can be quickly scaled to prod

Let computers do the boring things - avoid human mistakes