mesos at opentable

50
Mesos at OpenTable Pablo Delgado Senior Data Engineer OpenTable @pablete MesosCon 2015, Seattle, WA

Upload: pablo-delgado

Post on 13-Apr-2017

669 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Mesos at OpenTable

Mesos at OpenTable

Pablo Delgado Senior Data Engineer OpenTable @pablete

MesosCon 2015, Seattle, WA

Page 2: Mesos at OpenTable

• Over 32,000 restaurants worldwide

• more than 760 million diners seated since 1998, representing more than $30 billion spent at partner restaurants

• Over 16 million diners seated every month

• OpenTable has seated over 190 million diners via a mobile device. Almost 50% of our reservations are made via a mobile device

• OpenTable currently has presence in US, Canada, Mexico, UK, Germany and Japan

• OpenTable has nearly 600 partners including Facebook, Google, TripAdvisor, Urbanspoon, Yahoo and Zagat.

2

OpenTable the world’s leading provider of online restaurant reservations

Page 3: Mesos at OpenTable

At OpenTable

we aim to power

the best dining experiences!

Page 4: Mesos at OpenTable

Service Oriented Architecture

Page 5: Mesos at OpenTable

5

From monolith to microservices

Page 6: Mesos at OpenTable

6

• Mesos: A Platform for Fine-Grained Resource Sharing in the Data CenterPAPER: http://mesos.berkeley.edu/mesos_tech_report.pdf

• Omega: flexible, scalable schedulers for large compute clusters PAPER: http://research.google.com/pubs/pub41684.html

Apache Mesos

Page 7: Mesos at OpenTable

7

Apache Mesos• Mesos slaves connect to

masters and offer resources like CPU, disk, and memory.

• Masters take those offers and make decisions about resource allocation using frameworks like Singularity.

• Frameworks in turn choose to use resource offers, and run tasks on slaves.

Page 8: Mesos at OpenTable

8

Zookeeper

Netflix’s Exhibitor

Mesos Master

Zookeeper

Netflix’s Exhibitor

Standby Master

Zookeeper

Netflix’s Exhibitor

Standby Master

Docker

Mesos SlaveDocker

Mesos Slave

Docker

Mesos SlaveDocker

Mesos Slave

Docker

Mesos SlaveDocker

Mesos Slave

availability zone 2bavailability zone 2a availability zone 2c

Apache Mesos

Page 9: Mesos at OpenTable

Hubspot’s Singularity Scheduler

Page 10: Mesos at OpenTable

10

• Native Docker Support

• JSON REST API and Java Client

• Fully featured web application (replaces and improves Mesos Master UI)

• Deployments, automatic rollbacks, and healthchecks

• Configurable email alerts to service owners

Singularity Features

Page 11: Mesos at OpenTable

11

Hubspot’s SingularityProcess types:Web Services WorkersScheduled (CRON-type) JobsOn-Demand Processes

Slave placement:GREEDYSEPARATE_BY_DEPLOYSEPARATE_BY_REQUESTOPTIMISTIC

Executors:Mesos executorSingularity executorDocker executor

Page 12: Mesos at OpenTable

Linux Containers

Page 13: Mesos at OpenTable

13

Docker• Immutability

• Portability

• Isolation

Page 14: Mesos at OpenTable

Service Discovery

Page 15: Mesos at OpenTable

15

Services no longer live in a well known address/port, so we needed a registry or dynamic way to find them. Also it had to be MESOS agnostic.

• Service announce their presence to the Discovery Server

• Service subscribe to changes in dependencies announcement

• Service un-announce on termination or timeout on crash

Service Discovery

Page 16: Mesos at OpenTable

16

Zookeeper Zookeeper Zookeeper

availability zone 2bavailability zone 2a availability zone 2c

Service Discovery

Discovery Server Discovery Server Discovery Server

A

A

A

BB

Announce

Discover

Subscribe

Page 17: Mesos at OpenTable

17

Service Discovery API

Page 18: Mesos at OpenTable

FrontDoor

Page 19: Mesos at OpenTable

19

FrontDoor

• Route external traffic to internal services

• Simple Discovery-aware proxy

• Dynamic configuration

• Developer friendly configuration via Git repo

REQUEST_URI=/api/timezone* passthru timezone

Page 20: Mesos at OpenTable

Monitoring

Page 21: Mesos at OpenTable

21

Monitoring

https://github.com/opentable/mesos_stats

• Finds your service name by parsing the task names.

• Includes grafana dashboard

• Runs inside mesos

Page 22: Mesos at OpenTable

All together

Page 23: Mesos at OpenTable

23

Github

Continuous Integration

Singularity

Discovery

MasterZookeeper

MasterZookeeper

MasterZookeeper

SlaveDocker

SlaveDocker

SlaveDocker

SlaveDocker

SlaveDocker

SlaveDocker

FrontDoor

Docker Registry

Discovery

Discovery

Overview

Page 24: Mesos at OpenTable

24

Github

Continuous Integration

Singularity

Docker Registry

Developer’s Concerns

• Initialize projects with Continuous integration template

• Enable monitoring/logging of application level errors

• Build project as an immutable docker image

• Deploy to Mesos through singularity using a rest API

Page 25: Mesos at OpenTable

25

Singularity

Discovery

MasterZookeeper

MasterZookeeper

MasterZookeeper

SlaveDocker

SlaveDocker

SlaveDocker

SlaveDocker

SlaveDocker

SlaveDocker

FrontDoor

Docker Registry

Discovery

Discovery

Operational Concerns

• Provide Mesos with resources

• Monitor and maintain external traffic routing

• Monitor and replace failing resources

Page 26: Mesos at OpenTable

26

Stateless Mesos Cluster

Datastores

Caches

Stateless Simplicity

Other

Mysql, PostgreSQL, MongoDB

Redis, Memcached

Zookeeper, Amazon S3

Page 27: Mesos at OpenTable

27

US Data Center EU Data Center

AWS us-west-2 AWS eu-west-1 AWS us-west-2

PROD PROD

PROD PROD QA

DATA PROCESSING

Page 28: Mesos at OpenTable

28

US Data Center EU Data Center

AWS us-west-2 AWS eu-west-1 AWS us-west-2

PROD PROD

PROD PROD QA

DATA PROCESSING

Kafka Kafka

Kafka Kafka Kafka

Page 29: Mesos at OpenTable

Data Processing

Page 30: Mesos at OpenTable

30

Distributed Multitenant Data Processing

Page 31: Mesos at OpenTable

31

Spark’s Approach

• Generalize MapReduce in order to support new apps in the same engine

• General DAGs and Data Sharing

• Unification benefits the engine, which is more efficient, and simple for user

• Handles batch, interactive and online processing

• API available for Java, Scala, Python, SQL, R

Page 32: Mesos at OpenTable

32

Spark RDDs

Resilient Distributed Datasets (or RDD) are fault-tolerant distributed collections

They exists in the form of:

• Parallelized Collections

• External datasets, distributed datasets from any storage source supported by Hadoop, including your local file system, HDFS, Cassandra, HBase, Amazon S3, etc.

Page 33: Mesos at OpenTable

33

HadoopRDD(path(=(hdfs://...(

FilteredRDD(func(=(_.contains(…)(shouldCache(=(true(

file:%

errors:%

Partition.level%view:%Dataset.level%view:%

Task%1%Task%2% ...%

RDD GraphDataset-level view Partition-level view

file RDD

errors RDD

Task 1 Task 2 Task 3 Task n

Page 34: Mesos at OpenTable

34

Scheduling Process

rdd1.join(rdd2) .groupBy(…) .filter(…)

RDD#Objects#

build#operator#DAG!

agnos&c(to(operators!(

doesn’t(know(about(stages(

DAGScheduler#

split#graph#into#stages#of#tasks!

submit#each#stage#as#ready#

DAG#

TaskScheduler#

TaskSet#

launch#tasks#via#cluster#manager!

retry#failed#or#straggling#tasks!

Cluster#manager#

Worker#

execute#tasks!

store#and#serve#blocks!

Block(manager(

Threads(Task#

stage#failed#

Lifetime of a job. Scheduling Process

Page 35: Mesos at OpenTable

35

Scheduling Process

rdd1.join(rdd2) .groupBy(…) .filter(…)

RDD#Objects#

build#operator#DAG!

agnos&c(to(operators!(

doesn’t(know(about(stages(

DAGScheduler#

split#graph#into#stages#of#tasks!

submit#each#stage#as#ready#

DAG#

TaskScheduler#

TaskSet#

launch#tasks#via#cluster#manager!

retry#failed#or#straggling#tasks!

Cluster#manager#

Worker#

execute#tasks!

store#and#serve#blocks!

Block(manager(

Threads(Task#

stage#failed#

Lifetime of a job. Scheduling Process

Page 36: Mesos at OpenTable

36

Scheduling Process

rdd1.join(rdd2) .groupBy(…) .filter(…)

RDD#Objects#

build#operator#DAG!

agnos&c(to(operators!(

doesn’t(know(about(stages(

DAGScheduler#

split#graph#into#stages#of#tasks!

submit#each#stage#as#ready#

DAG#

TaskScheduler#

TaskSet#

launch#tasks#via#cluster#manager!

retry#failed#or#straggling#tasks!

Cluster#manager#

Worker#

execute#tasks!

store#and#serve#blocks!

Block(manager(

Threads(Task#

stage#failed#

Lifetime of a job. Scheduling Process

Page 37: Mesos at OpenTable

37

Scheduling Process

rdd1.join(rdd2) .groupBy(…) .filter(…)

RDD#Objects#

build#operator#DAG!

agnos&c(to(operators!(

doesn’t(know(about(stages(

DAGScheduler#

split#graph#into#stages#of#tasks!

submit#each#stage#as#ready#

DAG#

TaskScheduler#

TaskSet#

launch#tasks#via#cluster#manager!

retry#failed#or#straggling#tasks!

Cluster#manager#

Worker#

execute#tasks!

store#and#serve#blocks!

Block(manager(

Threads(Task#

stage#failed#

Lifetime of a job. Scheduling Process

Page 38: Mesos at OpenTable

38

Alternating Least Squares (ALS) in MLlib

Page 39: Mesos at OpenTable

39

Driver Program

SparkContext

Cluster Manager

Worker Node

Executor

Task Task

Cache

Worker Node

Executor

Task Task

Cache

Running Spark

Page 40: Mesos at OpenTable

40

Driver Program

SparkContext

Cluster Manager

Worker Node

Executor

Task Task

Cache

Mesos Master

Mesos Executor

Worker Node

Task Task

Cache

Mesos Executor

Framework

Mesos Coarse Grained

Executor

Page 41: Mesos at OpenTable

41

Driver Program

SparkContext

Cluster Manager

Worker Node

Task

Mesos Master

Mesos Executor

Worker Node

Mesos Executor

Task

Task Task

Executor

Executor

Executor

Executor

Mesos Fine Grained

Framework

Page 42: Mesos at OpenTable

Pull Requests (maybe merged)

[SPARK-7373] Add docker support for launching drivers in mesos cluster mode.

[SPARK-5338] Add cluster mode support for Mesos

[SPARK-5095] Support capping cores and launch mulitple executors in coarse mode

[SPARK-6707] Mesos Scheduler should allow the user to specify constraints based on slave attributes[SPARK-6287] Add dynamic allocation to the coarse-grained Mesos scheduler

Page 43: Mesos at OpenTable

43

Memory-centric distributed storage system (cache)

Distributed file system

General engine for large-scale data processing

Kernel for the datacenter

Ideal data processing stack

Page 44: Mesos at OpenTable

44

Other frameworks

• KAFKA on mesos https://github.com/mesos/kafka

• SAMZA on mesos https://github.com/banno/samza-mesos

• PHOENIX (secor on mesos) https://github.com/stealthly/phoenix

• CASSANDRA on mesos https://github.com/mesosphere/cassandra-mesos

We are also using:

We are considering:

• CHRONOS https://github.com/mesos/chronos

• MARATHON https://github.com/mesosphere/marathon

Page 45: Mesos at OpenTable

45

KafkaUser Activity

backups

Query/Processing Layer

Spark SQL

JSON

Data Products

ETL

Spark MLlib

Spark Streaming

Page 46: Mesos at OpenTable

46

{“userId”:"xxxxxxxx","event":"personalizer_search","query_longitude":-77.16816,"latitude":38.918159,"req_attribute_tag_ids":["pizza"],"req_geo_query":"Current Location”,"sort_by":"best","longitude":-77.168156,"query_latitude":38.91816,"req_forward_minutes":30,"req_party_size":2,"req_backward_minutes":30,"req_datetime":"2015-06-02T12:00","req_time":"12:00","res_num_results":784,"calculated_radius":5.466253405962307,"req_date":"2015-06-02"},"type":"track","messageId":"b4f2fafc-dd4a-45e3-99ed-4b83d1e42dcd","timestamp":"2015-06-02T10:02:34.323Z"}

ETL with Spark/ SparkSQL

Page 47: Mesos at OpenTable

47

Matrix Factorization. Spark MLlib

• Collaborative Filtering

• Topic Modeling

• Restaurant Demand Analysis

Page 48: Mesos at OpenTable

48

nigiri sashimi gari maki roku rolls roll godzilla chirashi robata zushi omakase yellowtail unagi

samba toro gyoza aburi spider starburst nakazawa shabu sasa katana sake hapa maguro tsunami

raku kappo yasuda otoro seki tamari ra teppanyaki caterpillar japan shashimi hamasaku

Early explorations with Word2vec: Find synonyms for “Sushi”

We use Apache Spark’s Implementation of Word2Vec (skip-gram model)

Page 49: Mesos at OpenTable

49

Sushi of Gari, Gari Columbus, NYC

Masaki Sushi Chicago

Sansei Seafood Restaurant & Sushi Bar, Maui

A restaurant like your favorite one but in a different city. Find the “synonyms” of the restaurant in question, then filter by location!

Akiko’s, SF

San Francisco Maui Chicago New York

'

Downtown upscale sushi experience with sushi bar

Page 50: Mesos at OpenTable

keep in touch @pablete