sprayer: low latency, reliable multichannel messaging

53
Sprayer low latency, reliable multichannel messaging for Telefonica Digital

Upload: pablo-enfedaque

Post on 18-Dec-2014

836 views

Category:

Technology


1 download

DESCRIPTION

At Telefonica PDI we are developing an internal messaging service to be used by our own products. Sprayer is a low latency, reliable messaging system supporting delivery of messages to a single receiver, predefined group of receivers or specific list of receivers over different channels (SMS, HTTP, WebSockets, Email, Android, iOS and Firefox OS native push…). We are using Redis, MongoDB and RabbitMQ to implement Sprayer. In this talk we will review Sprayer’s architecture. We will see for each of these technologies, why, where and for what they are used as well as some tips. Talk done together with Javier Arias ( @javier_arilos ) at NoSQL Matters Barcelona 2013.

TRANSCRIPT

Page 1: Sprayer: low latency, reliable multichannel messaging

Sprayerlow latency, reliable multichannel messaging

for Telefonica Digital

Page 2: Sprayer: low latency, reliable multichannel messaging

who are we?

Javier Arias@javier_arilos

Javier is a Software Architect and developer, worked in different sectors such as M2M, Telcos, Finance, Airports.

Pablo Enfedaque@pablitoev56

Pablo is a SW R&D engineer with a strong background in high performance computing, big data and distributed systems.

Page 3: Sprayer: low latency, reliable multichannel messaging

Telefónica is the 4th largest telco in the world

2 years ago Telefonica Digital was established to spread our business to the digital world

former Telefonica R&D / PDI was merged into this new company

some context

Page 4: Sprayer: low latency, reliable multichannel messaging

overview

we are developing an internal messaging service to be used by our own products

we have polyglot persistence using different NoSQL technologies

in this talk we will review Sprayer’s architecture and, for each technology, how it is used

Page 5: Sprayer: low latency, reliable multichannel messaging

why sprayer?

a common push messaging service. why?

➔ each project with messaging needs was implementing its own server its own way

➔ 5 push messaging systems in the company

➔ none of them supporting a wide variety of transports

➔ independent deployment and operations

Page 6: Sprayer: low latency, reliable multichannel messaging

the problem

cross technology push:

point to point and pubsub:

PaaS, multitenant

iOS Android Websockets HTTP

eMail SMS FirefoxOS

1 to 1 1 to N 1 to Group

Page 7: Sprayer: low latency, reliable multichannel messaging

inspiration

➔ Google’s Thialfi: http://research.google.com/pubs/pub37474.html

➔ Twitter Timeline: http://www.infoq.com/presentations/Twitter-Timeline-Scalability

➔ Pusher: http://www.pusher.com

➔ Pubnub: http://www.pubnub.com

➔ Amazon SNS: http://aws.amazon.com/sns/

Page 8: Sprayer: low latency, reliable multichannel messaging

SPRAYER!

Sprayer is a low latency, reliable messaging system supporting delivery of messages to a single receiver, to a predefined group of receivers or to a specific list of receivers over different channels (WebSockets, SMS, Email, HTTP and iOS, Android or Firefox OS native push…)

the proposal

Page 9: Sprayer: low latency, reliable multichannel messaging

SPRAYER!

our motto:you care about business,we deliver your messages

the proposal

Page 10: Sprayer: low latency, reliable multichannel messaging

server

side API

Page 11: Sprayer: low latency, reliable multichannel messaging

server

side API

?

Page 12: Sprayer: low latency, reliable multichannel messaging

server side API challenges

➔ common interface for all channels

➔ reliable, consistent, idempotent

➔ route messages efficiently

➔ simple and user oriented

◆ manage subscriptions

◆ send messages: to list or group (topic)

◆ get delivery feedback

➔ standards based (HTTP + Json)

Page 13: Sprayer: low latency, reliable multichannel messaging

architecture

APPLICATION <BACKEND>

sprayer backend

sms gateway

email gateway

GCM

APNs

Operationalstorage

ACCEPTERREST API

MESSAGES DISPATCHING

Page 14: Sprayer: low latency, reliable multichannel messaging

messagesdispatching

Page 15: Sprayer: low latency, reliable multichannel messaging

messagesdispatching

?

Page 16: Sprayer: low latency, reliable multichannel messaging

message dispatching challenges

➔ scaling horizontally

➔ reliability

➔ different channels: ◆ HTTP (outbound)◆ Websockets (inbound)◆ iOS push (APNs)◆ Android push (GCM)◆ SMS◆ eMail

Page 17: Sprayer: low latency, reliable multichannel messaging

architecture

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

Operationalstorage

ACCEPTERREST API

MESSAGES ROUTING

Page 18: Sprayer: low latency, reliable multichannel messaging

outbound-stateless dispatchers

ANDROID GCM

simple dispatchers: HTTP, iOS, Android...

➔ Take message, get msg subscribers,

dispatch to receiver, report feedback

➔ Completely stateless

Operationalstorage

ACCEPTERREST API

Page 19: Sprayer: low latency, reliable multichannel messaging

Operationalstorage

connection aware dispatchers

clients (websockets, HTTP long poll …)➔ messages are stored until clients connect➔ client inits a persistent connection➔ potentially, millions of clients

WEBSOCKETS

ROUTER DELIVERER

ACCEPTERREST API

inboxes

Page 20: Sprayer: low latency, reliable multichannel messaging

messagerouting

Page 21: Sprayer: low latency, reliable multichannel messaging

messagerouting

?

Page 22: Sprayer: low latency, reliable multichannel messaging

message routing challenges

routing (two-steps):➔ API routes messages to N dispatchers➔ Each dispatcher routes message to M

receivers (subscribers of a group)

both steps must be decoupled

The number of receivers could be thousands

Page 23: Sprayer: low latency, reliable multichannel messaging

architecture

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

Subscriptionsstorage

email

sms

android

WS

HTTP

iOS

Operationalstorage

ACCEPTERREST API

FEEDBACK

Page 24: Sprayer: low latency, reliable multichannel messaging

async messagedelivery feedback

Page 25: Sprayer: low latency, reliable multichannel messaging

?

async messagedelivery feedback

Page 26: Sprayer: low latency, reliable multichannel messaging

async delivery feedback challenges

make msg feedback available through API to clients

feedback must not compromise message delivery or API

feedback: msg delivery, connections, push

The number of updates could be millions

Page 27: Sprayer: low latency, reliable multichannel messaging

architecture

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

STATUS FEEDER

Subscriptionsstorage

email

sms

android

WS

HTTP

iOS

Operationalstorage feedback

ACCEPTERREST API

Page 28: Sprayer: low latency, reliable multichannel messaging

technology stack

Page 29: Sprayer: low latency, reliable multichannel messaging

subscriptions storage

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

STATUS FEEDER

Subscriptionsstorage

email

sms

android

WS

HTTP

iOS

Operationalstorage feedback

ACCEPTERREST API

?

Page 30: Sprayer: low latency, reliable multichannel messaging

subscriptions storage

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

STATUS FEEDER

email

sms

android

WS

HTTP

iOS

Operationalstorage feedback

ACCEPTERREST API

Page 31: Sprayer: low latency, reliable multichannel messaging

dispatcher receiver inboxes

WEBSOCKETS

ROUTER DELIVERER

ACCEPTERREST API

inboxes

?

Page 32: Sprayer: low latency, reliable multichannel messaging

dispatcher receiver inboxes

WEBSOCKETS

ROUTER DELIVERER

ACCEPTERREST API

inboxes

Page 33: Sprayer: low latency, reliable multichannel messaging

redis

Redis is an open source, advanced key-value store. It is often referred to as a data structure server (...) - (redis.io)

why redis?

- amazingly fast

- easy to use

- usage patterns: shared cache, queues, pubsub, distributed lock, counting things

Page 34: Sprayer: low latency, reliable multichannel messaging

redis use cases

use cases in Sprayer:

➔ group subscribers x channel➔ channels x group➔ websockets channel queues (potentially

million receivers)

limitations for our use cases:

➔ memory bound➔ queries and pagination➔ high throughput queues

Page 35: Sprayer: low latency, reliable multichannel messaging

redis concerns

➔ what happens when dataset does not fit in memory? two strategies◆ partition datasets to different redis clusters◆ sharding: based in tenant would be easy

➔ FT and HA◆ easy way: master-slave with virtual IPs, switch

slave’s IP when master’s out. home made daemon◆ sentinel based, some tests done, needs to be

supported by client library◆ redis cluster being implemented; limited features

Page 36: Sprayer: low latency, reliable multichannel messaging

operational storage

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

STATUS FEEDER

email

sms

android

WS

HTTP

iOS

Operationalstorage feedback

ACCEPTERREST API

?

Page 37: Sprayer: low latency, reliable multichannel messaging

operational storage

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

STATUS FEEDER

email

sms

android

WS

HTTP

iOS

feedback

ACCEPTERREST API

Page 38: Sprayer: low latency, reliable multichannel messaging

mongodb

mongoDB (from "humongous") is a document database (...) features: full index support, replication & HA, auto-sharding... (mongodb.org)

why mongoDB?

➔ scaling & HA➔ great performance➔ dynamic schemas➔ versatile

Page 39: Sprayer: low latency, reliable multichannel messaging

mongodb use cases

use cases in Sprayer:➔ operational DB, administrative data➔ message delivery feedback updates

(potentially millions of records)

limitations for our use cases:➔ operations with sets of subscribers➔ high throughput queues

Page 40: Sprayer: low latency, reliable multichannel messaging

mongodb concerns

no concerns about mongodb for our usecase.

maybe, in the long term, can it handle the huge amount of feedback write operations without affecting the API?

Page 41: Sprayer: low latency, reliable multichannel messaging

async queues

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

STATUS FEEDER

email

sms

android

WS

HTTP

iOS

feedback

ACCEPTERREST API

?

?

Page 42: Sprayer: low latency, reliable multichannel messaging

async queues

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

STATUS FEEDER

ACCEPTERREST API

Page 43: Sprayer: low latency, reliable multichannel messaging

rabbitmq

robust messaging for applications, easy to use

(www.rabbitmq.com)

why rabbitmq?

➔ very fast➔ reliable➔ builtin clustering

Page 44: Sprayer: low latency, reliable multichannel messaging

rabbitmq use cases

use cases in Sprayer:➔ jobs for dispatchers (API => dispatchers)➔ feedback status updates: message

delivery, connections, device status (dispatchers => API)

limitations for our use cases:➔ not scaling well to millions of queues

(websocket receiver inboxes)

Page 45: Sprayer: low latency, reliable multichannel messaging

rabbitmq concerns

no concerns!

rabbitmq is best suited to very high throughput messaging

Page 46: Sprayer: low latency, reliable multichannel messaging

full tech stack

APPLICATION <BACKEND>

sprayer backend

IOS

HTTP

WEBSOCKETS

ANDROID

SMS

EMAIL

sms gateway

email gateway

GCM

APNs

STATUS FEEDER

ACCEPTERREST API

Page 47: Sprayer: low latency, reliable multichannel messaging

sum up

Page 48: Sprayer: low latency, reliable multichannel messaging

design threats

Page 49: Sprayer: low latency, reliable multichannel messaging

design threats

related data in different places: redis, rabbitmq and mongo

we are not transactional, our components remain sane in case of a DB failure, idempotent operations help here

light implementation of Unit of Work architectural pattern

idempotent operations are a must here

Page 50: Sprayer: low latency, reliable multichannel messaging

architecture guidelines

Page 51: Sprayer: low latency, reliable multichannel messaging

architecture guidelines

➔ asynchronous processing / queues everywhere

➔ dedicated dispatchers for each transport

➔ common API interface

➔ used the best tool for each responsibility:

polyglot persistence

➔ processes as stateless as possible

Page 52: Sprayer: low latency, reliable multichannel messaging
Page 53: Sprayer: low latency, reliable multichannel messaging

YES, SPRAYER DOES!

thanks for coming