serverless platforms - qcon new york...infrastructure • cluster schedulers that can scale to 10s...

38
Serverless Platforms Diptanu Choudhury @diptanu 1 QCon New York

Upload: others

Post on 20-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Serverless Platforms

Diptanu Choudhury@diptanu

1

QCon New York

Page 2: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

2

Page 3: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

The GOES Mission

A lineage of geostationary satellites

Current mission is called GOES-R

ABI views earth with 16 different spectral bands

Page 4: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

GOES-R Products

• Atmospheric events

• Fire control

• Vegetation Monitoring

• NASA makes most of the data available to research community

Page 5: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

5Product Development Workflow

Page 6: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

6Product Development Workflow

Iterative Experimentationbased Model Development

Deployment of Models in Production and Image Processing

Data serving and

Collaboration APIs

Page 7: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

7The organization

Science Engineering

• Plans mission objectives

• Develop models and products

• Conducts research and drives collaboration

• Operates infrastructure to run models

• Design systems to scale to petabytes of data and 10sof 1000s of cores.

• Design real time services for services to consume data

• Dev-Infra to improve productivity of researchers

Page 8: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Infrastructure

• Cluster Schedulers that can scale to 10s of 1000s of cores

• Storage services that can scale to petabytes of data

• Interactive batch processing systems

• Services to stream real time information

• APIs to access data from various transformation jobs

Page 9: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

9

Job Management System

Authorization/Authentication

Service

Identity Management

Telemetry System

Compute Node

Compute Node

Object Store

Jobs

Page 10: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

10

Page 11: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Infrastructure Stack of Circa 2017

Locking and Co-ordination Management

Cluster Schedulers

Block Storage Service Discovery

Telemetry Services

Object Store

MicroservicesBatch Processing

LBaaS

Identity Management

Tracing Services

Applications

The Platform

Page 12: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Platforms as a Service

Locking and Co-ordination Management

Cluster Schedulers

Block Storage Service Discovery

Telemetry Services

Object Store

MicroservicesBatch Processing

LBaaS

Identity Management

Tracing Services

Applications

Release Management

Build Infrastructure

Configuration Management

Page 13: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

13

Serverless Architecture

Page 14: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Serverless Architecture

Function ControllerRequests

Application Logic

Application Container

Payload

Page 15: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Serverless Architecture

• Incremental evolution over PAAS

• Serverless from the perspective of users

• Clear separation of concerns between infrastructure engineering and application developers

Page 16: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Serverless Platform

• Optimistically concurrent Cluster Scheduler

• Application Container

• Function Controller

• RPC Middleware

• Function Invocation shims in data sources

• Packaging and distribution service

Page 17: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Cluster Scheduler

• Low latency and high throughput scheduling

• No head of line blocking

• Optimistically concurrent

• Scale up to a large number of containers

Page 18: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

18

Page 19: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

19

THE PATH TO PRODUCTION FOR A SCHEDULER IS FILLED WITH PAIN

CHOOSE AN EXISTING SCHEDULER IF YOU CAN

Page 20: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Event Driven Scheduler

Evaluations ~= State Change Event

Page 21: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Data Model

Page 22: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

22

External Event

Evaluation Creation

Evaluation Queuing

Evaluation Processing

Optimistic Coordination

State Updates

Page 23: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Templated Jobs

• Can be used for low throughput use cases

• Expensive to start new processes

• Bounded by network operations of scheduler

Page 24: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Variability

• Sharing resources

• Daemons

• Global resources

• Maintainance activities

• Queuing

• Power Limits

Page 25: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

25Source: Tale at Scale

Page 26: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Reduce Variability

• Differentiate service classes

• Break down long running functions into small ones.

• Minimize background activity

Page 27: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Application Containers

• Application containers range from language sandboxes to full blown containers

• Reduced surface area helps in making the UX better

• Application containers have function invocation middleware which are packaged during build

Page 28: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Function Controller

• Maintains the network topology of functions

• Optimizes for low latency and not cost

• Works with the scheduler to scale containers

• Works with telemetry system to scale underlying clusters

Page 29: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Software Load Balancer

• Service discovery aware LB

• Reactive Socket protocol for delivery of request payloads to function containers

• Doesn't effect container life cycles

• Drops requests which doesn't have any containers associated

Page 30: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Reactive Sockets

• RSocket is a means for asynchronous communication between functions and data sources

• Messages are multiplexed over a single connection

• Flexible interaction models

• Provides mechanisms for back pressure and request cancellation

Page 31: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

RPC Middleware

• RPC implementation moved into the platform

• Hedged requests to reduce tail latency

• Tied requests to reduce mean and tail latency

• Heavy use of back pressure

Page 32: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Cluster Topology

• Scheduler chooses the cluster topology for containers

• Function Controller can change the topology based on real time latency

• Functions which are chained needs to be placed as close as possible

Page 33: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Caching

• Various forms of caching

• Depends on life cycle management of functions

• External caches like Redis, Memcached works best

• Hard to use in-memory clustered caches like Groupcache

Page 34: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

External Data Sources

• Functions can read from external data sources just like traditional applications

• Avoid polling external data sources

• Data Sources like Kafka could co-operate with functions by a shim which calls functions with data mutations

Page 35: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Build and Distribution

• Packaging is part of the server less experience

• Packaging pipeline should transform a function to an application container.

• Unit Testing should be much easier with Functions than they were for traditional applications

• Performance of the platform can be improved by caching containers on compute nodes

Page 36: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Operations in the world of Serverless

• Reduced amount of operations for developers

• Metrics related to various concerns of the platform should be very targetted.

Page 37: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Application metrics

• Latency at the API Gateway level

• Throughput of events processed

• Latency distribution and planning.

Page 38: Serverless Platforms - QCon New York...Infrastructure • Cluster Schedulers that can scale to 10s of 1000s of cores • Storage services that can scale to petabytes of data • Interactive

Platform Metrics

• Latency of function invocation

• Throughput of events dispatched to the platform

• Number of active functions