devops in the amazon cloud – learn from the pioneersnetflix suro

109
Netflix Data Pipeline Sudhir Tonse (@stonse) Danny Yuan (@g9yuayon)

Upload: gaurav-gp-pal

Post on 15-Jan-2015

153 views

Category:

Technology


1 download

DESCRIPTION

DevOps helps accelerate the delivery of software applications through automation and by removing Development & Operations silos. The Netflix Platform Engineering team has developed a robust data pipeline solution called SURO that has been open sourced. Come learn from the experiences of pioneers like Netflix how they are leveraging the data pipeline for new and innovative use cases. This is the presentation by Danny Yuan, Netflix Platform Engineering Team on operational and monitoring aspects of applications on cloud platforms.

TRANSCRIPT

Page 1: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Netflix Data Pipeline

Sudhir Tonse (@stonse)Danny Yuan (@g9yuayon)

Page 2: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

photo credit: http://www.flickr.com/photos/decade_null/142235888/sizes/o/in/photostream/!

Netflix is a log generating company that also happens to stream movies

- Adrian Cockroft

Page 3: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Data Is the most important asset at Netflix

Page 4: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

If all the data is easily available to all teams, it can be leveraged in new and

exciting ways

Page 5: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Dashboard

Page 6: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

~1000 Device Types

Dashboard

Page 7: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

~1000 Device Types

~500 Apps/Web Services

Dashboard

Page 8: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

~1000 Device Types

~500 Apps/Web Services

~100 Billion Events/Day !3.2M messages per second at peak time !3GB per second at peak time

Dashboard

Page 9: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Type of Events• User  Interface  Events  • Search  Event  (‘Matrix’  using  PS3  …)  • Star  Ra>ng  Event  (HoC  :  5  stars,  Xbox,  US,  …)  

!

• Infrastructural  Events  • RPC  Call  (API  -­‐>  Billing  Service,  ‘/bill/..’,  200,  …)  • Log  Errors  (NPE,  “Movie  is  null”,  …,  …)  

!

• Other  Events  …  !!

Page 10: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Making Sense of Billions of Events

Page 11: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

http://netflix.github.io+

ElasticSearchDruid

Page 12: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 13: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

A Humble Beginning

Page 14: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 15: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 16: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 17: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Evolution …Scale!

Page 18: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 19: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 20: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

ApplicationApplication

Application Application

Application

Application

Application

Application

ApplicationApplication

Page 21: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

We Want to Process App Data in Hadoop

Page 22: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 23: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 24: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 25: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 26: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 27: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Our Hadoop Ecosystem

Page 28: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

@NetflixOSS Big Data Tools

Page 29: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Hadoop as a Service

Page 30: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pig Scripting on Steroids

Page 31: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pig Married to Clojure

Page 32: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

S3MPER

S3mper is a library that provides an additional layer of consistency checking on top of Amazon's S3 index through use of a consistent, secondary index.

S3mper is a library that provides an additional layer of consistency

checking on top of Amazon's S3 index through use of a consistent, secondary index.

Page 33: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Efficient ETL with Cassandra

Cassandra

Page 34: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Offline Analysis

Page 35: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Evolution … Speed!

Page 36: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket

Page 37: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 38: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 39: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 40: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 41: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 42: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

We Want to Aggregate, Index, and Query Data in Real Time

Page 43: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Interactive Exploration

Page 44: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Let’s walk through some use cases

Page 45: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

client activity event

*/name = “movieStarts”

Page 46: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pipeline Challenges

Page 47: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pipeline Challenges

• App owners: send and forget

Page 48: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pipeline Challenges

• App owners: send and forget

• Data scientists: validation, ETL, batch processing

Page 49: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pipeline Challenges

• App owners: send and forget

• Data scientists: validation, ETL, batch processing

• DevOps: stream processing, targeted search

Page 50: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Message Routing

Page 51: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 52: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 53: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

We Want to Consume Data Selectively in Different Ways

Page 54: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 55: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Message broker!

• High-throughput!

• Persistent and replicated

Page 56: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

There Is More

Page 57: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Intelligent Alerts

Page 58: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Intelligent Alerts

Page 59: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 60: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 61: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 62: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 63: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 64: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 65: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 66: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 67: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

What We Need

Page 68: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Ad-hoc query with different dimensions

What We Need

Page 69: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Ad-hoc query with different dimensions

• Quick aggregations and Top-N queries

What We Need

Page 70: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Ad-hoc query with different dimensions

• Quick aggregations and Top-N queries• Time series with flexible filters

What We Need

Page 71: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Ad-hoc query with different dimensions

• Quick aggregations and Top-N queries• Time series with flexible filters• Quick access to raw data using boolean queries

What We Need

Page 72: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Druid

• Rapid exploration of high dimensional data!

• Fast ingestion and querying!

• Time series

Page 73: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Real-time indexing of event streams!

• Killer feature: boolean search!

• Great UI: Kibana

Page 74: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

The Old Pipeline

Page 75: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

The New Pipeline

Page 76: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 77: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

There Is More

Page 78: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

It’s Not All About Counters and Time Series

Page 79: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 80: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

RequestId Parent Id Node Id Service Name Status

4965-4a74 0 123 Edge Service 200

4965-4a74 123 456 Gateway 200

4965-4a74 456 789 Service A 200

4965-4a74e 456 abc Service B 200

Status:200

Page 81: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 82: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 83: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 84: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 85: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 86: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 87: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

A System that Supports All These

Page 88: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

A Data Pipeline To Glue Them All

Page 89: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Make It Simple

Page 90: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Message Producing

Page 91: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Message Producing

• Simple and Uniform API

• messageBus.publish(event)

Page 92: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Consumption Is Simple Too                              consumer.observe().subscribe(new  Subscriber<>()  {         @Override     public  void  onNext(Ackable<IncomingMessage>  ackable)  {           process(ackable.getEntity(MyEventType.class));       ackable.ack();     }  });  !consumer.pause();  consumer.resume()

Page 93: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

RxJava

• Functional reactive programming model!

• Powerful streaming API!

• Separation of logic and threading model

Page 94: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

Page 95: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

• Top Priority: app stability and throughput

Page 96: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

• Top Priority: app stability and throughput

• Asynchronous operations

Page 97: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

• Top Priority: app stability and throughput

• Asynchronous operations

• Aggressive buffering

Page 98: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

• Top Priority: app stability and throughput

• Asynchronous operations

• Aggressive buffering

• Drops messages if necessary

Page 99: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Anything Can Fail

Page 100: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Cloud Resiliency

Page 101: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Fault Tolerance Features

Page 102: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Fault Tolerance Features

• Write and forward with auto-reattached EBS (Amazon’s Elastic Block Storage)

Page 103: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Fault Tolerance Features

• Write and forward with auto-reattached EBS (Amazon’s Elastic Block Storage)

• disk-backed queue: big-queue

Page 104: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Fault Tolerance Features

• Write and forward with auto-reattached EBS (Amazon’s Elastic Block Storage)

• disk-backed queue: big-queue

• Customized scaling down

Page 105: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 106: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

There’s More to Do

• Contribute to @NetflixOSS !

• Join us :-)

Page 107: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Summaryhttp://netflix.github.io

+ElasticSearchDruid

Page 108: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

You can build your own web-scale data pipeline using open source components

Page 109: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Thank You!Sudhir Tonse http://www.linkedin.com/in/sudhirtonse Twitter: @stonse

Danny Yuan http://www.linkedin.com/pub/danny-yuan/4/374/862 Twitter: @g9yuayon