how netflix directs 1/3rd of - qconsf.com...how netflix directs 1/3rd of haley tucker mohit vora...
TRANSCRIPT
How Netflix directs 1/3rd of
Haley Tucker Mohit Vora
QCon San FranciscoNov 16, 2015
Playback Overview
DATA PLANE(CDN)
CONTROL PLANE
STREAMNETFLIX DEVICE
Project 366 #59; 280212 Days Gone By..., CC BY-SA, Pete 2012, Flickr
AUDIOVIDEO TEXT
STREAMS
How do we build a streaming “tape”?
Determine the preferred experience
DEVICETITLE
CONNECTIONS
COUNTRY
NETWORK
Broadband - wired or wifiCellular - Edge, 3G, LTE, ...
CUSTOMER
That’s exactly what I want ...now where can I get it?
Point the device to appropriate locations
Steering
GENERATE PLAYBACK MANIFEST
PLAYBACK MANIFEST
PLAYBACK MANIFEST
Uh-oh, the content is encrypted!
Keymaster, CC BY-SA, Sean McGrath 2007, Flickr
LICENSE
LICENSE
And...Action!
SESSION (START, STOP, PAUSE, RESUME, KEEPALIVE)
SESSION EVENTS
LICENSE
PLAYBACK MANIFEST
GENERATE PLAYBACK MANIFEST
SESSION (START, STOP, PAUSE, RESUME, KEEPALIVE)
PLAYBACK LIFECYCLE
Data Plane(CDN)
What is a Content Delivery Network?
Open ConnectA NETFLIX ORIGINAL
CONTENT RANK
BYTE
S ST
REA
MED
PREDICTABLE VIEWING PATTERNS
FILLING WHEN YOU SLEEP
Dre
amin
g…, C
C B
Y-S
A, E
leni
Bou
lsai
ki 2
009,
Flic
kr
FILLING WHEN YOU SLEEP
Open ConnectA NETFLIX ORIGINAL
READ XOR WRITE
ON
E W
AY
, CC
BY
-SA
, Ken
ny L
ouie
201
0, F
lickr
Content Delivery Mechanisms
DATA PLANE(CDN)
CONTROL PLANE
STREAMNETFLIX DEVICE
STREAM
ISP DATA CENTER
ISP ROUTER
NETFLIX DEVICE
STREAM
ISP DATA CENTER
ISP ROUTER
NETFLIX DEVICE
ISP CO-LOCATION
STREAM
ISP DATA CENTER
ISP ROUTER
NETFLIX DEVICE
STREAM
ISP DATA CENTER
NETFLIX DEVICE
IXP DATA CENTER
NFLXROUTER
ISP ROUTER
ISP ROUTER
NETFLIX
STREAM
ISP DATA CENTER
NETFLIX DEVICE
IXP DATA CENTER
NFLXROUTER
ISP ROUTER
ISP ROUTER
NETFLIX
STREAM
ISP DATA CENTER
NETFLIX DEVICE
IXP DATA CENTER
NFLXROUTER
ISP ROUTER
ISP ROUTER
IXP INTERCONNECTION
NETFLIX
Control Plane
OPEN CONNECTSTREAMNETFLIX DEVICE
CDN CONTROL
PLANE
DEVICE CONTROL
PLANEDON’T KEEP SECRETS
Network ProximityContent Positioning
Load Distribution
Network Proximity
Social Network in a Course, CC BY-SA, Hans Põldoja 2010, Flickr
By Specification?
By Specification?Doesn’t scale
Border Gateway Protocol
TAKEAWAY
BGP ROUTE175.231.128.0/24
(+ proximity attributes)
Use BGP
ISP2 DATA CENTER
ISP2 BGP ROUTES
CONTROLPLANE
IXP DATA CENTER
ISP1 BGP ROUTES
ISP1 DATA CENTER ISP1
NFLX
BGP ROUTE175.231.128.0/24 (+ proximity attributes)
Content Positioning
LOCALIZE TRAFFIC
ISP DATA CENTER
SERVE CACHE MISS
HOW DO WE DETERMINE WHAT CONTENT WILL BE POPULAR TOMORROW?
CHANGING CATALOG
EVOLVING MEMBER TASTES
MINIMIZE FILL CHURN
ISP DATA CENTER
OFF PEAKFILL
USE HISTORICAL DATA
CONTENT RANKBY
TES
STRE
AM
ED
bytesStreamed/bytesStored
IS ONE DAY OF HISTORY ENOUGH?
EXPONENTIALLY WEIGHTED MOVING AVERAGE
WEI
GHT
DAYS AGO0 10 20 30 40
…
= 0.9
TAKEAWAY Weigh Recent Data Higher
HOW SHOULD CONTENT BE ALLOCATED?
MILLIONS OF FILES
THOUSANDS OF SERVERS
HOW SHOULD CONTENT BE ALLOCATED?
SVR4
SVR2SVR1
SVR3
FILE1
FILE3
FILE1
TAKEAWAY
ALLOCATE MULTIPLE REPLICASRESILIENT TO CLUSTER CHANGES
REPEATABLE
Consistent Hashing
ISP2 DATA CENTER
WHAT TO FILL?
CONTROLPLANE
IXP DATA CENTER
WHERE TO FILL FROM?
ISP1 DATA CENTER
S3
FILL OVER HTTP
Load Distribution
CONTENT RANKBY
TES
STRE
AM
ED
LOTS OF THROUGHPUT
LOTS OF STORAGE
CONTENT WITH CONFLICTING CONSTRAINTS
SSD BASED
SPINNING DISK BASED
WITHIN CLUSTERS ON EACH SERVER
MEMORY
CONTENT RANK
BYTE
S ST
REA
MED
SSD SPINNING DISK
TAKEAWAY Tier Infrastructure
ACROSS SERVERS WITHIN CLUSTERS
BALANCE BALANCE
ACROSS EQUIDISTANT CLUSTERS
HOW DO WE BALANCE LOAD?
OPEN CONNECTNETFLIX DEVICE
CDN CONTROL
PLANE
DEVICE CONTROL
PLANE
LOAD BALANCER
STREAM
USING CONTENT DISTRIBUTION
HOW DO WE BALANCE LOAD?
FLIP A COIN
AND WHEN WE HAVE EQUALLY ATTRACTIVE LOCATIONS TO SERVE FROM –
INCIDENT LOAD
SYST
EMM
ETRI
CS
MAX
INSANESANE
HOW DO WE LOAD SERVERS OPTIMALLY?
… AMIDST EVER CHANGING INTERNET WEATHER
TRA
FFIC
t
… AND DAILY TRAFFIC EBBS AND FLOWS
+ SERVE STREAMS
FEEDBACK
-TRAFFIC EFFECT ON
SYSTEM METRICS
CONTROL
WE INTRODUCE A FEEDBACK LOOP
TAKEAWAY PID CONTROLLER
TAKEAWAY PID CONTROLLER
Process Variable
Set Point
Control Variable
Current RPM
Desired RPM
Input Voltage
System Metrics
System Metrics Max
Controlled Traffic
DC MOTOR
TAKEAWAY PID CONTROLLER
Process Variable
Set Point
Control Variable
System Metrics
System Metrics Max
Controlled Traffic
Current RPM
Desired RPM
Input Voltage
LOADING SERVERS
ISP2 DATA CENTER
CONTROL TO 80%
CONTROLPLANE
IXP DATA CENTER
NO CONTROL
ISP1 DATA CENTER
0.0 < CONTROL VAR < 1.0
TRA
FFIC
t
NEXT HOP
TRAFFIC SHIFTS TO NEXT HOP LOCATION
Steering
STREAMNETFLIX DEVICE
CDN CONTROL
PLANEPLAYBACK SERVICES
STEERINGGot URLs for f1, f2, …, fn?
Yes, here’s the URLs
PROXIMITYHEALTH
CONTENT
CASS
KAFKA
OPEN CONNECT
ArchitectureEvolution
5 CHALLENGES
API
STEERING
SESSION
MANIFEST
DRMLICENSE
How did we evolve from here...
API
STEERING
SESSION
MANIFEST
DRMLICENSE
CLIENT SCRIPTS
SERVICE LAYER
RULES
INSIGHTS
...to here.
5 SOLUTIONS
CACHE
DEVICE
CUSTOMER
TITLE
NETWORK
Broadband - wired or wifiCellular - Edge, 3G, LTE, ...
CONNECTIONS
COUNTRY
High dimensionalityCHALLENGE
How can we quickly alter the playback experience in a targeted manner?
ALL STREAMS
FOR CONTENT
ENGINE
RULES
BEST STREAMS
FOR SESSION
Stream FilteringUSE CASE
EXAMPLE RULES
ENGINE
CONFIGURATION MANAGEMENT UI
UPDATING RULES
TOPIC
PUBLISH
RULESSUBSCRIBE
Dynamic Business Rules
API
STEERING
SESSION
MANIFEST
DRMLICENSE
RULES
TAKEAWAY
Pinpoint what is brokenCHALLENGE
Hay
stac
ks, C
C B
Y-S
A, J
ohn
Pav
elka
200
8, F
lickr
3:00 AM : Pager goes off
METRICS AND ALERTING
OK...error code 105 is elevated. But why?
Indexed Logging
Detailed Domain Insights
API
STEERING
SESSION
MANIFEST
DRMLICENSE
RULES
INSIGHTS
TAKEAWAY
Large amount of stateCHALLENGE
How can we enable faster UIs and low-end devices?
We introduced a server-side caching tier
MANIFESTSC
US
TOM
ER
A
CU
STO
ME
R A
CU
STO
ME
R B
Watch out for resiliency issues!!
Ping Pong project, CC BY-SA, Michael Knowles 2008, Flickr
API
STEERING
SESSION
MANIFEST
DRMLICENSE
RULES
INSIGHTS
Reduce client stateTAKEAWAY
CACHE
Managing device protocolsCHALLENGE
Square peg, round hole, CC BY-SA, Simon Law 2006, Flickr
Can we allow devices to define their own protocols?
DYNAMIC SCRIPTING PLATFORM
SESSION
LICENSE
MANIFESTXBOX
iPHONE
HTML5 PLAYER
iphone.groovy
JAVA
SER
VICE
LA
YER
xbox.groovy
html5.groovy
API
STEERING
SESSION
MANIFEST
DRMLICENSE
RULESINSIGHTS
Client-driven protocols
API
CLIENT SCRIPTS
SERVICE LAYER
TAKEAWAY
CACHE
Enabling high-velocity innovationCHALLENGE
CC BY-SA, Nathan E Photography 2008, Flickr
How can we expose new data with the least amount of churn?
API MANIFEST
Stream● Bitrate● Framerate● Dynamic Data
Stream’● Bitrate● Dynamic Data
This works from API:● stream.getBitrate()● stream.getDynamicData().get(“FRAME_RATE”)
Works both ways!
This works from CLIENT SCRIPT!● stream.getDynamicData().get(“BIT_RATE”)● stream.getDynamicData().get(“FRAME_RATE”)
CLIENT SCRIPT
Stream’’● Dynamic Data
Works both
ways!
API MANIFEST
Stream● Bitrate● Framerate● Dynamic Data
Stream’● Bitrate● Dynamic Data
Works both
ways!
API
CLIENT SCRIPTS
SERVICE LAYER
STEERING
SESSION
MANIFEST
DRMLICENSE
RULES
INSIGHTS
Data pass-thruTAKEAWAY
CACHE
TAKEAWAYS
● BGP based proximity● Tiered Infrastructure● PID Controller● EWMA for historical data● Consistent Hashing
● Dynamic business rules● Detailed domain insights● Reduce client state● Client-driven protocols● Data pass-thru
TAKEAWAYS
● BGP based proximity● Tiered Infrastructure● PID Controller● EWMA for historical data● Consistent Hashing
● Dynamic business rules● Detailed domain insights● Reduce client state● Client-driven protocols● Data pass-thru
Questions?Haley Tucker @hwilson1204
Mohit Vora@mohitvora
STREAM
NETFLIX DEVICE
NETFLIX DEVICE
STREAM
SPINNING DISK SERVERS
SSD SERVERS
WHAT TO FILL?
WHERE TO FILL FROM?
API
CLIENT SCRIPTS
SERVICE LAYER
CACHE CONTROL
DON’T KEEP SECRETSSTEERING
SESSION
MANIFEST
DRMLICENSE
RULES
CACHE
INSIGHTS
IXP DATA CENTER
ISP1
ISP2
ISP2 BGP ROUTES
ISP1 BGP ROUTES
CONTROL TO 80%
● Background image from https://www.flickr.com/photos/centralasian/4099515384, Image was cropped and red lines and dots were drawn on top, https://creativecommons.org/licenses/by/2.0/.
● Image from https://www.flickr.com/photos/28705377@N04/4142872268, No modifications made, https://creativecommons.org/licenses/by/2.0/.
● Image of cassette is from https://www.flickr.com/photos/comedynose/6939206771, Image was cropped, https://creativecommons.org/licenses/by/2.0/.
● Image of speaker is from https://www.flickr.com/photos/av_hire_london/5578975575, No changes made, https://creativecommons.org/licenses/by/2.0/.
● Image of television is from https://www.flickr.com/photos/jvcamerica/3660897684/, No changes made, https://creativecommons.org/licenses/by/2.0/.
● Image of text is from https://www.flickr.com/photos/dno1967b/5754743006, No changes made, https://creativecommons.org/licenses/by/2.0/.
● Background image from https://www.flickr.com/photos/mcgraths/866572532, Image was cropped, https://creativecommons.org/licenses/by/2.0/.
● Image from https://www.flickr.com/photos/thatguyfromcchs08/2300190277, Image is dimmed, https://creativecommons.org/licenses/by/2.0/.
● Image from https://www.flickr.com/photos/mknowles/3134373590, Image was cropped, https://creativecommons.org/licenses/by-sa/2.0/.
Image Attributions