social hash - usenix · summary assignment problems are common in distributed systems design...

23
Social Hash: an Assignment Framework for Optimizing Distributed Systems Operations on Social Networks Alon Shalita, Brian Karrer, Igor Kabiljo, Arun Sharma, Alessandro Presta, Aaron Adcock, Herald Kllapi, and Michael Stumm March 2016

Upload: others

Post on 01-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash:an Assignment Framework for Optimizing Distributed Systems Operations on Social Networks

Alon Shalita, Brian Karrer, Igor Kabiljo, Arun Sharma, Alessandro Presta, Aaron Adcock, Herald Kllapi, and Michael Stumm

March 2016

Page 2: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment ProblemFront-end clusters

Cache

Point-of-Presence

(PoP)Cache

Cache

Alon’s HTTP requests

Igor’s HTTP requests

Page 3: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment ProblemFront-end clusters

Cache

Cache

Cache

Point-of-Presence

(PoP)

Alon’s HTTP requests

Igor’s HTTP requests

Page 4: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment Problem OptimizationFront-end clusters

Cache

Cache

Cache

Point-of-Presence

(PoP)

Page 5: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Solution RequirementsFront-end clusters

Cache

Cache

Cache

Balanced

Point-of-Presence

(PoP)

Page 6: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Solution RequirementsFront-end clusters

Cache

Cache

Cache

Adaptive

Point-of-Presence

(PoP)

Alon’s HTTP requests

Page 7: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Solution RequirementsFront-end clusters

Cache

Cache

Cache

Stable

Point-of-Presence

(PoP)

Alon’s HTTP requests

Page 8: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Solution RequirementsFront-end clusters

Cache

Cache

Cache

Fast decision

Point-of-Presence

(PoP)

Alon’s HTTP requests

Page 9: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash framework

Page 10: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash framework

!

Components!(e.g.,!compute!clusters!or!storage!subsystems)!

Groups!

Objects!(e.g.,!data!records!or!HTTP!requests)!

static!!

assignm

ent!

dynamic!!

assignm

ent!

Page 11: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Static assignment

▪ Goal: assign similar objects sent to the same group

▪ Data access pattern -> represent as graph -> graph partitioning

▪ Large-scale optimization: slow, time-consuming

Page 12: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Dynamic assignment

▪ Goal: adapt to maintain load balance by altering group -> component

▪ hardware changes

▪ dynamic workload

▪ addition and removal of objects

▪ Two-level framework separates optimization from adaptation

▪ Slow optimization -> static

▪ Fast adaptation -> dynamic

▪ Group-to-component ratio controls tradeoff

Page 13: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash framework!!!!!!!!!!!!!!!!!!!!!! !

Social!Hash!Tbl! Assignment!Tbl!

group!g"

Lookup!

Request!

Missing!key!!assignment!

key"

failed!

group!

c"

Graph!Partitioning!

graph!specifications! Dynamic!!

Assignment!

monitoring!info!operator!console!

key"

Page 14: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

HTTP Request Routing

Page 15: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash for Facebook’s web routing

▪ Objects: HTTP request identified by user, Components: front-end clusters

▪ PoP: Dynamic assignment by hash ringFront-end clusters

Cache

Cache

Point-of-Presence

(PoP)

Page 16: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Edge locality for Facebook’s web routing

●●

●●

●●

●●

●●

0.00

0.25

0.50

0.75

1.00

10 100 1,000 10,000 100,000Number of groups

Edge

loca

lity

▪ Production routing: 21k groups for 10’s of front-end clusters

▪ Over half of friendships are within groups

▪ Updated on a weekly basis (~1% movement)

Page 17: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Live traffic experiment: TAO miss rate

−30

−20

−10

0

10

Sun

Mon

Tue

Wed Thu Fri

Sat

Sun

Mon

Tue

Wed Thu Fri

Sat

Day

Perc

enta

ge c

hang

e in

TAO

miss

rate

(%)▪ Orange: traffic shifts

▪ Red: duration of test

▪ Green: updated Social Hash table

Page 18: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Storage Sharding

Page 19: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment Problem 2: Storage sharding

Arun’s query

Objects: data recordsComponents: storage machines

Page 20: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment Problem 2: Storage sharding

Arun’s query

Objects: data recordsComponents: storage machines

Page 21: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Static assignment

▪ Minimize fanout through bipartite graph partitioning

▪ Graph contains recent queries and data records

▪ edge => query accesses data record

▪ Dotted: edge locality optimization

▪ Solid: fanout optimization●

10

20

30

40

50

3 10 30 100 300 1,000 3,000 10,000Number of groups

Aver

age

fano

ut

Page 22: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Storage sharding deployment

▪ Graph database with thousands of storage servers

▪ Group-to-component ratio of 8

▪ Static assignment every few months

▪ Results:

▪ Average latencies decreased by over 50%

▪ CPU utilization decreased by over 50%

Page 23: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Summary

▪ Assignment problems are common in distributed systems design

▪ Proposed Social Hash framework for solving assignment problems

▪ Two-level design optimizes performance with graph partitioning

▪ Two Facebook integrations in production for over a year

▪ HTTP Request Routing: > 25% reduction in TAO miss rate

▪ Storage Sharding: Latency and CPU utilization reduced by over 50%