Facebook Haystack
Finding a needle in Haystack: Facebook's photo storage. An Analysis of Facebook Photo Caching
PWL SF - June 30, 2015
Sargun Dhillon@Sargun
Agenda•The Haystack problem •Design & Architecture •Takeaways
Needle Storage & Serving For Facebook
•Write Once •Read Often •Delete Rarely
•Write Once •Read Often •Delete Rarely
•Write Once •Read Often •Delete Rarely
Why Haystack as a Paper?
Really BIG dataset
120 millionnew photos a day
Clever optimizations
Where did Haystack
come from?
Network Attached Storage (NAS)
mounted over NFS
CDNs for low-latency
Pareto Distribution
Theoretical Image Access CDF
Pareto
Fetches are expensive• Multiple seeks:
• Directory metadata • Inode • File contents
• File metadata is 10s of kilobytes • Long-tail uncachable
Decision to Build• Existing systems unable to be adapted
• Hadoop • MySQL • Traditional NAS appliances
• Don’t need to solve for the kitchen sink • Log data • Development work
Design Constraints
High Throughput &
Low Latency
Cost-effective*
*CDNs are Expensive
Separation of concerns
•Haystack Store •Haystack Cache•Haystack Directory
Concerns: Read, Write,
Delete Needles
There is no right ratio
Just enough memory for metadata
Store little metadata
Volume LayerVolume layer above filesystem
Volumes are append-only
Arranged Into Logical Volumes
Append-only Data File
10 bytes of metadata per
photo
Read by <Key, Alt Key,
Cookie>
Checks cookie for security
Modifications are appends
Deletions change offset to 0
Compaction for reclamation
Similar to Bitcask & CDB
Volumes preallocated
Pitchfork: generates artificial
load
Checksum verified on compaction
Directory marks volumes offline
Recovery: Rsync*
* With QoS
Restore: Multiple Replicas
OCP: Open Compute
Project
12x3TB SATA in RAID6
RAID Controller with NVRAM
Only Writes Cached
Good at reads xor writes not both
ReadThroughput
Avg. Read Latency
Write Throughput
Avg. Write Latency
Only Reads 770.6 33.2 - -
Only Writes - - 6099.4 4.9
Multiwrite (x16) - - 10843.8 43.9
Reads And
Writes718.1 41.6 232.0 11.9
Latencies Table
“Known Unknowns” and
“Unknown Unknowns”
Haystack Store• Responsibilities:
• Read needles • Write needles
• Append-only • O(1) read cost*
*Usually
Concerns: Caching
Two caching rules
Request isn’t from CDN
Request is to write-enabled store
Haystack Cache
•Simple cache •Optimizations, given access patterns
Concerns: Mapping, Load Balancing, CDN Management,
Directing
Maps logical volumes to
physical machines
Mapping based on business rules
Load balances reads
Directs writes to relevant logical
volume
Directs reads away from CDN
Directory• Manages capacity • Manages volume mapping • Manages image mapping • Manages CDN
Tying it together
Write-path
• Involves
• Store
• Directory
• Smart client
Directory uses URLs for directing
http://⟨CDN⟩/⟨Cache⟩/⟨Machine id⟩/⟨Logical
volume, Photo⟩
URL Makeup• CDN • Cache Node • Machine ID • Logical Volume ID • Photo ID & Alt ID • Cookie
Strips URL left-to-right
Read-path• Involves:
• Directory • Cache • CDN
Narrow Scope“That simplicity let us build and
deploy a working system in a few months instead of a few years.”
Sometimes you’re solving the wrong
problem
Smart Clients & Ecosystem Control
Simple Optimizations
Open Source Implementation:
WeedFS