gfs/hdfs 15-440 distributed systems. 2 building blocks google workqueue (scheduler) google file...
TRANSCRIPT
GFS/HDFS
15-440 Distributed Systems
2
Building Blocks
• Google WorkQueue (scheduler)• Google File System• Chubby Lock service (paxos-based)• Two other pieces helpful but not required
• Sawzall (languate)• MapReduce
• BigTable: Build a more application-friendly storage service using these parts
Overview
• Google File System (GFS) and Hadoop Distributed File System (HDFS)
• BigTable
3
Google Disk Farm
Early days…
…today
4
Google Platform Characteristics
• Lots of cheap PCs, each with disk and CPU• High aggregate storage capacity• Spread search processing across many CPUs
• How to share data among PCs?
5
Google Platform Characteristics
• 100s to 1000s of PCs in cluster• Many modes of failure for each PC:
• App bugs, OS bugs• Human error• Disk failure, memory failure, net failure, power supply
failure• Connector failure
• Monitoring, fault tolerance, auto-recovery essential
6
7
Data-Center Network