problem-solving on large-scale clusters: theory and applications lecture 4: gfs & course wrap-up

19
Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Upload: rachel-ross

Post on 20-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Problem-solving on large-scale clusters:  theory and applications

Lecture 4: GFS & Course Wrap-up

Page 2: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Today’s Outline

• File Systems Overview– Uses– Common distributed file systems

• GFS– Goals– System design– Consistency model– Performance

• Introduction to Distributed Systems– MapReduce as a distributed system

Page 3: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

File System Uses

Uses:– Persistence of data– Inter Process Communication– Shared Namespace of Objects

Requirements:– Lots of files– Random Access– Permissions– Consistency

Page 4: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Common Distributed FSsTraditional:

– NFS– SMB– AFS

What about:– Kazaa– BitTorrent– Keberos

Page 5: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

GFS GoalsCommon usage and environment patterns:

– Regular Component Failure– Few, Large, Multi-GB files– Reads are streaming– Writes are appends– Control over client implementation

What are the consequences on…– data caching?– latency and bandwidth balance?– consistency?– control of data flow?

Page 6: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

GFS System DesignSystem attributes and components:

– Single master controls file namespace– Data broken in 64Meg “Chunks”– Chunks replicated over many “Chunk Servers”– Client talks directly to chunk servers for data

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

From GFS paper

Page 7: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

GFS Write flowControl flow

1. Client get chunk list from master

2. Master responds with primary/secondary chunk servers

3. Client starts pipelining data to chunk servers

4. Client asks primary chunk servers to notify finish

5. Primary chunk server signals write ordering to replicated servers

6. Chunk servers responds with successful commit

7. Client notified of good write

From GFS paper

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 8: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Consequence of design

Questions:

1. Where are the bottlenecks?

2. What if a replica fails?

3. What if the primary fails?

4. What if the master fails?

5. Why do writes need to be ordered?

From GFS paper

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

How do you work around these issues?

Page 9: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

GFS Consistency: termsTwo new terms

Consistent: All chunk servers have the same data

Defined: The result of the “last write” is fully availableA defined chunk is also a consistent chunk

Questions:– Is data corrupt if it is inconsistent?– Is data corrupt if it is undefined?– Can applications use data in either state?

Page 10: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

GFS Consistency: mutationsConsistency for types of writes:

– Single random write• Consistent and Defined

– Single append• Consistent and Defined

– Concurrent random write• Consistent

– Aborted write• Inconsistent

– Concurrent append• Final location is defined

Page 11: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

GFS Single Master handlingSingle Master = bottleneck = SPF

1. Master persists changes to multiple replicas

2. Can delegate to “shadow masters”

3. Naming done via DNS (easy failover)

How does this work:1. If the network is partitioned?

2. Over multiple data centers?

Page 12: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

GFS PerformanceHere’s some random perf numbers

– 1MB replicates in about 80ms– 342 node cluster

• 72 TB Avail, 55 Used• 735 K Files, 22 K dead files, 992 K Chunks• 13GB Chunk server Meta data• 48MB Master Meta data• Read rate: ~580 MB/s• Write rate: ~2 MB/s• Master ops: ~320 Op/s

100Mbit full duplex with Gigabit backbone`

Data from gfs paper

Page 13: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

GFS Performance Cont’dMore random perf numbers

– 227 node cluster• 180 TB Avail, 155 Used• 737 K Files, 232 K dead files, 1550 K Chunks• 21GB Chunk server Meta data• 60MB Master Meta data• Read rate: ~380 MB/s• Write rate: ~100 MB/s• Master ops: ~500 Ops/s

Data from gfs paper

Page 14: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

And now …

• An overview of the concepts we’ve been alluding to all quarter: parallel and distributed systems

Page 15: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Parallel vs Distributed Computing

• Parallel computing– Dividing a problem into identical tasks to be executed

at the same time on multiple machines or threads

• Distributed computing– Dividing a problem into (possibly identical) tasks to be

executed on multiple machines or threads, but generally on machines separated by a network

• Parallel computing is often a limited form of distributed computing

Is the MapReduce programming model parallel or distributed?

Page 16: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Requirements: Parallel Computing

• Requirement: minimal (to no) data dependencies!– Also nice: static data source

• Nice to have: minimal communication overhead– Replicating state– Coordinating / scheduling tasks (eg,

administrative overhead)

Page 17: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Requirements: Distributed System

• See worksheet / activity

Page 18: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Distributed System Design (1 of 2)

• From studying large (but not necessarily distributed) systems, we know that distributed systems trade off one guarantee for another– Ergo, you need to know what you’re designing

for, eg use-cases

• From studying MapReduce, we know that successful distributed system minimize data dependencies and administrative communication

Page 19: Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up

Distributed System Design (2 of 2)

• From MapReduce & GFS, we know that a distributed system must assume that components fail– Network may not be reliable

• Latency can be high• Bandwidth is not infinite• Data may get corrupted

– Attacks from malicious users– Hardware failures: motherboards, disks, and

memory chips will fail– … etc …