the hadoop distributed file system: architecture and design by dhruba borthakur

25
The Hadoop Distributed File System: File System: Architecture and Design Architecture and Design by Dhruba Borthakur by Dhruba Borthakur Presented by Bryant Yao

Upload: helki

Post on 30-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

The Hadoop Distributed File System: Architecture and Design by Dhruba Borthakur. Presented by Bryant Yao. Introduction. What is it? It’s a file system! Supports most of the operations a normal file system would. Open source implementation of GFS (Google File System). Written in Java - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

The Hadoop Distributed File The Hadoop Distributed File System:System:Architecture and DesignArchitecture and Designby Dhruba Borthakurby Dhruba BorthakurPresented by Bryant Yao

Page 2: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

IntroductionIntroductionWhat is it? It’s a file system!

◦Supports most of the operations a normal file system would.

Open source implementation of GFS (Google File System).

Written in JavaDesigned primarily for GNU/Linux

◦Some support for Windows

Page 3: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Design GoalsDesign Goals HDFS is designed to store large files (think TB or PB). HDFS is designed for a computer cluster/s made up

of racks.

Write once, read many model◦ Useful for reading many files at once but not single files.

Streaming access of data◦ Data is coming to you constantly and not in waves

Make use of commodity computers Expect hardware to fail “Moving computation is cheaper than moving data”

Rack 1

Rack 2Cluste

r

Page 4: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Master/Slave ArchitectureMaster/Slave Architecture

Datanodes

Namenode

Page 5: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Master/Slave Architecture Master/Slave Architecture cont.cont. 1 master, many slaves The master manages the file system namespace and

regulates access to files by clients.

Data distributed across slaves. The slaves store the data as “blocks”.

What is a block?◦ A portion of a file.◦ Files are broken down into and stored as a sequence of

blocks.

A B C

File 1

Broken down into blocks A, B, and C.

Page 6: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Task FlowTask Flow

Page 7: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

NamenodeNamenodeMasterHandles metadata operations

◦ Stored in a transaction log called EditLogManages datanodes

◦ Passes I/O requests to datanodes◦ Informs the datanode when to perform block

operations.◦ Maintains a BlockMap which keeps track of

which blocks each datanode is responsible for. Stores all files’ metadata in memory

◦ File attributes, number of replicas, file’s blocks, block locations, and checksum of a block.

Stores a copy of the namespace in the FsImage on disk.

Page 8: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

DatanodeDatanodeSlave Handles data I/O.Handles block creation, deletion,

and replicationLocal storage is optimized so files

are stored over multiple file directories◦Storing data into a single directory

Page 9: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Data ReplicationData ReplicationMakes copies of the data!Replication factor determines the

number of copies.◦Specified by namenode or during file

creationReplication is pipelined!

Page 10: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Pipelining Data Pipelining Data ReplicationReplicationBlocks are split into portions

(4KB).

A

1 2 3

B A

1 2 3

C B A

1 2 3

Assume a block is split into 3 portions: A, B, and C.

Page 11: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Replication PolicReplication Polic yyCommunication bandwidth between

computers in a rack is greater than between a computer outside of the rack.

We could replicate data across racks…but this would consume the most bandwidth.

We could replicate data across all computers in a rack…but if the rack dies we’re in the same position as before.

Page 12: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Replication PolicReplication Polic y cont.y cont. Assume only three replicas are created.

◦ Split the replicas between 2 racks.◦ Rack failure is rare so we’re still able to maintain

good data reliability while minimizing bandwidth cost.

Version 0.18.0◦ 2 replicas in current rack (2 different nodes)◦ 1 replica in remote rack

Version 0.20.3.x◦ 1 replica in current rack

2 replicas in remote rack (2 different nodes) What happens if replication factor is 2 or > 3?

◦ No answer in this paper.◦ Some other papers state that the minimum is 3.◦ The author wrote a separate paper stating every

replica after the 3rd is placed randomly.

Page 13: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Reading DataReading DataRead the data that’s closest to you!

◦ If the block/replica of data you want is on the datanode/rack/data center you’re on, read it from there!

Read from datanodes directly.◦Can be done in parallel.

Namenode is used to generate the list of datanodes which host a requested file as well as getting checksum values to validate blocks retrieved from the datanodes.

Page 14: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Writing DataWriting DataData is written onceSplit into blocks, typically of size 64MB

◦The larger the block size, the less metadata stored by the namenode

Data is written to a temporary local block on the client side and then flushed to a datanode, once the block is full.◦ If a file is closed while the temporary block

isn’t full, the remaining data is flushed to the datanode.

If the namenode dies during file creation, the file is lost!

Page 15: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Hardware FailureHardware FailureImagine a file is broken into 3 blocks spread over three datanodes.

Block A

Block B

Block C

If the third datanode died, we would have no access to block C and we can’t retrieve the file.

Block A

Block B

Block C

1 2 3

1 2 3

Page 16: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Designing for Hardware Designing for Hardware FailureFailureData replicationSafemode

◦Heartbeat◦Block report

CheckpointsRe-replication

Page 17: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

CheckpointsCheckpoints

EditLog FsImage+

=

File System Namespace

Page 18: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

CheckpointsCheckpointsFsImage is a copy of the system taken

before any changes have occurred.EditLog is a log of all the changes to

the namenode since it’s startup.Upon the start up of the namenode, it

applies all changes to the FsImage to create an up to date version of itself.◦The resulting FsImage is the checkpoint.

If either the FsImage or EditLog is corrupt, the HDFS will not start!

Page 19: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Heartbeat and BlockreportHeartbeat and BlockreportA heartbeat is a message sent

from the datanode to the namenode.◦Periodically sent to the namenode,

letting the namenode know it’s “alive.”

◦If it’s dead, assume you can’t use it.Blockreport

◦A list of blocks the datanode is handling.

Page 20: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

SafemodeSafemodeUpon startup, the namenode enters

“safemode” to check the health status of the cluster. Only done once.

Heartbeat is used to ensure all datanodes are available to use.

Blockreport is used to check data integrity.◦ If the number of replicas retrieved is

different from the number of replicas expected, there is a problem.

Replicated FoundA A A A A

Page 21: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

OtherOtherCan view file system through FS

Shell or the webCommunicates through TCP/IPFile deletes are a move operation

to a “trash” folder which auto-deletes files after a specified time (default is 6 hours).

Rebalancer moves data from datanodes which have are close to filling up their local storage.

Page 22: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Relation with Search Relation with Search EnginesEnginesOriginally built for Nutch.

◦Intended to be the backbone for a search engine.

HDFS is the file system used by Hadoop.◦Hadoop also contains a MapReducer which

has many applications, like indexing the web!

Analyzing large amounts of data.Used by many, many companies

◦Google, Yahoo!, Facebook, etc.It can store the web!

◦Just kidding .

Page 23: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

““Pros/Cons”Pros/Cons”The goal of this paper is to

describe the system, not analyze it. It gives a great beginning overview.

Probably could’ve been condensed/organized better.

Some information is missing◦SecondaryNameNode◦CheckpointNode◦Etc.

Page 24: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Pros/Cons of HDFSPros/Cons of HDFSIn and Beyond the PaperIn and Beyond the PaperPros

◦ It accomplishes everything it set out to do.◦ Horizontally scalable – just add a new

datanode!◦ Cheap cheap cheap to build.◦ Good for reading and storing large amounts

of data.Cons

◦ Security◦ No redundancy of namenode

Single point of failure◦ The namenode is not scalable◦ Doesn’t handle small files well◦ Still in development, many features missing

Page 25: The  Hadoop  Distributed File System: Architecture and Design by  Dhruba Borthakur

Questions?Questions?

Thank you for listening!