cloud computing - fenix. · pdf filecomponents of cloud computing platforms data storage ......

Cloud Computing

Lectures 11, 12 and 13

Cloud Storage

2014-2015

Up until now…

• Introduction

• Definition of Cloud Computing

• Grid Computing

• Content Distribution Networks

• Cycle-Sharing

• Distributed Scheduling

• Map Reduce

Outline

• Components of Cloud Platforms

• Storage Types

• Storage Products

• Cloud File Systems

• Cloud Object Storage

Components of Cloud Computing

Platforms

Data Storage

Execution Model

Programming Model

•How to program an application?

•How is the platform viewed?

•Which abstraction is accessible: VM? API?

Framework?

•Which operations can I perform?

•How are my data stored and accessed?

•Monitoring: How can I evaluate the state

of executions/nodes/data...?

Major Cloud Platforms

• Apache Hadoop

• Amazon Web Services

• Google App Engine

• Microsoft Azure

• OpenStack

Storage Types

• A range of search, streaming and indexing variants.

• File System:

• Hierarchical organization, files, permission, streaming data,...

• Object Storage:

• Direct Program <-> Storage interaction

• Object ID indexing

• Tables (no-SQL DB):

• records and tables

• Search

• No relational model

• Relational Databases:

• Full relational model

• Conventional services

• We will see that the categories are becoming blurred...

Storage Products (i)

• File System

• Hadoop File System / Google File System

• Object/Byte Storage

• Amazon S3

• MS Azure Blobs

• Table

• Hadoop HBase / Google Big Table (AppEngine Datastore)

• Amazon Simple DB

• MS Azure Tables

• Hadoop Hive

• Yahoo PNUTS

• Relational Databases

• Amazon RDS

• SQL Azure

Cloud File System: HDFS/GFS

• Distributed File System

• Reimplementation of the Google File System (GFS).

• Runs on clusters of generic machines.

• HDFS is tuned for:

• Very large files.

• Streaming access.

• Generic hardware.

• Scalability Key: data operations don’t go through the central server.

Blocks

• Simplify space management: allocation, replication and a file may grow almost indefinitely.

• Evolution:

• Disk blocks: 512 bytes

• File system blocks: 2,4,8 kB

• HDFS blocks: 64MB

• To eliminate seek steps: contiguous 64MB.

• A file smaller than one block does not occupy 1 block.

Namenode

• Manages the file system name space: folder hierarchy, name uniqueness,…

• Maintains the folder tree and the metadata in 2 files: namespace image and edit log.

• HDFS cannot operate without the namenode.

• Files can be written, read, renamed and deleted.

• It is not possible to:• Write in the middle of the file.

• Write concurrently to the same file.

• Fault tolerance mechanism: atomic replication to another machine.

Datanode

• Manage a set of blocks.

• Process clients’ or namenode’s

writing/reading requests.

• Periodically notifies the namenode of the

blocks it holds..

• If a block’s replication factor drops below a

configuration value, a new replica is created.

Permissions

• Permissions in HDFS are similar to UNIX:

• user, group e other

• read, write e execute

• As the user is very often remote, any

username from a remote node is trusted.

Therefore, protection is weak.

• They are more geared towards managing a

group of users in the cluster.

Consistency Model

• Formalization of the visibility of read and write

operations.

• After an operation call finishes, who sees what

and when?

• HDFS model: There are no guarantees that the

last block has been written unless sync() is

called.

Error Checking

• The block correction is checked using a hashing function (CRC32 - checksum).

• At file creation:• Client calculates the checksum for each 512 byte block.

• Datanode stores the checksum.

• At file access:• Client reads the data and the checksum from the

datanode.

• If the check fails, it tries other replicas.

• Periodically, the datanode checks its blocks checksum.

Reading

• Client contacts the namenode to get the list of the datanodes with the file’s blocks (stored in memory).

• Receives a FSDataInputStream that transparently chooses the best datanode, opens and closes connections to the datanodes, requests blocks from the namenode, repeats operations if necessary and logs failed datanodes.

Reading

Choosing Nodes: Distance

• Nodes choose the closer sources of data.

• Assumes a tree structured organization.

• Distance equal to the name of hops between the tree nodes.

• distance(/d1/r1/n1, /d1/r1/n1) = 0 (processes on the same node)

• distance(/d1/r1/n1, /d1/r1/n2) = 2 (processes on the same racks)

• distance(/d1/r1/n1, /d1/r2/n3) = 4 (processes on different racks)

• distance(/d1/r1/n1, /d2/r3/n4) = 6 (processes on different datacentres)

Distance Between Nodes

Writing (+ creating)

• Client requests a new file to the namenode

checking permission and uniqueness. If it succeeds, it receives a FSOutputStream .

• Namenode provides a set of datanodes for replication.

• Blocks write requests are kept in a data queue.

• Unconfirmed block write request are kept in a ack queue.

Writing

• In case the datanode fails, the client changes

the block id so that the corrupted replica is

deleted later.

• By default, if one of the replicas is successfully

written, the writing is considered done. The

other replicas are written asynchronously.

Command Line Tool

• hadoop fs• ls

• mkdir

• rm

• rmr

• put

• copyToLocal

• copyFromLocal

Cloud Object Store:

Amazon Simple Storage System (S3)

• Amazon’s persistent object storage system.

• Implementation based on the Dynamo system

(SOSP, 2007).

• Accessible using HTTP: 3 different protocols,

e.g. SOAP.

Dynamo: Intuition

• CAP Theorem: Consistency, Availability and Partition

tolerance - Pick two!

• At Amazon: Availability = Client’s trusts

• Cannot be sacrificed.

• In large data centres there are going to be frequent

faults:

• The possibility of a partition has to be included.

• Most data services tolerate small inconsistencies:

• Relaxed consistency ==> Eventual consistency.

Consistency Models

• Strong Consistency: Once a write operations is finished for the requester, any subsequent read will return the value that was written.

• Weak Consistency: The system does not guarantee that subsequent accesses return the written value. Some condition must be verified for the written value to be returned (a time interval, an access to a synchro variable,…). The period between the write finishing and the value visibility is called the inconsistency window.

• Eventual Consistency: The system guarantees that, if there no more writes, the updates will become visible for all clients (e.g. DNS): a DNS name update is propagated between zones until all clients see the new value.

Variants of Eventual Consistency

• Causal Consistency: Two causally related writes (A happens before B) cannot lead to B being written before A. There are no guarantees regarding write operations that are not causally related.

• Read-your-writes Consistency: Every time a process A writes a value, all subsequent reads must reflect that write (a particular case of causal consistency).

• Session Consistency: A practical implementation of the previous model. All operations are done in the context of a session. During the session, the system guarantees “read-your-writes”. In the case of certain faults, the session is ended and the “read-your-writes” guarantee is restarted.

• Monotonic Reads Consistency: If a process has seen a subsequent value, subsequent reads will never return a previous value.

• Monotonic Writes Consistency: Systems that do not guarantee ordered writes in the same process. Very rare…

Dynamo Assumptions

• Interaction Model:• Total reads and writes with unique IDs.

• Binary objects with up to 5GB.

• No operations on multiple objects.

• ACID properties (Atomicity, Consistency, Isolation, Durability):• Atomicity/Isolation: total writes of an object.

• Durability: replicated write.

• Only the consistency isn’t strong.

• Efficiency:• Optimize for the 99,9 percentile.

Design Decisions

• Incremental Scalability:

• Adding nodes has to be simple.

• Load balancing and support for heterogeneity:

• The system must distribute the requests.

• And support nodes with different characteristics.

• Solution: nodes in a Chord like DHT.

Design Decisions

• Symmetry:

• All nodes are equally responsible peers.

• Decentralization:

• Avoid single points of failure.

Dynamo: Design Decisions

Problem Technique Advantage

Partitioning Consistent Hashing Incremental Stability

Write Availability Vector clocks and conflict resolution of writes

Version size does not depend on the update rate

Temporary Faults Relaxed quorum and hinted handoff

High availability and durability

Permanent Faults Anti-entropy with Merkle trees Synchronizes replicas asynchronously

Membership and Fault detection

Gossip based membership protocol

Maintains symmetry and avoids and centralized

cloud computing - fenix. · pdf filecomponents of cloud computing platforms data storage ......

Documents

the cloud computing guide for healthcare -...

cloud computing: managing the legal...

cloud computing collaboration new workflows · image...

cloud computing - arcitura · cial cloud computing...

a descriptive literature review about cloud computing...

cloud computing cloud computing overview of distributed...

introduction to cloud computing cloud computing : module 1

controlling the cloud: requirements for cloud...

· cloud computing private cloud public cloud (1) cloud...

cloud computing readiness...

workflows in cloud computing. contents : cloud computing ...

cloud computing - security audits versus cloud computing

cloud computing cloud computing paas techniques file system

cloud computing. agenda qué es cloud computing...

cloud computing -...

cloud computing indigoo.com cloud...

cloud computing: lecture 2 cloud computing concept and

enisa – cloud computing security strategy - terena ·...

cloud computing reference architecture 2.0: overview · ibm...

cloud computing essentials · cloud computing essentials...