unit ivpit.ac.in/pitnotes/uploads/cs6703_iv.pdf · the globus toolkit was initially motivated by a...

Final IT/7th

Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING

PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 1

UNIT IV -

Open source grid middleware packages – Globus Toolkit (GT4) Architecture ,

Configuration – Usage of Globus – Main components and Programming model - Introduction

to Hadoop Framework - Mapreduce, Input splitting, map and reduce functions,

specifying input and output parameters, configuring and running a job – Design of

Hadoop file system, HDFS concepts, command line and java interface, dataflow of File

read & File write.

1. Discuss about the Open Source Grid Middleware Packages.

Open Source Grid Middleware Packages

Two well-formed organizations behind the standards:-

1) Open Grid Forum

2) Object Management

Middleware: software layers connects software components. It lies betrween Operating System

and the applications.

Grid Middleware: A layer between hardware and software, enable the sharing of heterogeneous

resources , and Managing virtual organizations created around the grid.

Popular Grid Middleware: BOINC – Berkeley Open Infrastructure for Network Computing

1) UNICORE ( open source)

Uniform Interface to Computing Resource.

Focus : High level programming models (Java on Unix).

Developed by the German Grid Computing Community.

A ready-to-run Grid system including client and server software.

2) GLOBUS (GT4 - open source)

Focus : Low level services (C and Java on Unix)

It is a library jointly developed by Argonne National Lab, University of Chicago, and

USC Information Science Institute, funded by DARPA, NSF, & NIH.

It is an open source toolkit for grid computing.

Final IT/7th



3) GRIDBUS ( open source)

Focus : Abstraction and market models (Java on Unix)

Supports eScience and eBusiness applications.

4) LEGION ( Not open source)

Focus : High level programming models (C++ on Unix)

It was the first integrated grid middleware architected from first principles to address the

complexity of grid environments.

Different types of Grid Environments available in the market:-

Final IT/7th



Grid and Web Services: Convergence

Grid community meets Web services: Open Grid Services Architecture (OGSA)

Service orientation to virtualize resources - Everything is a service!

From Web services

Standard interface definition mechanisms.

Evolving set of other standards: security, etc.

From Grids (Globus Toolkit)

Service semantics, reliability & security models.

Lifecycle management, discovery, other services

OGSA implementation: WSRF – A framework for the definition & management of composable,

interoperable services.

Final IT/7th



Final IT/7th



The WSRF specification

The Web Services Resources Framework is a collection of 4 different specifications:

WS-ResourceProperties

WS-ResourceLifetime

WS-ServiceGroup

WS-BaseFaults

Related specifications

WS-Notification

WS-Addressing

Web services vs. Grid services

“Web services” address discovery & invocation of persistent services.

Interface to persistent state of entire enterprise

In Grids we also need transient services, created/destroyed dynamically.

Interfaces to the states of distributed activities.

E.g. workflow, video conf., dist. data analysis.

Significant implications for how services are managed, named, discovered, and used.

In fact, much of our work is concerned with the management of services.

Final IT/7th



Open Grid Services Infrastructure (OGSI) Specification

Defines fundamental interfaces (using extended WSDL) and behaviors that define a Grid

Service.

A unifying framework for interoperability & establishment of total system properties.

Defines WSDL conventions and extensions for describing and naming services.

Defines basic patterns of interaction, which can be combined with each other and with custom

patterns in a myriad of ways.

Final IT/7th



2. Draw and explain the global toolkit architecture. (or) What is GT4? Describe in detail

the components of GT4 with a suitable diagram.

The GT4 Library Globus Job Workflow Client-Globus Interactions

Introduction of GT4

Open middleware library for the grid computing communities.

Support many operational grids and their applications on an international basis.

The toolkit addresses common problems and issues related to grid resource discovery,

management, communication, security, fault detection, and portability.

It includes a rich set of service implementations.

Provides tools for building new web services in Java, C, and Python, builds a powerful

standard-based security infrastructure and client APIs.

The Globus Toolkit was initially motivated by a desire to remove obstacles that prevent

seamless collaboration, and thus sharing of resources and services, in scientific and

engineering applications.

The shared resources can be computers, storage, data, services, networks, science

instruments (e.g., sensors).

Globus Tookit GT4 supports distributed and cluster computing services.

Final IT/7th



The GT4 Library

GT4 offers the middle-level core services in grid applications.

The high-level services and tools, such as MPI, Condor-G, and Nirod/G, are developed by third

parties for general-purpose distributed computing applications.

The local services, such as LSF, TCP, Linux, and Condor, are at the bottom level and are

fundamental tools supplied by other developers.

GT4 is based on industry-standard web service technologies.

The GT4 Library-Functional modules helps user to discover available resources, move data

between sites, user credentials

Globus Job Workflow

A typical job execution sequence proceeds as follows:-

The user delegates his credentials to a delegation service.

The user submits a job request to GRAM with the delegation identifier as a parameter.

Final IT/7th



GRAM parses the request, retrieves the user proxy certificate from the delegation service,

and then acts on behalf of the user.

GRAM sends a transfer request to the RFT (Reliable File Transfer), which applies

GridFTP to bring in the necessary files.

GRAM invokes a local scheduler via a GRAM adapter and the SEG (Scheduler Event

Generator) initiates a set of user jobs.

The local scheduler reports the job state to the SEG.

Once the job is complete, GRAM uses RFT and GridFTP to stage out the resultant files.

The grid monitors the progress of these operations and sends the user a notification when

they succeed, fail, or are delayed.

Client-Globus Interactions

There are strong interactions between provider programs and user code.

GT4 makes heavy use of industry-standard web service protocols and mechanisms in

service description, discovery, access, authentication, authorization, and the like.

Final IT/7th



GT4 makes extensive use of Java, C, and Python to write user code.

Web service mechanisms define specific interfaces for grid computing.

Web services provide flexible, extensible, and widely adopted XML-based interfaces.

Final IT/7th



GT4 provides a set of infrastructure services for accessing, monitoring, managing, and

controlling access to infrastructure elements.

GT4 implements standards to facilitate construction of operable and reusable user code.

Developers can use these services and libraries to build simple and complex systems

quickly.

A high-security subsystem addresses message protection, authentication, delegation, and

authorization.

The toolkit programs provide a set of useful infrastructure services.

Three containers are used to host user-developed services written in Java, Python, and C,

respectively.

Containers provide implementations of security, management, discovery, state

management, and other mechanisms required when building services.

They extend open source service hosting environments with support for a range of useful

web service specifications, including WSRF, WS-Notification, and WS-Security.

A set of client libraries allow client programs in Java, C, and Python to invoke operations

on both GT4 and user-developed services.

Final IT/7th



3. Write short notes on Hadoop Framework.

Distributed Framework for parallel processing and storing data.

It is an open source implementation of Google’s MapReduce and Google File

System(GFS).

Top level Apache project.

Written in Java and Runs on Linux, Mac OS/X, Windows, and Solaris.

Client applications can be written in various languages.

Users: Yahoo!, Facebook, Amazon, eBay, etc.

Hadoop - Applications

Types :

Statistical analysis.

Web applications like crawling (scan through Internet pages to create an index of data),

feeds, graphs.

Similarity and tagging applications.

Batch data processing, not real-time / user facing

Being applied to the back-end of web search

Log Processing

Document Analysis and Indexing

Web Graphs and Crawling

But, Now a days, it is also used for many real-time / user facing applications.

Hadoop Core, our flagship sub-project, provides a distributed filesystem (HDFS) and

support for the MapReduce distributed computing metaphor.

HBase builds on Hadoop Core to provide a scalable, distributed database.

Pig is a high-level data-flow language and execution framework for parallel computation.

It is built on top of Hadoop Core.

ZooKeeper is a highly available and reliable coordination system. Distributed

applications use ZooKeeper to store and mediate updates for critical shared state.

Hive is a data warehouse infrastructure built on Hadoop Core that provides data

summarization, adhoc querying and analysis of datasets.

Final IT/7th



Hadoop is a software framework for distributed processing of large datasets across large

clusters of computers

Large datasets -> Terabytes or petabytes of data.

Large clusters -> hundreds or thousands of nodes.

Hadoop is open-source implementation for Google MapReduce.

Hadoop is based on a simple programming model called MapReduce.

Hadoop is based on a simple data model, any data will fit.

Hadoop framework consists on two main layers

Distributed file system (HDFS)

Execution engine (MapReduce)

The Hadoop Core project provides the basic services for building a cloud computing

environment with commodity hardware, and the APIs for developing software that will

run on that cloud.

The two fundamental pieces of Hadoop Core are the MapReduce framework, the cloud

computing environment, and Hadoop Distributed File System (HDFS).

The Hadoop Core MapReduce framework requires a shared file system.

This shared file system does not need to be a system-level file system, as long as

there is a distributed file system plug-in available to the framework.

While Hadoop Core provides HDFS, HDFS is not required.

In Hadoop JIRA (the issue-tracking system), item 4686 is a tracking ticket to

separate HDFS into its own Hadoop project.

Final IT/7th



In addition to HDFS, Hadoop Core supports the Cloud- Store (formerly Kosmos)

file system (http://kosmosfs.sourceforge.net/) and Amazon Simple Storage

Service (S3) file system (http://aws.amazon.com/s3/).

The Hadoop Core framework comes with plug-ins for HDFS, CloudStore, and S3.

Users are also free to use any distributed file system that is visible as a system-

mounted file system, such as Network File System (NFS), Global File System

(GFS), or Lustre.

When HDFS is used as the shared file system, Hadoop is able to take advantage

of knowledge about which node hosts a physical copy of input data, and will

attempt to schedule the task that is to read that data, to run on that machine.

4. Discuss MAPREDUCE with suitable diagrams.

Large-Scale Parallel Data Processing framework

We can use 1000s of CPUs without hassle of managing things.

User need to write two functions - map and reduce.

It provides Auto partitioning of job into sub tasks, Auto retry on failures, Linear Scalability,

Locality of task execution, and Monitoring & status updates.

Simple programming model - solve many problems

A natural for log processing

Great for most web search processing

Fits into workflow frameworks as chains of M/R jobs e.g feed processing,

A lot of SQL also maps trivially to this construct. Eg. Pig

Map

Apply given function on sequence of data individually to produce output sequence of data

E.g. square x = x * x, map square [1,2,3,4,5] = [1,4,9,16,25]

Reduce

Allows to define combining operation on sequence of data Folds sequence to return single output

E.g. add [1,4,9,16,25] = 55

Final IT/7th



Map function

Takes a key/value pair and generates a set of intermediate key/value pairs

map(k1, v1) -> list(k2, v2)

Reduce function

Takes intermediate values and associates them with the same intermediate key

reduce(k2, list(v2)) -> list (k3, v3)

Abstracts functionality common to all Map/Reduce applications.

Distribute tasks to multiple machines.

Sorts, transfers and merges intermediate map outputs to the Reducers.

Monitors task progress.

Handles faulty machines, faulty tasks transparently.

Provides pluggable APIs and configuration mechanisms for writing applications.

Map and Reduce functions

Input formats and splits

Number of tasks, data types, etc…

Final IT/7th



Provides status about jobs to users.

Application writer specifies

– A pair of functions called Map and Reduce and a set of input files

Workflow

– Input phase generates a number of FileSplits from input files (one per Map task)

– The Map phase executes a user function to transform input kv-pairs into a new set of kv-

pairs

– The framework sorts & Shuffles the kv-pairs to output nodes

– The Reduce phase combines all kv-pairs with the same key into new kv-pairs

– The output phase writes the resulting pairs to files

All phases are distributed with many tasks doing the work

– Framework handles scheduling of tasks on cluster

– Framework handles recovery when a node fails

The Hadoop Distributed File System (HDFS)MapReduce environment provides the user

with a sophisticated framework to manage the execution of map and reduce tasks across a

cluster of machines.

The user is required to tell the framework for the following:-

– The location(s) in the distributed file system of the job input

– The location(s) in the distributed file system for the job output

– The input format

– The output format

– The class containing the map function

– Optionally. the class containing the reduce function

– The JAR file(s) containing the map and reduce functions and any support classes

If a job does not need a reduce function, the user does not need to specify a reducer class,

and a reduce phase of the job will not be run.

The framework will partition the input, and schedule and execute map tasks across the

cluster.

If requested, it will sort the results of the map task and execute the reduce task(s) with the

map output.

The final output will be moved to the output directory, and the job status will be reported

to the user.

Final IT/7th



MapReduce is based on key/value pairs.

The framework will convert each record of input into a key/value pair, and each pair will

be input to the map function once.

The map output is a set of key/value pairs—nominally one pair that is the transformed

input pair, but it is perfectly acceptable to output multiple pairs.

The map output pairs are grouped and sorted by key.

The reduce function is called one time for each key, in sort sequence, with the key and

the set of values that share that key.

The reduce method may output an arbitrary number of key/value pairs, which are written

to the output files in the job output directory.

If the reduce output keys are unchanged from the reduce input keys, the final output will

be sorted.

The framework provides two processes that handle the management of MapReduce jobs:-

TaskTracker manages the execution of individual map and reduce tasks on a compute

node in the cluster.

JobTracker accepts job submissions, provides job monitoring and control, and

manages the distribution of tasks to the TaskTracker nodes.

Hadoop Map/Reduce Architecture

JobTracker - Master

Client

TaskTracker - Slaves Run Map and Reduce task Manage intermediate output

UI for submitting jobs Polls status information

Accepts MR jobs Assigns tasks to slaves Monitors tasks Handles failures

Task Run the Map and Reduce functions Report progress

Final IT/7th



One JobTracker process per cluster and one or more TaskTracker processes per node in

the cluster.

The JobTracker is a single point of failure, and the JobTracker will work around the

failure of individual TaskTracker processes.

5. Elaborate HDFS concepts with suitable illustrations.

o Blocks

o Namenodes and Datanodes

Blocks

A disk has a block size- minimum amount of data that it can read or write.

Filesystems for a single disk building data in blocks is an integral multiple of the disk

block size.

Filesystem blocks are typically a few kilobytes in size, while disk blocks are normally

512 bytes.

This is generally transparent to the filesystem user who is simply reading or writing a

file—of whatever length.

There are tools to perform filesystem maintenance, such as df and fsck, that operate on

the filesystem block level.

HDFS has the concept of a block, but it is a much larger unit—64 MB by default.

Files in HDFS are broken into block-sized chunks, stored as independent units.

Having a block abstraction for a distributed filesystem brings several benefits.

1. A file can be larger than any single disk in the network. Blocks from a file to be

stored on any of the disks in the cluster.

2. Making the unit of abstraction a block rather than a file simplifies the storage

subsystem. The storage subsystem deals with blocks, simplifying storage management

and eliminating metadata

3. Blocks fit well with replication for providing fault tolerance and availability.

% hadoop fsck / -files –blocks will list the blocks that make up each file in the filesystem.

Namenodes and Datanodes

Final IT/7th



Hadoop framework consists on two main layers

– Distributed file system (HDFS)

– Execution engine (MapReduce)

An HDFS cluster has two types of node operating in a master- worker pattern:

a namenode(the master) and a number of datanodes (workers).

The namenode manages the filesystem namespace.

o It maintains the filesystem tree and the metadata for all the files and directories in

the tree.

o This information is stored persistently on the local disk in the form of two files:

the namespace image and the edit log.

o The namenode also knows the datanodes on which all the blocks for a given file

are located

A client accesses the filesystem by communicating with the namenode and datanodes.

The client presents a POSIX-like filesystem interface, so the user code does not need to

know about the namenode and datanode to function.

Datanodes are the workhorses of the filesystem.

They store and retrieve blocks when they are told and they report back to the namenode

periodically with lists of blocks that they are storing.

Without the namenode, the filesystem cannot be used.

Drawback

Final IT/7th



If the machine running the namenode were obliterated, all the files on the filesystem

would be lost since there would be no way of knowing how to reconstruct the files from

the blocks on the datanodes.

Hadoop provides two mechanisms to make the namenode resilient to failure,

First Mechanism

Back up the files that make up the persistent state of the filesystem metadata.

Hadoop can be configured so that the namenode writes its persistent state to multiple

filesystems and are synchronous and atomic.

The usual configuration choice is to write to local disk as well as a remote NFS mount.

Second Mechanism

Run a secondary namenode,to periodically merge the namespace image with the edit log

to prevent the edit log from becoming too large.

Runs on a separate physical machine, and requires plenty of CPU and as much memory

It keeps a copy of the merged namespace image, which can be used in the event of the

namenode failing.

In the event of total failure of the primary, data loss is almost certain.

6. Write short notes on Command-Line and Java Interface.

Simplest interface to HDFS

Two properties

o fs.default.name, set to hdfs://localhost/ (default)

o dfs.replication, to 1 so that HDFS doesŶ’t replicate filesystem blocks by the

default factor of three.

When running with a single datanode, HDFS replicate blocks to three datanodes.

The Java Interface

Reading Data from a Hadoop URL

Reading Data Using the FileSystem API

Writing Data

Directories

Querying the Filesystem

Final IT/7th



Deleting Data

1) Reading Data from a Hadoop URL

Simplest way to read a file from a Hadoop filesystem is by using a java.net.URL object to

open a stream to read the data from.

2) Reading Data Using the FileSystem API

java.io.File object

FileSystem - general filesystem API,

Final IT/7th



First step - to retrieve an instance for the filesystem we want to use—HDFS

Two static factory methods for getting a FileSystem instance

– public static FileSystem get(Configuration conf) throws IOException

– public static FileSystem get(URI uri, Configuration conf) throws IOException

Final IT/7th



3) Writing Data

Class : FileSystem

Simplest method : takes a Path object for the file to be created and returns an output

stream to write to:

– public FSDataOutputStream create(Path f) throws IOException

4) append()

package org.apache.hadoop.util;

creating a new file, you can append to an existing file using the append() method

– public FSDataOutputStream append(Path f) throws IOException

Final IT/7th



4) Directories

Final IT/7th



5) Querying the Filesystem

File metadata: FileStatus class

Ability to navigate its directory structure and retrieve information about the files and

directories that it stores.

Encapsulates filesystem metadata for files and directories, including file length,block

size, replication, modification time, ownership, and permission information.

Method : getFileStatus()

Final IT/7th



6) Deleting Data

• delete() method on FileSystem to permanently remove files or directories:

– public boolean delete(Path f, boolean recursive) throws IOException

• If f is a file or an empty directory, then the value of recursive is ignored.

• A nonempty directory is only deleted, along with its contents, if recursive is true

(otherwise an IOException is thrown).

7. Illustrate dataflow in HDFS during file read/write operation with suitable diagrams.

Anatomy of a File Read

Fig. A client reading data from HDFS

1) client opens the file it wishes to read by calling open() on the FileSystem object, which for

HDFS is an instance of DistributedFileSystem

2) DistributedFileSystem calls the namenode, using RPC, to determine the locations of the

blocks for the first few blocks in the file DistributedFileSystem returns an FSDataInputStream

(an input stream that supports file seeks) to the client for it to read data from FSDataInputStream

in turn wraps a DFSInputStream, which manages the datanode and namenode I/O

3) The client then calls read() on the stream

Final IT/7th



4) Data is streamed from the datanode back to the client, which calls read() repeatedly on the

stream

5) When the end of the block is reached, DFSInputStream will close the connection to the

datanode, then find the best datanode for the next block

6) When the client has finished reading, it calls close() on the FSDataInputStream (step 6).

If the DFSInputStream encounters an error while communicating with a datanode, then it

will try the next closest one for that block. The

DFSInputStream also verifies checksums for the data transferred to it from the datanode.

If a corrupted block is found, it is reported to the namenode before the DFSInput Stream

attempts to read a replica of the block from another datanode.

client contacts datanodes directly to retrieve data and is guided by the namenode to the

best datanode for each block.

Allows HDFS to scale to a large number of concurrent clients, since the data traffic is

spread across all the datanodes in the cluster.

The namenode has to service block location requests, for example, serve data, which

would quickly become a bottleneck as the number of clients grew.

Final IT/7th



Network distance in Hadoop

Anatomy of a File Write

Final IT/7th



1. The client creates the file by calling create() on DistributedFileSystem.

2. DistributedFileSystem makes an RPC call to the namenode to delete a file blocks

associated with it.

3. As the client writes data DFSOutputStream splits it into packets, which it writes to an

internal queue, called the data queue. The data queue is consumed by the Data Streamer,

whose responsibility it is to ask the namenode to allocate new blocks by picking a list of

suitable datanodes to store the replicas.

4) The DataStreamer streams the packets to the first datanode in the pipeline which stores

the packet and forwards it to the second datanode in the pipeline. Similarly, the second

datanode stores the packet and forwards it to the third (and last) datanode in the pipeline.

5) DFSOutputStream also maintains an internal queue of packets that are waiting to be

acknowledged by datanodes, called the ack queue. A packet is removed from the ack

queue only when it has been acknowledged by all the datanodes in the pipeline.

6) When the client has finished writing data, it calls close() on the stream

7) This action flushes all the remaining packets to the datanode pipeline and waits for

acknowledgments before contacting the namenode to signal that the file is complete.

unit ivpit.ac.in/pitnotes/uploads/cs6703_iv.pdf · the globus toolkit was initially motivated by a...

Documents