grid computing project

8/6/2019 Grid Computing Project

1/24

Grid Computing

Vincent PoonUniversity of Pennsylvania

Oby SumampouwUniversity of Pennsylvania

ABSTRACT

Grid computing brings the diverse resources of multiple administrative domains to bear on

large scale computing problems. Recent advances in desktop computing power and

network bandwidth have generated widespread interest and investment in grid

technologies. This paper examines the current status of grid computing through a review

of recent literature on the topic. The analysis focuses on how grids are implemented, the

benefits and drawbacks of grid computing, and both public and private applications of grid

technologies.

Keywords: grid computing, distributed computing, Internet, Information Technology

CIT 595


2/24

Table of Contents

Introduction 1

Implementation 1

What is the Grid? 1Grid Computing Architecture 4GRID Middleware (Globus Toolkit) 5GRID Framework 6

Benefits 9

Drawbacks, Risks, and Limitations 10

Security 10Impact on Network Traffic 12Accounting and Charging for Grid Resources 12Amdahls Law 14

Applications 15

SETI@Home 15Folding@Home 16Private/Corporate Applications 17

Conclusion 19

Appendix 20

Task Summary 20


3/24

IntroductionSince the dawn of computers, there have always been computational problems requiring

massive amounts of processing power. Despite Moores Law, the large-scale calculations de-

manded by these problems have consistently exceeded the computational capabilities of even the

fastest processors available. In an attempt to satisfy this demand for large-scale processing

power, several approaches have been taken, including super-computing and cluster computing.

These approaches have typically relied on the use of multiple processors or computers operating

in parallel to act as a single, ultra-fast computer. These approaches have had their share of suc-

cesses, but have also been limited by high costs and short life-spans.

We think that two general trends in computing portend a potential explosion in grid com-

puting. First, while personal computer processors have achieved exponential increases in speed

and capabilities, the processing demands of the average users typical computer usage has not

kept pace. This has resulted in a situation where most CPUs remain idle for large amounts of

time. Secondly, the rapid expansion of the Internet and broadband access has essentially created

a high-speed network between most of the personal computers currently in use. Grid computing

offers the potential to take advantage of these two trends by breaking up computational problems

into smaller pieces, and transmitting these pieces over the Internet to harness the large number of

idle CPU cycles. In this way, grid computing represents a low-cost, efficient usage of computer

resources to solve large-scale computational problems.

ImplementationWhat is the Grid?

The term grid computing is not defined rigidly. According to [1], grid computing has

three characteristics as shown in figure 1:

1.) Decentralized Resource Coordination:

All resources within the network are handled at the local level. Grid computing handles

the integration and distribution of users from multiple domains. The grid must also address the

security issues which emerge from the interactions among many users. This approach is the op-

1


4/24

posite of traditional server-client resource coordination, where resources are heavily centralized

on the server.

2.) Open source, standard and general purpose protocols and interfaces:

Grid computing is used to handle multi-various applications with userson different do-

mains. Therefore it is important that the communication protocols among the nodes are imple-

mented in a standard way. Open source development plays a significant role in ensuring that the

protocol can be expanded to serve specific applications.

3.) Deliver high quality services:

Grid computing must be able to solve complex interactions among resources in a respon-

sive and coordinated way.

Figure 1. The Basic Foundation of a grid enabled application Source [2]

In some scientific communities, grid computing refers to CPU scavenging, where idle

machines are converted into a shared computing resource, such as the system provided by

SETI@home [3] to search for extraterrestrial life. However based on the 3 criteria listed above,

SETI project and Folding@home should not be considered as a GRID application. The public

2
mailto:Folding@homemailto:Folding@homemailto:Folding@home


5/24

nature of SETI and Folding@home creates security compromises and they are prone to malicious

attack [2]. In addition, we think that since SETI and Folding clients cannot interact with each

other, they thereby do not follow the specification of the grid computing model. However since

SETI and Folding are commonly acknowledged as grid computing examples [4], we will intro-

duce a few distributed computing models that can be considered as grid computing under a

broader definition.

a) Internet Computing uses the Internet as a means to solving large problems. Large prob-

lems are divided into smaller sub problems and are distributed through the Internet to small

computing resources such as personal computers and laptops. SETI@home and Folding@home

use this model. Resource nodes incorporate Internet computing by installing a client program.

The client program will then download a small problem and utilize unused CPU cycles to solve

the problem and resend the solution to the server. The server assigns a unique ID tag for each

chunk of problem and each problem is solved by several users. This redundant problem solving

is conducted to maintain accuracy and to prevent backlog from nodes who failed to solve a prob-

lem.

Unused CPU cycle management is delegated to the client's operating system usually by

setting the default program priority to the lowest priority. For example: Folding@home clientprogram is run with default niceness of 19, which is the lowest priority program in Linux.

Internet computing's major advantage is scalability and a high degree of independence from net-

work latency due to its decentralized nature. However, due to the free and open nature of the cli-

ent program, this computing model is prone to security attacks.

b) P2P or peer-to-peer can be represented as storage grid. The advantage of a peer-to-peer

distributed model is decentralized control. The nodes interact among themselves and they relieve

the burden of managing resources from the central server. Kazaa, Limewire and Napster are

prime examples of the P2P model. In this model, resources such as data and network bandwidth

are located in local client called peers. Peers can share and leverage unused resources by ag-

gregating cycles and sharing digital content. Available download bandwidth is directly correlated

with the number of client available. This is the greatest strength of the P2P model. Unfortunately

3
mailto:Folding@homemailto:Folding@homemailto:Folding@homemailto:Folding@homemailto:Folding@homemailto:Folding@home


6/24

since P2P has no centralized control, it is hard to find an efficient search mechanism and it tends

to create high network latency because the speed of the network depends on the number of users

aggregated for a certain resource. To alleviate this problem other P2P models have been devel-

oped - namelyHybrid decentralized P2Pin which a server hold meta data with respect to each

resource so searching is faster andPartially centralized P2Pin which several nodes are gathered

and managed by one larger node which acts like a pseudo server.

Grid Computing ArchitectureIn general, grid architecture can be represented as in Figure 2.% %

% % % % % % % % % %

% % %

%

% % % % % % % % % % % % % % % %

% % % % % % % % % % % % % % % %

% % % %

%

% %% % % % % % % % % % % % %

% % % % % % % % % % % % % %

% % %

% % % % % % % % % % % % % )

% % % % % % % % % % % % )

.))4#/3+#',

&'441/+#D1

N16'*"/1

&',,1/+#D#+=

735"#/

H')%P1D14

Q'++'(%P1D14

% % % % % % % %

Figure 2. High level concept of Grid Computing Architecture

Explanation of each layer:

a.) Fabric:Fabric is the lowest layer in grid architecture. Unlike in normal computer architecture

where the lowest layer represents logic gates, the fabric is an abstract layer which represents lo-

cal computing resources such as storage, networking and computational resources.

b)Connectivity:

4


7/24

Connectivity layer connects several fabrics into one giant node of fabric. Connectivity

layer provides secure connections and is implemented using network protocols such as Internet

protocol (TCP/IP) and application protocols (DNS), etc.

c)Resource:

Resource layer deals with management of many connectivity layers. Resource layer can

be information protocols used to obtain information about configuration, load and usage policies,

and management protocolthat negotiate the policies for handling resource requirement and op-

erations.

d)Collective:

Collective layer consist of the protocols of interactions among several different resources.This layer includes directory services, accounting payment, collaboration services, and schedul-

ing services to name a few.

e) Application:

Application layer is the highest layer in grid computing architecture. This layer calls

other layers to perform desired actions. Application layer is simply the program we are working

with to solve our problems.

GRID Middleware (Globus Toolkit)Since grid computing is relatively new, standards are being developed to accommodate

the openness and integrity of grids. There are two competing industry standards groups, the

Global Grid Forum, started in 1999 and the Enterprise Grid Alliance, founded in 2004. [5]. Here

are some examples of middleware and APIs used for developing grid applications: Globus

Toolkit, Berkeley Open Infrastructure for Network Computing (BOINC), Simple Grid Protocol

and Java CoG Kit.

The Globus Toolkit(GT) was developed by Global Alliance, a division of Global Grid

Forum. Global Alliance comprises of R&D research groups based at several universities such as

the University of Chicago, the University of Edinburgh and the University of Southern Califor-

nia. GT is the de facto standard for grid computing [2] and it is comprised of 3 main services:

5


8/24

a) The core services:

Basic infrastructures to enable grid computing such as: resource management for naming

and locating computational resources on remote systems, security and system level services, and

monitor status.

b) Security services:

Security is implemented using the standard GSI (Grid Security Infrastructure) and CAS

(Community Authorization Service). GSI offers services such as basic certification, PKI, and

many other security libraries.

c) Data/Resource Management

Protocols to ensure rapid and secure data transfer among resource nodes. There are 4 pro-tocols: GridFTP, Reliable File Transfer(RFT), Replica Location Service(RLS), and Extensible

Input/Output(XIO). GRAM(Globus Resource Allocation Managers) is the Data Management for

GT. TeraGrid [5], TIGER, Taiwan UniGrid[6] are examples of grid projects that use GT.

GRID FrameworkLike many other high performance computing models, a grid enabled application has a

typical framework as shown in figure 3.

Figure 3. The typical framework of a grid-enabled application. Source [2]

6


9/24


10/24

# # # # # # # # # # # # # # #

# # # # # # # # # # # # # # #

# # # # # # # # # # # # # # # # # #

# # # # # # # #

# # # # #

# # # # # # # # # # # # # # # #

ABC=


11/24

FindServiceData() which provides information about the service such as status and reg-

istry

SetTerminationTime() which sets how long until the service is terminated

Destroy() which allows the client to destroy instances

OGSA interfaces, which is called WSDL PortTypes, also implements additional meth-

ods such as

SubscribeTo-NotificationTopic() which allow delivery of notifications via third party

messaging services

RegisterService() which register the GSH

CreateService() which creates new grid service instances and many other interfaces.

BenefitsGrid computing offers several benefits over regular cluster computing and super com-

puter models. For example, grid computing enables several resource nodes such as regular desk-

top computers, super computers and even cluster computers to be connected as one giant com-

puter. This is possible because grid computing has a transparency layer that shields the user from

the impression that the grid is a network of computers. In addition grid also offers several other

benefits such as:a) The ability to use computing resources regardless of their location and therefore, man-

aged by different people and organizations. [8] Unlike regularly connected network (i.e. the

Internet, server-client networks), a user in Chicago could access a file in a computer down at At-

lanta as if the file was in his personal desktop. This is possible because the grid treats storage and

computing power of several clusters as a single computer by implementing the transparency

layer and virtualization program.

b) Internet computing and P2P offers a cheap solution to solve a large problem. This is

especially useful for scientists that are working to solve a scientific problem that requires mas-

sive computing facilities but do not have sufficient funding to purchase adequate facilities. The

grid model allows common folks to contribute to science in ways that were not available before.

9


12/24

c) Unlike super computers, cluster computing builds a giant computing resource based on

regular computer components such as regular Intel Pentium processors, regular DDR SDRAM,

etc. Thus the cost and scalability is superior to that of super computers. Grids have similar per-

formance and scalability as cluster computing and may cost less. According to [6], once a net-

work speed surpasses a certain limit, the speed of a grid network does not affect grid perform-

ance much.

Drawbacks, Risks, and LimitationsSecurity

Grid computing poses a variety of unique security challenges. In a more traditional

server-client model, a client is authenticated by a server to use the servers resources. In a grid

computing environment, however, resources from different administrative domains are brought

to bear on a single computation. As [9] points out, it is quite possible in such an environment for

a particular grid resource to act as both a server and a client. When a user first sends out a com-

putation onto the grid, the first resource to receive the request is acting as a server. Yet this initial

server may quickly become a client as it requests assistance from other resources on the grid.

This scenario highlights one of the security demands of grid computing - delegation.

That is, a user needs to be able to delegate authority to the grid application he/she is running, sothat the application can then authorize any subprocesses it needs to run on other grid resources.

With the resources of a grid widely spread out, all using different security policies with various

levels of security, this can be quite a challenge. To solve this problem, a proxy can be used. If

the proxy is recognized and trusted by all administrative domains, the user can can login to the

proxy, and all requests for new grid tasks would be handled by the proxy. Of course, this would

necessitate a global identification system [10], since different administrative domains may con-

tain the same local login IDs. One proposed solution is to use a naming system similar to the

DNS system, where components are added to an ID progressively until it is globally unique.

In addition to the authentication issues just mentioned, grids need to also manage confi-

dentiality. The nature of grid computing is such that the data being computed will be copied to

many different machines, each a potential security leak. One of the original drivers of grid com-

10


13/24

puting was the demand for large-scale, high performance computing by the scientific communi-

ties. The need for security in these initial scientific applications was minimal, since scientific

research is typically done openly with peer review and public funding. As such, researchers

didnt have to worry about the confidentiality of data being sent over public grids. As grid com-

puting expands beyond its scientific roots into the private arena, however, grids will be running

more mission-critical, confidential applications, and keeping the data being sent over the grid a

secret will become a priority.

On the flip side, every machine that participates in a grid network wants to ensure its own

security with respect to the opening of its resources to the grid. Each machine needs to guard

against malicious code that it might receive from the grid (or an unauthorized party posing as a

grid member) as a computing task. It is crucial that this security safeguard is in place at the local

machine level, to prevent a major risk of grid computing: the propagation of viruses, worms, or

other malicious content over the grid [11].

Thus, there are dual security concerns at play in grid computing: the desire for confiden-

tiality of the data sent over the grid, and the desire to protect each resource/machine from mali-

cious data/code. One way of solving these problems is to run grid processes in asandbox [12],

where the local client system has limited access to the grid data, and the grid code being run haslimited access to the local system. Encryption can be combined with this model to transfer data

between resources on the grid, as to prevent unauthorized parties that intercept the data from

reading it. For example, a Public Key Infrastructure such as RSA can be employed either to en-

crypt the data directly, or to open a secure channel of communication between grid participants.

Maintaining the integrity of data on the grid is also an important security challenge. That

is, after a users task is complete, he/she needs to ensure that no individual part of the grid has

tampered with the data being computed. This can be dealt with by either using MD5 checksums,

redundancy (where the same task is parceled out to several grid resources), or a combination of

the two.

11


14/24

Finally, a security challenge for future grid growth will perhaps not be a technical hurdle,

but a legal one. Grids already encompass geographically widespread and diverse administrative

domains, with some even spanning multiple continents. With laws and policies widely differing

between countries and states with respect to encryption and privacy [13], grid administrators will

need to find a way to secure the grid while still respecting local laws. In addition, legal statutes

may govern the unauthorized installation of middleware, even if the computer resources

wouldnt have been used for anything else. For example, David McOwen was sued for $415,000

for installing grid computing programs on the computers at his college, even though these pro-

grams were set to use only idle CPU resources [14].

Impact on Network Traffic

One of the concerns about grid computing is its impact on network traffic. Grid comput-

ing was first developed in the mid 90s, and at the time many individual users did not even have

broadband access in their homes. From the start, then, grid computing designers have had a mo-

tivation to minimize the impact of their software on network traffic. Grid computing employs

sophisticated scheduling and caching to reduce the impact on the users network capacity. Users

can control settings that allow them to decide when to receive/transmit data, and how much data

to cache. For example, the BOINC client can restrict network usage to certain times of the day,

and maximum upload/download rates can be set. Finally, bandwidth monitors can be used [15]

to ensure that a minimal level of network capacity is available to the user at all times. Of course,

all of these measures can slow down the grids overall processing speed.

Accounting and Charging for Grid ResourcesThe public, research-oriented grids currently in use do not have any measures in place to

account for the expenditure of resources such as network bandwidth. The middleware for these

projects are distributed with licenses that free the originators from any liability with respect to

the use of the software. Any bandwidth or power expenditures, then, are paid for by the users of

the middleware, who in essence voluntarily give up these resources forthe progress of the grid

project.

12


15/24

As the use of grid computing expands and more data is pushed out over the grid, how-

ever, accounting and charging systems will be needed to keep track of resource expenditures and

payments. In traditional computing paradigms, charging for bandwidth and server processing/

storage usage is fairly straightforward using conventional metering based on time or amount

used on a per-client basis. The challenge in grid systems is that any given task can use a wide

array of resources spread out across the grid simultaneously. Thus, as [16] points out, if proper

charging of usage is to take place, it is imperative that all administrative domains within a grid

agree to the same standard of accounting and charging.

Deciding which standard to apply can be an involved process, and can depend on what

the grid is primarily used for. For example, some grid computing tasks may involve large

amounts of data analysis and data mining, thereby requiring heavy bandwidth usage to transfer

the data sets across the grid, whereas other tasks might use very limited bandwidth but require

intense processor usage. [17] lists several examples of resources that might be metered and

charged for:

CPU time

Memory Usage

Page faults

Storage Usage

Bandwidth Consumption Software and Libraries accessed

Signals Received/Context Switches

A particular grid, then, may charge based on any combination of such resources. A grid

could also have varying grid service classes [18], where, for instance, some classes would re-

ceive lower latency and other benefits but be charged at higher rates. The accounting for this,

however, would still be complicated, and some have suggested a flat-rate pricing model to sim-

plify this process. Others have proposed market based schemes [19], where grid resources are

considered producers and the users of the grid are considered consumers. In such a scheme, the

producers would offer a set of services for a given price in an auction, and consumers would

consequently bid on these services. The ostensible benefit of such a system would be similar to

the advantages of private markets in other economies - e.g. the users with the most urgent (as de-

termined by amount bid at auction) computing tasks would get serviced first. And just as there

13


16/24

are brokers in real-world economies, software brokers have been proposed - intelligent soft-

ware agents that seek out resources at the best prices for their owners. In the future, public par-

ticipants in grid networks may even get compensated monetarily for allowing their spare com-

puter resources to be used by whoever is willing to pay for it. The hope is that utility (in this

case computational success) is maximized for all under such a free-market system.

Amdahls LawAmdahls law is a general statement about the limitations of parallelization in computing,

which is inherently relied upon by grid computing. In his original paper [20], Amdahl referred to

an inevitable portion of the computational load that he called data management housekeeping,

and pointed out that this portion is mostly sequential and hence will limit the gains that can be

achieved through parallel processing. Although he did not give any equations in his paper, a

common formulation of his ideas is [21]:

where S is the overall system speedup,fis the fraction of work per-

formed by the component being analyzed, and kis the speedup of the new component. Applying

the equation to parallelization, we find that if the fraction of the work that can be made parallel is

not 100%, then doubling the numberofcpus does not necessarily double the speed of the overall

system.

Within a grid computing context, we can take Amdahls law even further by noticing that

even the speedup from parallelizing a process is itself limited by communication time [22]. That

is, even if a process can be sped up by dividing it into chunks and calculating these chunks sepa-

rately over a grid, this speedup is limited by the time it takes to transfer the initial data through

the grid to the grid resources and eventually back to the user after the calculation is complete.Even if we were to assume instantaneous computation of results by the grid resources, the calcu-

lation time can be no smaller than the time it takes to communicate data over the network (which

in turn involves the significant security and potential accounting overheard described earlier).

14


17/24

Amdahls law thus gives the theoretical bounds on possible speedups from the use of grid

computing. In practice, we find that it is indeed the case that every grid application in use to date

does not rely on low latency or high response times. Even if a grid has more processing capabil-

ity than any single computer, it will never have the fast communication time inherent in a single

computer, and this places significant limitations on the applications of grid computing.

ApplicationsSETI@Home

One of the earliest, and most successful, public-resource grid computing projects was

SETI@Home, or the Search for Extra-Terrestrial Intelligence. The project uses grid computing

to analyze radio waves from outer space for signs of intelligent life. This analysis requires the

use of fast Fourier transforms and adjustments to correct for what is known as Doppler drift [23],

all of which requires large amounts of computational processing. In fact, even with 3.96 million

users as of 2002, the project still receives more raw data than it can analyze, creating a rising

backlog of data to be examined. The following diagram [15] depicts the overall process:

TheinternetWork unit

storageData splittersTapes

fromArecibo

2.4 millionusers

User database

Data server

Science database

First, data is sent on 35GB tapes to a centralized server location, where the data on the

tapes is broken up into work units. The data is actually very amenable to this process and is ideal

for grid computing, since observations of different portions of the sky are independent of one

another, and hence can be broken up into work units fairly easily. These work units are stored on

a data server that distributes them upon request (using the HTTP protocol to avoid firewalls)

15


18/24

from users that have the client software installed. This is a more limited form of grid computing

in that the clients do not communicate with each other, but instead send the work units directly

back to the data server upon completion. This simplifies security and synchronization issues,

and most internet computing projects have followed this model.

One interesting and innovative aspect of SETI@Home is its use of two databases [24].

One is a science database, which is needed to store results from completed work units. But it is

the user database that enabled the project to garner as much support as it has. The user database

stores information about the submitter whenever it receives a completed work unit, recording a

variety of stats such as team, country, and total CPU time contributed. This allows for fun,

friendly competition between different teams and countries, which in turn helps spread the word

about the project. One problem with this, however, is that some users goto extremes and send

fake or manipulated data to increase their stats [25]. To combat this, the SETI@Home project

uses a redundancy level of 2 to 3, and has looked into embedding encrypted tags into work units

to verify that no tampering has taken place.

Folding@HomeAs discussed previously, one of the challenges in grid computing is designing algorithms

for the task at hand that can be massively parallelized. The simulation of protein folding was in

the past an example of an application that required enormous amounts of computing power, but

could not be spread out over more than a few hundred CPUs very easily. The Pande group at

Stanford, however, came up with a method using ensemble dynamics that made it easy to divide

up the work of protein folding simulations into separate computations, which results in an almost

linear speed up with the number of processors [26]. They formed a public distributing project

called Folding@Home in 2000 based on their algorithm, and as of March 2007 almost 2 million

CPUs have contributed to the project.

The actual implementation of Folding@Home is very similar to that of SETI@Home -

work units are created and distributed in the same fashion. What is unique about Fold-

ing@Home is that the developers have migrated the code so that it can take advantage of a wide

variety of resources. [27] shows the mix of contributing platforms as of Mar 2007:

16


19/24

OS Type Current TFLOPS Active CPUs Total CPUs

Windows 155 163467 1630664

Mac OS X/PowerPC 7 8974 95656

Mac OS X/Intel 10 3180 7864

Linux 43 25570 216555

GPU 45 769 2287PLAYSTATION3 392 29920 43712

Total 652 231880 1996738

Official support for the ATI Radeon x1900 GPU was added in Sep 2006, and sup-

port for the PS3 was added March 15, 2007. As can be seen from the table, these sources pro-

vide a much higher TFLOP/processor ratio than desktop CPUs, and have boosted the project

much closer to its goal of 1 petaflop. Thus, the Folding@Home project is proof-of-concept that

grid networks can take advantage of a panoply of computing resources to tackle large problems.

Private/Corporate ApplicationsMost businesses in the private sector already have large investments in IT and computing

resources. Yet these resources are typically not used uniformly, thus providing an opportunity

for efficiency gains from employing grid computing. According to IBMs vice president of grid

computing [28], in a typical enterprise environment, Windows desktops and servers have roughly

5 to 10% utilization, and Unix servers have between 10 and 20% utilization. By using grid com-

puting, companies can lower their IT costs by using idle desktop cycles rather than purchasing

new servers, and can divert resources from idle divisions to busier ones. For these reasons, grid

computing is gaining traction in the enterprise market. Indeed, corporate investment in grid

computing has grown exponentially in the past few years. Worldwide spending totaled $719 mil-

lion in 2005, $1.8 billion in 2006, and is expected by analysts to reach a staggering $12 billion

in 2007, and $24.5 billion by 2011 [29]. The following table shows the results of a survey in

[30] citing reasons given for implementing grid technology:

Reduce overall capital costs 69%

Increase performance/service levels 62%

Greater flexibility in assigning IT resources 52%

Improve utilization rates 41%

Reduce IT staffing costs 41%

17


20/24

Reduce IT upgrade cycle 17%

Reduce data center floorspace 17%

Corporations are finding other uses for grid computing besides taking advantage of ex-

cess cycles on desktops. For example, Ebay is using grid computing to spread work across their

more than 15,000 servers [31]. Its system administrators normally have to manage each server

individually, but with grid technology they can manage entire domains together. One problem

they are facing, however, is the ability to find common grid computing standards in the industry.

Industry organizations such as the Enterprise Grid Alliance (EGA) are working to resolve such

issues.

Many companies are experimenting with grid computing by incrementally adding grid

technologies alongside their current IT systems. Rather than immediately install grid middle-

ware on all the desktops and risk bringing down mission-critical systems, many companies are

adding dedicated grids that are used for the most resource-intensive computations. For example,

UPS recently moved its billing application from a mainframe to a Linux grid [32]:

In this approach, the grid does not completely replace the traditional mainframe, but

complements it. So far, it has been a success: the UPS team discovered that a process that took

270 minutes on the mainframe could be done in less than 40 minutes on a mere two-server, 8-cpu

18


21/24

grid. As predicted by Amdahls law, however, they found diminishing returns when adding a

third or fourth server, with only a few percentage points of performance differential. Another

major problem UPS ran into was licensing - grid computing doesnt help if you dont have any

software to run on the grid. It turns out that many software licenses are node locked, which

means they tie the software to a designated computer. Grid computing requires concurrent use

licenses, which allow more than one user to run the software simultaneously.

Within the realm of incremental approaches, Sun Microsystems offers an innovative and

perhaps ironic approach. In the past, computer vendors offered server time on a per-use basis -

for example, a company might pay to use a server for a given amount of time. This approach fell

out of favor when the price of PCs fell dramatically. Yet now Sun is once again reviving the on-

demand computing approach with the Sun Grid Utility [33], which allows the public to use Suns

grid for $1 per CPU hour. For example, if a job uses 1000 of the grid CPUs for one minute, it

would count as 16.67 CPU hours, and hence cost $17 [34]. This allows companies to tap into

large amounts of computing power when they need it, and reduces the cost of capital for startups,

who dont have to purchase servers immediately. Suns ostensible strategy is to give people a

chance to experience the capability of grids, as a way of driving business to Suns grid comput-

ing offerings. This heralds a potential future where large corporations will be able to sell their

idle CPU cycles to drive down their computing costs.

ConclusionThus, grid computing delivers high quality services through a decentralized coordination

of resources. Open source protocols and standardized interfaces are now bringing the advanced

distributed job handling capabilities of grid networks to a wider audience than ever before. The

advantages of this are clear: more efficient use of computational resources and increased produc-

tivity, at a lower cost than other computing paradigms. Perhaps more importantly, grid comput-

ing offers solutions to problems so large that they were previously considered infeasible or cost

prohibitive.

Even so, risks remain: security concerns and Amdahls law place significant limitations

on the ultimate reach of grid computing. But if future development of grid computing remains

19


22/24

consistent with its historical trends, bright researchers and a private sector with a vested interest

will continue to develop new and innovative methodologies to minimize these drawbacks.

AppendixTask Summary

Vincent Poon

Drawbacks, Risks, Limitations

Applications of Grid Computing

Oby Sumampouw

Implementation of Grid Computing

Benefits of Grid Computing

20

References1 Foster, I. What is the Grid? A Three Point Checklist. Argonne National Laboratory & University of Chi-

cago, July 20, 2002. Available online at http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf

2 Silva, V. Grid Computing for Developers. Hingham, Massachusettts, Charles River Media, Inc. 2006.

3 Sullivan III, W. T., Werthimer, D., Bowyer, S., Cobb, J., Gedye, D. & Anderson, D. A new major SETIproject based on Project Serendip data and 100 000 personal computers. In Proc. 5th Int. Conf. Bioas-tronomy (ed. C. B. Cosmovici, J. Bowyer & D. Werthimer). Bologna, Italy: Editrice Composition. IAU Col-loquium No. 161. 2001

4 Abbas, A. Grid Computing: A Practical Guide to Technology and Applications. Hingham, Massachusetts,

Charles River Media, Inc. 2004.

5 Beckman, P.H. Building The Tera Grid. Philosophical Transactions of The Royal SocietyA. (2005) 363,p17151728.

6 Chang, H., Li, K., Lin, Y., Yang, C., Wang, H., Lee, L. Performance Issues of Grid Computing Based onDifferent Architecture Cluster Computing Platforms. Proceedings of the 19th International Conference onAdvanced Information Networking and Applications(AINA05) Vol 2, p321-324. Issued 28-30 March

2005.

7 Gannon, D., Chiu, K., Govindaraju, M., and Slominski, A. An Analysis of the Open Grid Services Archi-

tecture. Department of Computer Science, Indiana University, Bloomington, IN. Available online athttp://www.extreme.indiana.edu/~aslom/papers/ogsa_analysis3.html

8 Coveney, P.V. Scientific Grid Computing Philosophical Transactions of The Royal Society A. (2005)363, p17071713.

9 Foster, I. The Grid: a new infrastructure for 21st century science. Physics Today, v 55, n 2, Feb.2002, p 42-7.
http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdfhttp://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdfhttp://www.extreme.indiana.edu/~aslom/papers/ogsa_analysis3.htmlhttp://www.extreme.indiana.edu/~aslom/papers/ogsa_analysis3.htmlhttp://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdfhttp://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf


23/24

21

10 Humphrey, M; Thompson, M. Security for Grids. Proceedings of the IEEE, v 93, n 3, March, 2005,

p 644-652

11 Johnston, W.; Jackson, K.; Talwar, S. Overview of Security Considerations for Computational andData Grids. Proceedings 10th IEEE International Symposium on High Performance Distributed Comput-ing, 2001, p 439-40

12 Cummings, M.; Huskamp, J. Grid Computing. EDUCAUSE Review, vol. 40, no. 6 (November/December 2005): 11617.

13 Ramakrishnan, L. Source. Securing Next Generation Grids. IT Professional, v 6, n 2, March-April2004, p 34-9

14 Hermida, A. When Screensavers are a Crime. BBC news online, Jan 28, 2002. HTTP:

http://news.bbc.co.uk/1/hi/sci/tech/1782050.stm

15 Surveyer, J. Grid Computing Uses Spare CPU Power. NetworkWorld, July 15, 2002. HTTP:http://www.networkworld.com/news/tech/2002/0715tech.html

16 McGinnis, L.F.; Thigpen, W.; Hacker, T.J.. Accounting and Accountability for Distributed and Grid Sys-

tems. Proceedings CCGRID 2002. 2nd IEEE/ACM International Symposium on Cluster Computing andthe Grid, 2002, p 284-5

17 Zhengyou, L.; Zhang, L.; Shoubin, D.; Wenguo, W. Charging and Accounting for Grid Computing

System. Grid and Cooperative Computing. Second International Workshop (GCC 2003) (Lecture Notesin Comput. Sci. Vol.3032), 2004, pt. 2, p 644-51 Vol.2

18 Stiller, B.; Gerke, J.; Flury, P.; Reichl, P. Charging Distributed Services of a Computational Grid Archi-

tecture. Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid,2001, p 596-601

19 Buyya, R.; Abramson, D.; Venugopal, S. The Grid Economy. Proceedings of the IEEE, v 93, n 3,

March, 2005, Grid Computing, p 698-714

20 Amdahl, Gene., Validity of the single processor approach to achieving large scale computing capabili-ties, AFIPS spring joint computer conference, 1967.

21 Null, L.; Lobur, J. The Essentials of Computer Organization and Architecture, Second Edition, 2006, p

328-329

22 Browne, J. Performance and Scalability. CS395T Lecture Notes. HTTP:http://www.cs.utexas.edu/~browne/CS395Tf2002/

23 Korpela, E.; Werthimer, D.; Anderson, D.; Cobb, J.; Lebofsky, M. SETI@HOME - Massively distrib-

uted computing for SETI. Computing in Science and Engineering, v 3, n 1, January/February, 2001, p78-83

24 Anderson, D.P.; Cobb, J.; Korpela, E.; Lebofsky, M.; Werthimer, D. SETI@home: an experiment inpublic-resource computing. Communications of the ACM, v 45, n 11, Nov. 2002, p 56-61

25 Bansal, R. ET or EC? IEEE Antennas and Propagation Magazine, v 43, n 4, Aug. 2001, p 118

26 Larson, S. M.; Snow, C. D.; Shirts, M.; Pande, V. S. Folding@Home and Genome@Home: Using dis-tributed computing to tackle previously intractable problems in computational biology. Computational Ge-

nomics, Horizon Press, 2002
http://www.cs.utexas.edu/~browne/CS395Tf2002/%06http://www.networkworld.com/news/tech/2002/0715tech.html%06http://www.cs.utexas.edu/~browne/CS395Tf2002/%06http://www.cs.utexas.edu/~browne/CS395Tf2002/%06http://www.networkworld.com/news/tech/2002/0715tech.html%06http://www.networkworld.com/news/tech/2002/0715tech.html%06


24/24

27 Folding@Home client statistics by OS. HTTP:

http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats

28 Thiboudeau, P. IBM Expands Grid Offerings. Computerworld. May 5, 2003. Vol. 37, Iss. 18; p. 7

29 Vreede, S.V. Grid Computing Market Trends. Faulkners Advisory for IT Studies, March 2007.30 Summit Strategies. Grid Computing Facts. InfoTech Trends, Apr 2004.

31 Patrick T. EBay Seeks Grid Standards as It Expands Massive System. Computerworld, Sep 25,

2006. Vol. 40, Iss. 39; p. 18

32 Julie B. How to avoid bumps on the road to grid computing. Network World. Feb 19, 2007. Vol. 24,Iss. 7; p. 32

33 Solheim, S. Sun Grid Goes Live. InfoWorld, 3/20/2006, Vol. 28 Issue 12, p17

34 Sun Utility Computing website, HTTP: http://www.sun.com/service/sungrid/
http://www.sun.com/service/sungrid/%06http://www.sun.com/service/sungrid/%06http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats%06http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats%06http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats%06http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats%06

grid computing project

Documents