defining the grid: a snapshot on the current view .doc

19
Defining the Grid: A Snapshot on the Current View Draft v0.4, 15 June 2006 Heinz Stockinger Swiss Institute of Bioinformatics (Vital-IT) CH-1015 Lausanne, Switzerland [email protected] Active contributions from: Greg Astfalk, Malcolm Atkinson, Miguel Bote-Lorenzo, Rajkumar Buyya, Lorenzo Cerutti, Walfredo Cirne, Brian Coghlan , Jose Cunha, Andrea Domenici, Flavia Donno, Dietmar Erwin, Laurent Falquet, Stephen Flinter, Ian Foster, Geoffrey Fox, Fabrizio Gagliardi, Wolfgang Gentzsch, Andrew Hanushevsky, Emir Imamagic, Fotis Karayannis, Dan Katz, Dieter Kranzlmüller, Domenico Laforenza, Erwin Laure, Max Lemke, Rodrigo Fernandes de Mello, Miron Livny, Gabriel Mateescu, Rodrigo Mello, André Merzky, Reagan Moore, John Morrison, Maria S. Perez, Ron Perrot, Jean- Marc Pierson, Thierry Priol, Jean Salzemann, Dave Snelling, Michela Taufer, Domenico Talia, Sathish Vadhiyar, Frank van Lingen, Gregor von Laszewski Abstract The term “Grid” was introduced in early 1998 with the launch of the book “The Grid. Blueprint for a new computing infrastructure”. Since that time many technological changes have occurred in both hardware and software. One of the most important ones seems to be the wide acceptance of Web services. Although the basic Grid idea has not changed much in the last decade, many people have different ideas about what a Grid really is. In the following article we report on a survey where we invited many people in the field of Grid computing to give us their current understandings. Computational Grids are the equivalent to the electrical power Grid[1] With Web Services we allow a thousand flowers to bloom. With a Grid we organize the planting and growth of a crop of plants to make harvesting easier.” [MA] 1 Introduction

Upload: cameroon45

Post on 29-Aug-2014

1.168 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Defining the Grid: A Snapshot on the Current View .doc

Defining the Grid: A Snapshot on the Current View Draft v0.4, 15 June 2006

Heinz StockingerSwiss Institute of Bioinformatics (Vital-IT)

CH-1015 Lausanne, [email protected]

Active contributions from:

Greg Astfalk, Malcolm Atkinson, Miguel Bote-Lorenzo, Rajkumar Buyya, Lorenzo Cerutti, Walfredo Cirne, Brian Coghlan , Jose Cunha, Andrea Domenici, Flavia Donno, Dietmar Erwin,

Laurent Falquet, Stephen Flinter, Ian Foster, Geoffrey Fox, Fabrizio Gagliardi, Wolfgang Gentzsch, Andrew Hanushevsky, Emir Imamagic, Fotis Karayannis, Dan Katz, Dieter

Kranzlmüller, Domenico Laforenza, Erwin Laure, Max Lemke, Rodrigo Fernandes de Mello, Miron Livny, Gabriel Mateescu, Rodrigo Mello, André Merzky, Reagan Moore, John Morrison, Maria S. Perez, Ron Perrot, Jean-Marc Pierson, Thierry Priol, Jean Salzemann, Dave Snelling, Michela Taufer, Domenico Talia, Sathish Vadhiyar, Frank van Lingen, Gregor von Laszewski

Abstract

The term “Grid” was introduced in early 1998 with the launch of the book “The Grid. Blueprint for a new computing infrastructure”. Since that time many technological changes have occurred in both hardware and software. One of the most important ones seems to be the wide acceptance of Web services. Although the basic Grid idea has not changed much in the last decade, many people have different ideas about what a Grid really is. In the following article we report on a survey where we invited many people in the field of Grid computing to give us their current understandings.

“Computational Grids are the equivalent to the electrical power Grid” [1]“With Web Services we allow a thousand flowers to bloom.

With a Grid we organize the planting and growth of a crop of plants to make harvesting easier.” [MA]

1 Introduction

The ideas of Grid computing have been around for much longer than the advent of the famous book [1] by Ian Foster and Carl Kesselman. However, the launch of the book started a new era in computing which created an entire new research field: Grid computing. The original ideas and definitions compare a computing Grid with the electric power Grid [1]. Actually, the names are even reflected in European companies such as the Austrian Power Grid, swissgrid, etc. representing national electricity Grids. In addition to this original vision, Ian Foster gave the following check list [2] that was widely accepted:

1) coordinates resources that are not subject to centralized control …2) … using standard, open, general-purpose protocols and interfaces …3) … to deliver nontrivial qualities of service

More recently, Ian Foster and Steve Tuecke gave a clear description of what they mean by Grid and service-oriented architecture [3]. The article also gives clear definitions for utility computing and on-demand computing and the differences to Grid computing. Grid definitions from other

Page 2: Defining the Grid: A Snapshot on the Current View .doc

authors can be found in [8, 9, 10].

In general, computer science sometimes does not have as strict definitions as in the fields of physics or mathematics which results in the fact that many Grid researchers or people working with Grid technology have different views on what a Grid is. The most common discrepancies are in the definition of the hardware (for some a local cluster with a middleware system on top is a Grid) whereas others believe that a wide-area network connection has to be involved. Other main discrepancies are on the software side: what actually makes a software a “Grid software”? Is any kind of middleware using Grid security already a Grid software? etc. Most of us have had similar discussions in the past which often did not reach a full conclusion.

Due to the recent changes in Web and Grid service technologies, it is often not clear any more where to draw the border between Web services and Grid services [4]. We are particularly interested in the current view of Grid researchers and therefore conducted a survey in early 2006 to invite people to express their views. The following article reports on opinions collected from many researchers in the world-wide Grid community and tries to focus on the basic characteristics. Having an idea about the current view one can get an impression of how computer scientists in the Grid domain perceive Grid concepts and how they can be applied to other science domains.

2 Background on the Survey

In spring 2006 we started a survey where we contacted more than 170 Grid researchers all over the globe to give us their current views on how they define the Grid. The criteria for the survey was not to influence the answers, i.e. we refrained from giving questions or definitions to either agree or disagree with. Contacted people should have the maximum freedom in their definitions. The main guideline was the following:

Try to define what are the important aspects that build a Grid, what is distinctive, and where are the borders to distributed computing, Internet computing etc.

Additionally, people were asked to give precise answers of a maximum of 0.5-1 page. More than 40 people responded to this call, and a distilled summary can be found in the following article. We are aware that the freedom we gave to people results in difficulties in making a summary for all the opinions obtained. However, it also reflects reality since many researchers have different views. Given the pool of answers we classify the responses according to a few categories that are characteristic for computational Grids.

3 Survey Results

One of the main interests of the survey was to find out if people have more or less a common understanding of a Grid or if there are many conflicting opinions. The result is of course biased in the sense that we mainly asked researchers working actively in the field. This has the advantage that we get a more condensed view of what the community thinks rather than the general public.

There is also no obvious way how to evaluate the received answers. We used the following approach: in a first pass we highlighted all the main keywords that were used to describe the

Page 3: Defining the Grid: A Snapshot on the Current View .doc

Grid. Not surprisingly, the many similar words and phrases were used to describe the vision as well as main characteristics. Therefore, in a second pass a classification method was used to categorize the answers according to:

1) Grid Vision2) Differences with respect to other computing domains such as distributed computing,

Internet and Web computing3) Grid Characteristics

In the following subsections we describe the results of the survey based on the answers we received. Often we cite people directly indicated by the initials. For details on the actual person represented by the initial refer to Section 5. Sometimes people agree with certain definitions by GGF [6] or CoreGRID [11]. In this case, no direct citations are used.

In more detail, several people describe parts of the Grid, describe characteristics and what is different with respect to traditional approaches. There are many overlaps and hardly any contradictions. We consider this as a main message of the paper: the Grid community survey here is actually rather coherent in what they understand by a Grid.

3.1 The Grid Vision

3.1.1 Overview

The overall vision that was given in [1] has not changed but a few more additions were given such as the ones below. For instance, “in the Grid vision there is a distinction between (a) the Grid approach, or paradigm, that represents a general concept and idea to promote a vision for sophisticated international scientific and business-oriented collaborations and (b) the physical instantiation of a production Grid based on available resources and services to enable the vision for sophisticated international scientific and business-oriented collaborations” [GvL].

A Grid infrastructure must provide a set of technical capabilities, as follows [3]:

“Resource modeling. Describes available resources, their capabilities, and the relationships between them to facilitate discovery, provisioning, and quality of service management.

Monitoring and notification. Provides visibility into the state of resources—and notifies applications and infrastructure management services of changes in state—to enable discovery and maintain quality of service. Logging of significant events and state transitions is also needed to support accounting and auditing functions.

Allocation. Assures quality of service across an entire set of resources for the lifetime of their use by an application. This is enabled by negotiating the required level(s) of service and ensuring the availability of appropriate resources through some form of reservation—essentially, the dynamic creation of a service-level agreement.

Provisioning, life-cycle management, and decommissioning. Enables an allocated resource to be configured automatically for application use, manages the resource for the duration of the task at hand, and restores the resource to its original state for future use.

Accounting and auditing. Tracks the usage of shared resources and provides mechanisms for transferring cost among user communities and for charging for resource use by applications and users.“

In addition to that security is an important aspect [GA].

Page 4: Defining the Grid: A Snapshot on the Current View .doc

“We can consider the grid as the combination of distributed, high-throughput and collaborative systems for the effective sharing and distributed coordination of resources which belong to different control domains [MP]”. Generally, a Grid provides a “distributed computing power infrastructure. It is supposed to provide researchers (users) with a single entry point to launch jobs” [LF]. Simply put, Grid means "distributed computing across multiple administrative domains" [DS]. Sometimes the Grid is also called to be the “software environment” [GA] that integrates, virtualizes, and manages distributed resources (software and hardware). Another view is that a Grid is “a very large scale resource management system” [AD].

It is also important to point out that the Grid paradigm can be built with different technologies which generally also means that there is no such thing as a ‘typical Grid technology’. “Web services are merely a mechanism (out of many possible mechanisms) that can be used to build a schedulable grid” [AH]. This is further stressed by the statement that “Grid services are the current technological approach for the deployment of Grid infrastructures. However, it is very important to notice that Grid services are the ‘current approach’ since other technologies not related to services could be employed in order to build Grid infrastructures (e.g. software components)” [MB]. “Therefore, a multiplicity of technologies is desirable and may need to be employed concurrently in a heterogeneous Grid” [JM]. Consequently, other biotechnologies that are not Web service based can and will be used to build Grids.

Others consider the Grid “more a concept or movement rather than a system” [ML] which brings people together. It is an enabling factor, much “more sociological or cultural rather than technical” [ML]. Along the same line is the following opinion “I think one should completely dissociates the Grid definition which is rather a concept to be defined from a user’s point of view, from technical implementation of the architecture, protocols, services and technology. So definitely, defining the Grid, would rather be to define a set of features. If we observe a given unidentified system that can achieve these features, then this system can be defined as a Grid” [JS].

To conclude, we also present already commonly agreed definitions by GGF and CoreGRID Network of Excellence since they were suggested in the survey:

GGF [6]:“A system that is concerned with the integration, virtualization, and management of services and resources in a distributed, heterogeneous environment that supports collections of users and resources (virtual organizations) across traditional administrative and organizational domains (real organizations).”

CoreGRID (submitted by [TP] for the CoreGRID executive committee):"A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide location independent, pervasive, reliable, secure and efficient access to a coordinated set of services encapsulating and virtualizing resources (computing power, storage, instruments, data, etc.) in order to generate knowledge."

3.1.2 Classification

Often, people try to classify different “Grid types” according to their main functionalities but this classification is not always agreed. However, we try to convey the main ideas. In principle, most people distinguish between pure Computational Grids and the more enhanced Data Grids. However, there are also additional classifications such as [DS]:

Page 5: Defining the Grid: A Snapshot on the Current View .doc

1) “Collaboration Grids: These Grids involve multiple organizations (institutions) and individuals, security domains, protocols, discovery mechanisms, etc.” Important aspects are: “Widely distributed, virtual organizations (VOs) Service level agreements & commercial partnerships Business model: increase overall revenue

2) Enterprise Grids: These Grids are in most ways as technically complex as in item 1) above and involve the complete life cycle of service deployment, provision, management, and decommissioning, just like Collaboration Grids. However, the multiple domains are either absent or highly integrated, at least at a political level. These are the production Grids of major data centers. Important aspects are: Virtualization of enterprise resources and applications Aggregation and centralization of management Business model: reduce total cost of ownership

“In the enterprise security and auditing is even of greater importance” [GA].3) Cluster Grids: Aimed at high performance/throughput computing, these Grids are mostly

workload scheduling environments. They tend to be static, rather than dynamic like the above. The services are either generic in nature, e.g. a job submission service, or provide the same service all the time. They do not typically support the whole service life cycle.” [DS]. However, clusters themselves (if not connected to other clusters) are typically not called a Grid.

Another way of categorizing Grids is according to their “geographical distribution, their organizational scope and resource ownership” [GM]: We can then distinguish between cluster Grids, campus Grids, enterprise Grids, and global Grids. “A cluster Grid (also called department Grid) contains resources located at one site within one organization, and belonging to a single owner. A campus Grid differs from a cluster Grid in that its resources belong to multiple owners. Unlike campus Grids, enterprise Grids contain resources located at multiple sites. Finally, global Grids contain resources from multiple organizations” [GM]. Collaboration Grids are sometimes also called “Beyond Firewall Grids” [DT]. An alternative way of naming different Grids is “IntraGrid, ExtraGrid and InterGrid” [DT].

Clusters and Grids are sometimes used in the same context but a majority of people surveyed makes a clear distinction between Grids and clusters: “The key distinction between clusters and Grids is mainly in the way resources are managed. In case of clusters, the resource allocation is performed by a centralized resource manager and all nodes cooperatively work together as a single unified resource. In case of Grids, each node has its own resource manager and don't aim for providing a single system view" [RB]. Another distinction is that “a Grid is composed by different administrative domains, whose resources are managed by dynamic virtual organizations” [MP].

3.1.3 Hardware vs. Software

The physical instantiation of a Grid relies on hardware and software components. Whereas on the hardware side no particular features are identified (sometimes wide-area network connections are considered to be an important part of a Grid [4]), the software side is more distinct. For instance, “the key distinction between the Grid and other distributed computing is the use of Grid middleware” [DK]. However, the definition of middleware is also not always commonly agreed on so others suggest to “avoid the definitions tied to resource or middleware level” [AM].

3.1.4 Basic Services

Page 6: Defining the Grid: A Snapshot on the Current View .doc

At least with the advent of the Open Grid Service Architecture it become clear that basically any conventional “service” (in the meaning of Service Oriented Architecture) can be provided by or via a Grid. However, the most basic ones are the following: resource selection, scheduling, secure execution, data management, data integrity and privacy, authentication, and fault recovery.

3.2 Differences/communalities with other computing domains

In the late 1990ies Grid computing emerged as a new domain in computer science although standard techniques and protocols are taken from related domains such as the Internet, distributed computing or the database community. However, the Grid computing cannot be discussed in isolation and has many overlaps with “traditional” domains. Nevertheless, a considerable part of our surveyed contributors make clear distinctions between Grid computing, distributed computing and Internet computing. This is also the most controversial part of the entire survey. A short insight is given here.

3.2.1 Grid vs. Distributed Computing

Some people consider the Grid as a “general” [DT], some as a “special” form of distributed computing [4] whereas others think that a distinctive feature is the complexity of the Grid in several ways, characterized by scalability and transparency: Scalability:

“The borders with distributed computing might be defined as the point at which 2-way contexts begin to be replaced by N-way contexts. By context I think we primarily mean security, architecture and programming models” [BC].

Number of organizations involved: “the main differences are the (potential) inter-organizational characteristics and the looser dependence between the participating partners (either services or institutions).” [JP]

Transparency: a Grid should further be “platform agnostic” [SF] and be able to utilize heterogeneous resources (both hardware and software).

However, we can also find opinions that go in the opposite direction such that there is no “line” between distributed and Grid computing, rather “they complement each other and are part of each other [MT]“.

3.2.2 Grid vs. Internet (Web)

The Grid community has adopted and enhanced many Internet and Web service technologies. For instance, a Grid service is a Web service with additional features. However, there is no common agreement about where the border is (if it exists at all) between Internet and Grid computing. Some argue: “whereas in the Internet (Web) messages are exchanged between two points, a Grid provides a higher level of abstraction” [DKr]. Furthermore, the vision goes even beyond that and extends the Internet even more: “similar to today's World Wide Web as our global information platform, we are building the World Wide Grid to become our global collaboration platform, connecting computers and storage, applications and data, experiments, instruments, sensors and other digital devices” [WG].

Along the lines of the metaphor used on the first page, [MA] makes the following distinction: “I see web services as a computing subsystem that is independently developed and independently deployed across heterogeneous platforms. They are independently managed services that may be composed by other services, e.g. via a workflow enactment. There need not be any a priori design and implementation consistency among the web services to make such composition easy over and above adherence to WSI-like standards. There is no a priori arrangement to permit

Page 7: Defining the Grid: A Snapshot on the Current View .doc

distributed WS management. In the case of a Grid the available services are independently developed and independently deployed across heterogeneous platforms. However, the designers of a Grid choose to give up some independence between services; instead services comply with commonly agreed higher standards, implementing virtual homogeneity. This chosen consistency is intended to make it easier to deploy software and services across the Grid and easier to compose services offered by a Grid” [MA]. A Grid typically provides a communication layer that enables services to communicate with each other which leads to a similar argumentation as the one above: “… discriminates Grids from the Web, which is a (large) set of independent servers” [AD].

A statement more commonly agreed to is that a Grid could be seen as an extension of the Internet. “Therefore, same basic rule can be applied in the Grid world – integration of heterogeneous resources can be achieved by using standardized protocols and services. Internet protocols provide a good basis for linking resources. However, a wider set of standards is needed for advanced functionalities, such as job execution, data management, security operations, etc.” [EI]. A representative argument to underline that the Grid extends the Internet: “Internet services are not a different research field but a part of the Grid research “ [MT].

3.3 Grid Characteristics

Grids typically have a set of characteristics. The most dominant ones that people generally agree on are the following ones: Collaboration Aggregation Virtualization Service orientation Heterogeneity Decentralized control Standardization and interoperability Access transparency Scalability Reconfigurability Security

In addition to that we can identify a set of important topics and aspects: Application support Computing model Licensing model Procedures and policies Auditing

Collaboration

A commonly agreed aspect of a Grid is sharing of resources in a distributed fashion. Furthermore, it “spans multiple administrative domains seamlessly” [AM]. It even goes as far as people define “collaboration Grids” [GF]. It is furthermore important that the collaboration provides positive synergies among users and service providers. “Done properly, it will result in synergistic, and potentially emergent, advantages that otherwise will remain unreachable” [JM]. Finally, the resources should be “shared in a fair way” [FvL].

Aggregation

Page 8: Defining the Grid: A Snapshot on the Current View .doc

A Grid is more than the sum of all parts: “A Grid aggregates many resources and therefore provides an aggregation of the capacity of the individual resources into a higher capacity virtual resource. The capability of individual resources is preserved. As a consequence, from a global standpoint the Grid enables running larger applications faster (aggregation capacity), while from a local standpoint the Grid enables running new applications” [GM]. The aggregation is also used for “improved performance, higher quality of service, better utilization, and easier access to data” [FD]. Finally, resources can (or should be) be added dynamically or statically [DE].

Virtualization

Grid services are often provided with a certain interface that hides the complexity of the underlying resources. This is also known as virtualization which also provides an abstract “layer” between clients and resources [GA]. Therefore, a Grid provides the “ability to virtualize the sum of parts into a singular wide-area programming model” [BC]. Virtualization covers both, data (flat files, databases etc.) and computing resources [WG]. The list of resources to virtualize can be extended as follows [RM]:

Grid as workflow virtualization – the use of Grid computing services to execute and manage processes across multiple compute platforms

Data Grid as data virtualization – the management of shared collections independently of the remote storage systems where the data is stored

Semantic Grid as information virtualization – the ability to reason on inferred attributes from multiple independent information repositories.

“Virtualization is based on the ability to manage naming conventions, state information, access methods, and remote operations independently of the remote resource. All of the grid environments require” [RM]: Name space virtualization, logical names for resources, users, files, and metadata that are

independent of the name spaces used on the remote resource. Trust virtualization, the ability to manage authentication and authorization independently of the

remote resource. Constraint virtualization, the ability to manage access controls independently of the remote

resource Access virtualization, the ability to port an arbitrary access mechanism on top of the Grid

middleware. For Data Grids, this is the ability to support access through multiple loadable libraries (Windows, Perl, Python, C), Java, Digital libraries (DSpace, Fedora, OAI-PMH), workflow actors (Kepler), Web browsers, etc.

Network virtualization, the ability to manage transport in the presence of network devices such as firewalls, load levelers, private virtual networks. This typically requires multiple protocols to support client-initiated versus server-initiated I/O, bulk operations versus single-file operations.

Latency management, the ability to minimize the number of messages sent over wide area networks. Examples include execution of procedures at the remote resource when the complexity (ratio of operations to bytes transmitted) is sufficiently small. The standard case is data filtering or sub-setting.

Federation, the ability to interoperate across multiple grid environments. This requires the ability to share logical name spaces, and Shibboleth-style authentication. Grids establish trust mechanisms to allow assertions about the authenticity of an individual to be verified from the “home” Grid.

Page 9: Defining the Grid: A Snapshot on the Current View .doc

Service Orientation

Grids provide services, following the concept of a service orient architecture. In the widest sense “all large scale collections of services can be viewed as Grids” [GF].

Heterogeneity

A Grid typically consists of “heterogeneous computing resources” [RM], i.e. there is a variety of different hardware and software components with different performance and latency characteristics.

Decentralized Control

We have seen these characteristics already in the 3-point checklist by [IF] but we list it here again since it was mentioned several times in the survey answers. In other words, “components are under control of multiple entities, i.e. the key difficulties in Grids lay exactly in not having a single "owner" of the whole system” [WC], i.e. the resources are “under different ownerships” [JM]. “One of the requirements of a Grid is the use of distributed control mechanisms” [MP].

Standardization and Interoperability

A Grid “promotes standard interface definitions for services that need to inter-operate to create a general distributed infrastructure to fulfill users’ tasks and provide user level utilities” [FD]. “Grids systems that implement one standard must interoperate with Grids that adhere to the same standard” [DE].

“Grid is exposing the need for increased levels of integration of distinct technologies and for increased agreements in the standardization of services. The success of the implementation of the Grid very much depends on these aspects” [JC]. Furthermore, the Grid should provide uniform access to heterogeneous resources through virtualization [GM].

An even stronger statement on standards and interoperability is the following one: “Any Grid not based on standards is wasteful. If you consider what I said years ago that "Grid will do for services what the Web does for data" then if the Grids and their services are not interoperable it just doesn't work. The rule in Grids for us in HP is 'ruthless standardization.'" [GA]

Access Transparency

The Grid “should allow its users to access the computing infrastructure without having to be intimately aware of the underlying architecture or network topology” [SF]. This is sometimes considered “the most distinctive aspect of Grid Computing, that is, the levels of transparency provided for the end-user, through the virtualization of resources” [JC].

Scalability

Even if Grid implementations and infrastructures sometimes do not solve a “new problem”, it is often the scale of data, resources and users that contributes to the additional complexity of a Grid. This is also expressed by the fact that a Grid should be “non-trivial in the sense of what a user was not able to solve earlier” [SV].

Reconfigurability

Page 10: Defining the Grid: A Snapshot on the Current View .doc

A Grid should be “dynamically reconfigurable” as it is specified in the definition from CoreGRID.

Security

Secure access to resources an essential feature of a Grid. Therefore, “authorized users and applications have a limited number of operations (even none at all)” [JM]. Basically, Grid security is one of the first things that real Grid users have to deal with and therefore is essential for any Grid software system that spans multiple administrative domains.

Application Support

In general, a Grid might support a large variety of different applications. “Applications should also be part of the Grid and the whole Grid environment (where for environment I mean the hardware, middleware, and applications) should be data-driven. In particular, it should be able to react to changes of the system and application behaviors captured by application and system data” [MT].

Computational Model

In general, a Grid supports “several computational models (e.g., batch, interactive, distributed and parallel computing...)” [AD].

Licensing Model

Since Grids originate from the academic community, there is a “global emphasis on open source software” [FK], which is also followed by several companies that are involved in Grid development.

Procedures and Policies

Grid users and service providers interact with each other in a similar way like on the open market where certain rules have to be followed. Therefore, “procedure and polices” [FG] need to be in place to allow for (coordinated) sharing of resources.

3.4 Discussion

Although there is enthusiasm in the Grid community, not every one believes that the high goals defined in the overall Grid vision are achieved satisfactory by today’s Grid implementations. This was also partly evident in the responses we received. The following section reflects on the current status as well as its relevance to the IT community.

Status and trend

“One of the biggest fears for Grid computing is that it might be seen as today's sexy technology that will quickly get replaced by tomorrow's sexy technology” [SF]. The Grid researchers and technologists have to start to point to results/applications that utilize the Grid to solve problems or enable new applications that would have be unachievable without grid. A similar opinion is as follows: “Contemporary Grid implementations are still far from initially described image and from being widely adopted” [EI].

Page 11: Defining the Grid: A Snapshot on the Current View .doc

Relevance to wider IT community

In a recent market survey we analyzed mainly the middle European IT market and looked at how Grid technologies are or can be applied in to a business and/or commercial IT environment [6]. The major outcome was that many companies are using distributed computing technologies but are not yet ready to adopt a Grid computing model. This raises the question of why Grid is not yet more wide-spread in the commercial world? Another question from our survey: “Is there something that mainstream corporate IT can gain from the Grid, or is it just reserved for the boffins running nuclear simulations, protein folding experiments, or whatever? How can an IT manager of a bank or insurance company utilize grid technologies to solve his/her business and technical problems?” [SF].

4 Conclusion

The presented survey is one of the first attempts to get an overall view on what the Grid research community thinks about the definition of a Grid. We previously interviewed a set of companies on their perception in Grid usability in a business environment [7] and found that the opinions of IT leaders in industry are rather diverse with respect to Grid computing. Therefore, it was of major interest to see what the Grid community itself thinks about the topic. An interesting result is that there are hardly any big discrepancies seen within the research community.

5 Contributors

Init. Name Organization Country

AD Andrea Domenici University of Pisa Italy

AH Andrew Hanushevsky Stanford Linear Accelerator Center USA

AM André Merzky University of Amsterdam The Netherlands

BC Brian Coghlan Trinity College Dublin Ireland

DE Dietmar Erwin Research Centre Jülich Germany

DK Dan Katz Louisiana State University USA

DKr Dieter Kranzlmüller University of Linz / CERN Austria / Switzerland

DL Domenico Laforenza CNR Pisa Italy

DS Dave Snelling Fujitsu UK

DT Domenico Talia University of Calabria Italy

EI Emir Imamagic Universiy of Zagreb Croatia

EL Erwin Laure CERN Switzerland

FD Flavia Donno CERN Switzerland

FG Fabrizio Gagliardi Microsoft Switzerland

FK Fotis Karayannis GRNet Greece

FvL Frank van Lingen California Institute of Technology USA

GA Greg Astfalk HP USA

GF Geoffrey Fox Indiana University USA

Page 12: Defining the Grid: A Snapshot on the Current View .doc

Init. Name Organization Country

GM Gabriel Mateescu National Research Council Canada Canada

GvL Gregor von Laszewski Argonne National Lab USA

IF Ian Foster Argonne National Lab / U. Chicago USA

JC Jose Cunha University of Lisbon Portugal

JM John Morrison University College Cork Ireland

JP Jean-Marc Pierson INSA Lyon France

JS Jean Salzemann CNRS Clermont-Ferrant France

LC Lorenzo Cerutti Swiss Institute of Bioinformatics Switzerland

LF Laurent Falquet Swiss Institute of Bioinformatics Switzerland

MA Malcolm Atkinson National e-Science Centre UK

MB Miguel Bote-Lorenzo University of Valladolid Spain

ML Max Lemke European Commission Belgium

ML Miron Livny University of Wisconsin USA

MP Maria S. Perez Technical University of Madrid Spain

MT Michela Taufer University of Texas, al Paso USA

RB Rajkumar Buyya University of Melbourne Australia

RM Reagan Moore San Diego Supercomputing Center USA

RM Rodrigo Ferandes de Mello University of São Paulo Brasil

RP Ron Perrot Queen's University Belfast UK

SF Stephen Flinter Science Foundation Ireland Ireland

SV Sathish Vadhiyar Indian Institute of Science India

TP Thierry Priol CoreGRID France

WC Walfredo Cirne Federal University of Campina Grande Brasil

WG Wolfgang Gentzsch D-Grid Initiative Germany

Acknowledgements

HS is supported by the EU project EMBRACE Grid which is funded by the European Commission within its FP6 Programme, under the thematic area "Life sciences, genomics and biotechnology for health", contract number LUNG-CT-2004-512092.

References

[1] Ian Foster, Carl Kesselman. The Grid. Blueprint for a new computing infrastructure. Morgan Kaufman, 1998.

[2] Ian Foster. What is the Grid? A Three Point Checklist. http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf, July 20, 2002.

[3] Ian Foster, Steve Tuecke. Enterprise Distributed Computing, ACM Queue, Vol. 3, No. 6 - July/August 2005.

[4] Heinz Stockinger. Grid Computing: A Critical Discussion on Business Applicability, IEEE DS Online, 7(6), art. no. 0606-o6002, June 2006.

Page 13: Defining the Grid: A Snapshot on the Current View .doc

[5] Oracle Grid Index, http://www.oracle.com/corporate/press/2005_apr/emeagridindex2.html, 2005.

[6] http://www.ggf.org/documents/GFD.44.pdf[7] Erich Schikuta, Flavia Donno, Heinz Stockinger, Elisabeth Vinek, Helmut Wanek, Thomas

Weishäupl, Christoph Witzany. Business In the Grid: Project Results, 1st Austrian Grid Symposium, OCG Verlag, Hagenberg, Austria, December 1-2, 2005.

[8] Ian Foster, Carl Kesselman, Steve Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, International Journal of Supercomputing Applications, 15(3):200-222, 2001.

[9] Miguel L. Bote-Lorenzo, Yannis A. Dimitriadis. Eduardo Gómez-Sánchez. Grid Characteristics and Uses: A Grid Definition. First European Across Grids Conference, Santiago de Compostela, Spain, February 13-14, 2004.

[10] Andrew Grimshaw. What is a Grid, Grid Today, 1(26), 2002.

[11] CoreGRID Network of Excellence, http://www.coregrid.org, June 2006.