the astrophysics simulation collaboratory portal: a...

17
The Astrophysics Simulation Collaboratory Portal: a Framework for Effective Distributed Research Ruxandra Bondarescu a,b , Gabrielle Allen c , Gregory Daues a , Ian Kelley c , Michael Russell c , Edward Seidel c,a,b , John Shalf d , Malcolm Tobias e a National Center for Supercomputing Applications, Champaign, IL USA b University of Illinois at Urbana-Champaign, Department of Physics and Astronomy, Champaign, IL USA c Max Planck Institute for Gravitational Physics, Albert Einstein Institute, Potsdam, Germany d Lawrence Berkeley National Laboratory, Berkeley, CA USA e Washington University, St. Louis, MO USA Abstract We describe the Astrophysics Simulation Collaboratory (ASC) portal, a collabo- rative environment in which distributed projects can perform research. The ASC project seeks to provide a web-based problem solving framework for the astrophysics community to harness computational Grids. To facilitate collaboration amongst dis- tributed researchers within a virtual organization, the ASC Portal supplies special- ized tools for the management of large-scale numerical simulations and the resources on which they are performed. The ASC Virtual Organization uses the Cactus frame- work for studying numerical relativity and astrophysics phenomena. We describe the architecture of the ASC Portal and present its components with emphasis on elements related to the Cactus framework. Key words: Grid Computing, Globus, Cactus, Astrophysics Simulation Collaboratory Portal, GridLab Email addresses: [email protected] (Ruxandra Bondarescu), [email protected] (Gabrielle Allen), [email protected] (Gregory Daues), [email protected] (Ian Kelley), [email protected] (Michael Russell), [email protected] (Edward Seidel), [email protected] (John Shalf), [email protected] (Malcolm Tobias). Preprint submitted to Elsevier Science 1 July 2003

Upload: others

Post on 22-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

The Astrophysics Simulation Collaboratory

Portal: a Framework for Effective Distributed

Research

Ruxandra Bondarescu a,b, Gabrielle Allen c, Gregory Daues a,

Ian Kelley c, Michael Russell c, Edward Seidel c,a,b, John Shalf d,Malcolm Tobias e

aNational Center for Supercomputing Applications, Champaign, IL USA

bUniversity of Illinois at Urbana-Champaign, Department of Physics and

Astronomy, Champaign, IL USA

cMax Planck Institute for Gravitational Physics, Albert Einstein Institute,

Potsdam, Germany

dLawrence Berkeley National Laboratory, Berkeley, CA USA

eWashington University, St. Louis, MO USA

Abstract

We describe the Astrophysics Simulation Collaboratory (ASC) portal, a collabo-rative environment in which distributed projects can perform research. The ASCproject seeks to provide a web-based problem solving framework for the astrophysicscommunity to harness computational Grids. To facilitate collaboration amongst dis-tributed researchers within a virtual organization, the ASC Portal supplies special-ized tools for the management of large-scale numerical simulations and the resourceson which they are performed. The ASC Virtual Organization uses the Cactus frame-work for studying numerical relativity and astrophysics phenomena. We describethe architecture of the ASC Portal and present its components with emphasis onelements related to the Cactus framework.

Key words: Grid Computing, Globus, Cactus, Astrophysics SimulationCollaboratory Portal, GridLab

Email addresses: [email protected] (Ruxandra Bondarescu),[email protected] (Gabrielle Allen), [email protected] (GregoryDaues), [email protected] (Ian Kelley), [email protected](Michael Russell), [email protected] (Edward Seidel), [email protected](John Shalf), [email protected] (Malcolm Tobias).

Preprint submitted to Elsevier Science 1 July 2003

Page 2: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

1 Introduction

In a variety of scientific disciplines, complex research problems are being inves-tigated by geographically distributed collaborations. Computational grids areemerging as a distributed computing infrastructure that will allow these col-laborations to seamlessly access computational resources and perform large-scale numerical simulations. The Cactus framework (www.cactuscode.org)and Computational Toolkit (7), a scientific code developed at the Albert Ein-stein Institute and used throughout the world, is a step towards providing acommon computational environment in the astrophysics community. Cactuswas initially designed for solving the equations governing Einstein’s Theory ofGeneral Relativity, but is presently used in various scientific disciplines such asbioinformatics and atmospheric sciences. Cactus enables research into a largevariety of physical problems and is portable across all computer architectures.It has emerged as an important tool for groups of researchers performing large-scale numerical simulations. While numerous scientific groups are benefitingfrom the features of Cactus, effective use of this code on diverse and distributedcomputational resources would benefit from a connection between the scien-tists, their simulations and emerging Grid services (1; 5). Such a connection isprovided by the simulation management framework of the Astrophysics Sim-ulation Collaboratory (ASC) portal (www.ascportal.org).

The ASC Portal is intended to deliver a collaborative simulation managementframework for generic applications, with the development driven by a par-ticular community of astrophysicists, numerical relativists and computationalscience researchers that use and develop their codes with Cactus. This com-munity makes up a virtual organization (VO) (4), denoted as the ASC VirtualOrganization (ASC VO). End user involvement from this VO has been a cru-cial factor in providing a truly useful and appropriate working environmentthrough the ASC Portal, both in terms of core functionalities and also in thedesign of the user interface. The ASC Portal codebase serves as a solid foun-dation for extensions to the next generation portlet and services frameworks,such as that being developed by the Portal Group of the GridLab project.

An important recent aspect of the ASC development is a collaboration be-tween the ASC and the new European GridLab project (www.gridlab.org).The ASC project involves the Washington University Gravity Group, the Uni-versity of Illinois at Urbana-Champaign/National Center for SupercomputingApplications, the University of Chicago/Argonne National Laboratory, Rut-gers University, and the Albert Einstein Institute in Potsdam, Germany. TheGridLab project is a European funded collaboration involving eight countriesin Europe and three institutes in the US along with industrial partners Com-paq and Sun. The ASC and GridLab projects both involve components aimedat solving the common problem of managing simulations on various super-

2

Page 3: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

computers across the world; the web portal technology is a key componentof the solution to this problem. The GridLab project is extending the ASCtechnology to a wider community of researchers (10).

The ASC Portal has prototyped and tested many different Grid and collab-orative capabilities (11; 15). This paper primarily focuses on those aspectsconcerned with simulation management that are important to the researchersin the ASC VO. Section 2 describes the basic architecture of the ASC Portal,Section 3 describes how computational resources are managed and accessed,Section 4 introduces the configuration tools that are provided for portal ad-ministrators, Section 5 illustrates how simulations are staged and monitored,Section 6 describes the ASC Portals role at SC2002 and Section 7 outlinesfuture directions and plans following from this work.

2 Portal Architecture

The ASC Portal is an instance of an emerging class of Web Portals, termedGrid Portals, that utilize Grid technologies. The Grid infrastructure withwhich the portal interacts is constructed with Globus (3) middleware. Inthis setting, the portal application typically comprises several distinct lay-ers consisting of multiple clients, an application server, and a collection ofapplication services maintained on the portal host. The application server isa central, continually active web application that handles user requests fromweb-based clients. The ASC application server is written in Java and employsJava Servlets and Java Server Pages technology to deliver content in the formof DHTML pages and Java applets to web browser clients. Furthermore, theASC Portal can interact with clients such as cellular telephones and otherhandheld devices.

Deployment of the application server as a web application is accomplishedwithin the open source Tomcat servlet container of the Apache Jakarta Project.To reduce the amount of data the application server must keep in memory, theportal uses a MySQL relational database system for persistent storage of in-formation on users, accounts, resources, code configurations, simulations, etc.The application server is capable of being deployed on most popular platforms.The application services on the portal host connect to the Grid through theJava Commodity Grid (CoG) Toolkit (2). The Java CoG interfaces with ser-vices on remote resources through standard Grid protocols, which are a partof the Globus Toolkit. Jobs are submitted to Globus gatekeepers through theGlobus Resource and Management (GRAM) protocol. Mutual authenticationbetween hosts is accomplished via the Grid Security Infrastructure (GSI), andfile transfers to GSI-enabled FTP servers are achieved using GridFTP.

3

Page 4: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

Related portal efforts include projects that use general portal frameworks suchas the Grid Portal Development Kit (GPDK) (13), and the SDSC Grid PortalToolkit (GridPort) (www.gridport.npaci.edu). The GPDK is a Java basedtoolkit; projects that have adopted GPDK include the VisPortal (8) developedat LBL that delivers centralized access to visualization interfaces in a collabo-rative setting. The SDSC GridPort makes use of server-side Perl modules withan HTML/JavaScript client. Applications that build upon GridPort includethe UCSD Telescience Portal and the GAMESS Web portal, a tool for thestudy of quantum chemistry.

3 Basic Portal Features

3.1 Resource Management

The portal currently stores information about computational resources andGlobus services within its own persistence layer. To allow current portal usersto work with machines with minimal deployment of Grid infrastructure, (e.g.,a Globus gatekeeper with an interactive jobmanager, along with a GridFTPdaemon) this information is entered by portal administrators through portalweb interfaces. The ASC Portal does include tools for querying informationabout remote resources with the Monitoring and Discovery Service (MDS).Currently, it is our experience that the Grid infrastructure technologies thatsupport information publishing and retrieval mechanisms are the most diffi-cult to properly configured for production use because of the lack of unifor-mity of the batch systems at various sites. Since much of the information thatMDS provides is essentially static, we found it useful to cache this informa-tion within the ASC Portal database. This provides the additional benefit ofenabling developers to more easily relate information about other entities toGrid resources. For example, we can very easily test and record the resultsof whether a user’s credentials authenticate to the Grid services running onthe machines made accessible via the portal. This information can be used topersonalize a user’s view of the Grid.

3.2 Resource Groups

The various platforms encountered in a Grid of machines often require differ-ent resource management techniques. In a production environment it becomespractical for users to group resources to which they can authenticate. Onecan use ASC Portal interfaces to define different working groups of machines.For example, a scientist can have a group of machines on which a certain

4

Page 5: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

simulation code is known to run well. Within each given portal session, aparticular working group may then be selected. The resource groups provideclean working interfaces, avoiding a large list of possible resources that may beunavailable or unusable to the researcher. This feature also simplifies the com-patibility problem from resources belonging to different virtual organizationsthat accept different Grid certificates.

3.3 Authentication to Computational Resources and Grid Services

For secure access to web portal sessions and computational resources, the ASCPortal makes use of Grid Security Infrastructure (GSI) certificates and proxies.These certificates allow a researcher, after an initial authentication, to movefrom one Grid resource to the next without re-establishing identity, i.e., theyprovide single sign-on access. Grid certificates are obtained from a trustedCertificate Authority such as the NCSA Alliance or the NPACI partnership.Users first establish their identity to the Certificate Authority, whereupon theorganization issues user certificates. From the certificate/key pair the user cangenerate proxy credentials. A proxy consists of a new certificate/key pair thatis signed by the owner of the user certificate rather than by the certificateauthority. Mutual authentication to a remote machine is achieved by sendingboth the user and proxy certificate to the remote party. A chain of thrust isestablished by the Certificate Authority validating the signature of the usercertificate. Further delegation of the proxy credential enables authenticationto additional Grid resources from the remote machine.

The ASC Portal utilizes an online credential repository and retrieval service,the MyProxy (12) server, to securely access proxy credentials. Portal usersdelegate a set of proxy credentials to a MyProxy server along with authen-tication information and retrieval restrictions. This task must be performedoccasionally, e.g., once every seven days, giving a balance between security (thecredentials are transient, with a finite lifetime) and convenience. The user logsonto the ASC Portal using the login name and password that protects theirstored proxy credentials. The portal contacts the MyProxy server and requestsa delegation of the user’s credential. It supplies the name and pass phrase pro-vided by the user. After verifying these and checking on retrieval restrictions,the MyProxy repository delegates a proxy credential back to the portal to beused for subsequent authentication. The proxy credentials that are obtainedfor work within the portal are short-lived, lasting for the duration of a typicaluser session. In some cases, our users require more than one Grid certificateto be able to access resources from different organizations. In order to supportthis scenario, the ASC Portal allows users to retrieve multiple credentials peruser session. And recently, the MyProxy server was enhanced to enable usersto upload and retrieve credentials with one username and password.

5

Page 6: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

3.4 Job Staging

The ASC portal provides a complete job submission interface that allows usersto run a wide variety of applications on remote Grid resources. The interfaceallows the end user to specify all properties of the job and set the environmentrequired by the application. The submitted jobs may be started interactivelyor may be queued into a batch system of a remote machine. The flexibility ofthe interface allows the submission of simple single processor jobs as well ascomplex distributed computations that run across multiple supercomputers(using Globus with MPICH-G2). While the application services of the portalpresently interact with Globus gatekeepers through the GRAM protocol, inthe future the portal will interact with higher level services such as resourcebrokers that automate the resource selection process.

4 Portal Administration

The ASC Portal provides web interfaces for administering user accounts, re-sources and services that are to be made accessible to end users. Account ad-ministration includes the ability to create user accounts, edit user profiles, andset allowed credentials. Administrators specify user access rights within theportal by defining roles and assigning them to users accordingly. By default,the ASC Portal includes an administrative role for users allowed to administerthe portal, a user role for typical user rights, and a cactus role for accessingCactus web interfaces. Resource administration includes the ability to specifyresources and their properties as described in the Resource Management sec-tion above. Finally, service administration includes the ability to specify theservices on remote resources made available from the ASC Portal.

Another aspect of the ASC Portal is its support for automated database up-grading during the build process. Since we use a MySQL database, we areable to modify the production database structure without losing data. Single-tons are used for each Java class that represents a persistent object and eachsingleton performs SQL queries directly with JDBC. Thus, when developerscommit changes to a persistent entity class that changes the underlying datamodel, they must commit a SQL script for modifying the database to con-form to the new data model. Each script is named accordingly by its date.Finally, we include build targets that run the appropriate scripts for modify-ing the database, based on the natural ordering imposed by the filenames ofeach script and the last script that was run on the target database. While thisscheme requires developers to be careful about code and database changes, itsimplifies the administration of the ASC Portal and insures that productiondatabases conform to the changes in the ASC Portal web application.

6

Page 7: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

5 Simulation Management

Performing large-scale numerical simulations with advanced application codeson Grid resources is complicated by various factors. Computational Grids aretypically heterogeneous, comprised of supercomputers of various architecturesand platforms that have different installed software and present different en-vironments to users. Furthermore, the resources are maintained by diversevirtual organizations. The collaborative group work space provided by theASC Portal intends to facilitate simulation management in a complex Gridenvironment. For effective collaboration, ASC members need a uniform sim-ulation management framework containing the core elements: (1) source codemanagement, (2) simulation tracking, and (3) an executable repository.

5.1 Source Code Management

One of the key features of the Cactus framework is its modular structure. Cac-tus has a central core (called the “Flesh”) into which independent applicationand computational infrastructure modules (called “Thorns”) can be interfacedcleanly. The design provides the end user with the flexibility to choose the mostappropriate set of thorns to perform their particular application. This featureis important for large, geographically distributed collaborations of physicistsand computational scientists with the various members contributing indepen-dent elements of the overall simulation code. As improved or additional thornsbecome available they can be incorporated and leveraged immediately. A pro-duction simulation code can consist of many different thorns from diversesources, and while this greatly enhances the capabilities of the simulationcode, it creates a challenge in the area of source code management. The nu-merous elements of the production simulation code must be gathered fromvarious source code repositories. The management of these elaborate sourcecode configurations becomes a problem for both individual researchers andcollaborators within a research group. The ASC Portal provides useful toolsand a working environment that facilitate source code management.

An important piece of the portal’s source code management environment islistings of the contents of Concurrent Versioning System (CVS) source coderepositories. Portal administrators may easily enter the listings of group sharedrepositories, and individual users may enter the contents of their personalrepositories, making them shareable if they so choose. Portal users can per-form multiple CVS checkouts to construct a source code configuration on aGrid resource. The contents of the code configurations are recorded by the por-tal, enabling users to easily move an entire working setup from one machineto another. Members of a collaboration can share their code configurations,

7

Page 8: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

saving valuable time and preventing duplication errors. The configurations ofstandard test problems of a research group may also be stored in the portal.By maintaining the source code work history of a collaboration, the portalprovides better organization, extending research capabilities.

Future enhancements to the source code handling apparatus will address avariety of issues that occur when developing scientific codes to be run on aGrid. In a typical scenario, development starts by editing a code configurationon a base machine (e.g., a personal laptop). Once the local development iscomplete, the developer must verify that the code configuration is portableto the various architectures and operating systems that make up the Grid ofmachines. The portal will provide the service of automatically checking thatthe code configuration still compiles, executes, and passes basic unit testssuccessfully on a heterogeneous Grid of machines. The portal will also providea mechanism for users to share code under development via patches. Thisfunctionality will (1) speed the rate at which updates to a configuration canbe made and (2) improve the stability and portability of code configurations.

Fig. 1. The simulations list of the ASC Portal is organized by project, in this casethe “Black Hole” project is shown. Interactive maps showing the physical locationof simulations can be launched.

8

Page 9: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

Fig. 2. Diagram displaying the current range of interactions between the ASC Portaland a running Cactus simulation: (1) the application starts and announces to theportal, (2) notifications are sent to users, (3) users can link to a steering interfaceand to interactive maps.

5.2 Simulation Tracking

Large scale simulations are very expensive from a computational viewpoint.The high cost of numerical simulations imposes a limit on the number oflarge runs a research group can expect to perform. The results obtained arevery valuable and are often studied by many different scientists within the

9

Page 10: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

Fig. 3. Visualization tools launched by the ASC Portal. The visualization tools covera wide range of data formats, with xgraph displaying one-dimensional data, gnuplotdisplaying two-dimensional data, and LCAVision rendering three-dimensional data.

community. A large-scale numerical simulation submitted to a remote machinemay start only after an hour, a day, or a week. When the simulation starts, theonly available information may be that the job is active, and this information isnot enough to judge the performance or the quality of results of the simulation.Under these circumstances, following the progress of the simulation is vital,but usually inconvenient or impossible.

The ASC Portal enables effective monitoring of numerical simulations by lis-

10

Page 11: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

tening to messages from the application itself. Scientists who login to the ASCPortal see a complete list of their own simulations and those of other mem-bers of their VO (Figure 1). The portal maintains a group workspace indexedby project name. The history of the VO’s simulations on a given project arecollected and displayed to all authorized scientists. The ASC Portal displayssimulation information such as the machine where the simulation is running,the name of the scientist running the simulation, and the date of the initialreport. Periodically, the Cactus simulation updates the portal records by pro-viding the current iteration number and an estimate of the time to completion.Portal users can provide their Email and mobile telephone number, and canchoose to receive status messages and updates about their simulations.

Cactus simulations can be further monitored and interacted with. Any activesimulation which includes the toolkit thorn HTTPD is provided with its ownwebserver (Figure 2). This webserver can then be connected to, from anyremote browser, and provides a full description of the simulation such as acomplete list of initial parameters, the name and locations of the output files,etc. It also provides a steering interface from which authorized users can changeparameters in real time or pause and stop the run. The data generated by thesimulation can be analyzed using local client or remote visualization tools.Figure 3 displays several visualization tools launchable from the ASC Portal.

The present simulation tracking framework has proved to be extremely usefulfor the users in the ASC VO. Currently, the messaging is a one-way com-munication from the application code to the portal host. Extensions to thisframework will allow two-way communication so that users can control theirapplications through mobile devices such as cellular phones and PDAs. Thecommunication between the Cactus code and the portal is performed via XML-RPC. The present XMLRPC server will evolve into an independent Grid ser-vice, based on the Open Grid Services Architecture (OGSA)(5), that receivesmessages from and sends messages to any application. The XML-based com-munication will use GSI security and the SOAP protocol, providing securemessaging between the portal and emerging Grid services on a wide varietyof computing platforms and architectures. By changing from a tailored XML-RPC implementation to one that adheres to the OGSA standard we will gaininteroperability with other advanced Grid services.

5.3 Executable Repository

An additional goal of the ASC is to enable not only experienced researchersto work with Cactus and the ASC Portal, but also students and scientistswho are either casually interested in or just beginning to learn astrophysicsand general relativity. Portal users can simply choose a general problem and

11

Page 12: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

a target machine. The portal automatically selects the necessary files froma repository, stages them to the remote machine, and starts the simulation.As the simulation is running, the generated output is displayed and standardvisualization tools can be launched to view the results. By using the reposi-tory, ASC members avoid the repetition of trivial and tedious work such asthe gathering and compilation of commonly used source code configurations.While the repository contents are currently added by portal administrators,in the future portal users will be able to upload working binary executablesfrom their production codes. The general framework will allow the additionof executables from any scientific application. The repository contents will beorganized by projects, and made available to authorized groups of users. Theproduction repository is an effective and organized way of sharing valuablecompiled codes within a virtual organization.

6 Proof of Concept: SC2002 Demonstrations

The features of the ASC portal described in this paper were shown in a numberof demonstrations at the Supercomputing 2002 High Performance Networkingand Computing Conference (SC2002). The portal could access and use manydifferent resources across various testbeds, including the GridLab, ASC andGlobal Grid Testbed Collaboration (GGTC) testbeds. The GGTC testbed (6)consisted of 69 different machines in 14 countries and 5 continents. Thesemachines were wildly diverse in size and platform, and were connected to avariety of institutions and virtual organizations. The testbed contained su-percomputers with thousands of processors, small Linux clusters, and even aSony PlayStation in Manchester, England.

Each of the GGTC machines had common Grid (Globus) and application(Cactus) infrastructure installed and deployed. From the ASC Portal a Cac-tus application could be staged and run on any machine, with the applicationannouncing to the portal’s simulation list. The portal also showed the statusof a new generic task farming infrastructure developed by the GridLab projectusing their Grid Application Toolkit. One application running in a task farm-ing mode was motivated by the ASC VO users: a parameter survey of binaryblack hole coalescence parameters. Black hole simulations are very demand-ing computationally, typically requiring hundreds of processors. Furthermore,black hole simulations require time consuming and tedious parameter tun-ing. These simulations can be expedited by task farming test problems acrossdistributed machines, the results of which automatically steer parameters inmuch larger production runs. This application demonstrated how Grid com-puting can simplify, economise and speed-up the use of resources. The GlobalGrid Testbed Collaboration was awarded two HPC awards at SC2002 for itsapplication use of this large testbed.

12

Page 13: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

7 Future Directions

Grid computing promises many benefits for research communities who makeheavy use of a bewildering variety of supercomputing resources, developing andrunning complex large-scale applications which create terabytes of data whichmust then be analysed, visualized and archived. This paper has describedsome initial advantages that are implemented in the ASC Portal: simplifyingthe use of resources, the monitoring of simulations, and the coordination ofvirtual organizations.

Viewing distributed, heterogeneous computational resources, coupled withGrid infrastructure, as a single virtual machine, leads the way to new intelli-gent Grid-aware applications and unprecedented coordination and interactionof communities with their colleagues, data and simulations. Grid-aware appli-cations will be able to migrate between supercomputers as more appropriateresources are needed, deciding when and if to distribute independent tasks toadditional resources. Increased accuracy for large-scale physics simulations willbe available by coupling together supercomputers, with communication pat-terns and computational loads automatically adapting to the available band-width. Such dynamic and self-configuring applications will require extensivetracking and logging, such as is beginning to be provided by the ASC Portal.

The end users of Grid technology will be given new opportunities both forcataloguing and understanding their computational work. Right now the ASCPortal collects basic information about the VO’s simulations in a persistentdatabase. This interface could very easily be expanded to collect additionalinformation from simulations, including:

• Building a repository of benchmark results for standard simulation moduleswhich could then be used for selecting an appropriate resource.

• Collecting accounting information, such as the number of processors andCPU hours used by the members of the VO on their different resources, tohelp with budgeting and resource selection.

• Building a database of physics parameters and results. For the ASC VOsuch a database might tabulate the type of initial data with the main com-putation method used for evolution and the error at a certain point of time.

The ASC Portal has implemented basic notification services, informing usersby email or SMS of the status of their simulations. There are many potentialextensions which can be added. The Cactus team is planning to implement ageneric notification thorn, which will provide the application with the ability torequest user notification for significant events. Such notifications could signalthe appearance of a black hole horizon, of an unexpected physical feature whichshould be analysed, or could suggest that a user might want to interact with

13

Page 14: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

the simulation to change a physics parameter. The portal can also interactwith the machines, checking for situations such as a disk becoming full, oradvising users that a queue is empty and could be immediately exploited.

The ASC Portal is being migrated to a Portlet (14) framework, called Grid-Sphere (www.gridsphere.org), being developed by the GridLab project. Inthis framework, individual components become independent modules that areeasily incorporated into other projects. The portlet framework can groupportlets into independent web applications simplifying development withinlarge collaborations. Furthermore, the portlet interface brings enhanced cus-tomization, giving each user a personalized display. The migration of ASCcomponents into the GridLab portlet framework will allow many different anddiverse communities to benefit from this technology.

8 Conclusion

The potential of Grid computing is just beginning to be realized. In orderto reach the next stage in the evolution of Grid computing, research projectslike the ASC and GridLab, as well as the organizations that provide compu-tational resources to application scientists, must cooperate more effectivelytowards building a solid Grid infrastructure and technologies for successfullyadministering and utilizing that infrastructure. Grid portals will begin to playa more prominent role because of their flexibility for providing user interfacesto Grid services such as GRAM. However, the next stage of Grid computingrequires that more research communities start considering how the Grid canbe effectively utilized to meet the ever growing resource needs of today’s scien-tific applications. The ASC has taken part in this process by working closelywith the research communities that are using Cactus to help drive the devel-opment of tools for utilizing the Grid. More importantly, we hope to make iteasier for application scientists to acquire the resources they need and to buildbetter tools for managing their applications on the Grid. We look forward tocontinuing our research and collaborating with others in this very excitingfield.

Acknowledgements

First, we would like to thank the physics users of the ASC VO, in particularPeter Diener, Frank Herrmann, Denis Pollney and Jian Tao. We would liketo recognize Brad and Galina Wind for developing LCAVision, an advanced3D AMR visualization tool as part ASC project. We also thank Brad forhis contributions to the ASC Portal. We gratefully acknowledge the help of

14

Page 15: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

the Cactus team, particularly Tom Goodale, Gerd Lanfermann, and ThomasRadke, and our ASC and GridLab colleagues Thomas Dramlitsch, Ian Fos-ter, Sai Iyer, Gregor von Laszewski, Jason Novotny, Miroslav Rudek, Wai-MoSuen and Oliver Wehrens. We especially acknowledge Cornel Izbasa, DanaPetcu, Victor Bobea, Dumitru Vulcanov, Cristian Bucovan, Mihai Burada,Catalin Milos, Gabriela Turcu, and other students of West University, andPolitehnica University of Timisoara, Romania for portal testing. The ASCproject is funded by the NSF (KDI 9979985) with computational support fromUS NRAC grant 93S025. Important contributions to this work have been madeby the EU GridLab (IST-2001-32133) and Astrophysics Network (HPRN-CT-2000-00137) projects, and the German DFN Verein GriKSL project (TK 602 -AN 200). We thank the staff at NCSA, LRZ, AHPCC, SDSC, PSC and RZGwho have provided the necessary software installations and support at thedifferent sites used by the ASC VO. We also acknowledge generous supportfrom Microsoft.

References

[1] G. Allen, E. Seidel, J. Shalf, “Scientific Computing on the Grid”, Byte Spring(2002) 24–32.

[2] Commodity Grid Kits: http://www.globus.org/cog.[3] I. Foster, C. Kesselman, “Globus: A Metacomputing Infrastructure Toolkit”,

International Journal of Supercomputing Applications 11 (2) (1997) 115–128.[4] I. Foster, C. Kesselman, S. Tuecke, “The Anatomy of the Grid: Enabling

Scalable Virtual Organizations”, Intl J. Supercomputer Applications 15 (3),http://www.globus.org/research/papers/anatomy.pdf.

[5] I. Foster, C. Kesselman, J. Nick, S. Tuecke, “The Physiology of the Grid: AnOpen Grid Services Architecture for Distributed Systems Integration”, OpenGrid Service Infrastructure WG, Global Grid Forum June.

[6] The Global Grid Testbed Collaboration:http://scb.ics.muni.cz/static/SC2002.

[7] T. Goodale, G. Allen, G. Lanfermann, J. Masso, T. Radke, E. Seidel, J. Shalf,“The Cactus Framework and Toolkit: Design and Applications”, in: Vectorand Parallel Processing - VECPAR’2002, 5th International Conference, LectureNotes in Computer Science, Springer, Berlin, 2003.

[8] T. J. Jankun-Kelly, O. Kreylos, J. Shalf, K.-L. Ma, B. Hamann, K. I. Joy, andE. W. Bethel, “Deploying Web-based Visual Exploration Tools on the Grid,”in IEEE Computer Graphics and Applications, March/April, 2003, IEEE Com-puter Society Press, Los Alamitos, California.

[9] “Jetspeed - Jetspeed Overview”, http://jakarta.apache.org/jetspeed[10] I. Kelley, J. Novotny, M. Russell, O. Wehrens,

“ASC Portal Evaluation”, Technical Report, D4.1, GridLab, (2002)http://www.gridlab.org/Resources/Deliverables/D4.1.pdf.

[11] G. von Laszewski, M. Russell, I. Foster, J. Shalf, G. Allen, G. Daues, J. Novotny,E. Seidel, “Community Software Development with the Astrophysics Simula-

15

Page 16: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

tion Collaboratory”, Concurrency and Computation: Practice and Experiencehttp://www.cactuscode.org/Papers/CCE 2001.pdf.

[12] J. Novotny, S. Tuecke, V. Welch, “An Online Credential Repository for theGrid: Myproxy”, in: Proceedings of Tenth IEEE International Symposium onHigh Performance Distributed Computing, HPDC-10, San Francisco, IEEEPress, 2001, http://www.globus.org/research/papers/myproxy.pdf.

[13] J. Novotny, “The Grid Portal Development Kit.”, Concurrency and Computa-tion: Practice and Experience, Vol. 14, No. 13-15, pp. 1129-1144 (2002).

[14] Portlet Specification, http://www.jcp.org/en/jsr/detail?id=168[15] M. Russell, G. Allen, G. Daues, I. Foster, E. Seidel, J. Novotny, J. Shalf,

G. von Laszewski, “The Astrophysics Simulation Collaboratory: A Science Por-tal Enabling Community Software Development”, Cluster Computing 5 (3),http:www.cactuscode.org/Papers/CCJ 2001.ps.gz.

Ruxandra Bondarescu graduated from the University of Illinois at Urbana-Champaign with a B.S. in physics in 2002. She has worked with the Astro-physics Simulation Collaboratory project for the last two years, and is a collab-orator of the Cactus Code development team at the Albert Einstein Institutein Germany and the European GridLab project. She is currently working invarious projects including numerical relativity of compact scalars, Grid com-puting, web portal development and the integration of community codes withGrid technology. She will begin her Ph.D. study in physics at Cornell Univer-sity in the fall of 2003.

Gabrielle Allen, who holds a Ph.D. in Astrophysics from Cardiff University,has led the Cactus Code team at the Albert Einstein Institute in Germanyfor the past 4 years. Previous to this she spent 10 years as a computationalphysicist in the field of numerical astrophysics, and has a deep understandingof user requirements for computational software and infrastructure. Allen is aPI for the European GridLab and German DFN GriKSL projects.

Gregory Daues is a research programmer at the National Center for Super-computing Applications. Daues received his Ph.D. in physics from WashingtonUniversity in St. Louis. He has led the Astrophysics Simulation Collaboratoryefforts at NCSA since 2001. Prior to that, he was a developer of the SimulatedCluster Archive, a web-based archive of numerically simulated galaxy clus-ters. He has worked in the areas of web portal development, Grid computing,numerical cosmology and astrophysics, and numerical relativity.

Ian Kelley has worked at the Albert Einstein Institute in Germany for thelast four years. He is the lead architect for Living Reviews in Relativity, anexclusively web based peer-reviewed journal, publishing reviews of research inall areas of relativity. In his work with the EU funded GridLab project, hedeveloped the GridTools package, a set of customizable command line toolsfor using Grids. Kelley contributed to the development of the Global GridTestbed Collaboration for SC 2002.

16

Page 17: The Astrophysics Simulation Collaboratory Portal: a ...astrosun2.astro.cornell.edu/~ruxandra/asc.pdfcommon computational environment in the astrophysics community. Cactus was initially

Michael Russell is presently the leader of the Computer Science group atthe Max-Planck-Institut-Albert-Einstein-Institut. He has been the lead devel-oper of the ASC Portal at the University of Chicago. Russell is specialized inthe design and development of science collaboratories, Grid computing envi-ronments that enable researchers to effectively share and coordinate the useof supercomputing resources across institutional, geographical, and politicalboundaries towards the study of complex scientific problems. Russell is alsothe project manager for the Grid Portals Work Package and a member of theGridLab Technical Board.

Ed Seidel has led both the Computer Science and the Numerical RelativityResearch groups at the Albert Einstein Institute in Germany for the past 7years, with over 120 publications spanning both disciplines. Seidel holds addi-tional positions as Senior Research Scientist at NCSA and as Adjunct Profes-sor at the University of Illinois. Seidel is a co-chair of the Global Grid ForumApplications Working Group, and is a PI for the NSF funded AstrophysicsSimulation Collaboratory, the European GridLab project, and the GermanDFN GriKSL project. Seidel has a Ph.D. from Yale University.

John Shalf is a staff scientist at the LBNL. He is involved in projects thatcover visualization of AMR data, distributed/remote visualization, Grid por-tal technology, high performance networking, and computer architecture. Hehas also been a visiting researcher at the at the Albert Einstein Institute/MaxPlanck Institute in Potsdam Germany and is a member of the Technical Ad-visory Board for the EU GridLab project which seeks to create application-oriented APIs and frameworks for Grid computing.

Malcolm Tobias is the administrator for the Center for Scientific Paral-lel Computing at Washington University in St. Louis. Tobias continues towork with the gravity group at Washington University (WUGRAV) wherehe received his Ph.D. and is involved with the KDI Astrophysical SimulationCollaboratory project.

17