grid computing development at aei - dfn

64
CeBIT 13 March 2005 Grid Computing Development At AEI CeBIT Gabrielle Allen [email protected] Kelly Davis [email protected] Robert Engel [email protected] Hartmut Kaiser hartmut .kaiser@ aei .mpg. de Jason Novotny [email protected] Thomas Radke [email protected] Ed Seidel [email protected] Oliver Wehrens [email protected] and Jarek Nabrzyski [email protected] Albert Einstein Institute Talk prepared by Michael Russell [email protected] With (many) contributors, including:

Upload: others

Post on 02-Nov-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Grid ComputingDevelopment At AEI

CeBIT

Gabrielle Allen [email protected]

Kelly Davis [email protected]

Robert Engel [email protected]

Hartmut Kaiser [email protected]

Jason Novotny [email protected]

Thomas Radke [email protected]

Ed Seidel [email protected]

Oliver Wehrens [email protected]

and

Jarek Nabrzyski [email protected]

Albert Einstein Institute

Talk prepared by Michael Russell [email protected]

With (many) contributors, including:

Page 2: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

THE GRID: Dependable,consistent, pervasive access

to high-end resources

CACTUS is a freely available, modular,portable and manageable environment

for collaboratively developing parallel, high-performance multi-dimensional simulations

www.CactusCode.org

Page 3: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Cactus: Parallel, Collaborative,Modular Application Framework

http://www.CactusCode.orgOpen source PSE for scientists and engineers ... USER DRIVEN ...easy parallelism, no new paradigms, flexible, Fortran, legacy codes.Flesh (ANSI C) provides code infrastructure (parameter, variable,scheduling databases, error handling, APIs, make, parameter parsing)Thorns (F77/F90/C/C++) are plug-in and swappable modules orcollections of subroutines providing both the computationalinstructructure and the physical application. Well-definedinterface through 3 config filesEverything implemented as a swappable thorn ... use best availableinfrastructure without changing application thorns.Collaborative, remote and Grid toolsComputational Toolkit: existing thorns for (Parallel) IO, elliptic, MPIunigrid driver, coordinates, interpolations, and more.Integrate other common packages and tools, HDF5, PETSc, GrACE ..

Page 4: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Modularity of Cactus...

Sub-app

AMR (GrACE, etc)

I/O layer 2

Globus Metacomputing Services

User selectsdesired functionality…Code created...

Abstractions...

Remote Steer 2MDS/Remote

Spawn

Legacy App 2Symbolic

Manip App

Unstructured...

Application 2...

Cactus Flesh

MPI layer 3

Application 1

Page 5: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Grid-Enabled Cactus

Cactus and its ancestor codes have beenusing Grid infrastructure since 1993 ...motivated by simulation requirements ...Support for Grid computing was part of thedesign requirements for Cactus 4.0(experiences with Cactus 3)Cactus compiles out-of-the-box with Globus[using globus device of MPICH-G(2)]

Design of Cactus means that applicationsare unaware of the underlying machine/sthat the simulation is running on …applications become trivially Grid-enabledInfrastructure thorns (I/O, driver layers)can be enhanced to make most effectiveuse of the underlying Grid architectureInvolved in lots of ongoing Grid projects ....

Page 6: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Why Grid Computing?

AEI Numerical Relativity Group has access to high-end resourcesin over ten centers in Europe/USAThey want:

Bigger simulations, more simulations and faster throughputIntuitive IO at local workstationNo new systems/techniques to master!!

How to make best use of these resources?Provide easier access … no one can remember ten usernames, passwords,batch systems, file systems, … great start!!!Combine resources for larger productions runs (more resolution badlyneeded!)Dynamic scenarios … automatically use what is availableRemote/collaborative visualization, steering, monitoring

Many other motivations for Grid computing ...

Page 7: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Grids, The Main Idea

The idea is to make computational resources(clusters, data servers, applications, scientiticinstruments, etc.) as readily available as electricalpower….And to provide computational resources transparentlyto users of differing levels of expertise and applicationbackgrounds.Computational services should interact to performspecified tasks efficiently, securely and with minimalhuman intervention…This is what we’re trying to build… we’re a long wayfrom this vision!

Page 8: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Grand Picture

Remote steeringand monitoring

from airport

Origin: NCSA

Remote Viz inSt Louis

T3E: Garching

Simulationslaunched fromCactus PortalGrid enabled

Cactus runs ondistributedmachines

Remote Viz andsteering from Berlin

Viz of data fromprevious simulations in

SF caf?

DataGrid/DPSSDownsampling

Globus

http

HDF5

IsoSurfaces

Page 9: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Computing On Demand!

NCSA

Go!

Clone job with steered parameter

Queue time over, find new machine

Add more resources

Found a horizon,try out excision

Look forhorizon

Calculate/OutputGrav. Waves

Calculate/OutputInvariants

Find bestresources

Free CPUs!!

SDSC RZG

SDSC

LRZ Archive data

Page 10: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

From User’s Point Of View

Page 11: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Cactus Grid Projects

User Portal (KDI Astrophysics Simulation Collaboratory)Efficient, easy, access to resources … interfaces to everything else

Collaborative Working Methods (KDI ASC)Large Scale Distributed Computing (Globus)

Only way to get the kind of resolution we really needRemote Monitoring (TiKSL/GriKSL)

Direct access to simulation from anywhereRemote Visualization (Live/Offline) (TiKSL/GriKSL)

Collaborative analysis during simulations/Viz of large datasetsRemote Steering (TiKSL/GriKSL)

Live collaborative interaction with simulation (eg IO/Analysis)Dynamic, Adaptive Scenarios (GridLab/GrADs)

Simulation adapts to changing Grid environmentMake Grid Computing useable/accessible for application users !!

GridLab: Grid Application Toolkit

Page 12: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

GridLab Project

Funded by the EU (5+ M€), January 2002 – December 2004Application and Testbed oriented

Cactus Code, Triana Workflow, all the other applications that want to be Grid-enabled

Main goal: to develop a Grid Application Toolkit (GAT) and set of gridservices and tools...:

resource management (GRMS),data management,monitoring,adaptive components,mobile user support,security services,portals,

... and test them on a real testbed with real applications

Page 13: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

GridLab Is An Architecture

Page 14: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

And A Global Effort

PSNC (Poznan) - coordinationAEI (Potsdam)ZIB (Berlin)Univ. of LecceCardiff UniversityVrije Univ. (Amsterdam)SZTAKI (Budapest)Masaryk Univ. (Brno)NTUA (Athens)Sun MicrosystemsCompaq (HP)ANL (Chicago, I. Foster)ISI (LA, C.Kesselman)UoWisconsin (M. Livny)

collaborating with:Users!

EU Astrophysics Network,DFN TiKSL/GriKSLNSF ASC Project

other Grid projectsGlobus, Condor,GrADS,PROGRESS,GriPhyn/iVDGL,Most of the otherEuropean Grid Projects(GRIDSTART)GWEN

Page 15: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

GridLab Testbed Snapshot

Page 16: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

GridLab Goals

Get Computational Scientists using the “Grid” andGrid services for real, everyday, production work (AEIRelativists, EU Network, Grav Wave Data Analysis,Cactus User Community), all the other potential gridappsMake it easier for applications to make flexible,efficient, robust, use of the resources available totheir virtual organizationsDream up, prototype, and test new applicationscenarios which make adaptive, dynamic, wild, andfuturistic uses of resources.

Page 17: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

What Do Our Users Need?

Application oriented environmentFlexible, easy-to-use, simple interfacesEfficient and effective use of resourcesRobustness, fail-safety, adapabilityThe ability to work in distributed teamsSupport for mobile working environments

Page 18: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

What Do Our Users Want?

Larger computational resourcesMemory/CPU

Faster throughputCleverer scheduling, configurable scheduling, co-scheduling, exploitation of un-used cycles

Easier use of resourcesPortals, grid application frameworks, information services, mobile devices

Remote interaction with simulations and dataNotification, steering, visualization, data management

Collaborative toolsNotification, visualization, video conferencing, portals

Dynamic applications, New scenariosGrid application frameworks connecting to services

Page 19: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Many Application Scenarios!

Dynamic Stagingmove to faster/cheaper/bigger machine

Multiple Universecreate clone to investigate steered parameter

Automatic Convergence Testingfrom initial data or initiated during simulation

Look Aheadspawn off and run coarser resolution to predictlikely future

Spawn Independent/Asynchronous Taskssend to cheaper machine, main simulationcarries on

Application Profilingbest machine/queuechoose resolution parameters based onqueue

Dynamic Load Balancinginhomogeneous loadsmultiple grids

PortalUser/virtual organisation interface to thegrid.

Intelligent Parameter Surveysfarm out to different machines

Make use ofRunning with management tools such asCondor, Entropia, etc.Scripting thorns (management, launchingnew jobs, etc)Dynamic use of eg MDS for findingavailable resources

Page 20: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Our Role in GridLab

Development of the Grid Application ToolkitDevelopment of application scenarios using GAT andGridLab technologiesNumerical relativists are the target user group forGridLab.Development of GridSphere Portal FrameworkRequirements and design for data management toolsand visualization servicesGeneral support of GridLab services on ourproduction resources

Page 21: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Grid Application Toolkit

Page 22: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Grid Application Toolkit

The GAT provides functionality through a carefullyconstructed set of generic high-level APIs, throughwhich an application will be able to call the underlyinggrid services,Set of application developer APIs for Grid tools,services and software libraries, (and exampleimplementations) that support the development ofgrid-enabled applications (open source!)Usable from any high level “application” (any genericcode, Cactus, Triana, Portals, Scripts, …)

Page 23: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

GAT Goals

The GAT provides an API and an associated setof tools which enable end-users and applicationdevelopers to make easy and flexible use of theGrid,The infrastructure, and in particular the GAT,must allow developers to develop theirapplications independently of the deployment ofgrid services,Users must be able to make use of suchapplications in the absence of a fully-deployedinfrastructure.

Page 24: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

The Grid is complex …

Monitoring

Resource Management

InformationSecurity

DataManagement

GLOBUS

ApplicationManager

Logging

Notification Migration

Profiling

SOAP WSDL Corba OGSA Other

Other GridInfrastructure?

Application

“Is there a better resource I could be using?”

UNICORE

Page 25: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

…need to make it easier to use

GAT

Application

“Is there a better resource I could be using?”

GAT_FindResource( )

The Grid

Page 26: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

The Same Application …

Application

GAT

Application

GAT

Application

GAT

Laptop The GridSuper Computer

No network! Firewall issues!

Page 27: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Why Another Grid-API?

The situation today:Grids: everywhere

Supposedly. At least many projects ☺Grid applications: nowhere

Almost. At least our experience that this is difficult, GGFAPPS group

Why is this?Application programmers accept the Grid as a computingparadigm only very slowly.Problems: (multifold and often cited - amongst others)

Interfaces are NOT simple (see next slides. . .)Typical Globus code... ahem... ☺

Different and evolving interfaces to the ’Grid’Versions, new services, new implementations, WSDL doesnot solve all problems at all

Environment changes in many waysGlobus, grid members, services, network, applications, ...

Page 28: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Dynamic Middleware

Globus, Unicore, my_service, your_service, . . .The same functionality has different interfaces allover the place.

But you don't want to recompile your app every time, not to speak of recoding...WSDL does not mean end of all problems (see CoG code), but begin of new ones... - on application level, WSDLis not trivial enough

Restricting yourself to Globus does not help either:version changes every couple of months(2.4.x, 3.2.y, 4.a.b)

and gets bug fixes. Changes often are MAJOR - we have seen a number of them over the last couple of years...

The application that runs today will fail tomorrow!Right now, it is basically impossible for a programmer to focus on the science, not on IT (i.e. Grid) problems.

Page 29: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Dynamic Grids

Services (and interfaces) get exchanged (“upgraded”) on regularbasis

That is related to the point above, but also a social problem!

Institutions (resources, services, applications) join/leave YOUR gridwithout (much) notice.

The grid is designed to ease and simplify that kind of fluctuation - its not a bug, its a feature!But the applications are not able to make use of that feature right now …

The Grid changes AT RUNTIME – services go down, resources getbusy/free, disks and storage nodes are empty/full, . . . THINGSCONSTANTLY CHANGE.

Today Grid middleware allows to cope with that, but utilizing that in an intelligent way is a major programming effort, andblows the application with code that needs constancy maintenance...

Applications need LOTS of code for handling transient problems.

Most applications share most of these problems, but code reuse isdifficult/impossible.

We can reuse the Globus libraries, right, but isn't every project re-inventing its own abstraction layer for these?In our experience/projects: they do!

Aren’t we all re-inventing abstraction layers for this?

Page 30: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Copy a File: Globus GASSif (source_url.scheme_type == GLOBUS_URL_SCHEME_GSIFTP ||

source_url.scheme_type == GLOBUS_URL_SCHEME_FTP ) {

globus_ftp_client_operationattr_init (&source_ftp_attr);

globus_gass_copy_attr_set_ftp (&source_gass_copy_attr,

&source_ftp_attr);

}

else {

globus_gass_transfer_requestattr_init (&source_gass_attr,

source_url.scheme);

globus_gass_copy_attr_set_gass(&source_gass_copy_attr,

&source_gass_attr);

}

output_file = globus_libc_open ((char*) target,

O_WRONLY | O_TRUNC | O_CREAT,

S_IRUSR | S_IWUSR | S_IRGRP |

S_IWGRP);

if ( output_file == -1 ) {

printf ("could not open the file \"%s\"\n", target);

return (-1);

}

/* convert stdout to be a globus_io_handle */

if ( globus_io_file_posix_convert (output_file, 0,

&dest_io_handle)

!= GLOBUS_SUCCESS) {

printf ("Error converting the file handle\n");

return (-1);

}

result = globus_gass_copy_register_url_to_handle (

&gass_copy_handle, (char*)source_URL,

&source_gass_copy_attr, &dest_io_handle,

my_callback, NULL);

if ( result != GLOBUS_SUCCESS ) {

printf ("error: %s\n", globus_object_printable_to_string

(globus_error_get (result)));

return (-1);

}

globus_url_destroy (&source_url);

return (0);

}

int RemoteFile::GetFile (char const* source, char const* target) {

globus_url_t source_url;

globus_io_handle_t dest_io_handle;

globus_ftp_client_operationattr_t source_ftp_attr;

globus_result_t result;

globus_gass_transfer_requestattr_t source_gass_attr;

globus_gass_copy_attr_t source_gass_copy_attr;

globus_gass_copy_handle_t gass_copy_handle;

globus_gass_copy_handleattr_t gass_copy_handleattr;

globus_ftp_client_handleattr_t ftp_handleattr;

globus_io_attr_t io_attr;

int output_file = -1;

if ( globus_url_parse (source_URL, &source_url) != GLOBUS_SUCCESS ) {

printf ("can not parse source_URL \"%s\"\n", source_URL);

return (-1);

}

if ( source_url.scheme_type != GLOBUS_URL_SCHEME_GSIFTP &&

source_url.scheme_type != GLOBUS_URL_SCHEME_FTP &&

source_url.scheme_type != GLOBUS_URL_SCHEME_HTTP &&

source_url.scheme_type != GLOBUS_URL_SCHEME_HTTPS ) {

printf ("can not copy from %s - wrong prot\n", source_URL);

return (-1);

}

globus_gass_copy_handleattr_init (&gass_copy_handleattr);

globus_gass_copy_attr_init (&source_gass_copy_attr);

globus_ftp_client_handleattr_init (&ftp_handleattr);

globus_io_fileattr_init (&io_attr);

globus_gass_copy_attr_set_io (&source_gass_copy_attr, &io_attr);

&io_attr);

globus_gass_copy_handleattr_set_ftp_attr

(&gass_copy_handleattr,

&ftp_handleattr);

globus_gass_copy_handle_init (&gass_copy_handle,

&gass_copy_handleattr);

Page 31: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Copy a File: CoG/RFT

TransferRequestType transferRequest = new TransferRequestType ();

transferRequest.setTransferArray (transfers1);

int concurrency = Integer.valueOf

((String)requestData.elementAt(6)).intValue();

if (concurrency > transfers1.length)

{

System.out.println ("Concurrency should be less than the number"

"of transfers in the request");

System.exit (0);

}

transferRequest.setConcurrency (concurrency);

TransferRequestElement requestElement = new TransferRequestElement ();

requestElement.setTransferRequest (transferRequest);

ExtensibilityType extension = new ExtensibilityType ();

extension = AnyHelper.getExtensibility (requestElement);

OGSIServiceGridLocator factoryService = new OGSIServiceGridLocator ();

Factory factory = factoryService.getFactoryPort (new URL (source_url));

GridServiceFactory gridFactory = new GridServiceFactory (factory);

LocatorType locator = gridFactory.createService (extension);

System.out.println ("Created an instance of Multi-RFT");

MultiFileRFTDefinitionServiceGridLocator loc

= new MultiFileRFTDefinitionServiceGridLocator();

RFTPortType rftPort = loc.getMultiFileRFTDefinitionPort (locator);

((Stub)rftPort)._setProperty (Constants.AUTHORIZATION,

NoAuthorization.getInstance());

((Stub)rftPort)._setProperty (GSIConstants.GSI_MODE,

GSIConstants.GSI_MODE_FULL_DELEG);

((Stub)rftPort)._setProperty (Constants.GSI_SEC_CONV,

Constants.SIGNATURE);

((Stub)rftPort)._setProperty (Constants.GRIM_POLICY_HANDLER,

new IgnoreProxyPolicyHandler ());

int requestid = rftPort.start ();

System.out.println ("Request id: " + requestid);

}

catch (Exception e)

{

System.err.println (MessageUtils.toString (e));

}

}

package org.globus.ogsa.gui;

import java.io.BufferedReader;

import java.io.File;

import java.io.FileReader;

import java.net.URL;

import java.util.Date;

import java.util.Vector;

import javax.xml.rpc.Stub;

import org.apache.axis.message.MessageElement;

import org.apache.axis.utils.XMLUtils;

import org.globus.*

import org.gridforum.ogsi.*

import org.gridforum.ogsi.holders.TerminationTimeTypeHolder;

import org.w3c.dom.Document;

import org.w3c.dom.Element;

public class RFTClient {

public static void copy (String source_url, String target_url) {

try {

File requestFile = new File (source_url);

BufferedReader reader = null;

try {

reader = new BufferedReader (new FileReader (requestFile));

} catch (java.io.FileNotFoundException fnfe) { }

Vector requestData = new Vector ();

requestData.add (target_url);

TransferType[] transfers1 = new TransferType[transferCount];

RFTOptionsType multirftOptions = new RFTOptionsType ();

multirftOptions.setBinary (Boolean.valueOf (

(String)requestData.elementAt (0)).booleanValue ());

multirftOptions.setBlockSize (Integer.valueOf (

(String)requestData.elementAt (1)).intValue ());

multirftOptions.setTcpBufferSize (Integer.valueOf (

(String)requestData.elementAt (2)).intValue ());

multirftOptions.setNotpt (Boolean.valueOf (

(String)requestData.elementAt (3)).booleanValue ());

multirftOptions.setParallelStreams (Integer.valueOf (

(String)requestData.elementAt (4)).intValue ());

multirftOptions.setDcau(Boolean.valueOf(

(String)requestData.elementAt (5)).booleanValue ());

int i = 7;

for (int j = 0; j < transfers1.length; j++)

{

transfers1[j] = new TransferType ();

transfers1[j].setTransferId (j);

transfers1[j].setSourceUrl ((String)requestData.elementAt (i++));

transfers1[j].setDestinationUrl ((String)requestData.elementAt (i++));

transfers1[j].setRftOptions (multirftOptions);

}

Page 32: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Copy a File: GAT/C

#include <GAT.h>

GATResult RemoteFile_GetFile (GATContext context, char const* source_url, char const* target_url){ GATStatus status = 0; GATLocation source = GATLocation_Create (source_url); GATLocation target = GATLocation_Create (target_url); GATFile file = GATFile_Create (context, source, 0); if (source == 0 || target == 0 || file == 0) { return GAT_MEMORYFAILURE; } if ( GATFile_Copy (file, target, GATFileMode_Overwrite) != GAT_SUCCESS ) { GATContext_GetCurrentStatus (context, &status); return GATStatus_GetStatusCode (status); } GATFile_Destroy (&file); GATLocation_Destroy (&target); GATLocation_Destroy (&source);

return GATStatus_GetStatusCode (status);}

Page 33: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Copy a File: GAT/C++

#include <GAT++.hpp>

GAT::Result RemoteFile::GetFile (GAT::Context context, std::string source_url, std::string target_url){ try { GAT::File file (context, source_url); file.Copy (target_url); } catch (GAT::Exception const &e) { std::cerr << "Some error: " << e.what() << std::endl; return e.Result(); } return GAT_SUCCESS;}

Page 34: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Code Statistics

55510Language

291020Cleanup113030Action253030Init

152080100Linestotal

C++ GATC GATCoGGASSCode

Page 35: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

GridLab & Cactus

Page 36: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Cactus/GAT Integration

GATLibrary

Cactus Flesh

Thorn

CGATThorn

Thorn

ThornThorn

Thorn

Physics andComputationalInfrastructure

Modules

Cactus GAT wrappersAdditional functionality

Build system

GridLab Service

GridLab Service

Page 37: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

TFM

TFM TFM TFM TFM

Task Farming on the Grid

TFM implementedin Cactus

GAT (GRAM, GRMS) used for starting remote TFMs

Designed for the Grid

Tasks can be anything

fork/exec

Page 38: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Task Farming Motivation

Requested by local physics groupParameter surveys, e.g. looking for critical phenomena ingravitational wave collapse by varying amplitude, testing differentformalisms of Einstein Equations for evolving same initial data

Scenario is inherently quite robust and fault tolerantGood migration path to the Grid

Start easy (not too much Grid!), task farm across localhomogeneous workstations and on single supercomputers.

Use public keys first, then test standard Grid infrastructureUse of GAT then means users can start testing GridLab services(should still work for them if services not ready)CGAT team can then test real physics runs using wider Grid andGridLab services.

Page 39: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

GridSphere Portal

Page 40: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

What is a Grid Portal?

“A portal is a web basedapplication that commonlyprovides personalization, singlesign on, content aggregation fromdifferent sources and hosts thepresentation layer of InformationSystems”(JSR 168)Grid Portals build upon the familiarWeb portal model, such as Yahooor Amazon, to deliver the benefitsof Grid computing to virtualcommunities of users, providing asingle access point to Gridservices and resources.

Page 41: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Developing Grid Portals

Grid web application development still remains a tedioustask with little in the way of reusable components, forcingdevelopers to constantly “re-invent” the wheel.Often difficult and hard to maintain glue code must bewritten connecting the portal to Grid services, due to lackof/evolving standards.Lack of real usability has made it difficult to test andevaluate user interfaces.A Portal is only as good as the underlying deployedinfrastructure…. Portal development often involvesdebugging underlying middleware

Page 42: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Early Grid Portal Projects

Grid-Port:Perl based framework developed by Mary Thomas and SteveMock at San-Diego Supercomputing Center (SDSC)

Grid Portal Development Toolkit (GPDK):Developed by Jason Novotny at Lawrence Berkeley NationalLaboratories (LBNL)

Astrophysics Simulation Collaboratory (ASC):Developed by Michael Russell at University of Chicago

Page 43: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

GridSphere 2.0

Page 44: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Personalized Environment

Page 45: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Single Sign-On Capabilities

Page 46: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Submit Jobs

Page 47: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Perform File Transfers

Page 48: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Manage Resources

Page 49: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Value Added Services

Page 50: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Data Mgmt And Viz Tools

Page 51: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Page 52: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Page 53: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Page 54: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Some Achievements…

Many successful demos and awards…Our software is being adopted by many groups andlarge-scale projects around the world, including theD-Grid Initiative and HPC Europa!Our ideas and technologies are becoming topics ofresearch at conferences like the Global Grid Forum.But we’re only now in the process of putting ourtechnologies to use here at AEI.

Page 55: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Prepared a demo at GGFannouncing kickoff of the“Deutsche-Grid”Migrating a testapplication that was puttogether by our partnersat AEI to help us buildsolutions tailored to theirneeds.Yet the lessons learnedhere apply to a largeclass of applications!

Demo at GGF10!

Page 56: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

SC2002, Baltimore

Varied applications deployed of theGGTC testbed

Cactus Black Hole SimulationsASC GridLab PortalSmith-WatermanNimrod-GGridLab Task Farming scenarioVisapult

HighlightsGGTC won 2 of the 3 HPC AwardsWon (with Visapult/LBL group)Bandwidth Challenge$2000 prize money to UNICEF childrensfund

Page 57: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Global Grid TestbedCollaboration (GGTC)

Driven by GGF APPS and GridLab testbed and applicationsWhole testbed constructed very swiftly (few weeks)5 continents: North America, Europe, Asia, Africa, AustraliaOver 14 countries, including:

China, Japan, Singapore, S.Korea, Egypt, Australia, Canada, Germany, UK, Netherlands, Czech,Hungary, Poland, USA

About 70 machines, with thousands of processors (~7500)Many hardware types, including PS2, IA32, IA64, MIPS, IBM Power, Alpha,Hitachi/PPC, SparcMany OSs, including Linux, Irix, AIX, OSF, True64, Solaris, Hitachi

Many different organizations (big centers/individuals)All ran same Grid infrastructure! (Globus)

Page 58: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Global Grid TestbedCollaboration Map

Page 59: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Bandwidth Challenge:Highest Performing Application

Distributed simulations usingCactus, Globus and VisapultWith John Shalf/LBL and others16.8 Gigabits/secondscinet.supercomp.org/bwcSix sites:USA/Dutch/Czech/Poland

Page 60: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Grid-xclock

Simple application for testing and debugging.xclock is standard X utility, run on any machine with X installed

Requires:

o xclock binaryo X librarieso To display remotely, need to

open outgoing ports frommachine it is running on tomachine displaying

Page 61: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Preparing for Production

Now we have some basic tools and APIs available foralpha / beta testing.While there are many enhancements we haveplanned for GAT, GridSphere, etc, we are turning ourattention to the needs of our own users at AEI andmembers of the GridLab Virtual Organization.GridLab technologies are beginning to mature, andthis means we can start building real solutions for thephysicists at AEI, the reason why we are here in thefirst place.

Page 62: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Building a Production Grid

Constructing a Grid that includes hpc computingresources from LSU-AEI-KISTI.Going to require that users access this Grid with oursoftware, this encourages both better software designand new ways of thinking about how best to exploitthis Grid.This effort will build upon the technologies andexpertise we’ve been developing here at AEI and withour partners in the GridLab Project.

Page 63: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

The Cactus Portal…

Goal is to build a production Grid portal to support theuse of Cactus applications on Grid.Support for job submission and tracking.Data management tools.Higher-level visualization services.Automated software deployment.Notification services (e.g. AIM, Email, SMS).SSH access to resources from portal.Improved credential management.And whatever else our users want!

Page 64: Grid Computing Development At AEI - DFN

CeBIT 13 March 2005

Conclusion…

We have a long way to go, but we’ve made realprogress and this is the year we get to test our work.We’re looking not just to support the scientists at thisinstitute, but to get input and collaboration fromcommunities around the world, from varyingapplication backgrounds.Visit www.gridlab.org for more info!