development & implementation of an inter-institutional multi-purpose grid suragrid, 11/22/05...
TRANSCRIPT
Development & Implementation of an Inter-institutional Multi-purpose Grid SURAgrid, 11/22/05UNC-Charlotte: Grid Computing-ITSC 4010-001
• Mary Fran Yafchak, SURA• Jim Jokl, University of Virginia• Art Vandenberg, Georgia State University
Southeastern Universities Research Association
Presentation agenda
• About SURAgrid - Mary Fran Yafchak
• SURAgrid build/portal - MF Yafchak
• SURAgrid authN/authZ - Jim Jokl
• SURAgrid applications - Art Vandenberg
• Q&A - All
This is a living, breathing project. Exchange of ideas encouraged!
Southeastern Universities Research Association
About SURAgrid• A “beyond regional” initiative in support of
SURA regional strategy“Mini-About” SURA:– SURA region: 16 states & DC; Delaware to Texas– SURA membership: 62 SE research universities– SURA mission: Foster excellence in scientific
research, strengthen capabilities, provide training opportunities
• Evolved from the NMI Testbed Grid project, part of the NMI Integration Testbed Program– http://www1.sura.org/3000/NMI-Testbed.html
Southeastern Universities Research Association
SURAgrid Goals
• SURAgrid: Organizations collaborating to bring grids to the level of seamless, shared infrastructure
• Goals:– To develop grid infrastructure that is scalable and
that leverages local identity and authorization while managing access to shared resources
– To promote use of this infrastructure for the broad research and education community
– To provide a forum for participants to share experience with grid technology, and participate in collaborative project development
Southeastern Universities Research Association
SURAgrid Participants• University of Alabama at
Birmingham*• University of Alabama in
Huntsville*• University of Arkansas*• University of Florida*• George Mason University*• Georgia State University* • Great Plains Network• University of Kentucky*• University of Louisiana at
Lafayette*• Louisiana State University*• University of Michigan• Mississippi Center for
SuperComputing Research*
• University of North Carolina, Charlotte
• North Carolina State University*• Old Dominion University*• University of South Carolina*• University of Southern California• Southeastern Universities
Research Association (SURA)**• Texas A&M University*• Texas Advanced Computing
Center (TACC)*• Texas Tech University• Tulane University* • Vanderbilt University* • University of Virginia*
Resources on grid*SURA member **Project planning
Southeastern Universities Research Association
Focus Areas
• Authentication & Authorization– Themes: maintain local autonomy, leverage
enterprise infrastructure
• Grid-Building– Themes: heterogeneity, flexibility, interoperability,
scalability
• Application Development – Themes: immediate benefit to applications,
applications drive development
• Project Planning– Themes: cooperative, representative, sustainable
Southeastern Universities Research Association
In the Coming Months…• Continue evolving key areas
– Grow and solidify grid infrastructure– Continue expanding and exploring authN/authZ– Identify & “grid-enable” new applications
• “Formal” work on organizational definition– Charter, membership, policies, governance
• Develop funding & collaboration opportunities– Some areas of interest: scalable mechanisms for
shared, dynamic access; interoperability in grid products; grid-enabling applications; grids for education; broadening participation; support and management of large-scale grid operations
(Ashok Adiga, Texas Advanced Computing Ctr.)
Building SURAgrid& SURAgrid portal
Southeastern Universities Research Association
SURAgrid Software Requirements
• SURAgrid supports dedicated & non-dedicated compute nodes
• Non-dedicated nodes are typically shared across multiple grids, – Could have constraints on the software that can be installed– Must allow resource owner to set usage policies
• Dedicated nodes run only SURAgrid jobs– Common software stack being defined for dedicated nodes– Will consider using packaged Grid solutions
• Virtual Data Toolkit (VDT)• NSF Middleware Initiative (NMI Grids)
Southeastern Universities Research Association
Configuring Non-dedicated nodes
• Non-dedicated nodes support basic grid services– Document simple process to add resources to the
grid– Job & data management
• Install Globus (pre-web services GRAM & gridftp)– Authentication
• Cross sign CA certificates with Bridge CA• Work with individual resource owners to get authorized
– Resource monitoring• Install GPIR perl provider scripts on resource and add
resource description to User Portal
Southeastern Universities Research Association
SURAgrid Resource Status
• Number of Compute Clusters: 14
• Total number of CPUs: 611
• Peak GigaFlops: 1,367
• Memory (GigaBytes): 621
• Storage (GigaBytes): 5,645
Southeastern Universities Research Association
Motivation for User Portals• Make joining the SURAgrid easier for users• Single place for users to find user information
and get user support• Certain information can be displayed better in
a web page than in a command shell• Allow novice users to start using grid
resources securely through a Web interface• Increase productivity of SURAgrid
researchers – do more science!
Southeastern Universities Research Association
What is a Grid User Portal?
• In general - a gateway to a set of distributed services accessible from a Web browser
• Provides – Aggregation of different services as a set of Web
pages– Single URL– Single Sign-On– Personalization– Customization
Southeastern Universities Research Association
Characteristics of a User Portal• A User Portal can include the following
services:– Documentation Services– Notification Services– User Support Services
• Allocations• Accounts• Training• Consulting
Southeastern Universities Research Association
User Portal Characteristics (cont’d)
– Collaborative Services• Calendar• Chat• Resource sharing
– Information Services• Resource• Grid-wide
– Interactive Services• Manage Jobs & Data• Doesn’t replace the command shell but
provides a simpler, alternative interface
Southeastern Universities Research Association
Service Aggregation
User Portal
InteractiveJob Submission
File Transfer
NotificationUser News
User SupportConsulting
CollaborativeCalendar
Chat
InformationResource
Grid
Client Browser
HTTP/SSL
HTTP/SSL/SOAPGSI
DocumentationUser Guides
Southeastern Universities Research Association
Portal Built Using GridPort 4
• Developed at TACC & San Diego State• Interface to grid technologies
– GRAM, GridFTP, MyProxy, WSRF, science applications
• Includes:– Portal framework-independent “portlets”
• Expose backend services as customizable web interfaces• Small changes allow portlets to run in any JSR-168 compliant
portal framework (e.g., uPortal, WebSphere, Jetspeed; installs into Gridsphere by default)
– Portal services• Run in the same web container as portlets• Provide portlet cohesion and portal framework level support
Southeastern Universities Research Association
• Single sign-on to access all grid resources• Documentation tab has details on:
– Adding resources to the grid– Setting up user ids and uploading proxy certificates
Southeastern Universities Research Association
Information Services
• Resource-level view– State information about individual resources
• Queue, Status, Load, OS Version, Uptime, Software, etc..
• Grid-level view– Grid-wide network performance– Aggregated capability
• GPIR information Web Service– Collects and provides information above
Resource Monitoring
http://gridportal.sura.org
Southeastern Universities Research Association
Interactive Services
• Security– Hidden from the user as much as possible
• File Management– Upload– Download– Transfer between resources
• Job Submission to a single resource• Job Submission to a grid meta-scheduler
(future)• Composite Job Sequencing (future)
Southeastern Universities Research Association
Proxy Management
• Upload proxy certificates to MyProxy server• Portal provides support for selecting a proxy certificate
to be used in a user session
Southeastern Universities Research Association
• List directories, Move files between grid resources, Upload/download files from local machine
File Management
Southeastern Universities Research Association
• Submit Jobs for execution on remote grid resources• Check status of, cancel and delete submitted jobs
Job Management
Southeastern Universities Research Association
Future Directions
• User Portal currently offers basic user, informational and interactive services. – Build on other services such as user support
• Need to expand services as grid grows– Resource broker to automatically select resource
for job execution– Workflow support for automation and better
utilization of grid resources– Reliable file transfer services
• Build customized application portlets
Jim Jokl, University of Virginia
SURAgrid authN/authZ
Southeastern Universities Research Association
SURAgrid Authentication
• Goal– Develop a scalable inter-campus solution
• Preferred mechanisms– Leverage campus middleware activities
• Researchers should not need to operate their own authentication systems
• Use local campus credentials inter-institutionally
– Rely on existing higher education inter-institutional authentication efforts
Southeastern Universities Research Association
Inter-campus Globus Authentication
• Globus uses PKI credentials for authentication• Leverage native campus PKI credentials on
SURAgrid– Users do all of their work using local campus PKI credentials
• How do we create the inter-campus trust fabric?• Standard inter-campus PKI trust mechanisms include
– Operating a single Grid CA or trusting other campus CAs– Cross-certification and Bridge PKIs
• How well does Globus operate in a bridged PKI?– OpenSSL PKI in Globus is not bridge-aware– Known to work from NMI Testbed project
• Decision: intercampus trust based on a PKI Bridge– Leverage EDUCAUSE Higher Education Bridge CA (HEBCA
) when ready
Southeastern Universities Research Association
Background: Cross-certification• Top section
– Traditional hierarchical validation example
• Bottom section– Validation using cross
certification example– UVA signed a certificate
request from the UAB CA– UAB signed a certificate
request from the UVA CA– This pair of cross
certificates enables each school to trust certs from the other using only their own root as a trust anchor
– An n2 problemI: UVAS: User-1
I: UABS: UVA
I: UABS: UAB
I: UABS: User-2
I: UVAS: UAB
I: UVAS: UVA
Cross Certs
I: UVAS: User-1
I: UABS: UAB
I: UVAS: UVA
I: UABS: User-2
Southeastern Universities Research Association
Background: Bridged PKI
• Used to enable trust between multiple hierarchical CAs
• Generally more infrastructure than just the cross-certificate pairs
• Typically involves strong policy & practices
• Solves the n2 problem• For SURAgrid we
preload cross-certs
Campus A
Mid-A
User A1
User A2
Campus B Campus n
Mid-B
User B1
User B1
Bridge CA
Cross-certificate pairs
Southeastern Universities Research Association
SURAgrid Authentication Schematic
Campus E Grid
A’s PKI
SURAgrid Bridge CA
Campus B Grid
Campus C Grid
Campus D Grid
Campus A Grid
Campus F Grid
B’s PKI C’s PKI
Cross-cert pairsD’s PKI
E’s PKI
F’s PKI
Southeastern Universities Research Association
SURAgrid Authentication Status
• SURAgrid Bridge CA– Off-line system– Used Linux and OpenSSL to
build bridge
• Cross-certifications with the bridge complete or in progress for 8 SURAgrid sites
• Several more planned in near future
• SURAgrid Bridge Web Site• Interesting PKI issues
discussed in paper
Southeastern Universities Research Association
Higher Education Bridge Certification Authority (HEBCA)
• A project of EDUCAUSE– Implement a bridge for higher education
based on the Federal PKI bridge model– Support both campus PKIs and sector
hierarchical PKIs– Cross-certify with the Federal bridge (and
others as appropriate)
• Should form an excellent permanent trust fabric for a bridge-based Grid
Southeastern Universities Research Association
Model SURAgrid Authentication
Campus E Grid
A’s PKI
HEBCA
Campus B Grid
Campus C Grid
Campus D Grid
Campus A Grid
Campus F Grid
B’s PKI C’s PKI
Cross-cert pairsD’s PKI
E’s PKI
F’s PKI
Southeastern Universities Research Association
Bridge to Bridge Context
• A federal view on how the inter-bridge environment is likely to develop
• FBCA – Federal Bridge• SAFE – Pharmaceutical• HEBCA – Higher Ed• Commercial - aerospace
and defense
• Grid extensible across PKI bridges?
FBCA
HEBCASAFE
Commercial
Others
Southeastern Universities Research Association
SURAgrid AuthN/AuthZ Status
• Bridge CA and cross-certification process– Forms the basic AuthN infrastructure– Builds a trust fabric that enables each site to trust
the certificates issued by the other sites
• The grid-mapfile– Controls the basic (binary) AuthZ process– Sites add certificate Subject DNs from remote
sites to their grid-mapfile based on email from SURAgrid sites
Southeastern Universities Research Association
SURAgrid AuthZ Development
• Grid-mapfile automation– Sites that use a recent version of Globus
will use a LDAP callout that replaces the grid-mapfile
– For other sites there will be some software that provides and updates a grid-mapfile for their gatekeeper
Southeastern Universities Research Association
SURAgrid AuthZ Development• LDAP AuthZ Directory
– Web interface for site administrators to add and remove their SURAgrid users
– Directory holds and coordinates• Certificate Subject DN• Unix login name (prefixed by school initials)• Allocated Unix UID (high numbers)• Some Unix GIDs? (high numbers)• Perhaps SSH public key, perhaps gsissh only• Other (tbd)
– Reliability• Replication to sites that want local copies
Southeastern Universities Research Association
SURAgrid AuthZ Development
• Sites contributing non-dedicated resources to SURAgrid greatly complicate the equation
• We will provide a code template for editing grid-mapfiles to manage SURAgrid users
• Publish our LDAP schema– Sites may query LDAP to implement their
own SURAgrid AuthZ/AuthN interface
Southeastern Universities Research Association
Likely SURAgrid AuthZ Directions and Research
– User directory or directory access• Group management• Person attributes• VO names• Store per-person, per-group allocations• Integrate with accounting• Local and remote stop-lists
– Resource directory• Hold resource usage policies• Time of day, classifications, etc
– Mapping users to resources within resource policy constraints
– We’ll learn a lot more about what is actually required as we work with the early user groups
Art Vandenberg, Georgia State University
Applications on SURAgrid
Southeastern Universities Research Association
SURAgrid Applications
• Need applications to inform and drive development
• Want to be of immediate service to real applications
• Believe in grids as infrastructure– but not “if you build it they will come”…
• Identifying & Fostering Applications
Southeastern Universities Research Association
Proposed Application Process
• Continuing survey of applications– Catalog of Grid Applications; similar agency and partner databases;
survey of SURA membership
• Identify target applications– Region significance, multi-institutional, intersection other e-Science
– Illustrating grid benefits
• Test it – Globus, authN-Z/BridgeCA, compilers, portal… and more
• Implementation options1) Immediate deployment
2) Demonstration deployment opportunities
3) Combined with proposal development
Southeastern Universities Research Association
Catalog of Grid Applications
• http://art11.gsu.edu:8080/grid_cat/index5.jsp• Researchers of grid, grid potential
applications • Initial intent just to see who's doing what • Potentially larger resource (collaboration,
regional perspective, overall trends)• 20 sites, 475+ researchers • Current focus:
– Automated maintenance– Improved search, browse
Southeastern Universities Research Association
Identify an Applications Base
• Build from application activities already underway in SURAgrid
• Integrate with regional strategy (SURA HPC-Grid Initiatives Planning Group)
• Apply additional resources– Seeking additional collaboration, external funding
• Achieve critical mass
• Seek FUNDING
Southeastern Universities Research Association
SURAgrid Applications
• SCOOP/ADCIRC (UNC, RENCI, MCNC, SCOOP partners, SURAgrid partners)
• Multiple Genome Alignment (GSU, UAB, UVA) • ENDYNE (TTU)• Task Farming (LSU)• Data Mining on the Grid (UAH)• BLAST (UAB)• … and more …
SCOOP/ADCIRC- UNC, RENCI, MCNC, SCOOP Partners, SURAgrid Participants
• SURA program to create infrastructure for distributed Integrated Ocean Observing System (IOOS) in the southeast– Shared means for acquisition of observational data– Enables modeling, analysis and delivery of real-time data
• SCOOP will serve as a model for national effort• http://www1.sura.org/3000/3300_Coastal.html• SCOOP/ADCIRC: forecast storm surge
1. resource selection (query MDS)2. build package (application & data)3. send package to resource (gridftp)4. run adcirc in mpi mode (globus rsl & qsub)5. retrieve results from resource (gridftp)
SCOOP/ADCIRC…
Left: ADCIRC max water level for 72 hr forecast starting 29 Aug 2005,driven by the "usual, always-available” ETA winds.
Right: ADCIRC max water level over ALL of UFL ensemble wind fields for 72 hr forecast starting 29 Aug 2005, driven by “UFL always-available” ETA winds.
Images credit: Brian O. Blanton, Dept of Marine Sciences, UNC Chapel Hill
SCOOP/ADCIRCResults SURAgrid U. Kentucky (CCS-UKY, 48 CPU/230 Gflops/48G RAM, 500G
Disk) • -rwx------ 1 howard howard 1458444 Sep 14 13:39 adcirc.x• -rwx------ 1 howard howard 12 Sep 14 13:39 adcpost.inp• -rwx------ 1 howard howard 843813 Sep 14 13:39 adcpost.x• -rw------- 1 howard howard 29 Sep 14 13:39 adcprep.inp• -rwx------ 1 howard howard 1150926 Sep 14 13:39 adcprep.x• -rwx------ 1 howard howard 915 Sep 14 13:39 execute_parallel_bundle.sh• -rwx------ 1 howard howard 3042520 Sep 14 13:39 fort.14• -rw------- 1 howard howard 64545 Sep 14 13:39 fort.15• -rw------- 1 howard howard 19804050 Sep 14 13:39 fort.22• -rw-rw-r-- 1 howard howard 1444457 Sep 14 16:17 fort.61• -rw-rw-r-- 1 howard howard 202457 Sep 14 16:17 fort.62 Results stored in
fort.61 - 64• -rw-rw-r-- 1 howard howard 105626297 Sep 14 16:18 fort.63• -rw-rw-r-- 1 howard howard 169753697 Sep 14 16:19 fort.64• -rw------- 1 howard howard 1257568 Sep 14 13:39 fort.68• -rw-rw-r-- 1 howard howard 1326004 Sep 14 13:40 fort.80• -rw------- 1 howard howard 3940266 Sep 14 13:40 metis_graph.txt• -rwx------ 1 howard howard 1802370 Sep 14 13:39 padcirc.x• -rw-rw-r-- 1 howard howard 403 Sep 14 13:39 pbs_sub-howard• -rw-r--r-- 1 howard howard 1028 Sep 14 13:39 pbs_sub-howard.e125698• -rw-r--r-- 1 howard howard 91 Sep 14 13:39 pbs_sub-howard.o125698drwxrwxr• -x 2 howard howard 4096 Sep 14 13:41 PE0000drwxrwxr• -x 2 howard howard 4096 Sep 14 13:41 PE0001drwxrwxr Directories created by
job• -x 2 howard howard 4096 Sep 14 13:41 PE0002drwxrwxr• -x 2 howard howard 4096 Sep 14 13:41 PE0003
SCOOP/ADCIRC - Challenges
1. resource selection (query MDS)– Expect MDS to be hosted on resource being queried.
CCS-UKY actually pointed to NCSA for their MDS; needed to implement MDS on CCS-UKY as well (essentially CCS-UKY part of multiple MDS)
2. build package (application & data)– Must address incompatibility between GT3 and GT2 style
proxies; must use “-old” option to GT3’s grid-proxy-init to get GT2 style proxy which ADCIRC currently expects
3. send package to resource (gridftp)– Staff availability…
4. run adcirc in mpi mode (globus rsl & qsub)5. retrieve results from resource (gridftp)
Multiple Genome Alignment-GSU, UAB, U. Virginia, U. Southern CA, TACC
• Demoed March 2005 SURA IT Comm (used BridgeCA)• SMP cluster UAB grid SURAgrid• Iteratively advance understanding (algorithm, UAB grid,
Bridge CA, multiple clusters, SURAgrid portal…)• USC baseline testing Mar-Jun 2005• TACC Bandera MPI running, submit to Portal in process
0
100
200
300
400
500
0 5 10 15 20 25 30
Number of processors
Co
mp
uta
tio
n t
ime
(sec
) Single Cluster
Single ClusteredGrid
Multi ClusteredGrid
Sequences 1-6
Sequences 7-12
Seq 1-2 Seq 5-6Seq 3-4
ENDYNE- Texas Tech
• Run on SURAgrid, September 2005• Electron Nuclear Dynamics simulations• Trajectory calculation in quantum phase space• Using grid enables real-time solutions
Left: Simulation of the H+* + C2H2 reaction, CS END gridRight: CS wave packets trajectory on X-Z plane predicting reaction
Task Farming- Louisiana State U.
• Demo Nov SC2004; Mar 2005 - SURA IT Comm (BridgeCA)• Pluggable components to use different technologies• Application independent = no need to recompile• Grid enabled, supports task scheduling• HTTP interface: monitor progress, steer individual TFM
Data Mining - U. Alabama at Huntsville
Linked Environments forAtmospheric Discovery
(LEAD)• NSF Program• Grid-based cyber-infrastructure• Real time, on-demand and
dynamically-adaptive• Mesoscale weather research• Vastly disparate high volume,
high bandwidth data• Tremendous computational
demand for models and data assimilation
BLAST- U. Alabama at Birmingham
• Nearing SURAgrid deployment• Database search application for protein and
nucleotide sequences• Globus: job staging, submission, retrieval• ncbiBLAST for computation• Pubcookie initial login, myproxy grid login• Simplified web interface• Sequence database pre-staged on nodes
Southeastern Universities Research Association
Funded applications…• EnLIGHTened Computing - MCNC, RENCI, LSU,
Cisco, AT&T, SURA & Naval Research Lab– Funded project to develop advanced toolkits, Grid middleware and
underlying optical control plane technologies. Provide awareness of Grid environment, applications have dynamic, adaptive and optimized use of networks connecting high end resources.
• UCoMS Reservoir Simulation via Task Farming - CCT at LSU, Petroleum Department at LSU, CASCS at ULL, and CS Department at SUBR– Ubiquitous Computing and Monitoring System. DOE funded project
addresses key research issues for technical solutions in the areas of wireless networked systems, grid computing, and application software.
– The workflow of a typical reservoir simulation includes geostatics modeling, reservoir simulating, and result analysis.
Southeastern Universities Research Association
Potential apps & grants…• NSF CRI proposal pending (SURAgrid team)
– Improved 2D gel statistical analysis• Dr. Alan Shih, ME, Dr. Sreelatha Meleth, Dept. Med., U.
Alabama at Birmingham
– Asynchronous iterative algorithms• Dr. Jim Browne, CS, U. Texas at Austin
– Protein structure prediction• Dr. Yi Pan, CS, Dr. Robert Harrison, CS/Biol, Georgia State U.
– Configurable grid testbed• Dr. Ashok Adiga, Dr. Warren Smith, Texas Advanced
Computing Center
– Application discovery and knowledge management• Dr. Vijay Vaishnavi, CIS/CS, Art Vandenberg, Georgia State U.
Southeastern Universities Research Association
Potential applications…
• Turbomachinery Flow Field Simulation (Dr. Alan Shih, UAB)
• Computational fluid dynamics (Tulane, TACC, UAB…)
Southeastern Universities Research Association
More Potential applications…
• Bioportal Phylip (PHYLogeny Inference Package) RENCI– Bioportal application that includes a variety of tools for
determining the phylogenetic relationship between sets of related nucleic acid and protein sequence.
• GeoScience Grid - George Mason University– Grid platform for supporting research, development,
and operational needs of spatial computing infrastructure focusing on GeoScience interoperability
Southeastern Universities Research Association
Applications drive infrastructure
• Contributed nodes• Defining software stack (evolving)• Bridge CA• Portal• Policy, meta-scheduling• CHALLENGE:
Pragmatic ManagedExperimental Production
Southeastern Universities Research Association
Challenges
• Essentially: persistent application use
• Meeting broad objectives of SURAgrid in context of size & diversity of SURA
• Busy people, multiple priorities, tight resources
• “Application implementation template”
• Collaboration is key
Southeastern Universities Research Association
SURAgrid Summary
• Fulfilling SURA mission to foster excellence in scientific research, strengthen capabilities, provide training opportunities
• Evolving beyond regional initiative• Growing infrastructure moving to production• Identifying applications & participants • Collaborative research activities
– AuthN/Z, Portals, Applications, Metascheduling– Grid middleware services– Funding opportunities
Additional questions or comments?
For more information:http://www1.sura.org/SURAgrid.html