production-quality, industry-strength grid middleware for ... · pdf filehigh-performance...
TRANSCRIPT
NAREGI Middleware V.1.0NAREGI Middleware V.1.0 ProductionProduction--Quality, IndustryQuality, Industry--Strength Grid Strength Grid Middleware for Petascale Supercomputing Middleware for Petascale Supercomputing
Grids in Japan.Grids in Japan.
Satoshi MatsuokaSatoshi MatsuokaTokyo Institute of TechnologyTokyo Institute of Technology
National Institute of InformaticsNational Institute of Informatics
SC07 RenoSC07 Reno
4Computing Resources
NII IMS ResearchOrganizations etc
SuperSINET
GridGrid--Enabled NanoEnabled Nano--ApplicationsApplications
Grid PSEGrid PSE
Grid Workflow ToolGrid Workflow Tool
Grid VisualizationGrid Visualization
Data GridData Grid
Information ServiceInformation Service
NAREGI Software StackNAREGI Software Stack
Grid Grid ProgrammingProgrammingLibrariesLibraries -- GridRPCGridRPC-- GridMPIGridMPI
HighHigh--Performance & Secure Grid Networking, CertificationPerformance & Secure Grid Networking, Certification
Grid VMGrid VM
Super SchedulerSuper Scheduler
WSRFWSRF((NAREGI implementation + Globus 4)NAREGI implementation + Globus 4)
5
1.1. ServicesServicesSuper Scheduler:Super Scheduler: Meta schedulerMeta schedulerGridVM:GridVM: Job manager on resourcesJob manager on resourcesInfo Service:Info Service: Grid resource info and accountingGrid resource info and accountingNetwork Services:Network Services: Network performance measurement & Network performance measurement &
routing controlrouting control2.2. FeaturesFeatures
OGF/W3C/OASIS/DMTF etc. Standards-basedAutomatic resource brokeringAdvanced reservation based co-allocation (heterogeneous resources)Coexistence of non-reserve and reservation jobs NEWNEW
Coexistence of grid and local batch jobs NEWNEW
Bulk job submission (both native and non-native) Management toolsNetwork measurement and controlInteroperation with EGEE/gLite (prototype)Ease of Installation, Server/VO Operation packagesNEWNEW
Resource and Job Execution ManagementResource and Job Execution Management
6
3.3. Supporting StandardsSupporting StandardsDMTF/CIMOASIS/WSRFOGF/JSDLOGF/OGSA-EMSOGF/OGSA-RUSOGSA-DAI
Resource and Job Execution ManagementResource and Job Execution Management
7
Information Services
ServiceContainer
Accounting Services
Execution Planning Services
Candidate Set Generator
JSDL JSDL
Reservation
QueryUpdate
Submit
Reserve
Register
Provisioning•Deployment•Configuration
NAREGI GridVM
NAREGIInformation
Service
NAREGI WFT
JSDL JSDL
NAREGISuper Scheduler
ApplicationContents Service
Deployment
NAREGI PSE
Job Manager
JSDL JSDL Discover &
Select
Open Grid Service Architecture Open Grid Service Architecture –– Execution Management Service Execution Management Service will be standardized by OGSA EMSwill be standardized by OGSA EMS--WG WG
■■
NAREGI will contribute our middleware as a reference implementaNAREGI will contribute our middleware as a reference implementation of tion of
OGSAOGSA--EMS.EMS.
The First OGSAThe First OGSA--EMS incarnationEMS incarnation
8
Reservation Based CoReservation Based Co--AllocationAllocation
Computing ResourceComputing Resource
GridVMAccounting
CIM
UR/RUS
GridVM
ResourceInfo.
Reservation, Submission,Query, Control…
Client
ConcreteJSDL
ConcreteJSDL
WorkflowAbstract
JSDLSuper
SchedulerDistributed
Information ServiceDAI
Resource Query
Reservation basedCo-Allocation
• Co-allocation for heterogeneous architectures and applications
• Used for advanced science applications, huge MPI jobs, realtime visualization on grid, etc...
9
Bulk Job Submission Bulk Job Submission NEWNEW
• Batch scheduler’s native bulk function support• Accelerates parameter sweep jobs
Batch sched.with Bulk
Batch sched.with no Bulk
IS
MachMaking
JobSubmission
IS
GridVM
× nSubmission
×1,000 queries
×1,000Submission
×1000‥
1,000 activities job
Native Bulk Functions on Batch Schedulers
10 10 10
n splits
X 1000sub-jobs
Non Native Bulk Functions on Batch Schedulers
X nsub-jobs
1,000 activities jobsplit by n
SS
SS
MachMaking
JobSubmission
×n queries
GridVM
11
1.1. FeaturesFeaturesFile registration, distribution in grid environment.Shared file system in the grid environment accessible using global names.Import/export files to/from the shared file systemFile staging to the computing environment in cooperation with the workflow-toolInteroperation with EGEE/gLite (prototype)
2.2. Supporting StandardsSupporting StandardsDMTF/CIMOASIS/WSRFOGF/GFS
Data Grid EnvironmentData Grid Environment
12
Job
Local Scheduler
GridVM
Local FileSystem
Job
GridVM
Super SchedulerWFT
DataGrid
SubmissionPortal
DataGridResource
Management
Local FileSystem
PSE
GVS
Gfarm MetadataServer
Gfarm FileServer
Gfarm FileServer
Gfarm FileServer
DataGridData
Transfer
File Staging
External FileServer
File Transfer
File Operations
Grid File System
Local Scheduler
Data Grid Environment ArchitectureData Grid Environment Architecture• Grid-wide data sharing services using extended Gfarm• Data transfer service for Super Scheduler and WFT
13
GridFTPServer
NAREGI Portal
NAREGI ClientNAREGI Client
SRMClient
GfarmClient
gLite Client
gLite ClientgLite Client
LCG Utility
Computing Resource
JobJob
SRMClient
GfarmClient
NAREGIMetadata Server
LFC(Metadata Server)
GfarmServer
DPM(SRM Server)
StorageStorage
Interoperation with EGEE/gLite (prototype)Interoperation with EGEE/gLite (prototype)
• NAREGI and EGEE gLite clients can access to both data resources (e.g., bi-directional file copy) using SRM interface.
• GridFTP is used as its underlying file transfer protocol.• File catalog (metadata) exchange is planned.
14
1.1. FeaturesFeaturesProduction level CA configurableVOMS, MyProxy based identity managementAccess control through permission policyProxy certificate renewal service NEWNEW
Authorization service NEWNEW
2.2. Supporting StandardsSupporting StandardsGT4 GSIX.509XACML
SecuritySecurity
15
Job, VO, Certificates Job, VO, Certificates • Certificate and VO management
– Production-level CA, APGrid PMA(V2.2: now available on NAREGI download site)
– VOMS based VO user managementVOMSVOMS
CertificateManagement Server
CertificateManagement Server
ProxyCertificate
withVO
ProxyCertificate
withVO
UserCertificate
PrivateKey
NAREGICA
SS
ProxyCertificatewith VO
ProxyCertificatewith VO
MyProxyMyProxyProxy
Certificatewith VO
ProxyCertificatewith VO
loglog--ininGet CertificateGet Certificate
sshssh--loginlogin
sshssh + + vomsvoms--myproxymyproxy--initinit
vomsvoms--myproxymyproxy--initinit
Signed Job Signed Job DescriptionDescription
Client EnvironmentClient EnvironmentPortal WFT
PSEGVM
SS
clie
ntProxyCertificatewith VO
ProxyCertificatewith VO
Renewal Service
Batch sched.
GRAM/GridVM
AuthZ service AuthZPolicy
Set AuthZPolicy
AdminSite
Put Put ProxyCertificateProxyCertificatewithwith VOVO
Get Get ProxyCertificateProxyCertificatewith VOwith VO
Get VOMS AttributeGet VOMS Attribute
16
1.1. ModulesModulesPortal: Web based portal for NAREGI EnvironmentWorkflow Tool: GUI/CUI based job workflow management toolPSE: Problem Solving EnvironmentGVS: Grid Visualization Service
2.2. FeaturesFeaturesSingle Sign OnGUI and Command based workflow job managementGUI based job workflow document managementCompile, Test, Deployment serviceApplication repositoryVisualization using grid resources
3.3. Supporting StandardsSupporting StandardsOASIS/WSRFOGF/JSDL
User EnvironmentsUser Environments
17
Application sharing inResearch communities
Information Service
⑦Register Deployment Info.
Server#1 Server#2 Server#3
Compiling OK!
Test Run NG! Test Run OK!Test Run OK!
ResourceInfo.③
Compile
Application SummaryProgram Source FilesInput FilesResource Requirements
etc.
Application Developer
⑥Deploy
④Send back Compiled
Application Environment
①Register Application②Select Compiling Host⑤Select Deployment Host
ACS(Application Contents Service)
PSE Server
Registration & Deployment of ApplicationsRegistration & Deployment of Applications
18
PSEDataGrid InformationService
Super Scheduler
Web server(apache)
Workflow Servlet
tomcatWokflow DescriptionBy NAREGI-WFML
Server
BPEL<invoke name=EPS-jobA>↓
JSDL -A<invoke name=BES-jobA>
↓
JSDL -A…………………..
NAREGIJM I/F module
BPEL+JSDL
http(s)
Data icon Program iconAppli-A
Appli-BJSDL
JSDL
applet
Global fileinformation
Application Information
/gfarm/..
GridFTP(StdoutStderr)
Description of Workflow and Job Submission Description of Workflow and Job Submission RequirementsRequirements
20
RISMSolvent distribution
FMOElectronic structure
Mediator MediatorSolvent charge distribution
is transformed from regular to irregular meshes
Mulliken charge is transferred for partial
charge of solute molecules
Electronic structure of Nano-scale molecules in solvent is calculated self-consistent by exchanging solvent chargedistribution and partial charge of solute molecules.
*Original RISM and FMO codes are developed by Institute of Molecular Science and National Institute of Advanced Industrial Science and Technology, respectively.
Suitable for SMP Suitable for Cluster
GridMPI
Use case: RISMUse case: RISM--FMO Coupled SimulationFMO Coupled Simulation
21
1.1. ModulesModulesGridMPI: MPI-1 and 2 compliant grid ready MPI libraryGridRPC: OGF/GridRPC compliant GridRPC libraryMediator: Communication tool for heterogeneous applications SBC: Storage based communication tool
2.2. FeaturesFeaturesGridMPI
MPI for a collection of geographically distributed resourcesHigh performance optimized for high bandwidth network
GridMPITask parallel simple seamless programming
MediatorCommunication library for heterogeneous applicationsData format conversion
SBCStorage based communication for heterogeneous applications
3.3. Supporting StandardsSupporting StandardsMPI-1 and 2OGF/GridRPC
Communication Libraries and ToolsCommunication Libraries and Tools
22
Grid Ready Programming LibrariesGrid Ready Programming Libraries
• Standards compliant GridMPI and GridRPC
GridMPI Data ParallelMPI Compatibility
100000 CPU
100-500 CPU
RPC
RPC
GridRPC (Ninf-G2)Task Parallel, SimpleSeamless programming
23
• Mediator
• SBC (Storage Based Communication)
Communication Tools for CoCommunication Tools for Co--Allocation JobsAllocation Jobs
Application-1 Application-2Mediator Mediator
Data FormatConversion
Data FormatConversion
GridMPI ( )
Application-3 Application-2SBC library SBC library
SBC protocol ( )
25
Three layered package installation and management over distributed resources
Software, Tool, Component, Daemon, etc
One server
Manyservers
Layer 1 (multiple-nodes-layer)
Layer 2 (node-layer)
Layer 3 (package-layer)
All setups are executed from the central node, where configuration informationon servers is defined.
APT : Advanced Packaging Toolhttp://apt-rpm.org/
RPM : Redhat Package Managerhttp://www.rpm.org/
Layered Structure of InstallationLayered Structure of Installation
26
NAREGI installation• Done by the use of APT-RPM packages and tools • All the operations are invoked from the central node
IS
DataGrid
RPMs
Central Node CA/RA
Portal Super Scheduler
GridVM…
Installation tool
java
Globus
NAREGI
Globus Java
NAREGI
Globus Java
NAREGI
NAREGI
Setup
APTAPT--RPMRPM--based NAREGI Setupbased NAREGI Setup
27
SummarySummary
• NAREGI middleware realizes to built a virtual single computing environment on geographically distributed computing and storage resources
• NAREGI middleware covered grid computation environment from infrastructure level to application programming libraries level
• NAREGI services and modules are designed and developed using grid standards
• NAREGI middleware will interoperate with other grid environment
• NAREGI Version1 will be distributed as an open source on 2008
28
Thank you !!Thank you !!