sdm center february 2, 2005 progress on mpi-io access to mass storage system using a storage...
TRANSCRIPT
SDMCenter
February 2, 2005
Progress on MPI-IO Access to Mass Storage SystemUsing a
Storage Resource Manager
Ekow J. Otoo, Arie Shoshani and Alex Sim
Lawrence Berkeley National Laboratory
SDMCenter
February 2, 2005
Objective
• To allow near-online transparent access to files on mass storage system (e.g., HPSS) from MPI applications.
• To enable existing applications to dynamically access files from MSS with little modification to the source code.
• To demonstrate the feasibility of the approach by running some applications on a Linux cluster using PVFS as the local parallel file system and
HPSS as the mass storage system
SDMCenter
February 2, 2005
MPI Application on a Cluster Accessing an MSS
ION ION ION
Proxy+ Svr
CN CN CN CN
Other MSS
Parallel File System
GridFTP
•CN – Compute Node•ION – IO Node•SRM-Proxy•SRM-Svr – SRM Server
Mass Storage System(MSS)
SDMCenter
February 2, 2005
MPI-IO-SRM Architecture
Data Intensive Applications
MPI-IO SRM
PVFS GPFS UFS XFS Other
BDB Other
ADIO
HPSS
SAM
Jasmin
Castor
High Level Accessand Control
Record Structured File Access
Low Level FileSystem Access
pNetCDF HDF5
SDMCenter
February 2, 2005
Main Functions of MpiioSrm Library
• This is a package of library functions called libMpiioSrm.a The functions callable from MPI applications are:1. MPI_Srm_proxy_init();
2. MPI_Srm_file_open(); [in place of MPI_File_open()]
3. MPI_Srm_file_close(); [in place of MPI_File_close()]
4. MPI_Srm_file_delete(); [in place of MPI_File_delete()]
5. MPI_Srm_proxy_close()
• Functions (2), (3) and (4) are the only ones that take file names as a parameter of all the MPI-IO library functions.
• These can be used to build complex execution models of an application.
SDMCenter
February 2, 2005
Dependent SRM Functions
• Functions equivalent to SRM commands are: srm_abort(); srm_copy(); srm_get(); srm_ls(); srm_mkdir(); srm_ping(); srm_put(); srm_release(); srm_remove(); srm_status();
• Parameters used in these functions are either derived from a configuration file “hrm.rc” or communicated through MPI_Info object.
SDMCenter
February 2, 2005
How to Use the Library Functions
• Requires inclusion of MpiioSrm.h for compilation and libMpiioSrm during linkage.
• After MPI_init() we call MPI_Srm_proxy_init( MPI_Comm comm, int myrank, int srm_enabled, char *proxy_host, int proxy_rank, char *fileofurls,
char fileofurl_fmt, char *srmuserid, char *hrmrcpath, MPI_Info *srminfo) ;
• The application then calls MPI_Srm_file_open(), … <Invoke any MPI_* functions for processing> MPI_Srm_file_close(), etc.
• Finally it calls MPI_Srm_proxy_close(MPI_Comm comm, int proxy_rank)
before calling MPI_Finalize();
SDMCenter
February 2, 2005
Some Special Features of MpiioSrm
• Methods of specifying data sources and targets. Simple file name – a string of characters of a directory
path and basename URLs Indirectly through a file of either
• List of source and target URLs or• List of SRM commands
• Modes for opening files: In addition to the various MPI_MODE_* of the standard
we add the mode MPI_MODE_PROPCHG_ON_CLOSEto propagate any changes on the local file to the MSS.
• An application can set modes through MPI_File_info_set() command in our case.
SDMCenter
February 2, 2005
Status of the MpiioSrm library - 1
• The current status of the library includes For SRM services, we have
• srm_abort(), srm_copy(), srm_ls(), srm_ping(), srm_status(), srm_release() and srm_remove().
For MPI-IO calling SRMs we have all five basic functions implemented, namely
• MPI_Srm_proxy_init(); MPI_Srm_file_open(); MPI_Srm_file_close(); MPI_Srm_file_delete(); MPI_Srm_proxy_close()
SDMCenter
February 2, 2005
Status of the MpiioSrm library - 2
• Test Programs The current status includes the following test
programs:• TestSRM_S1RC – Static Single Request at a time.
• TestSRM_SRSC – Static Requests with Single file
processing at a time.
• TestSRM_DRSOC – Dynamic Requests with
Sequential Processing Order.
• TestSRM_DRAOC – Dynamic Requests with
Arbitrary Processing order.
SDMCenter
February 2, 2005
Looking to the Future
• Short Term Add srm_put() and srm_get() and srm_mkdir() Improve method of communicating prefetched file with
processes using MPI’s Remote Memory Access. Release for general use.
• Long Term Address scalability, fault tolerance issues and PVFS2 – use of multiple SRM servers. Define XML schema for input file formats GUI for:
• Preparing input data files for applics• Monitor files fetched and processed by applications through
MPI’s Dynamic Process Management functions