1 srm-lite: overcoming the firewall barrier for data movement arie shoshani alex sim viji natarajan...
TRANSCRIPT
1
SRM-Lite:overcoming the firewall barrier
for data movement
Arie Shoshani
Alex Sim
Viji Natarajan
Lawrence Berkeley National Laboratory
SDM Center All-Hands Meeting
November, 2007
2
Outline
• What are Resource Storage Managers (SRM)
• Requirement of using SRM behind firewalls
• Satisfying the Requirements
• Architecture
• Potential uses
3
Storage Resource ManagersStorage Resource Managers
• SRMs are middleware components whose function SRMs are middleware components whose function is to provide:is to provide:• dynamic space allocation AND file management in spaces• for storage components on the local or wide-area network• Based on a common standard
SRM(BeStMan)
client/user applications
Unix-basedDiskPools
Examples of storage systems currently supported by SRMs
dCache CASTOR
CCLRC RAL
GPFS
SRM(DPM)
SRM(StoRM)
SRM/dCache
SRM/CASTOR
SRM(StoRM)
Unix-basedDiskPools
4
Storage Resource Managers:Main concepts
• Non-interference with local policies
• Advance space reservations
• Dynamic space management
• Pinning file in spaces
• Support abstract concept of a file name: Site URL (SURL)
• Temporary assignment of file names for transfer: Transfer URL (TURL)
• Directory Management and ACLs
• Multi-file requests (srmRquestToPut, srmRequestToGet, srmCopy)
• Transfer protocol negotiation
• Peer to peer request support
• Support for asynchronous multi-file requests
• Support abort, suspend, and resume operations
• SRM relies on other services for data movement (GridFTP, HTTPS, SCP, …)
5
Concepts: Site URL and Transfer URL
• Provide: Site URL (SURL)• URL known externally – e.g. in Replica Catalogs• e.g. srm://ibm.cnaf.infn.it:8444/dteam/test.10193
• Get back: transfer URL (TURL)• Path can be different than SURL – SRM internal mapping• Protocol chosen by SRM based on request protocol preference• e.g. gsiftp://ibm139.cnaf.infn.it:2811//gpfs/dteam/test.10193
• One SURL can have many TURL• Files can be replicated in multiple storage components• Files may be in near-line and/or on-line storage
• In light-weight SRM (a single file system on disk)• SURL can be the same as TURL except protocol
• File sharing is possible• Same physical file, but many requests• Needs to be managed by SRM
6
Tomcat servlet engine
Tomcat servlet engine
MCSMetadata Cataloguing Services
MCSMetadata Cataloguing Services
RLSReplica Location Services
RLSReplica Location Services
SOAP
RMI
MyProxyserver
MyProxyserver
MCS client
RLS client
MyProxy client
GRAMgatekeeper
GRAMgatekeeper
CASCommunity Authorization Services
CASCommunity Authorization Services
CAS client
disk MSSMass Storage System
HPSSHigh PerformanceStorage System
disk
HPSSHigh PerformanceStorage System
disk
disk
DRMStorage Resource
Management
DRMStorage Resource
Management
HRMStorage Resource
Management
HRMStorage Resource
Management
HRMStorage Resource
Management
HRMStorage Resource
Management
HRMStorage Resource
Management
HRMStorage Resource
Management
gridFTP
gridFTP
gridFTPserver
gridFTPserver
gridFTPserver
gridFTPserver
gridFTPserver
gridFTPserver
gridFTPserver
gridFTPserver
openDAPgserver
openDAPgserver
gridFTPStripedserver
gridFTPStripedserver
LBNL
LLNL
ISI
NCAR
ORNL
ANL
DRMStorage Resource
Management
DRMStorage Resource
Management
Earth Science Grid Analysis EnvironmentEarth Science Grid Analysis Environment(in production for 4 years)(in production for 4 years)
>5000 users 160 TBs managed
SRMs are used and inter-communicate in several sites SRMs
7
Robust Data Movement provided by SRMs and Robust Data Movement provided by SRMs and DataMoverDataMover
• ProblemProblem: move thousands of : move thousands of files robustlyfiles robustly
• Takes many hours• Need error recovery
• Mass storage systems failures
• Network failures
• SolutionSolution: Use Storage Resource : Use Storage Resource Managers (SRMs)Managers (SRMs)
• File streaming paradigm• By reserving and releasing
storage space automatically
• ProblemProblem: too slow: too slow
• Solution: Solution: • in GridFTP
• Use parallel streams• Use large FTP windows
• Pre-stage files from MSS• Use concurrent transfers
NCAR
Anywhere
LBNL
DiskCache
DiskCache
SRM-COPY(thousands of files)
SRM-GET (one file at a time)
DataMover
SRM(performs writes)
SRM(performs reads)GridFTP GET (pull mode)
stage filesarchive files
Network transfer
Get listof files
MSS
Example setup for Earth System Grid (ESG)
8
File tracking shows recovery from transient failures
Total:45 GBs
9
Requirements for SRM-Lite
• Run SRM behind a firewall• Cannot have third party transfers (source/target is local)
• May not be able to run GridFTP• Remote site may not support it• Some communities choose not to use GSI
• Need support for multi-file transfer• Or entire directory
• Need support for asynchronous request• Also support for intermediate status of request
• Need to support concurrent file transfers
10
Satisfying the Requirements: SRM-Lite
• Run SRM behind a firewall• Must have a client tool (SRM-Lite)
• May not be able to run GridFTP• Support high-performance SCP: Use HPN-SSS from Pittsburgh
supercomputing Center• But, also use other transfer protocols (GridFTP, bbcp, https, …)
• Need support for multi-file transfer• Manage queues for large requests
• Need support for asynchronous request• SRM-Lite returns a “request token”; token can be used for
“request status”
• Need to support concurrent file transfers• Use multi-threading to manage concurrent transfers• Monitor transfers and recover from mid-transfer interruptions
11
Scenario A: firewall at one site Scenario A: firewall at one site
DiskCache
SSH Server
NERSC
SSHChannel
(SCP)
GridFTP/FTP/BBCP/HTTP
transfers
• Process StepsProcess Steps• Login to ORNL using OTP• At ORNL invoke SRM-Lite• User composes XML input
file, srmlite.xml for selectedfiles/directories to copy from/to another site
• Or, user gives command lineoption for a selected file/directory
• SRM-Lite uses srmlite.xml orcommand line inputto automatically
• Push/Pull files to/from NERSC• Use multiple threads for
concurrent transfers
DiskCache
ORNL
SRM-Lite
OTPLogin
srmlite.xml
Local CommandsAnd
Protocols
Put example: Source: file:////my_directory/file_foo Target: scp://host/target_dir/file_fooGet example: Source: GridFTP://host/target_dir/file_foo Target: file:////my_directory/file_foo
12
Scenario B: one end has a firewall,Scenario B: one end has a firewall,The other end has The other end has SRMSRM
DiskCache
HPSS
SRM
NERSC
GridFTP/FTP/SCP
transfersDisk
Cache
ORNL
SRM-Lite
OTPLogin
srmlite.txt
SRM
Request
Put example: Source: file:////my_directory/file_foo Target: srm://host/target_dir/file_foo
13
Scenario C: firewalls at both endsScenario C: firewalls at both ends
DiskCache
SSH Server SSHChannel
(SCP)
• Process StepsProcess Steps• Login to Site1 using OTP• At site1 invoke SRM-Lite• SRM-Lite at site1 uses SSH
to invoke SRM-Lite at site2• Use SSH channel for SCP• Same as before:
• User composes XML input file, srmlite.xml for selected files/directories to copy from/to another site
• Or, user gives command line option for a selected file/directory
DiskCache
SRM-Lite
OTPLogin
srmlite.xml
SRM-Lite
site2 site1
14
Scenario C: SRM-Lite manages MSS accessScenario C: SRM-Lite manages MSS access
SSH Server SSHChannel
(SCP)
SRM-Lite
OTPLogin
srmlite.xml
SRM-Lite
site2 site1
DiskCache
HPSS
DiskCache
HPSS
15
GUI for SRM-Lite
• Used in ESG• Special version for data movement
to user workstations
• Called DataMover-Lite• Versions exist for Linux, PC, Mac
16
Usage
• Combustion project• The Applied Partial Differential Equations Center (APDEC)• John Bell• Efficient, robust data movement from sites behind firewalls• At DoE and DoD sites
• Kepler-SRM-Lite actor• To be used for managing multi-file transfers from sites behind
firewalls• Launch SRM-Lite remotely through SSH
• Initial version – help from NCSU: Pierre Mouallem• Two modes
• Entire request• Streaming file requests
• To be used in CPES workflows first with Norbert’s help