1 srm-lite: overcoming the firewall barrier for data movement arie shoshani alex sim viji natarajan...

16
1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands Meeting November, 2007

Upload: abner-mccarthy

Post on 17-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

1

SRM-Lite:overcoming the firewall barrier

for data movement

Arie Shoshani

Alex Sim

Viji Natarajan

Lawrence Berkeley National Laboratory

SDM Center All-Hands Meeting

November, 2007

Page 2: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

2

Outline

• What are Resource Storage Managers (SRM)

• Requirement of using SRM behind firewalls

• Satisfying the Requirements

• Architecture

• Potential uses

Page 3: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

3

Storage Resource ManagersStorage Resource Managers

• SRMs are middleware components whose function SRMs are middleware components whose function is to provide:is to provide:• dynamic space allocation AND file management in spaces• for storage components on the local or wide-area network• Based on a common standard

SRM(BeStMan)

client/user applications

Unix-basedDiskPools

Examples of storage systems currently supported by SRMs

dCache CASTOR

CCLRC RAL

GPFS

SRM(DPM)

SRM(StoRM)

SRM/dCache

SRM/CASTOR

SRM(StoRM)

Unix-basedDiskPools

Page 4: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

4

Storage Resource Managers:Main concepts

• Non-interference with local policies

• Advance space reservations

• Dynamic space management

• Pinning file in spaces

• Support abstract concept of a file name: Site URL (SURL)

• Temporary assignment of file names for transfer: Transfer URL (TURL)

• Directory Management and ACLs

• Multi-file requests (srmRquestToPut, srmRequestToGet, srmCopy)

• Transfer protocol negotiation

• Peer to peer request support

• Support for asynchronous multi-file requests

• Support abort, suspend, and resume operations

• SRM relies on other services for data movement (GridFTP, HTTPS, SCP, …)

Page 5: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

5

Concepts: Site URL and Transfer URL

• Provide: Site URL (SURL)• URL known externally – e.g. in Replica Catalogs• e.g. srm://ibm.cnaf.infn.it:8444/dteam/test.10193

• Get back: transfer URL (TURL)• Path can be different than SURL – SRM internal mapping• Protocol chosen by SRM based on request protocol preference• e.g. gsiftp://ibm139.cnaf.infn.it:2811//gpfs/dteam/test.10193

• One SURL can have many TURL• Files can be replicated in multiple storage components• Files may be in near-line and/or on-line storage

• In light-weight SRM (a single file system on disk)• SURL can be the same as TURL except protocol

• File sharing is possible• Same physical file, but many requests• Needs to be managed by SRM

Page 6: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

6

Tomcat servlet engine

Tomcat servlet engine

MCSMetadata Cataloguing Services

MCSMetadata Cataloguing Services

RLSReplica Location Services

RLSReplica Location Services

SOAP

RMI

MyProxyserver

MyProxyserver

MCS client

RLS client

MyProxy client

GRAMgatekeeper

GRAMgatekeeper

CASCommunity Authorization Services

CASCommunity Authorization Services

CAS client

disk MSSMass Storage System

HPSSHigh PerformanceStorage System

disk

HPSSHigh PerformanceStorage System

disk

disk

DRMStorage Resource

Management

DRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

gridFTP

gridFTP

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

openDAPgserver

openDAPgserver

gridFTPStripedserver

gridFTPStripedserver

LBNL

LLNL

ISI

NCAR

ORNL

ANL

DRMStorage Resource

Management

DRMStorage Resource

Management

Earth Science Grid Analysis EnvironmentEarth Science Grid Analysis Environment(in production for 4 years)(in production for 4 years)

>5000 users 160 TBs managed

SRMs are used and inter-communicate in several sites SRMs

Page 7: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

7

Robust Data Movement provided by SRMs and Robust Data Movement provided by SRMs and DataMoverDataMover

• ProblemProblem: move thousands of : move thousands of files robustlyfiles robustly

• Takes many hours• Need error recovery

• Mass storage systems failures

• Network failures

• SolutionSolution: Use Storage Resource : Use Storage Resource Managers (SRMs)Managers (SRMs)

• File streaming paradigm• By reserving and releasing

storage space automatically

• ProblemProblem: too slow: too slow

• Solution: Solution: • in GridFTP

• Use parallel streams• Use large FTP windows

• Pre-stage files from MSS• Use concurrent transfers

NCAR

Anywhere

LBNL

DiskCache

DiskCache

SRM-COPY(thousands of files)

SRM-GET (one file at a time)

DataMover

SRM(performs writes)

SRM(performs reads)GridFTP GET (pull mode)

stage filesarchive files

Network transfer

Get listof files

MSS

Example setup for Earth System Grid (ESG)

Page 8: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

8

File tracking shows recovery from transient failures

Total:45 GBs

Page 9: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

9

Requirements for SRM-Lite

• Run SRM behind a firewall• Cannot have third party transfers (source/target is local)

• May not be able to run GridFTP• Remote site may not support it• Some communities choose not to use GSI

• Need support for multi-file transfer• Or entire directory

• Need support for asynchronous request• Also support for intermediate status of request

• Need to support concurrent file transfers

Page 10: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

10

Satisfying the Requirements: SRM-Lite

• Run SRM behind a firewall• Must have a client tool (SRM-Lite)

• May not be able to run GridFTP• Support high-performance SCP: Use HPN-SSS from Pittsburgh

supercomputing Center• But, also use other transfer protocols (GridFTP, bbcp, https, …)

• Need support for multi-file transfer• Manage queues for large requests

• Need support for asynchronous request• SRM-Lite returns a “request token”; token can be used for

“request status”

• Need to support concurrent file transfers• Use multi-threading to manage concurrent transfers• Monitor transfers and recover from mid-transfer interruptions

Page 11: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

11

Scenario A: firewall at one site Scenario A: firewall at one site

DiskCache

SSH Server

NERSC

SSHChannel

(SCP)

GridFTP/FTP/BBCP/HTTP

transfers

• Process StepsProcess Steps• Login to ORNL using OTP• At ORNL invoke SRM-Lite• User composes XML input

file, srmlite.xml for selectedfiles/directories to copy from/to another site

• Or, user gives command lineoption for a selected file/directory

• SRM-Lite uses srmlite.xml orcommand line inputto automatically

• Push/Pull files to/from NERSC• Use multiple threads for

concurrent transfers

DiskCache

ORNL

SRM-Lite

OTPLogin

srmlite.xml

Local CommandsAnd

Protocols

Put example: Source: file:////my_directory/file_foo Target: scp://host/target_dir/file_fooGet example: Source: GridFTP://host/target_dir/file_foo Target: file:////my_directory/file_foo

Page 12: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

12

Scenario B: one end has a firewall,Scenario B: one end has a firewall,The other end has The other end has SRMSRM

DiskCache

HPSS

SRM

NERSC

GridFTP/FTP/SCP

transfersDisk

Cache

ORNL

SRM-Lite

OTPLogin

srmlite.txt

SRM

Request

Put example: Source: file:////my_directory/file_foo Target: srm://host/target_dir/file_foo

Page 13: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

13

Scenario C: firewalls at both endsScenario C: firewalls at both ends

DiskCache

SSH Server SSHChannel

(SCP)

• Process StepsProcess Steps• Login to Site1 using OTP• At site1 invoke SRM-Lite• SRM-Lite at site1 uses SSH

to invoke SRM-Lite at site2• Use SSH channel for SCP• Same as before:

• User composes XML input file, srmlite.xml for selected files/directories to copy from/to another site

• Or, user gives command line option for a selected file/directory

DiskCache

SRM-Lite

OTPLogin

srmlite.xml

SRM-Lite

site2 site1

Page 14: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

14

Scenario C: SRM-Lite manages MSS accessScenario C: SRM-Lite manages MSS access

SSH Server SSHChannel

(SCP)

SRM-Lite

OTPLogin

srmlite.xml

SRM-Lite

site2 site1

DiskCache

HPSS

DiskCache

HPSS

Page 15: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

15

GUI for SRM-Lite

• Used in ESG• Special version for data movement

to user workstations

• Called DataMover-Lite• Versions exist for Linux, PC, Mac

Page 16: 1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands

16

Usage

• Combustion project• The Applied Partial Differential Equations Center (APDEC)• John Bell• Efficient, robust data movement from sites behind firewalls• At DoE and DoD sites

• Kepler-SRM-Lite actor• To be used for managing multi-file transfers from sites behind

firewalls• Launch SRM-Lite remotely through SSH

• Initial version – help from NCSU: Pierre Mouallem• Two modes

• Entire request• Streaming file requests

• To be used in CPES workflows first with Norbert’s help