computing sciences directorate, l b n l 1 chep 2003 storage resource management in the grid...

23
Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management Storage Resource Management In the Grid Environment In the Grid Environment Alex Sim Alex Sim Junmin Gu Junmin Gu Arie Shoshani Arie Shoshani Scientific Data Management Group Scientific Data Management Group Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory http://sdm.lbl.gov/srm http://sdm.lbl.gov/srm

Upload: ariel-simpson

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

1 CHEP 2003

Storage Resource ManagementStorage Resource ManagementIn the Grid EnvironmentIn the Grid Environment

Alex SimAlex Sim

Junmin GuJunmin Gu

Arie ShoshaniArie Shoshani

Scientific Data Management GroupScientific Data Management Group

Lawrence Berkeley National LaboratoryLawrence Berkeley National Laboratory

http://sdm.lbl.gov/srmhttp://sdm.lbl.gov/srm

Page 2: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

2 CHEP 2003

OutlineOutline

• What are Storage Resource Managers - MotivationWhat are Storage Resource Managers - Motivation

• General Analysis Scenario and the use of SRMsGeneral Analysis Scenario and the use of SRMs

• SRM functionalitySRM functionality

• Real examples of working SRMsReal examples of working SRMs

• Advantages of using SRMsAdvantages of using SRMs

• Conclusions and Future WorkConclusions and Future Work

Page 3: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

3 CHEP 2003

MotivationMotivation

• Grid architecture needs to include reservation & Grid architecture needs to include reservation & scheduling of:scheduling of:• Compute resources• Storage resources• Network resources

• Storage Resource Managers (SRMs) role in the Storage Resource Managers (SRMs) role in the data grid architecturedata grid architecture• Shared storage resource allocation & scheduling• Especially important for data intensive applications• Often files are archived on a mass storage system (MSS)• Wide area networks – minimize transfers • large scientific collaborations (100’s of nodes,

1000’s of clients) – opportunities for file sharing• File replication and caching may be used• Need to support non-blocking (asynchronous) requests

Page 4: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

4 CHEP 2003

General Analysis ScenarioGeneral Analysis Scenario

MSS

RequestExecuter

Storage Resource Manager

Metadatacatalog

Replicacatalog

NetworkWeatherService

logicalquery

network

clientclient ...

RequestInterpreter

requestplanning

A set oflogical files

Execution plan and site-specific

files

Client’s site

...Disk

Cache

DiskCache

ComputeEngine

DiskCache

Compute Resource Manager

Storage Resource Manager

ComputeEngine

DiskCache

Requests fordata placement andremote computation

Site 2Site 1 Site N

Storage Resource Manager

Storage Resource Manager

Compute Resource Manager

result files

ExecutionDAG

Page 5: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

5 CHEP 2003

SRM is a ServiceSRM is a Service

• SRM functionalitySRM functionality• Manage space

• Negotiate and assign space to users• Manage “lifetime” of spaces

• Manage files on behalf of a user• Pin files in storage till they are released• Manage “lifetime” of files• Manage action when pins expire (depends on file types)

• Manage file sharing• Policies on what should reside on a storage resource at any one time• Policies on what to evict when space is needed

• Get files from remote locations when necessary• Purpose: to simplify client’s task

• Manage multi-file requests• A brokering function: queue file requests, pre-stage when possible

• Provide grid access to/from mass storage systems• HPSS (LBNL, ORNL, BNL), Enstore (Fermi), JasMINE (Jlab), Castor

(CERN), MSS (NCAR), …

Page 6: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

6 CHEP 2003

Types of SRMsTypes of SRMs

• Types of storage resource managersTypes of storage resource managers• Disk Resource Manager (DRM)

• Manages one or more disk resources• Tape Resource Manager (TRM)

• Manages access to a tertiary storage system (e.g. HPSS)• Hierarchical Resource Manager (HRM=TRM + DRM)

• An SRM that stages files from tertiary storage into its disk cache

• SRMs and File transfersSRMs and File transfers• SRMs DO NOT perform file transfer• SRMs DO invoke file transfer service if needed

(GridFTP, FTP, HTTP, …)• SRMs DO monitor transfers and recover from failures

• TRM: from/to MSS• DRM: from/to network

Page 7: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

7 CHEP 2003

Analysis ScenarioAnalysis Scenariofor Local Computationfor Local Computation

Uniform SRMinterface

tape system

HRM

RequestExecuter

DRM

DiskCache

Metadatacatalog

Replicacatalog

NetworkWeatherService

logicalquery

pinning & filetransfer requests

network

DRM

DiskCache

clientclient ...

RequestInterpreter

requestplanning

logical files

site-specific files

Client’s site

...

DiskCache

site-specific files requests

ComputeEngine

Page 8: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

8 CHEP 2003

SRM works with disk cachesSRM works with disk cachesas well as legacy systemsas well as legacy systems

DRM

Disk Cache

Disk Cache

Disk Cache

Disk Cache

BerkeleyBerkeleyChicago Livermore

HRMGridFTPGridFTP GridFTPFTP

Disk Cache

BIT-MAPIndex

RequestManager

File TransferMonitoring

DRM GridFTP

Denver

client

server server server server

Logical Request

Data Path

Control path

Legend:

SC 2001 DemoFor HENP – STAR

Experiment

Page 9: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

9 CHEP 2003

Tomcat servlet engine

Tomcat servlet engine

MCSMetadata Cataloguing Services

MCSMetadata Cataloguing Services

RLSReplica Location Services

RLSReplica Location Services

SOAP

RMI

MyProxyserver

MyProxyserver

MCS client

RLS client

MyProxy client

GRAMgatekeeper

GRAMgatekeeper

CASCommunity Authorization Services

CASCommunity Authorization Services

CAS client

NCAR-MSSMass Storage System

HPSSHigh PerformanceStorage System

HPSSHigh PerformanceStorage System

DRMStorage Resource

Management

DRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

gridFTP

gridFTP

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

openDAPgserver

openDAPgserver

gridFTPStripedserver

gridFTPStripedserver

LBNL

LLNL

USC-ISI

NCAR

ORNL

ANL

DRMStorage Resource

Management

DRMStorage Resource

Management

disk

disk

disk

disk

Earth Science Grid Demo - SC 2002Earth Science Grid Demo - SC 2002

Page 10: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

10 CHEP 2003

Uniformity of Interface Uniformity of Interface Compatibility of SRMsCompatibility of SRMs

SRM SRM SRM

Enstore JASMine

ClientUSER/APPLICATIONS

Grid Middleware

SRM

DCache

SRM

CASTOR

SRM

DiskCache

Page 11: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

11 CHEP 2003

High Level View of SRM High Level View of SRM setup in SC 2002setup in SC 2002

SRM

SRM

SRM

Enstore

JASMine

Client(USER/APPLICATIONS)

Page 12: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

12 CHEP 2003

Screen Dump of Demo at Fermi BoothScreen Dump of Demo at Fermi Booth

Page 13: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

13 CHEP 2003

Where do SRMs belongWhere do SRMs belongin the Grid architecture?in the Grid architecture?

ComputeSystems

Networks

OtherStorage

systems

StorageResourceManager

ComputeResource

Management

General DataDiscoveryServices

CommunityAuthorization

Services

Application-Specific Data

Discovery Services

StorageManagement(Brokering)

ComputeScheduling(Brokering)

Data Filtering orTransformation

Services

DatabaseManagement

Services

RequestInterpretationand Planning

Services

File TransferService(GridFTP)

DataTransportServices

Monitoring/AuditingServices

Workflow orRequest

ManagementServices

Consistency Services(e.g., Update Subscription,Versioning, Master Copies)

DataFederationServices

RE

SO

UR

CE

:

CO

LLE

CT

I VE

1:

GE

NE

RA

LS

ER

VIC

ES

FO

RC

OO

RD

INA

TIN

GM

ULT

I PLE

RE

SO

UR

CE

S

CO

LLE

CT

IVE

2:

SE

RV

ICE

SS

PE

CIF

IC T

OA

PP

LIC

AT

ION

DO

MA

IN O

RV

IRTU

AL

OR

G.

ResourceMonitoring/

Auditing

FA

BR

ICC

ON

NE

CTI

VIT

Y

CommunicationProtocols (e.g.,TCP/IP stack)

Authentication andAuthorization

Protocols (e.g., GSI)

Data Filtering orTransformation

Services

CO

LL

EC

TI V

E

This figure based on theGrid Architecture paper by Globus Team

Mass StorageSystem(HPSS)

Page 14: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

14 CHEP 2003

SRMs provide a brokering serviceSRMs provide a brokering serviceby supporting multi-file requestsby supporting multi-file requests

ComputeSystems

Networks

OtherStorage

systems

StorageResourceManager

ComputeResource

Management

CommunityAuthorization

Services

Application-Specific Data

Discovery Services

StorageManagement(Brokering)

ComputeScheduling(Brokering)

Data Filtering orTransformation

Services

DatabaseManagement

Services

RequestInterpretationand Planning

Services

File TransferService(GridFTP)

DataTransportServices

Monitoring/AuditingServices

Workflow orRequest

ManagementServices

Consistency Services(e.g., Update Subscription,Versioning, Master Copies)

DataFederationServices

RE

SO

UR

CE

:S

HA

RIN

G S

ING

LER

ES

OU

RC

ES

CO

LLE

CT

I VE

1:

GE

NE

RA

LS

ER

VIC

ES

FO

RC

OO

RD

INA

TIN

GM

ULT

I PLE

RE

SO

UR

CE

S

CO

LLE

CT

IVE

2:

SE

RV

ICE

SS

PE

CIF

IC T

OA

PP

LIC

AT

ION

DO

MA

IN O

RV

IRTU

AL

OR

G.

ResourceMonitoring/

Auditing

FA

BR

ICC

ON

NE

CTI

VIT

Y

CommunicationProtocols (e.g.,TCP/IP stack)

Authentication andAuthorization

Protocols (e.g., GSI)

CO

LL

EC

TI V

E

This figure based on theGrid Architecture paper by Globus Team

Mass StorageSystem(HPSS)

General DataDiscoveryServices

Data Filtering orTransformation

Services

Page 15: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

15 CHEP 2003

SRMs use in STAR forSRMs use in STAR forRobust Muti-file replication Robust Muti-file replication

Anywhere

BNL

DiskCache

DiskCache

HRM-COPY(thousands of files)

SRM-GET (one file at a time)

HRM-ClientCommand-line Interface

HRM(performs writes)

HRM(performs reads)

LBNLGridFTP GET (pull mode)

stage filesarchive files

Network transfer

Get listof files

Recovers from staging failures

Recovers from file transfer failures

Recovers from archiving failures

Page 16: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

16 CHEP 2003

Web-Based File Monitoring ToolWeb-Based File Monitoring Tool

Shows:-Files already transferred- Files during transfer- Files to be transferred

Also shows foreach file:-Source URL-Target URL-Transfer rate

Page 17: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

17 CHEP 2003

GridFTP-HPSSGridFTP-HPSSAccess Provided through HRMAccess Provided through HRM

HRM

GridFTP

SRM-API

GridFTP-API

Client

HRM

GridFTP move

SRM-API

GridFTP-API

Client

Using HRM protocol New: GridFTP-HPSSthrough HRM

GridFTP entry

• No modifications to the MSS

• Managing queues of multiple requests to the MSS

• Minimizing tape mounts

• Recovers from MSS transient failures

Page 18: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

18 CHEP 2003

GridFTP-HRM-LayerGridFTP-HRM-Layerimplementation detailimplementation detail

HRM

GridFTP-API

Client

GridFTP entry

GridFTP move

GridFTP exit

FTP-HRMLayer

Sharedmemory Corba

1a 1b

2a 2b

3a 3b

1a: stor/retv1b: hrm_get/hrm_put

2b: call_back2a: unblock semaphore

3a: success_code3b: hrm_release

Page 19: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

19 CHEP 2003

Types of Spaces and FilesTypes of Spaces and Files

• Space reservation servicesSpace reservation services• Spaces and files: volatile, durable, permanent• Lifetime, action at end of lifetime

• Volatile – SRM owned, files can be removed if space needed• Durable – files cannot be removed, but administrator notified• Permanent – can be removed by owner only

• Directory servicesDirectory services• Usual unix semantics

• any type of files in directory

• Access control servicesAccess control services• Support owner/group/world permission

• Can only be assigned by owner• File sharing for read-only files

• check with source for shared file permission• File sharing for updatable files

• check with “master copy” for time of last update

Page 20: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

20 CHEP 2003

File movement functionality: File movement functionality: srmGet, srmPut, srmReplicatesrmGet, srmPut, srmReplicate

SRM Client

Client-FTP-get(pull)

Client-FTP-put(push)

srmGet/srmPut

SRM-FTP-put(push)

SRM ClientSRM/

No-SRMSRM-FTP-get

(pull)

srmReplicate

SRM/No-SRM

FTP-get

Page 21: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

21 CHEP 2003

SRM MethodsSRM Methods

File Movementsrm(Prepare)Get:srm(Prepare)Put:srmReplicate: Lifetime managementsrmReleaseFiles:srmPutDone:srmExtendFileLifeTime:

Terminate/resumesrmAbortRequest:srmAbortFilesrmSuspendRequest:srmResumeRequest: 

Space managementsrmReserveSpacesrmReleaseSpacesrmUpdateSpacesrmCompactSpace:srmGetCurrentSpace: FileType managementsrmChangeFileType:

Status/metadatasrmGetRequestStatus:srmGetFileStatus:srmGetRequestSummary:srmGetRequestID:srmGetFilesMetaData:srmGetSpaceMetaData:

Page 22: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

22 CHEP 2003

Advantages of using SRMsAdvantages of using SRMs

• Synchronization between storage resourcesSynchronization between storage resources• Pinning file, releasing files• Allocating space dynamically on as “needed basis”

• Insulate clients from storage and network system failuresInsulate clients from storage and network system failures• Transient MSS failure• Network failures• Interruption of large file transfers

• Facilitate file sharingFacilitate file sharing• Eliminate unnecessary file transfers

• Support “streaming model”Support “streaming model”• Use space allocation policies by SRMs: no reservations needed• Use explicit release by client for reuse of space

• Control number of concurrent file transfersControl number of concurrent file transfers• From/to MSS – avoid flooding MSS and thrashing• From/to network – avoid flooding and packet loss

Page 23: Computing Sciences Directorate, L B N L 1 CHEP 2003 Storage Resource Management In the Grid Environment Alex Sim Junmin Gu Arie Shoshani Scientific Data

Computing Sciences Directorate, L B N L

23 CHEP 2003

Ongoing and Future WorkOngoing and Future Work

• Ongoing workOngoing work• Developing Standard SRM interfaces

• Particle Physics Data Grid (PPDG) project• LBNL, TJNAF, FNAL

• European Data Grid (EDG) project• WP2 - data management• WP5 – mass storage (CASTOR)

• Deployment• LBNL, BNL, ORNL, TJNAF, FNAL, CERN, (SE-England)

• Use of SRM by other agents• Storage Resource Broker (SDSC) calling HRM to Stage files from HPSS• GridFTP invoking HRM

• Future workFuture work• Access authorization – community access service (CAS)• “On-demand” space allocation, accounting, and charging• Replica management – invoke SRMs and RLS as a single service• Request executer (e.g. DAGMAN) to invoke SRMs• SRMs over NeST (Network STorage)