use of srms in earth system grid arie shoshani alex sim lawrence berkeley national laboratory

12
1 Use of SRMs in Use of SRMs in Earth System Grid Earth System Grid Arie Shoshani Arie Shoshani Alex Sim Alex Sim Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory

Upload: monte

Post on 04-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory. Earth System Grid. Main ESG portal 148.53 TB of data at four locations (NCAR, LBNL, ORNL, LANL) 965,551 files Includes the past 7 years of joint DOE/NSF climate modeling experiments - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

1

Use of SRMs inUse of SRMs in

Earth System GridEarth System Grid

Arie ShoshaniArie Shoshani

Alex SimAlex Sim

Lawrence Berkeley National LaboratoryLawrence Berkeley National Laboratory

Page 2: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

2

0

100

200

300

400

500

600

GB/day

Daily 7-Day Average

Earth System GridEarth System Grid

• Main ESG portalMain ESG portal• 148.53 TB of data at four locations (NCAR, LBNL, ORNL, LANL)

• 965,551 files• Includes the past 7 years of joint DOE/NSF climate modeling experiments

• 4713 registered users from 28 countries• Downloads to date: 31TB/99,938 files

• IPCC AR4 ESG portalIPCC AR4 ESG portal• 28 TB of data at one location

• 68,400 files• Model data from 11 countries• Generated by a modeling campaign coordinated

by the Intergovernmental Panel on Climate Change (IPCC)• 818 registered analysis projects from 58 countries

• Downloads to date: 123TB/543,500 files, 300 GB/day on average

Courtesy: http://www.earthsystemgrid.org

Page 3: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

3

The Role SRMs in ESG

• Data production• Run simulations• Generate data at compute sites -> move to archives• Need robust bulk data movement – use SRMs

• Data analysis• Replicate part of data to ESG portal sites• Get subsets of data to users/clients• Use SRMs to move data from any archive to portal site• Serve multiple files to users using an SRM client

Page 4: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

4

SRMs in ESGSRMs in ESG

Disk

Cache

DISK CACHE

Disk

Cache

HRM@ LBNL

Disk

Cache

DRM@ LANL

Disk

Cache

Disk

Cache

Portal

Client

Disk

Cache

HRM@ ORNL

Disk

Cache

DRM@ LLNL

Disk

Cache

Disk

Cache

Files SelectionAnd Request

download

NCARMSS

HRM@ NCAR

DRM – Disk Storage ManagerHRM – Hierarchical Storage Manager

Page 5: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

5

SRM works in concert with other SRM works in concert with other Grid components in ESGGrid components in ESG

MCS Metadata Cataloguing ServicesMCS Metadata Cataloguing Services

RLS Replica Location ServicesRLS Replica Location Services

MyProxyMyProxy

MSSMass TorageSystem

DISK

DISKHPSS

DISKHPSSDRMStorage Resource Management

DRMStorage Resource Management

HRMStorage Resource Management

HRMStorage Resource Management

HRMStorage Resource Management

HRMStorage Resource Management

HRMStorage Resource Management

HRMStorage Resource Management

GridFTPserver

GridFTPserver

GridFTP serverGridFTP server

GridFTPserver

GridFTPserver

GridFTPserver

GridFTPserver

OPeNDAP-gOPeNDAP-g

LBNL

LLNL

ISI

NCARORNL

ANL

DRMStorage Resource Management

DRMStorage Resource Management GridFTP

server

GridFTPserver

LANL

GridFTP serviceGridFTP service

RLSRLS

RLSRLS

RLSRLS

RLSRLS

Globus Security infrastructureGlobus Security infrastructure

IPCC PortalIPCC Portal

ESG Metadata DB

User DBXMLdata

catalogs

ESG CAESG CA

LAHFSLAHFS

RLSRLS

XML datacatalogs

FTP serverFTP server

ESG PortalESG Portal

Monitoring Discovery ervicesMonitoring Discovery ervices

DISK

DISK

Page 6: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

6

Page 7: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

7

DataMover: Robust Multi-File replication

• Multi-File Replication – why is it a problem?

• Tedious task – many files, repetitious

• Lengthy task – long time, can take hours, even days

• Error prone – need to monitor transfers

• Error recovery – need to restart file transfers

• Stage and archive from MSS – limited concurrency, down

time, transient failures

• Use of FTP – large windows, concurrent transfer

• Security – both for local MSS and the network

• Firewalls – transfer from/to MSS must be internal to the site

• Specialized MSS – HPSS at NERSC, ORNL, …, MSS at

NCAR

Page 8: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

8

Main Idea

• Take advantage of Storage Resource Managers• What do you get?

• SRMs queue multi-file requests• SRMs allocate space and release space automatically• SRMs request files from remote SRMs• Recover from network failures• SRMs invoke GridFTP – use large windows & parallel streams

• Special SRM in front of HPSS was developed by the SRM middleware project at LBNL and applied to PPDG• Called “Hierarchical Storage Manager” (HRM)• Queues multi-file requests to HPSS• Performs both staging and archiving• Recovers from failures during staging and archiving

• For MSS at NCAR• Replace module that communicates with HPSS to communicate

with NCAR-MSS

Page 9: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

9

DataMover: SRMs use in ESG forDataMover: SRMs use in ESG forRobust Muti-file replication Robust Muti-file replication

HRM-COPY(thousands of files)

SRM-GET (one file at a time)

GridFTP GET (pull mode)

stage filesarchive files

Network transfer

Get listof filesFrom directory

Recovers from file transfer failures

Anywhere

DiskCache

HRM-ClientCommand-line Interface

SRM(performs writes)

LBNL/ORNL

DiskCache

SRM(performs reads)

NCAR

NCAR-MSS

Recovers from staging failures

Recovers from archiving failures

Web-basedFile

MonitoringTool

Page 10: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

10

Web-Based File Monitoring ToolWeb-Based File Monitoring Tool

Shows:-Files already transferred- Files during transfer- Files to be transferred

Also shows foreach file:-Source URL-Target URL-Transfer rate

Page 11: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

11

File tracking helps to identify File tracking helps to identify bottlenecksbottlenecks

Shows that archiving is the bottleneck

Page 12: Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory

12

File tracking shows recovery from transient failures

Total:45 GBs