reliable and efficient grid data placement using stork and diskrouter

Reliable and Efficient Grid Data Placement

using Stork and DiskRouter

Tevfik Kosar University of Wisconsin-Madison

[email protected]

April 15th, 2004

A Single Project..

LHC (Large Hadron Collider) Comes online in 2006 Will produce 1 Exabyte data by 2012 Accessed by ~2000 physicists, 150

institutions, 30 countries

And Many Others..

Genomic information processing applicationsBiomedical Informatics Research Network (BIRN) applicationsCosmology applications (MADCAP)Methods for modeling large molecular systems Coupled climate modeling applicationsReal-time observatories, applications, and data-management (ROADNet)

The Same Big Problem..

Need for data placement: Locate the data Send data to processing sites Share the results with other sites Allocate and de-allocate storage Clean-up everythingDo these reliably and efficiently

Outline

IntroductionStorkDiskRouterCase StudiesConclusions

Stork

A scheduler for data placement activities in the GridWhat Condor is for computational jobs, Stork is for data placement Stork comes with a new concept:“Make data placement a first class

citizen in the Grid.”

The Concept

• Stage-in

• Execute the Job

• Stage-out

Stage-in

Execute the job

Stage-outRelease input space

Release output space

Allocate space for input & output data

Individual Jobs

The Concept

• Stage-in

• Execute the Job

• Stage-out

Stage-in

Execute the job

Stage-outRelease input space

Release output space

Allocate space for input & output data

Data Placement Jobs

Computational Jobs

DAGMan

The Concept

CondorJob

QueueDaP A A.submitDaP B B.submitJob C C.submit…..Parent A child BParent B child CParent C child D, E…..

C

StorkJob

Queue

E

DAG specification

A CBD

E

F

Why Stork?

Stork understands the characteristics and semantics of data placement jobs.Can make smart scheduling decisions, for reliable and efficient data placement.

Failure Recovery and Efficient Resource Utilization

Fault tolerance Just submit a bunch of data placement jobs,

and then go away..

Control number of concurrent transfers from/to any storage system Prevents overloading

Space allocation and De-allocations Make sure space is available

Support for Heterogeneity

Protocol translation using Stork memory buffer.

Support for Heterogeneity

Protocol translation using Stork Disk Cache.

Flexible Job Representation and Multilevel Policy Support[

Type = “Transfer”; Src_Url =

“srb://ghidorac.sdsc.edu/kosart.condor/x.dat”; Dest_Url =

“nest://turkey.cs.wisc.edu/kosart/x.dat”;…………Max_Retry = 10;Restart_in = “2 hours”;

]

Run-time AdaptationDynamic protocol selection[ dap_type = “transfer”; src_url = “drouter://slic04.sdsc.edu/tmp/test.dat”; dest_url = “drouter://quest2.ncsa.uiuc.edu/tmp/test.dat”; alt_protocols = “nest-nest, gsiftp-gsiftp”;]

[ dap_type = “transfer”; src_url = “any://slic04.sdsc.edu/tmp/test.dat”; dest_url = “any://quest2.ncsa.uiuc.edu/tmp/test.dat”;]

Run-time Adaptation

Run-time Protocol Auto-tuning[

link = “slic04.sdsc.edu – quest2.ncsa.uiuc.edu”; protocol = “gsiftp”;

bs = 1024KB; //block sizetcp_bs = 1024KB; //TCP buffer sizep = 4;

]

Outline


DiskRouter

A mechanism for high performance, large scale data transfersUses hierarchical buffering to aid in large scale data transfers Enables application-level overlay network for maximizing bandwidthSupports application-level multicast

Store and Forward

Improves performance when bandwidth fluctuation between A and B is independent of the bandwidth fluctuation between B and C

DiskRouter

With DiskRouter

Without DiskRouter

A

B

C

DiskRouter Overlay Network

A B

90 Mb/s

DiskRouter Overlay Network

A B

DiskRouter

90 Mb/s

400 Mb/s 400 Mb/s

C

Add a DiskRouter Node C which is not necessarily on the path from A to B, to enforce use of an

alternative path.

Data Mover/Distributed Cache

Source writes to the closest DiskRouter and Destination receives it up from its closest DiskRouter

Source Destination

DiskRouter Cloud

Outline


Case Study I: SRB-UniTree Data Pipeline

Transfer ~3 TB of DPOSS data from SRB @SDSC to UniTree @NCSAA data pipeline created with Stork and DiskRouter

SRB Server UniTree

Server

SDSC Cache

NCSA Cache

Submit Site

UniTree not responding Diskrouter reconfigured and restarted

SDSC cache reboot & UW CS Network outage Software problem

Failure Recovery

Case Study -II

Dynamic Protocol Selection

Runtime Adaptation

Before Tuning:

• parallelism = 1

• block_size = 1 MB

• tcp_bs = 64 KBAfter Tuning:

• parallelism = 4

• block_size = 1 MB

• tcp_bs = 256 KB

Conclusions

Regard data placement as first class citizen.Introduce a specialized scheduler for data placement.Introduce a high performance data transfer tool.End-to-end automation, fault tolerance, run-time adaptation, multilevel policy support, reliable and efficient transfers.

Future work

Enhanced interaction between Stork, DiskRouter and higher level planners co-scheduling of CPU and I/O

Enhanced authentication mechanismsMore run-time adaptation

You don’t have to FedEx your data anymore.. We deliver it for you!

For more information Stork:

• Tevfik Kosar• Email: [email protected]• http://www.cs.wisc.edu/condor/stork

DiskRouter:• George Kola• Email: [email protected]• http://www.cs.wisc.edu/condor/diskrouter

reliable and efficient grid data placement using stork and diskrouter

Documents

data placement stork

efficient data placement

data placement activities

datasend data

exabyte data

efficient grid data

bunch of data placement

datamanagement roadnetthe