modeling and optimizing large-scale wide-area data transfers
DESCRIPTION
Modeling and Optimizing Large-Scale Wide-Area Data Transfers. Raj Kettimuthu, Gayane Vardoyan, Gagan Agrawal, and P. Sadayappan. Exploding data volumes. Astronomy. Climate. 2004: 36 TB 2012: 2,300 TB. MACHO et al.: 1 TB Palomar: 3 TB 2MASS: 10 TB GALEX: 30 TB Sloan: 40 TB. - PowerPoint PPT PresentationTRANSCRIPT
Modeling and Optimizing Large-Scale Wide-Area Data Transfers
Raj Kettimuthu, Gayane Vardoyan, Gagan Agrawal, and P. Sadayappan
Exploding data volumes
100,000 TB
MACHO et al.: 1 TBPalomar: 3 TB
2MASS: 10 TBGALEX: 30 TBSloan: 40 TB
Pan-STARRS: 40,000 TB
2004: 36 TB2012: 2,300 TB
105 increase in data volumes in 6 years
Astronomy
Climate
Genomics
Datasets must frequently be transported over WAN– Analysis, visualization, archival
Data movement bandwidths not increasing at same rate as dataset sizes – Major constraint for data-driven sciences
File transfer - dominant data transfer mode GridFTP - widely used by scientific communities
– 1000s of servers deployed worldwide move >1 PB per day Characterize, control and optimize transfers
Data movement
High-performance, secure data transfer protocol optimized for high-bandwidth wide-area networks
Based on FTP protocol - defines extensions for high-performance operation and security
Globus implementation of GridFTP is widely used. Globus GridFTP servers support usage statistics
collection – Transfer type, size in bytes, start time of the transfer,
transfer duration etc. are collected for each transfer
GridFTP
4
GridFTP usage log
Parallelism vs concurrency in GridFTP
Data Transfer Node at Site B
GridFTP Server Process
GridFTP Server Process
GridFTP Server Process
Data Transfer Node at Site A
Parallel File System
TCP Connection
Parallelism = 3
Concurrency = 3
TCP Connection
TCP Connection
GridFTP Server Process
GridFTP Server Process
GridFTP Server Process
TCP ConnectionTCP Connection
TCP Connection
TCP ConnectionTCP Connection
TCP Connection
Parallelism vs concurrency
Objective - control bandwidth allocation for transfer(s) from a source to the destination(s)
Most large transfers between supercomputers– Ability to both store and process large amounts of data
Site heavily loaded, most bandwidth consumed by small number of sites
Goal – develop simple model for GridFTP – Source concurrency - total number of ongoing transfers between
the endpoint A and all its major transfer endpoints – Destination concurrency - total number of ongoing transfers
between the endpoint A and the endpoint B– External load - All other activities on the endpoints including
transfers to other sites
Problem formulation
Modeling throughput Linear models
Models that consider only source and destination CC
Separate model for each destination Data to train, validate models – load variation experiments Errors >15% for most cases Log models
Y’ = a1X1 + a2X2 + … + akXk + b
DT = a1*DC + a2*SC + b1 DT = a3 *DC/SC + b2
DT = SCa4 *DCa5 * 2b3 log(DT)=a4*log(SC) + a5*log(DC) + b3
Modeling throughput
Log model better than linear models, still high errors Model based on just SC and DC too simplistic Incorporate external load
– External load - network, disk, and CPU activities outside transfers– How to measure the external load?– How to include external load in model(s)?
External load
Transfers stable over short duration but vary widely over entire day
Multiple training data – same SC, DC - different days & times Throughput differences for same SC, DC attributed to
difference in external load Three different functions for external load (EL) EL1=T −AT, T - throughput for transfer t, AT - average
throughput of all transfers with same SC, DC as t EL2=T−MT, MT - max throughput with same SC, DC as t EL3 = T/MT
Models with external load
ELa11 if EL>0 |EL|(−a11) otherwise
AEL{a11} =
DT = a6*DC + a7*SC + a8*EL + b4
DT = SCa9 * DCa10 * AEL{a11} * 2b5
Linear
Log
Calculating external load in practice
Unlike SC and DC, external load is unknown Multiple data points with same SC, DC used to train models In practice, may not be any recent transfers with same SC, DC Some recent transfers, no substantial change in external load
over few minutes Most recent transfer’s load as current load Average load of transfers in past 30 minutes as current load Average load in the past 30 minutes with error correction
DT = a6*DC + a7*SC + a8*EL + b4
Given Control Unknown
Recent transfers load with error correction
DT = a6*DC + a7*SC + a8*EL + b4
Known Compute
Transfers in past 30 minutes
DT = a6*DC + a7*SC + a8*EL + b4 + e
Historictransfers
Previous Transfer Method
Recent Transfers Method
Recent Transfers with Error Correction
Applying models to control bandwidth
Experimental setup: DTNs at 5 XSEDE sites (Source: TACC, Destinations: PSC, NCAR, NICS, Indiana, SDSC)
Goal – control bandwidth allocation to destinations when source is saturated
Models express throughput in terms of SC, DC, and EL Given target throughput, determine DC to achieve target
– Often more than one destination transfer data, SC is also unknown. Limit DC to 20 to narrow search space
– Even then, large number of possible DC combinations (20n) Heuristics to limit search space to (SCmax – ND + 1)
Experiments
Ratio experiments – allocate available bandwidth at source to destinations using predefined ratio– Achieve specific fraction of bandwidth for each destination– Four ratio combinations
Factoring experiments – increase destination’s throughput by a factor when source is saturated– Bandwidth increase because of certain priorities
Four models/methods (log EL1/EL3 models and RT/RTEC methods) were used – Effective in predicting the throughputs– 83.6% of the errors are below 15%, and 65.5% of them are below 10%
Results – Ratio experiments
Ratios are 4:5:6:8:9 for Kraken, Mason, Blacklight, Gordon, and Yellowstone. Concurrencies picked by Algorithm were {1,3,3,1,1}. Model: log with EL1. Method: RTEC
Ratios are 4:5:6:8:9 for Kraken, Mason, Blacklight, Gordon, and Yellowstone. Concurrencies picked by Algorithm were {1,4,3,1,1}. Model: log with EL3. Method: RT
Results – Factoring experiments
Increasing Gordon’s baseline throughput by 2x. Concurrency picked by picked by Algorithm for Gordon was 5
Increasing Yellowstone’s baseline throughput by 1.5x. Concurrency picked by picked by Algorithm for Yellowstone was 3
Related work
Several models for predicting behavior & finding optimal parallel TCP streams – Uncongested networks, simulations
Several studies developed models to find optimal streams, TCP buffer size for GridFTP – Buffer size not needed with TCP autotuning
Major difference - attempt to model GridFTP throughput based on end-to-end behavior– End-system load, destinations’ capabilities, concurrent transfers
Many studies on bandwidth allocation at router – Our focus is application-level control
Summary
Understand performance of WAN transfers Control bandwidth allocation at FTP level Transfers between major supercomputing centers Concurrency powerful than parallelism Models to help control bandwidth allocation Log models that combine total source CC, destination CC, and
a measure of external load are effective Methods that utilize both recent and historical experimental
data better at estimating external load
Questions