Download - Pipeline and Batch Sharing in Grid Workloads
![Page 1: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/1.jpg)
Douglas Thain, John Bent, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau,
and Miron LivnyADSL and Condor Projects
6 May 2003
Pipeline and Batch Sharing
in Grid Workloads
![Page 2: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/2.jpg)
www.cs.wisc.edu/condor
Goals
› Study diverse range of scientific apps Measure CPU, memory and I/O demands
› Understand relationships btwn apps Focus is on I/O sharing
![Page 3: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/3.jpg)
www.cs.wisc.edu/condor
Batch-Pipelined workloads
› Behavior of single applications has been well studied sequential and parallel
› But many apps are not run in isolation End result is product of a group of apps Commonly found in batch systems Run 100s or 1000s of times
› Key is sharing behavior btwn apps
![Page 4: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/4.jpg)
www.cs.wisc.edu/condor
Batch-Pipelined Sharing
Pip
elin
e
Batch width
Shared dataset
Pipeline sharing
Shared dataset
![Page 5: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/5.jpg)
www.cs.wisc.edu/condor
3 types of I/O
› Endpoint: unique input and output
› Pipeline: ephemeral data
› Batch: shared input data Shared
dataset
![Page 6: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/6.jpg)
www.cs.wisc.edu/condor
Outline
› Goals and intro
› Applications
› Methodology
› Results
› Implications
![Page 7: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/7.jpg)
www.cs.wisc.edu/condor
Six (plus one) target scientific applications
› BLAST - biology
› IBIS - ecology
› CMS - physics
› Hartree-Fock - chemistry
› Nautilus - molecular dynamics
› AMANDA -astrophysics
› SETI@home - astronomy
![Page 8: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/8.jpg)
www.cs.wisc.edu/condor
Common characteristics
› Diamond-shaped storage profile
› Multi-level working sets logical collection may be greater than
that used by app
› Significant data sharing
› Commonly submitted in large batches
![Page 9: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/9.jpg)
www.cs.wisc.edu/condor
BLAST
search string
blastp
matches
genomic database
BLAST searches for matching proteins and nucleotides in a genomic database. Has only a single executable and thus no pipeline sharing.
![Page 10: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/10.jpg)
www.cs.wisc.edu/condor
IBIS
inputs
analyze
forecast
climate data
IBIS is a global-scale simulation of earth’s climate used to study effects of human activity (e.g. global warming). Only one app thus no pipeline sharing.
![Page 11: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/11.jpg)
www.cs.wisc.edu/condor
CMSconfiguration
cmkin
raw events
geometry
CMS is a two stage pipeline in which the first stage models accelerated particles and the second simulates the response of a detector. This is actually just the first half of a bigger pipeline.
cmsim
triggered events
configuration
![Page 12: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/12.jpg)
www.cs.wisc.edu/condor
Hartree-Fockproblem
setup
initial state HF is a three stage simulation of the non-relativistic interactions between atomic nuclei and electrons. Aside from the executable files, HF has no batch sharing.
argos
integral
scf
solutions
![Page 13: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/13.jpg)
www.cs.wisc.edu/condor
Nautilusinitial state
nautilus
intermediate Nautilus is a three stage pipeline which solves Newton’s equation for each molecular particle in a three-dimensional space. The physics which govern molecular interactions is expressed in a shared dataset. The first stage is often repeated multiple times.
bin2coord
coordinates
rasmol
visualization
physics
![Page 14: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/14.jpg)
www.cs.wisc.edu/condor
AMANDAinputs
corsika
raw eventsAMANDA is a four stage astrophysics pipeline designed to observe cosmic events such as gamma-ray bursts. The first stage simulates neutrino production and the creation of muon showers. The second transforms into a standard format and the third and fourth stages follow the muons’ paths through earth and ice.
corama
standard events
mmc
noisy events
physics
mmc
triggered events
ice tables
geometry
![Page 15: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/15.jpg)
www.cs.wisc.edu/condor
SETI@home
work unit
setiathome
analysis
SETI@home is a single stage pipeline which downloads a work unit of radio telescope “noise” and analyzes it for any possible signs that would indicate extraterrestrial intelligent life. Has no batch data but does have pipeline data as it performs its own checkpointing.
![Page 16: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/16.jpg)
www.cs.wisc.edu/condor
Methodology
› CPU behavior tracked with HW counters
› Memory tracked with usage statistics
› I/O behavior tracked with interposition mmap was a little tricky
› Data collection was easy. Running the apps was challenge.
![Page 17: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/17.jpg)
www.cs.wisc.edu/condor
Absolute Resources Consumed
0
5
10
15
20
25
30
SETI
BLAST
IBIS
CMS HF
Nautilu
s
AMANDA
Tim
e (
ho
urs
)
0
1000
2000
3000
4000
5000
Me
mo
ry,
I/O (
MB
)
Real time
Memory
I/O
• Wide range of runtimes. Modest memory usage.
![Page 18: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/18.jpg)
www.cs.wisc.edu/condor
Absolute I/O Mix
0.1
1
10
100
1000
10000
Tra
ffic
(M
B)
Endpoint
Pipeline
Batch
•Only IBIS has significant ratio of endpoint I/O.
![Page 19: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/19.jpg)
www.cs.wisc.edu/condor
Relative I/O Mix
0
1
2
3
4
5
6
7
8
Ba
nd
wid
th (
MB
/s)
Endpoint
Pipeline
Batch
Total
• Modest BW requirements. Max is < 8.
![Page 20: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/20.jpg)
www.cs.wisc.edu/condor
Observations about individual applications
› Modest buffer cache sizes sufficient Max is AMANDA, needs 500 MB
› Large proportion of random access IBIS, CMS close to 100%, HF ~ 80%
› Amdahl and Gray balances skewed Drastically overprovisioned in terms of
I/O bandwidth and memory capacity
![Page 21: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/21.jpg)
www.cs.wisc.edu/condor
Observations about workloads
› These apps are NOT run in isolation Submitted in batches of 100s to
1000s
› Large degree of I/O sharing Significant scalability implications
![Page 22: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/22.jpg)
www.cs.wisc.edu/condor
Scalability of batch width
Storage center (1500 MB/s)
Commodity disk (15 MB/s)
![Page 23: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/23.jpg)
www.cs.wisc.edu/condor
Batch elimination
Storage center (1500 MB/s)
Commodity disk (15 MB/s)
![Page 24: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/24.jpg)
www.cs.wisc.edu/condor
Pipeline elimination
Storage center (1500 MB/s)
Commodity disk (15 MB/s)
![Page 25: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/25.jpg)
www.cs.wisc.edu/condor
Endpoint only
Storage center (1500 MB/s)
Commodity disk (15 MB/s)
![Page 26: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/26.jpg)
www.cs.wisc.edu/condor
Conclusions
› Grid applications do not run in isolation
› Relationships btwn apps must be understood
› Scalability depends on semantic information Relationships between apps Understanding different types of I/O
![Page 27: Pipeline and Batch Sharing in Grid Workloads](https://reader035.vdocument.in/reader035/viewer/2022062322/568151fd550346895dc03902/html5/thumbnails/27.jpg)
www.cs.wisc.edu/condor
Questions?
› For more information:• Douglas Thain, John Bent, Andrea Arpaci-
Dusseau, Remzi Arpaci-Dusseau and Miron Livny, Pipeline and Batch Sharing in Grid Workloads, in Proceedings of High Performance Distributed Computing (HPDC-12).
– http://www.cs.wisc.edu/condor/doc/profiling.pdf – http://www.cs.wisc.edu/condor/doc/profiling.ps