TITRE SUR 1 OU 2 LIGNES MAXIMUM
Trace Generation to Simulate Large Scale Distributed ApplicationOlivier Dalle, Emiio P. ManciniMar. 8th, 20121Mar. 8th, 2012O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 2OutlineIntroduction
The trace collection
The hierarchical architecture
The components
An example
ConclusionMar. 8th, 2012O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 3IntroductionMost distributed systems, as the Grids, offer massively parallel but loosely coupled resources: an accurate applications model can help the scheduling decisionsSimulators of parallel and distributed applications need accurate model of application behavior: but the size of the traces for long running parallel applications tends to explodeMar. 8th, 2012O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 4IntroductionOne solution is to buffer data locally, gathering them after the end of the program (post-mortem): there is some scalability issue
We need to minimize the perturbation: the instrumentation compete with the application for the systems resources.Scalabiliy issue: Wolf Mohr 2006Parlare dellautonomic4IntroductionA distributed application is composed by a set of cooperating tasksThe connection between them are in general not homogenousNetworks may present some hierarchy, e.g. fat trees, multi switch hops ...Can we exploit that hierarchy on the trace generation/instrumentation purposes?O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 5The Trace Collection: a Simplified SchemaO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 6
The classical computational cluster execution model:Several task on several nodes (e.g., MPI)CPUCPUGPUGPUCoreCoreCoreCoreGPU, Multithreads, multiprocess6The Trace Collection: a Simplified SchemaO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 7
We need to measure some parameters on each task, collect local data, and gather them.The Trace Collection: a Simplified SchemaO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 8
We gather the data hierarchically, using local collectors, eventually making local decimations or pre-elaborations. We use the locality principle.In a Grid it is common to have a low quality connecting link between the V.O. sitesIn HPC the bandwidth of upper levels is shared between more hosts than lower levelsThe Trace CollectionO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 9
1. Infrastructure initialization2. Execution with instrumentation
Environment update (e.g., LD_PRELOAD)Middleware launcher (e.g., mpiexec, qsub )3. Data collection
Overhead estimation Events measurement4. Processing and PropagationDecimationCompressionBuffering5. Trace generationData collectionPost-processingSimulators trace generationThe architectureO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 10
Every subtree can join or leave at will10The sensorsO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 11
The sensors:
Instrument the applications tasks
Compute the instrumentations overhead
Collect the raw data
Send them to the first level collectorsThe sensorsO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 12
We assume the system to be heterogeneous
Every sensor makes an overhead analysis
Then it propagates the information to the management unitThe collectorsDec. 14th, 2011O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 13
The collectors gather data from sensors and from other collectors
Buffer incoming data
Process collected data before sending them to upper levelsDecimationCompressionThe Management UnitO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 14
Launches the collector daemons
Launches the application
Gather the data from the top collector
Convert and store the data in the required format
Managed with scripts or graphical interfaceAn Example of Data CollectionO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 15We are interested to analyze the I/O of a parallel synthetic benchmark
We want to check the overhead
The benchmark is a MPI application of n tasks
Every task runs on a different node and writes random data on the local file system
An Example of Data Collection:O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 16We use the management unit to:
Create a hierarchical schema
Create the MPI launch scripts
Launch the collectors and the instrumented application
Collect the results
$MPIDIR/mpiexec.hydra \-env LD_PRELOAD \$DTDIR/libdt_sensor.so \ $HOME/bench/benchmpiexecqsub
ConclusionO.Dalle, E.P. Mancini - Trace generation for large scale distributed applications- 17Collecting large traces in distributed systems may perturb the applications execution.
We presented a system that efficiently collects traces at run-time or post-mortem.
We use a hierarchical schema matching the network links capacity, with distributed buffering and processing
Future improvement will include the automatic discovery of the network topologyThank you