a framework for particle advection for very large data hank childs, lbnl/ucdavis david pugmire, ornl...

A Framework for Particle Advection for

Very Large DataHank Childs, LBNL/UCDavisDavid Pugmire, ORNL

Christoph Garth, Kaiserslautern

David Camp, LBNL/UCDavisSean Ahern, ORNL

Gunther Weber, LBNLAllen Sanderson, Univ. of

Particle advection is a foundational visualization algorithm

• Advecting particles creates integral curves

Particle advection is the duct tape of the visualization world

Advecting particles is essential to understanding flow and other

phenomena (e.g. magnetic fields)!

Outline

A general system for particle-advection based analysis

Efficient advection of particles

Outline

Efficient code for a variety of particle advection workloads and techniques

Cognizant of use cases from 1 particle to >>10K particles. Need handling of every particle, every

evaluation to be efficient. Want to support many diverse flow

techniques: flexibility/extensibility is key.

Fit within data flow network design (i.e. a filter)

Design

PICS filter: parallel integral curve system Execution:

Instantiate particles at seed locations Step particles to form integral curves

Analysis performed at each step Termination criteria evaluated for each step

When all integral curves have completed, create final output

Design

Five major types of extensibility: How to parallelize? How do you evaluate velocity field? How do you advect particles? Initial particle locations? How do you analyze the particle paths?

Inheritance hierarchy

avtPICSFilter

Streamline Filter

Your derived type of PICS

filter

avtIntegralCurve

avtStreamlineIC

Your derived type of integral

We disliked the “matching inheritance” scheme, but this achieved all of our design goals cleanly.

#1: How to parallelize?

avtICAlgorithm

avtParDomIC-Algorithm

(parallel over data)

avtSerialIC-Algorithm

(parallel over seeds)

avtMasterSlave-

ICAlgorithm

#2: Evaluating velocity field

avtIVPField

avtIVPVTKField

avtIVPVTK- TimeVarying-

avtIVPM3DC1 Field

avtIVP-<YOUR>HigherOrder-Field

IVP = initial value problem

#3: How do you advect particles?

avtIVPSolver

avtIVPDopri5 avtIVPEuleravtIVPLeapfr

avtIVP-M3DC1Integrato

IVP = initial value problem

avtIVPAdams-Bashforth

#4: Initial particle locations

avtPICSFilter::GetInitialLocations() = 0;

#5: How do you analyze particle path? avtIntegralCurve::AnalyzeStep() = 0;

All AnalyzeStep will evaluate termination criteria avtPICSFilter::CreateIntegralCurveOutput(

std::vector<avtIntegralCurve*> &) = 0;

Examples: Streamline: store location and scalars for current

step in data members Poincare: store location for current step in data

members FTLE: only store location of final step, no-op for

preceding steps NOTE: these derived types create very

different types of outputs.

Putting it all together

PICS Filter

avtICAlgorithm

avtIVPSolver

avtIVPFieldVector<

avtIntegral-Curve>

Integral curves sent to other processors with some derived types of avtICAlgorithm.

::CreateInitialLocations() = 0;

::AnalyzeStep() = 0;

http://www.visitusers.org/index.php?title=Pics_dev

Outline

Advecting particles

Decomposition of large data set into blocks on filesystem

What is the right strategy for getting particle and data together?

Strategy: load blocks necessary for advection

Go to filesystem and read block

Strategy: load blocks necessary for advection

“Parallelize over Particles”

Basic idea: particles are partitioned over PEs, blocks of data are loaded as needed.

Positives: Indifferent to data size Trivial parallelization (partition particles over

processors) Negative:

Redundant I/O (both across PEs and within a PE) is a significant problem.

“Parallelize over data” strategy:parallelize over blocks and pass particles

PE1 PE2

PE4PE3

“Parallelize over Data”

Basic idea: data is partitioned over PEs, particles are communicated as needed.

Positives: Only load data one time

Negative: Starvation!

Contracts are needed to enable these different processing techniques

Parallelize-over-seeds One execution per block Only limited data available at one time

Parallelize-over-data One execution total Entire data set available at one time

Both parallelization schemes have serious flaws.

Two approaches:

Parallelizing Over I/O EfficiencyData Good BadParticles Bad Good

Parallelizeover particles

Parallelizeover dataHybrid algorithms

The master-slave algorithm is an example of a hybrid technique.

Uses “master-slave” model of communication One process has unidirectional control over

other processes Algorithm adapts during runtime to avoid

pitfalls of parallelize-over-data and parallelize-over-particles. Nice property for production visualization tools. (Implemented in VisIt)

D. Pugmire, H. Childs, C. Garth, S. Ahern, G. Weber, “Scalable Computation of

Streamlines on Very Large Datasets.” SC09, Portland, OR, November, 2009

Master-Slave Hybrid Algorithm• Divide PEs into groups of N

• Uniformly distribute seed points to each group

Master:- Monitor workload- Make decisions to optimize resource

utilization

Slaves:- Respond to commands from

Master- Report status when work

complete

SlaveSlaveSlave

Master

SlaveSlaveSlave

Master

SlaveSlaveSlave

Master

SlaveSlaveSlave

MasterP0P1P2P3

P4P5P6P7

P8P9P10P11

P12P13P14P15

Master Process Pseudocode

Master(){ while ( ! done ) { if ( NewStatusFromAnySlave() ) { commands = DetermineMostEfficientCommand()

for cmd in commands SendCommandToSlaves( cmd ) } }}

What are the possible commands?

Commands that can be issued by master

Master Slave

Slave is given a particle that is contained in a block that is already loaded

1. Assign / Loaded Block

2. Assign / Unloaded Block

3. Handle OOB / Load

4. Handle OOB / Send

OOB = out of bounds

Master Slave

Slave is given a particle and loads the block

OOB = out of bounds

Master Slave

Slave is instructed to load a block. The particle advection in that block can then proceed.

OOB = out of bounds

Master Slave

Send to J

Slave J

Slave is instructed to send the particle to another slave that has loaded the block

OOB = out of bounds

Master Process Pseudocode

Master(){ while ( ! done ) { if ( NewStatusFromAnySlave() ) { commands = DetermineMostEfficientCommand()

for cmd in commands SendCommandToSlaves( cmd ) } }}

Driving factors…

You don’t want PEs to sit idle (The problem with parallelize-over-data)

You don’t want PEs to spend all their time doing I/O (The problem with parallelize-over-

particles) you want to do “least amount” of I/O

that will prevent “excessive” idleness.

Heuristics

If no Slave has the block, then load that block!

If some Slave does have the block… If that Slave is not busy, then have it

process the particle. If that Slave is busy, then wait “a while.” If you’ve waited a “long time,” have another

Slave load the block and process the particle.

Master-slave in action

Iteration

Action

0 S0 reads B0,S3 reads B1

1 S1 passes points to S0,S4 passes points to S3,S2 reads B0

0: Read

0: Read1: Pass

1: Pass1: Read

Algorithm Test Cases

- Core collapse supernova simulation- Magnetic confinement fusion simulation- Hydraulic flow simulation

Workload distribution in parallelize-over-data

Starvation

Workload distribution in parallelize-over-particles

Too much I/O

Workload distribution in master-slave algorithm

Just right

ParticlesData Hybrid

Workload distribution in supernova simulation

Parallelization by:

Colored by PE doing integration

Astrophysics Test Case: Total time to compute 20,000 Streamlines

Number of procs Number of procs

Uniform Seeding

Non-uniform Seeding

Data HybridParticles

Astrophysics Test Case: Number of blocks loaded

Number of procs Number of procs

Data Hybrid

Uniform Seeding

Non-uniform Seeding

Particles

Summary: Master-Slave Algorithm

First ever attempt at a hybrid algorithm for particle advection

Algorithm adapts during runtime to avoid pitfalls of parallelize-over-data and parallelize-over-particles. Nice property for production visualization tools.

Implemented inside VisIt visualization and analysis package.

Final thoughts…

Summary: Particle advection is important for

understanding flow and efficiently parallelizing this computation is difficult.

We have developed a freely available system for doing this analysis for large data.

Documentation: (PICS)

http://www.visitusers.org/index.php?title=Pics_dev

(VisIt) http://www.llnl.gov/visit Future work:

UI extensions, including Python Additional analysis techniques (FTLE & more)

Acknowledgements

Funding: This work was supported by the Director, Office of Science, Office and Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 through the Scientific Discovery through Advanced Computing (SciDAC) program's Visualization and Analytics Center for Enabling Technologies (VACET).

Program Manager: Lucy Nowell Master-Slave Algorithm: Dave Pugmire (ORNL),

Hank Childs (LBNL/UCD), Christoph Garth (Kaiserslautern), Sean Ahern (ORNL), and Gunther Weber (LBNL)

PICS framework: Hank Childs (LBNL/UCD), Dave Pugmire (ORNL), Christoph Garth (Kaiserslautern), David Camp (LBNL/UCD), Allen Sanderson (Univ of Utah)

A Framework for Particle Advection for

Very Large DataHank Childs, LBNL/UCDavisDavid Pugmire, ORNL

Christoph Garth, Kaiserslautern

David Camp, LBNL/UCDavisSean Ahern, ORNL

Gunther Weber, LBNLAllen Sanderson, Univ. of

Thank you!!

a framework for particle advection for very large data hank childs, lbnl/ucdavis david pugmire, ornl...

advection slide

filter slide

time slide

block slide

blocks of data

utah slide

data strategy

final output slide

Documents

aubrey pugmire portfolio

iphc-lbnl phone conference news and pxl sensor (ultimate)...

by: liz pugmire ammon schafer lyndsey ravsten camille...

bigboss science overview uros seljak lbnl/uc berkeley lbnl,...

chapter 6 - anl.govtpeterka/papers/pugmire-hpv12...david...

cuore clean room robin lafever lbnl lngs 9/18/2015robin...

encounters with enoch coffin by w.h. pugmire and jeffrey...

ucdavis sticky story

lbnl user testbed facility › community › cag ›...

biomet ucdavis edu frostprotection principles of frost...

emergency & massive transfusion brian poirier, md ucdavis...

lbnl enterprise computing

lbnl ccd packaging “yale mount” mechanical analysis dan...

sourav chatterji uc davis genome center schatterji@ucdavis

s.kwiatkowski lbnl als rf group contributors k.baptiste, j....

encounters with enoch coffin by jeffrey thomas and w.h....

2013 ucdavis-smbe-eukaryotes

john b rundle distinguished professor, university of...

cyberinfrastructure at ucdavis: creating a dialogue...

abo blood groups brian poirier, md ucdavis medical center 1