Download - On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data
![Page 1: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/1.jpg)
On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data
Janine Bennett1
William McLendon III1
Guarav Bansal2
Peer-Timo Bremer3
Jacqueline Chen1
Hemanth Kolla1
1Sandia National Laboratories, 2Intel, 3Lawrence Livermore National Laboratory
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
Approved for Unlimited Unclassified Release, SAND # 2012-9242 C
![Page 2: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/2.jpg)
HPC resources generate large, complex, multivariate data sets
Details: Lifted Ethylene Jet– 1.3 billion grid points– 22 chemical species, vector, & particle data– 7.5 million cpu hours on 30,000 processors– 112,500 time steps (data stored every 375th)– 240 TB of raw field data + 50 TB particle data
Recent data sets generated by S3D, developed at the Combustion Research Facility, Sandia National Laboratories
Efficiently characterizing & tracking intermittent features defined by multiple variables poses significant research challenges!
![Page 3: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/3.jpg)
Our contribution: a framework for characterizing complex events in large-scale multivariate data
• Introduce attributed relational graphs (ARGs) as an efficient encoding scheme for relationships between spatial features – Defined by multiple variables – Spanning an arbitrary number of time steps– Representation achieves drastic data reductions
• Provide a mechanism for querying ARGs – Identify events conditioned on a variety of metrics
• Demonstrate results on large-scale combustion simulation data
![Page 4: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/4.jpg)
Related work
Topology: Segment domain into features according to function behavior Level-set behavior: Reeb graph, contour tree, and variants
[Carr et al. 2003, Pascucci et al. 2007, Mascarenhas et al 2006, van Krevald et al 2004]
Gradient behavior: Morse and Morse-Smale Complex [Edelsbrunner 2003, Gyulassy et al 2007, 2008, Gunther et al 2011]
Multivariate feature analysis: Many correlation-based feature definitions[Gosink et al 2007, Chen et al 2011, Jaenicke et al 2007, Sauber et al 2006, Schneider et al 2008, Bennett et al 2011]
Feature tracking graphs: Capture spatial-temporal relationships [Edelsbrunner et al 2004, Bremer et al 2010, Muelder et al 2009, Widanagamaachchi et al 2012]
Graph search algorithms: Identify patterns in large-scale graphs[Barret et al 2007, Berry et al 2007, Gregor et al 2005, Siek et al 2002]
MTGL
![Page 5: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/5.jpg)
What is an attributed relational graph (ARG)?
• ARG nodes correspond to spatial features– Each ARG node encodes
• Feature type• Time step• Optional per feature statistics
• ARG edges encode relationship between features– Spatial overlap metric – Supports feature tracking over time
![Page 6: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/6.jpg)
ARG Nodes: Segment domain into relevant features
• Many options for segmenting the domain into features• Often features of interest are defined by a threshold around minima or maxima
of a particular variable
x
yf
![Page 7: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/7.jpg)
ARG Nodes: Merge trees encode features of interest defined by a single variable for a range of thresholds
x
yf
Tree encodes behavior as sweep of function values is performed from maximum to minimum of range of interest
![Page 8: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/8.jpg)
ARG Nodes: Merge trees encode features of interest defined by a single variable for a range of thresholds
x
yf
Tree encodes behavior as sweep of function values is performed from maximum to minimum of range of interest
![Page 9: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/9.jpg)
ARG Nodes: Merge trees encode features of interest defined by a single variable for a range of thresholds
x
yf
Tree encodes behavior as sweep of function values is performed from maximum to minimum of range of interest
![Page 10: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/10.jpg)
ARG Nodes: Merge trees encode features of interest defined by a single variable for a range of thresholds
x
yf
Tree encodes behavior as sweep of function values is performed from maximum to minimum of range of interest
![Page 11: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/11.jpg)
ARG Nodes: Merge trees encode features of interest defined by a single variable for a range of thresholds
x
yf
Tree encodes behavior as sweep of function values is performed from maximum to minimum of range of interest
![Page 12: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/12.jpg)
ARG Nodes: Merge trees encode features of interest defined by a single variable for a range of thresholds
x
yf
Tree encodes behavior as sweep of function values is performed from maximum to minimum of range of interest
![Page 13: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/13.jpg)
ARG Nodes: Refine the tree to increase granularity of possible segmentations
x
yf
![Page 14: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/14.jpg)
ARG Nodes: Features are defined as all sub-trees above a user-specified threshold
x
yf
x
yf
![Page 15: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/15.jpg)
x
yf
ARG Edges: An overlap-based metric is used to encode feature behavior over time
t = 1
t = 2
t = 3
t = 4
![Page 16: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/16.jpg)
t = 1
t = 2
t = 3
t = 4
ARG Edges: The same metric is used to encode relationships between different types of features
![Page 17: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/17.jpg)
t = 1
t = 2
t = 3
t = 4
ARG Edges: Relationships can span multiple time steps
![Page 18: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/18.jpg)
ARG Edges: Edge labels indicate degree of overlap between associated features
25 11
![Page 19: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/19.jpg)
multi-way co-occurrence
Once the ARG is constructed, we can search for patterns of interest
co-occurrence time-lag features
![Page 20: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/20.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
• MTGL: Multi-Threaded Graph Library– Open source software– https://software.sandia.gov/trac/mtgl
• Given ARG and template– Filter: Remove all edges in ARG that cannot belong to template– Match: Find all possible template matches in filtered ARG
![Page 21: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/21.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template patternTemplate walk
![Page 22: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/22.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template walk
![Page 23: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/23.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template walk
![Page 24: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/24.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template walk
![Page 25: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/25.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template walk
![Page 26: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/26.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template walk
![Page 27: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/27.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template walk
![Page 28: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/28.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template walk
![Page 29: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/29.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template walk
![Page 30: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/30.jpg)
Searches are performed using a two-phase subgraph isomorphism heuristic: filtering & matching
ARG
Template walk
![Page 31: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/31.jpg)
Case study: identification of deflagration fronts in HCCI combustion data
• Turbulent auto-ignitive mixture of Di-Methyl Ether under homogeneous charge compression ignition (HCCI) conditions
• Deflagration fronts: spatially collocated extrema of chemical reaction rates and diffusive fluxes
Reaction rate of OH Diffusion of OH
![Page 32: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/32.jpg)
Feature family
Structure geometries
Hierarchy &statistics
temperature 4.0 GB 319 MBdiffusion OH 3.6 GB 11 MB
reaction rate OH 4.3 GB 534 MB
• Raw output data size: 78.2 GB (grid size = 560 x 560 x 560)– 703 MB/variable * 6 variables for 19 time steps
• Meta-data: computed in parallel on ORNL’s Lens system– 3 feature families:
• Each encoding size, minimum, maximum, mean, and variance of 6 different variables
• Data dependent costs O(minutes) per time step– Structure geometries only needed for ARG
construction (not queries)– Size of ARG: 504 KB
• Under 1GB required for fully flexible exploration and search on commodity hardware– O(seconds) for searches
Case study: ARG representation encodes complex relationships very compactly
![Page 33: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/33.jpg)
template
connected components
14 6 2
nodes 487 310 352
edges 909 620 1005
Case study: Searching the ARG
A subset of the deflagration fronts identified
A subset of the full ARG (full size is 6563 nodes and 8903 edges)
![Page 34: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/34.jpg)
Conclusion & future work
• Introduced attributed relational graphs (ARGs) as an efficient encoding scheme for relationships between spatial features
• Provided a mechanism for querying ARGs • Demonstrated results on large-scale combustion simulation data• Some domain knowledge required to construct ARG
– Which variables define features of interest– Range of potential time-lags between features
• Opportunities for future work– GUI tool for specifying search template patterns
• Leveraging per-feature statistics in queries– Linked views of ARG, search results, domain visualization– Dynamic ARGs
• Don’t require feature thresholds to be specified in advance• Instead these are runtime parameters to be explored
![Page 35: On the use of Graph Search Techniques for the Analysis of Extreme-scale Combustion Simulation Data](https://reader036.vdocument.in/reader036/viewer/2022062411/56813e2a550346895da80a8b/html5/thumbnails/35.jpg)
Questions?
Janine Bennett [email protected]
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.