JavaSeis Parallel Arrays
• JavaSeis data structures
• Synchronous parallel model
• Arrays in Java
• mpiJava
• Parallel Distributed Arrays
• Examples
JavaSeis Data Structures
• View data as N-dimensional array
• Most common view is a series of 3D volumes
• Use generic names for each axis: Sample, Trace, Frame, Volume, Hypercube, …
• Associate “LogicalNames” with each axis:– Time, Offset, X-Line, InLine– Time, Channel, Shot, Swath
JavaSeis Dataset Logical View
Sample
Trace
Frame
Volume
JavaSeis Bounding Box
X-Line
InLine
Parallel Distributed Objects
pdoExec pdoExec pdoExec pdoExec
pdoRead pdoRead pdoRead pdoRead
pdoWrite pdoWrite pdoWrite pdoWrite
Node 0 Node 1 Node 2 Node 3
The Transpose Problem
CPU 0 CPU 1 CPU 2 CPU 3
LocalAccess Remote
Access
Data Parallel Transpose
CPU 0 CPU 1 CPU 2 CPU 3
LocalTile
Transpose
Arrays in Java
• 1D arrays or “arrays of arrays”
• Subarrays and multi-dimensional “views” of a 1D array are not supported by the language
• Subarrays are constructed by passing the full array and an upper and lower bound
• Example reference from Matuszkek, University of Pennsylvania
Design Sources
• NCAR / UCAR NetCDF– University Center for Atmospheric Research
• High Performance Fortran– Ken Kennedy, Rice University 1993
• Colorado School of Mines– Dave Hale, Mines Java Toolkit
• Landmark ProWESS / SeisSpace
• ARCO Parallel Seismic Environment
DistributedArray Class Structure
IMultiArrayMultiArray
DistributedArray
ITransposeTranspose
IParallelContextMPIContext
mpiJavajava.lang.Array TransposeType
Decomposition
“parallel” and “array” packages
• org.javaseis.parallel– IParallelContext, ParallelContext
• Message passing support– Decomposition
• Define decompositions for array dimensions across processors
• org.javaseis.array– IMultiArray, MultiArray
• Containers for Fortran style multidimensional arrays– ITranspose, Transpose, TransposeType
• Transpose operations for Fortran style arrays– DistributedArray
• Extends MultiArray to distribute across processors
MultiArray Design Targets
• 1D Java arrays of primitive elements or Objects • A superimposed "shape" that follows Fortran conventions • Access via "range" triplets (start,end,increment) • Ranges for Java zero based indexing or Fortran 1 to N
based indexing • Access to the "native" storage array for more arbitrary
access • Array "elements" can have multiple values (i.e. complex,
multi-component) • Designed to be extended to provide JavaSeis
DistributedArrays• Allow use of other array utility classes (java.util.Arrays,
edu.mines.jtk.dsp.Array)
Transpose Operations
• TransposeType– Java “enum” that defines the set of available
transpose operations (i.e. T312, T1243, T21)
• ITranspose, Transpose– Interface and “pure java” implementation– In-place 2D transpose is the basic operation– Extended to “132” transpose for 3D arrays– Combinations yield full set of 3D transposes– A single “1243” transpose provided for 4D
Message Passing
• IParallelContext– Interface for the minimal set of message
passing needed to support JavaSeis Parallel Arrays
– Send, Receive, getSize, getRank– Barrier, Broadcast (optional)– Shift, Transpose, BinaryTree built from the
above– Init and Finish
MPI for Java
• mpiJava from Syracuse University (NPAC) selected for SeisSpace
• Java wrappers for native MPI calls
• Support for sending serialized objects
• MPIContext implements IParallelContext
• MPICH for native methods
• Mpirun –np 16 –machinefile machines.txt java ClassName arguments
DistributedArray
• Extends MultiArray• Requires IParallelContext for constructor• Adds distributed tiled transpose (ttran)• Last dimension is spread across processors (
Decomposition, BLOCK or CIRCULAR)• Transpose operations support arbitrary
distribution of a single dimension• Multiple decompositions possible but not
currently supported
Decomposition
• Design concept from High Performance Fortran• Default decomposition is BLOCK
– Allocates a fixed number of array indices per node– Remainder is “pushed” to the edge, NOT evenly
allocated– May result in zero elements on high rank nodes– Simple start,end indexing with stride 1
• CIRCULAR decomposition– Round robin allocation– Remainder spread across nodes– Good for load balancing– Permutation logic required to keep track of indices
BLOCK vs CIRCULAR Decomposition: 13 array indices on 4 nodes
0
0.5
1
1.5
2
2.5
3
3.5
4
0 1 2 3
Block
0
0.5
1
1.5
2
2.5
3
3.5
4
0 1 2 3
Circ
1:4:1 5:8:1 9:12:1 13:13:1 1:13:4 2:10:4 3:11:4 4:12:4
0 1 2 3Time
InLine
X-Line
0 1 2 3X-Line
Frequency
InLine
Transform - Transpose Pattern
Transform - Transpose Pattern// Create a 3D distributed arrayDistributedArray a = new
DistributedArray( Seis.getParallelContext(), 3, float.class, new int {512,256,128}, Decomposition.BLOCK );
// Transform x axis of an array in xyz ordercomputeTransform1D( a );// Transpose to yxza.tran213();// Transform y axiscomputeTransform1D( a );// Transpose to zyxa.tran132();// Transform z axiscomputeTransform1D( a );// Transpose back to xyza.tran321();
Distributed Array Padding
• Decomposition will likely have a remainder that requires padding
• Constructor allocates an array that accounts for padding
• Use constructor with an array of Decomposition’s if transpose operations will be used
• Index and range methods only traverse the “live” section of the array
Distributed Array Padding
LiveSection
DecompositionPadding
PaddedArray
PartialArray
Section
Planned Additions
• Support for other patterns:– Transpose-Reduce– Transpose-Overlap
• Arrays of Arrays – optional variable lengthfloat[][]a = new float[][10];
for (int i=0; i<10; i++)
a[i] = new float[i];
• Parallel Sorting– Requires variable length “array of arrays”
PDO ( x, y | f )
PDO ( x, y | n )
PDO ( x, n | y )
Reduce - Transpose Pattern
The Overlap-Transpose Pattern
0 1 2 3 0 1 2 3 0 1 2 3
Distributed array Overlap-ExpandLocally
Transpose toDistributed Overlap
Sort
Parallel Data Sort
Parallel Data Sort
Variablelength
Transpose and resort within tile
Blockparalleloutput