javaseis parallel arrays javaseis data structures synchronous parallel model arrays in java mpijava...

Post on 13-Jan-2016

256 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

JavaSeis Parallel Arrays

• JavaSeis data structures

• Synchronous parallel model

• Arrays in Java

• mpiJava

• Parallel Distributed Arrays

• Examples

JavaSeis Data Structures

• View data as N-dimensional array

• Most common view is a series of 3D volumes

• Use generic names for each axis: Sample, Trace, Frame, Volume, Hypercube, …

• Associate “LogicalNames” with each axis:– Time, Offset, X-Line, InLine– Time, Channel, Shot, Swath

JavaSeis Dataset Logical View

Sample

Trace

Frame

Volume

JavaSeis Bounding Box

X-Line

InLine

Parallel Distributed Objects

pdoExec pdoExec pdoExec pdoExec

pdoRead pdoRead pdoRead pdoRead

pdoWrite pdoWrite pdoWrite pdoWrite

Node 0 Node 1 Node 2 Node 3

The Transpose Problem

CPU 0 CPU 1 CPU 2 CPU 3

LocalAccess Remote

Access

Data Parallel Transpose

CPU 0 CPU 1 CPU 2 CPU 3

LocalTile

Transpose

Arrays in Java

• 1D arrays or “arrays of arrays”

• Subarrays and multi-dimensional “views” of a 1D array are not supported by the language

• Subarrays are constructed by passing the full array and an upper and lower bound

• Example reference from Matuszkek, University of Pennsylvania

Design Sources

• NCAR / UCAR NetCDF– University Center for Atmospheric Research

• High Performance Fortran– Ken Kennedy, Rice University 1993

• Colorado School of Mines– Dave Hale, Mines Java Toolkit

• Landmark ProWESS / SeisSpace

• ARCO Parallel Seismic Environment

DistributedArray Class Structure

IMultiArrayMultiArray

DistributedArray

ITransposeTranspose

IParallelContextMPIContext

mpiJavajava.lang.Array TransposeType

Decomposition

“parallel” and “array” packages

• org.javaseis.parallel– IParallelContext, ParallelContext

• Message passing support– Decomposition

• Define decompositions for array dimensions across processors

• org.javaseis.array– IMultiArray, MultiArray

• Containers for Fortran style multidimensional arrays– ITranspose, Transpose, TransposeType

• Transpose operations for Fortran style arrays– DistributedArray

• Extends MultiArray to distribute across processors

MultiArray Design Targets

• 1D Java arrays of primitive elements or Objects • A superimposed "shape" that follows Fortran conventions • Access via "range" triplets (start,end,increment) • Ranges for Java zero based indexing or Fortran 1 to N

based indexing • Access to the "native" storage array for more arbitrary

access • Array "elements" can have multiple values (i.e. complex,

multi-component) • Designed to be extended to provide JavaSeis

DistributedArrays• Allow use of other array utility classes (java.util.Arrays,

edu.mines.jtk.dsp.Array)

Transpose Operations

• TransposeType– Java “enum” that defines the set of available

transpose operations (i.e. T312, T1243, T21)

• ITranspose, Transpose– Interface and “pure java” implementation– In-place 2D transpose is the basic operation– Extended to “132” transpose for 3D arrays– Combinations yield full set of 3D transposes– A single “1243” transpose provided for 4D

Message Passing

• IParallelContext– Interface for the minimal set of message

passing needed to support JavaSeis Parallel Arrays

– Send, Receive, getSize, getRank– Barrier, Broadcast (optional)– Shift, Transpose, BinaryTree built from the

above– Init and Finish

MPI for Java

• mpiJava from Syracuse University (NPAC) selected for SeisSpace

• Java wrappers for native MPI calls

• Support for sending serialized objects

• MPIContext implements IParallelContext

• MPICH for native methods

• Mpirun –np 16 –machinefile machines.txt java ClassName arguments

DistributedArray

• Extends MultiArray• Requires IParallelContext for constructor• Adds distributed tiled transpose (ttran)• Last dimension is spread across processors (

Decomposition, BLOCK or CIRCULAR)• Transpose operations support arbitrary

distribution of a single dimension• Multiple decompositions possible but not

currently supported

Decomposition

• Design concept from High Performance Fortran• Default decomposition is BLOCK

– Allocates a fixed number of array indices per node– Remainder is “pushed” to the edge, NOT evenly

allocated– May result in zero elements on high rank nodes– Simple start,end indexing with stride 1

• CIRCULAR decomposition– Round robin allocation– Remainder spread across nodes– Good for load balancing– Permutation logic required to keep track of indices

BLOCK vs CIRCULAR Decomposition: 13 array indices on 4 nodes

0

0.5

1

1.5

2

2.5

3

3.5

4

0 1 2 3

Block

0

0.5

1

1.5

2

2.5

3

3.5

4

0 1 2 3

Circ

1:4:1 5:8:1 9:12:1 13:13:1 1:13:4 2:10:4 3:11:4 4:12:4

0 1 2 3Time

InLine

X-Line

0 1 2 3X-Line

Frequency

InLine

Transform - Transpose Pattern

Transform - Transpose Pattern// Create a 3D distributed arrayDistributedArray a = new

DistributedArray( Seis.getParallelContext(), 3, float.class, new int {512,256,128}, Decomposition.BLOCK );

// Transform x axis of an array in xyz ordercomputeTransform1D( a );// Transpose to yxza.tran213();// Transform y axiscomputeTransform1D( a );// Transpose to zyxa.tran132();// Transform z axiscomputeTransform1D( a );// Transpose back to xyza.tran321();

Distributed Array Padding

• Decomposition will likely have a remainder that requires padding

• Constructor allocates an array that accounts for padding

• Use constructor with an array of Decomposition’s if transpose operations will be used

• Index and range methods only traverse the “live” section of the array

Distributed Array Padding

LiveSection

DecompositionPadding

PaddedArray

PartialArray

Section

Planned Additions

• Support for other patterns:– Transpose-Reduce– Transpose-Overlap

• Arrays of Arrays – optional variable lengthfloat[][]a = new float[][10];

for (int i=0; i<10; i++)

a[i] = new float[i];

• Parallel Sorting– Requires variable length “array of arrays”

PDO ( x, y | f )

PDO ( x, y | n )

PDO ( x, n | y )

Reduce - Transpose Pattern

The Overlap-Transpose Pattern

0 1 2 3 0 1 2 3 0 1 2 3

Distributed array Overlap-ExpandLocally

Transpose toDistributed Overlap

Sort

Parallel Data Sort

Parallel Data Sort

Variablelength

Transpose and resort within tile

Blockparalleloutput

top related