hpf (high performance fortran)
Post on 25-Feb-2016
47 Views
Preview:
DESCRIPTION
TRANSCRIPT
HPF (High Performance Fortran)
What is HPF?
• HPF is a standard for data-parallel programming.
• Extends Fortran-77 or Fortran-90.• Similar extensions exist for C and C++, but
Fortran is really the focus.
Principle of HPF
• Extending sequential language with data distribution directives.
• Data distribution directives specify on which processor a certain part of an array should reside.
• Compiler then produces:– parallel program,– communication between the processes.
What the Standard Says
• Can be used with both Fortran-77 and Fortran-90.
• Distribution directives are just a hint, compiler can ignore them.
• HPF can be used on both shared memory and distributed memory hardware platforms.
In Commercial Use
• HPF is always used with Fortran-90.• Distribution directives are a must.• HPF used on both shared memory and
distributed memory platforms.• But the truth is that the language was really
meant for distributed memory platforms.
Not to Confuse You
• We will discuss commercial use:– Fortran-90– Concurrency extensions to Fortran-90 in HPF.– HPF data distribution directives.– How HPF maps to a distributed memory
platform.• Afterwards, we will discuss what the
standard allows in addition.
Fortran-90
• Fortran + a number of array features.• Scalar operations are extended to arrays.• Intrinsic functions are extended to arrays.• Additional array-based intrinsic functions.
Array Assignment
Scalar assignment:integer a, b, ca = b + c
Array assignment:integer A(10,10), B(10,10), C(10,10)A = B + C
Requirements for Array Assignment
• Arrays must be comformable– have the same number of dimensions, and– have the same size in each dimension.
• One major exception for scalar is allowed:integer A(10,10), B(10,10), cA = B + c
Intrinsic Functions Extended to Arrays
integer A(10,10), B(10,10)
A = SQRT(A)B = ABS(A)
Additional Array Intrinsic Functions
• MAXVAL, MINVAL• MAXLOC, MINLOC
– return array of indices• SUM, PRODUCT• MATMUL, DOT_PRODUCT,
TRANSPOSE
Examples
real A(100,100), B(100), sint i(1), j(2)
s = SUM(A)i = MAXLOC(B)j = MINLOC(A)C = DOT_PRODUCT(B, A)
Array Sections
array( lower_bound : upper_bound : stride )• Refers to the section of the array between
lower_bound and upper_bound, with an optional stride specified.
• Multiple dimensions may be specified, with the obvious meaning.
• Array sections may be used wherever arrays may be used.
Examples
int A(10), B(10), C(10)int D(50), E(100), F(100)int maxint G(100), H(100,100)
A(1:8) = B(1:8) + C(2:9)D = E(1:100:2) + F(2:99:2)max = MAXVAL( G(1:100:10) )max = MINVAL( H(1:100, 1:50) )
Semantics of Array Assignments
• First, the entire right hand side is evaluated.• Then, assignments are made to the left hand
side.
Example
int A(4) = {7, 8, 12, 14}A(2:3) = A(1:2)
=> results in A being {7, 7, 8, 14}
=> not {7, 7, 7, 14}
Sequential/Parallel Fortran-90
• Fortran-90 is a sequential language.• However, its array assignment semantics
makes it easy to parallelize it (automatically).
Not Perfect, Though (1 of 2)
do i = 1,100X(i,i) = 0.0;
enddo
• Obviously parallelizable.• Not expressible as a Fortran-90 array
assignment (only regular sections).
Not Perfect, Though (2 of 2)
int D(50), E(100), F(100)D = E(1:100:2) + F(2:99:2)
is correct, butint D(100), E(100), F(100)D = E(1:100:2) + F(2:99:2)
is not, because array D is not conformable.
HPF: Additional Expressions of Parallelism
• FORALL array assignment.• INDEPENDENT construct.
FORALL Array Assignment
FORALL( subscript = lower_bound : upper_bound : stride, mask) array-assignment
• Execute all iterations of the subscript loop in parallel for the given set of indices, where mask is true.
• May have multiple dimensions.• Same semantics: first compute right hand side,
then assign to left hand side.• Only one assignment to particular element (not
checked by the compiler!).
Examples (1 of 3)
do i = 1,100X(i,i) = 0.0
enddo
becomesFORALL(i=1:100) X(i,i) = 0.0
Examples (2 of 3)
int D(100), E(100), F(100)D = E(1:100:2) + F(2:100:2)
becomes (correctly)FORALL(i=1:50) D(i) = E(2*i-1) + E(2*i)
Examples (3 of 3)
• A multiple dimension example with use of the mask option.
• Set all the elements of X above the diagonal to the sum of their indices.
FORALL(i=1:100, j=1:100, i<j) X(i,j) = i+j
The INDEPENDENT Clause
!HPF$ INDEPENDENTDO … ENDDO
• Specifies that the iterations of the loop can be executed in any order.
Examples (1 of 2)
!HPF$ INDEPENDENTDO i=1, 100
DO j = 1, 100IF(i.NE.j) A(i,j) = 1.0IF(i.EQ.j) A(i,j) = 0.0
ENDDOENDDO
Examples (2 of 2): Nesting
!HPF$ INDEPENDENTDO i=1, 100
!HPF$ INDEPENDENT DO j = 1, 100
IF(i.NE.j) A(i,j) = 1.0IF(i.EQ.j) A(i,j) = 0.0
ENDDOENDDO
HPF/Fortran-90 Matrix Multiply (1 of 4)
C = MATMUL( A, B )
HPF Matrix Multiply (2 of 4)
C = 0.0FORALL(i=1:n, j=1:n )
C(i,j) = C(i,j) + A(i,k) * B(k,j)
HPF Matrix Multiply (3 of 4)!HPF$ INDEPENDENTDO i=1,n
DO j=1,nC(i,j) = 0.0DO k=1,n
C(i,j) = C(i,j) + A(i,k) * B(k,j)ENDDO
ENDDOENDDO
HPF Matrix Multiply (4 of 4)!HPF$ INDEPENDENTDO i=1,n
!HPF$ INDEPENDENTDO j=1,n
C(i,j) = 0.0DO k=1,n
C(i,j) = C(i,j) + A(i,k) * B(k,j)ENDDO
ENDDOENDDO
HPF/Fortran-90 SOR (1 of 4)
TEMP(1:n,1:n) = 0.25 * ( GRID(1:n,0:n-1) + GRID(1:n,2:n+1) + GRID(0:n-1,1:n) + GRID(2:n+1,1:n) )
GRID(1:n,1:n) = TEMP(1:n,1:n)
HPF/Fortran-90 SOR (1’ of 4)
GRID(1:n,1:n) = 0.25 * ( GRID(1:n,0:n-1) + GRID(1:n,2:n+1) + GRID(0:n-1,1:n) + GRID(2:n+1,1:n) )
Also works, because of array assignment rules
HPF SOR (2 of 4)
FORALL(i=1:n,j=1:n)TEMP(i,j) = 0.25 *
( GRID(i-1,j) + GRID(i+1,j) + GRID(i,j-1) + GRID(i,j+1) )
FORALL(i=1:n,j=1,n)GRID(i,j) = TEMP(i,j)
HPF SOR (3 of 4)!HPF$ INDEPENDENTDO I=1,n
DO j=1,nTEMP(i,j) = 0.25 * ( GRID(i-1,j) + GRID(i+1,j) + GRID(i,j-1) + GRID(i,j+1) )
!HPF$ INDEPENDENTDO I=1,n
DO j=1,nGRID(i,j) = TEMP(i,j)
HPF SOR (4 of 4)
!HPF$ INDEPENDENTDO I=1,n
!HPF$ INDEPENDENT DO j=1,nTEMP(i,j) = 0.25 * ( GRID(i-1,j) + GRID(i+1,j) + GRID(i,j-1) + GRID(i,j+1) )
!HPF$ INDEPENDENTDO I=1,n
!HPF$ INDEPENDENT DO j=1,nGRID(i,j) = TEMP(i,j)
top related