porting from the cray t3e to the ibm sp

22
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER 1 Porting from the Cray T3E to the IBM SP Jonathan Carter NERSC User Services

Upload: olwen

Post on 24-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Porting from the Cray T3E to the IBM SP. Jonathan Carter NERSC User Services. Overview. Focus is on Fortran programs using MPI for communication Outline common pitfalls: f90 vs. xlf Fortran compiler Cray vs. IBM MPI library Math libraries System libraries I/O. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

1

Porting from the Cray T3E to the IBM SP

Jonathan Carter

NERSC User Services

Page 2: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

2

Overview

• Focus is on Fortran programs using MPI for communication

• Outline common pitfalls: – f90 vs. xlf Fortran compiler

– Cray vs. IBM MPI library

– Math libraries

– System libraries

– I/O

Page 3: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

3

f90 vs. xlf - Main Differences

• f90

– compiles for parallel (MPI) automatically

– accepts file suffix .f90, .F90

– default optimization is -O2

– allows access to full memory on a PE by default

• xlf

– compiler is accessed by several names, each name “packages” options together

– by default, only file suffix .f and .F allowed

– default is no optimization

– restricted amount of memory available by default

Page 4: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

4

xlf Compiler Options

• Compiler name can have three parts:– optional prefix “mp” indicates MPI library is automatically linked

– compiler name, xlf, xlf90, or xlf95 indicates language mode

– optional postfix “_r” indicates threads, or OpenMP capability

• Example:– mpxlf90 - Fortran 90 language compiler with MPI library available

– mpxlf_r - Fortran 77 language compiler with MPI library, threads, and OpenMP capability available.

• If you want to use MPI I/O, the thread capable compiler must be used.

Page 5: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

5

xlf Compiler Options

• To use different file suffixes, e.g. .f90 and .F90:– -qsuffix=f=f90,F=F90

• For optimization we recommend:– -O3 -qtune=pwr3 -qarch=pwr3 -qstrict

• xlf defaults to 32 Kbytes for stack space and 128 Mbyte for heap space. To increase to maximums of 256 Mbyte for stack, and 2 Gbyte for heap:– -bmaxstack:0x10000000 -bmaxstack:0x80000000

Page 6: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

6

Default Datatypes

Type T3ELength (bytes)

SPLength (bytes)

Character 1 1Complex 2 x 8 2 x 4Double Complex 2 x 8 2 x 8Double precision 8 8Integer / Logical 8 4Real 8 4

• Double Complex is a language extension• Assume -dp flag for f90• xlf compiler has -qrealsize=8 to promote all default reals and real constants to 8 bytes. Also, -qintsize=8 to promote all integers and logicals.

Page 7: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

7

Available Datatypes

• Fortran 77 “*” syntax is also available to explicitly define a datatype

Type Kind T3ELength (bytes)

SPLength (bytes)

4 2 x 4 2 x 48 2 x 8 2 x 8

Complex

16 NA 2 x 161 1 42 2 44 4 4

Integer /Logical

8 8 84 4 48 8 8

Real

16 NA 16

Page 8: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

8

MPI Differences

• Different default datatypes between T3E and SP

• More error checking of arguments on the SP

• Default amount of buffering is different

• Different subset of MPI I/O implemented

Page 9: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

9

Available MPI Datatypes

Type T3ELength (bytes)

SPLength (bytes)

MPI_Character 1 1MPI_Complex 2 x 8 2 x 4MPI_Double_Complex 2 x 8 2 x 8MPI_Double_Precision 8 8MPI_Integer 8 4MPI_Logical 8 4MPI_Real 8 4

Page 10: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

10

Default MPI Datatypes

Type T3ELength (bytes)

SPLength (bytes)

MPI_Complex8 NA 2 x 4MPI_Complex16 NA 2 x 8MPI_ Complex32 NA 2 x 16MPI_Integer1 4 1MPI_Integer2 4 2MPI_Integer4 4 4MPI_Integer8 8 8MPI_Logical1 NA 1MPI_Logical2 NA 2MPI_Logical4 NA 4MPI_Logical8 NA 8MPI_Real4 4 4MPI_Real8 8 8MPI_Real16 NA 16

Page 11: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

11

MPI - Argument Checking

• T3E MPI library has several collective routines which do not check arguments in accordance with the MPI standard. The SP does check arguments.

• Examples:– MPI_Bcast “count” argument is not checked for consistency on

T3E– MPI_Gatherv array of “counts” is not checked for consistency on

T3E

Page 12: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

12

MPI - Buffering

• If your program depends on the buffering of standard MPI Sends and Receives, you may see different behavior between the T3E and the SP.

• Classic case:

...if (mype.eq.0) then call mpi_send(buf,count,type,1,tag,MPI_COMM_WORLD,ierr) call mpi_recv(buf,count,type,0,tag,MPI_COMM_WORLD,status,ierr)else if (mype.eq.1) then call mpi_send(buf,count,type,0,tag,MPI_COMM_WORLD,ierr) call mpi_recv(buf,count,type,1,tag,MPI_COMM_WORLD,status,ierr)end if...

Page 13: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

13

MPI - Buffering

• On the T3E, a message up to 4 Kbyte are buffered. This can be changed by setting the environment variable MPI_BUFFER_MAX.

• On the SP, the default size depends on the number of processors:1 to 16 4096

17 to 32 2048

33 to 64 1024

65 to 128 512

127 to 256 256

257 and over 128

• This can be changed by setting the environment variable MP_EAGER_LIMIT.

Page 14: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

14

Cray SciLib and IBM ESSL

• Both vendors provide libraries of commonly used Linear Algebra subroutines

• On the T3E this is linked by default, on the SP use “-lessl”

• These libraries are faster then the public domain BLAS, LAPACK, etc.

Page 15: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

15

Using BLAS

• BLAS levels 1 through 3 are completely compatible between the two machines

• Note which precision of BLAS is being called:– On the T3E

real*8 a(n), b(n), x

x = sdot(n,a,1,b,1)

– On the SPreal*8 a(n), b(n), x

x = ddot(n,a,1,b,1)

Page 16: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

16

Using BLAS

• Instead of changing program source, loader options can be used to map one routine to another

• To resolve a call to sdot by a call to ddot on the SP:

xlf -o a.out -brename:sdot,ddot b.f

• To resolve a call to ddot by a call to sdot on the T3E:

f90 -o a.out -Wl”-Dequiv(DDOT)=SDOT” b.f

Page 17: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

17

LAPACK routines

• Most other linear algebra routines in Cray SciLib and IBM ESSL are compatible with LAPACK.

• In ESSL there are a few incompatibilities (x may be C, D, S, Z):xGEEV

xSPEV

xSPSV

xHPEV

xHPSV

xGEGV

xSYGV

• Use installed LAPACK library for these.

Page 18: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

18

ScaLAPACK library

• Cray SciLib and IBM PESSL support pieces of the standard ScaLAPACK library.

• Check precision of routines:– For real*8 on the T3E, routines start “PS”

– For real*8 on the SP, routines start “PD”

• On the SP, you must call BLACS_GET followed by either BLACS_GRIDINIT or BLACS_GRIDMAP. On the T3E, only a call to one of the latter two routines is required.

• Public domain ScaLAPACk is also installed on both machines.

Page 19: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

19

System Libraries

• Generally, any routines which interact with the operating system, and provide extensions to the Fortran language.

• Cray provides very many such routines. Some are available on the SP, for example:

T3E SP FunctionAbort Abort Ends programExit Exit_ Ends programFlush Flush_ Flushes Fortran I/O bufferSystem Ishell Executes a commandTrbk Xl__trbk Prints a tracback

Page 20: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

20

System Libraries

• A more comprehensive list is available at: http://hpcf.nersc.gov/computers/SP/port.html

• Some routines have changed names and slightly different arguments.

• There are sometimes identically or similarly named routines on the SP which are designed to be called from C only. Calling them from Fortran will cause unexpected behavior.

• For example, calling exit instead of exit_ will cause the program to end without flushing any Fortran I/O buffer.

Page 21: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

21

Fortran I/O

• Unformatted I/O– The primitive datatypes on the T3E and SP are compatible (provided they

are of the same length), but control words inserted by Fortran language i/o layer prevent transferability of sequential access files.

– Direct access files can be freely transferred between the two machines, as can MPI I/O files.

• Namelist Input/Output– Users familar with the assign -f77 on the T3E, which causes an old-

style namelist input to be written or read, can set the following environment variable on the SP to obtain the same effect:

setenv XLFRTEOPTS="namelist=old"

Page 22: Porting from the Cray T3E to the IBM SP

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

22

Further Information

• T3E and SP webpages and software webpages contain further information and links to vendor documentation:

http://hpcf.nersc.gov/computers

http://hpcf.nersc.gov/software