tech integration national center for computational … by open mpi on the cray xt richard l. graham...

15
Presented by Open MPI on the Cray XT Richard L. Graham Tech Integration National Center for Computational Sciences

Upload: phamkhanh

Post on 10-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Presented by

Open MPI on the Cray XT

Richard L. Graham

Tech Integration

National Center for Computational Sciences

2 Graham_OpenMPI_SC07

Why does Open MPI exist?

• Maximize all MPI expertise:

research/academia,

industry,

…elsewhere.

• Capitalize on (literally) years ofMPI research and implementationexperience.

• The sum is greater than the parts.

Research/

academia

Industry

3 Graham_OpenMPI_SC07

Current membership

1 individual8 universities

7 vendors4 US DOE labs

14 members, 6 contributors

4 Graham_OpenMPI_SC07

Key design feature: Components

Formalized interfaces

• Specifies “black box” implementation

• Different implementations available at run-time

• Can compose different systems on the fly

Interface 3Interface 2Interface 1

Caller

5 Graham_OpenMPI_SC07

Point-to-point architecture

BTL-GM

MPool-GM

Rcache

BTL-OpenIB

MPool-OpenIB

Rcache

BML-R2

PML-OB1/DR

MPI

MTL-MX

(Myrinet)

PML-CM

MTL-

Portals

MTL-PSM

(QLogic)

6 Graham_OpenMPI_SC07

Portals port: OB1 vs. CM

OB1

• Matching in main-memory

• Short message: eager,buffer on receive

• Long message: rendezvous

Rendezvous packet:0 byte payload

Get message after match

CM

• Matching maybe on NIC

• Short message: eager,buffer on receive

• Long message: eager

Send all data

• If Match: deliver directlyto user buffer

• No Match: discard payload,and get() user data aftermatch

7 Graham_OpenMPI_SC07

Collective communications componentstructure

PML

OB

1

CM

DR

CR

CP

W

Allocator

Bas

ic

Buc

ket

BTL

TC

P

Sha

red

Mem

.

Infib

and

MTLM

yrin

et M

X

Por

tals

PS

M

Topology

Bas

ic

Util

ity

Collective

Bas

ic

Tun

ed

Hie

rarc

hica

l

Inte

rcom

m.

Sha

red

Mem

.

Non

-blo

ckin

g

I/O

Por

tals

MPI Component Architecture (MCA)

MPI API

User application

8 Graham_OpenMPI_SC07

Benchmark Results

9 Graham_OpenMPI_SC07

NetPipe bandwidth data (MB/sec)

0.0001 0.001 0.01 0.1 1 10 100 1000 10000

Data Size (KBytes)

Open MPI—CM

Open MPI—OB1

Cray MPI

2000

1800

1600

1400

1200

1000

800

600

400

200

0

Ban

dw

idth

(M

Byte

s/s

ec)

10 Graham_OpenMPI_SC07

Zero byte ping-pong latency

4.78 secCray MPI

6.16 secOpen—OB1

4.91 secOpen MPI—CM

11 Graham_OpenMPI_SC07

VH1—Total runtime

3.5 4.0 4.5 5.0 5.5 6.5 7.0 7.5 8.5

Log 2 Processor Count

250

240

230

220

210

200

VH

-1 W

all

Clo

ck T

ime (

sec)

8.06.0

Open MPI—CM

Open MPI—OB1

Cray MPI

12 Graham_OpenMPI_SC07

GTC—Total runtime

1 2 3 4 5 7 8 9 11

Log 2 Processor Count

Open MPI—CM

Open MPI—OB1

Cray MPI

1150

1100

1050

1000

900

800106

950

850

GT

C W

all

Clo

ck T

ime (

sec)

13 Graham_OpenMPI_SC07

POP—Step runtime

3 4 5 6 7 8 9 11

Log 2 Processor Count

2048

1024

512

256

128PO

P T

ime S

tep W

all

Clo

ck T

ime (

sec)

10

Open MPI—CM

Open MPI—OB1

Cray MPI

14 Graham_OpenMPI_SC07

Summary and future directions

• Support for XT (Catamount and Compute Node Linux) withinstandard distribution

• Performance (application and micro-benchmarks)comparable to that of Cray MPI

• Support for recovery from process failure is being added

15 Graham_OpenMPI_SC07

Contact

Richard L. Graham

Tech IntegrationNational Center for Computational Sciences(865) [email protected]

www.open-mpi.org

15 Graham_OpenMPI_SC07