hpc middleware (hpc-mw) infrastructure for scientific applications on hpc environments overview and...

125
HPC Middleware (HPC- HPC Middleware (HPC- MW) MW) Infrastructure for Scientific Infrastructure for Scientific Applications on Applications on HPC Environments HPC Environments Overview and Recent Progress Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting, June 5 th & 6 th , 2003. Brisbane, QLD, Australia,

Upload: griffin-collins

Post on 28-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

HPC Middleware (HPC-MW)HPC Middleware (HPC-MW)Infrastructure for Scientific Applications on Infrastructure for Scientific Applications on HPC EnvironmentsHPC Environments

Overview and Recent ProgressOverview and Recent Progress

Kengo Nakajima, RIST.

3rd ACES WG Meeting, June 5th & 6th, 2003.Brisbane, QLD, Australia,

Page 2: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

2

3rd ACES WG mtg. June 2003.

My Talk

HPC-MW (35 min.) Hashimoto/Matsu’ura Code on ES (5 min.)

Page 3: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

3

3rd ACES WG mtg. June 2003.

Table of Contents Background Basic Strategy Development On-Going Works/Collaborations XML-based System for User-Defined Data Structure

Future Works Examples

Page 4: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

4

3rd ACES WG mtg. June 2003.

Frontier Simulation Software for Industrial Science (FSIS)http://www.fsis.iis.u-tokyo.ac.jp

Part of "IT Project" in MEXT (Minitstry of Education, Culture, Sports, Science & Technology) HQ at Institute of Industrial Science, Univ. Tokyo 5 years, 12M USD/yr. (decreasing ...) 7 internal projects, > 100 people involved.

Focused on Industry/Public Use Commercialization

Page 5: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

5

3rd ACES WG mtg. June 2003.

Frontier Simulation Software for Industrial Science (FSIS) (cont.)http://www.fsis.iis.u-tokyo.ac.jp

Quantum Chemistry Quantum Molecular Interaction System Nano-Scale Device Simulation Fluid Dynamics Simulation Structural Analysis Problem Solving Environment (PSE) High-Performance Computing Middleware (HPC-MW)

Page 6: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

6

3rd ACES WG mtg. June 2003.

Background Various Types of HPC Platforms Parallel Computers PC Clusters MPP with Distributed Memory SMP Cluster (8-way, 16-way, 256-way): ASCI, Earth Simulator

Power, HP-RISC, Alpha/Itanium, Pentium, Vector PE GRID Environment -> Various Resources Parallel/Single PE Optimization is important !! Portability under GRID environment Machine-dependent optimization/tuning. Everyone knows that ... but it's a big task especially for application experts, scientists.

Page 7: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

7

3rd ACES WG mtg. June 2003.

Background Various Types of HPC Platforms Parallel Computers PC Clusters MPP with Distributed Memory SMP Cluster (8-way, 16-way, 256-way): ASCI, Earth Simulator

Power, HP-RISC, Alpha/Itanium, Pentium, Vector PE GRID Environment -> Various Resources Parallel/Single PE Optimization is important !! Portability under GRID environment Machine-dependent optimization/tuning. Everyone knows that ... but it's a big task especially for application experts, scientists.

Page 8: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

8

3rd ACES WG mtg. June 2003.

Reordering for SMP Cluster with Vector PEs: ILU Factorization

PDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-ordering

Page 9: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

9

3rd ACES WG mtg. June 2003.

3D Elastic SimulationProblem Size~GFLOPSEarth Simulator/SMP node (8 PEs)

● : PDJDS/CM-RCM ,■: PCRS/CM-RCM ,▲: Natural Ordering

1.E-02

1.E-01

1.E+00

1.E+01

1.E+02

1.E+04 1.E+05 1.E+06 1.E+07

DOF

GF

LO

PS

Page 10: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

10

3rd ACES WG mtg. June 2003.

3D Elastic SimulationProblem Size~GFLOPSIntel Xeon 2.8 GHz, 8 PEs

● : PDJDS/CM-RCM ,■: PCRS/CM-RCM ,▲: Natural Ordering

1.50

2.00

2.50

3.00

1.E+04 1.E+05 1.E+06 1.E+07

DOF

GF

LO

PS

Page 11: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

11

3rd ACES WG mtg. June 2003.

Parallel Volume RenderingMHD Simulation of Outer Core

Page 12: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

12

3rd ACES WG mtg. June 2003.

Volume Rendering Moduleusing Voxels

On PC Cluster

-Hierarchical Background Voxels-Linked-List

-Globally Fine Voxels-Static-Array

On Earth Simulator

Page 13: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

13

3rd ACES WG mtg. June 2003.

Background (cont.) Simulation methods such as FEM, FDM etc. have

several typical processes for computation.

Page 14: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

14

3rd ACES WG mtg. June 2003.

"Parallel" FEM Procedure

Initial Grid DataInitial Grid Data

PartitioningPartitioning

Post ProcPost Proc..Data Input/OutputData Input/Output

Domain Specific Domain Specific Algorithms/ModelsAlgorithms/Models

Matrix AssembleMatrix Assemble

Linear SolversLinear Solvers

VisualizationVisualization

Pre-ProcessingPre-Processing MainMain Post-ProcessingPost-Processing

Page 15: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

15

3rd ACES WG mtg. June 2003.

Background (cont.) Simulation methods such as FEM, FDM etc. have several typical processes for computation. How about "hiding" these process from users by Middleware between applications and compilers ?

Development: efficient, reliable, portable, easy-to-maintain accelerates advancement of the applications (= physics) HPC-MW = Middleware close to "Application Layer"HPC-MW = Middleware close to "Application Layer"

Applications

Compilers, MPI etc.

Big Gap

Applications

Compilers, MPI etc.

Middleware

Page 16: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

16

3rd ACES WG mtg. June 2003.

Example of HPC Middleware Simulation Methods include Some Typical Processes

I/O

Matrix Assemble

Linear Solver

Visualization

FEM

Page 17: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

17

3rd ACES WG mtg. June 2003.

Example of HPC Middleware Individual Process can be optimized for Various Types of MPP Architectures

I/O

Matrix Assemble

Linear Solver

Visualization

FEM MPP-A

MPP-B

MPP-C

Page 18: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

18

3rd ACES WG mtg. June 2003.

Example of HPC MiddlewareLibrary-Type HPC-MW for Existing HW

FEM code developed on PC

Page 19: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

19

3rd ACES WG mtg. June 2003.

Example of HPC MiddlewareLibrary-Type HPC-MW for Existing HW

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Xeon Cluster

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Earth Simulator

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Hitachi SR8000

FEM code developed on PC

Page 20: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

20

3rd ACES WG mtg. June 2003.

Example of HPC MiddlewareLibrary-Type HPC-MW for Existing HW

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Xeon Cluster

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Earth Simulator

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Hitachi SR8000

FEM code developed on PCI/F forVis.

I/F forSolvers

I/F forMat.Ass.

I/F forI/O

Page 21: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

21

3rd ACES WG mtg. June 2003.

Example of HPC MiddlewareParallel FEM Code Optimized for ES

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Xeon Cluster

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Earth Simulator

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Hitachi SR8000

FEM code developed on PCI/F forVis.

I/F forSolvers

I/F forMat.Ass.

I/F forI/O

Page 22: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

22

3rd ACES WG mtg. June 2003.

Example of HPC MiddlewareParallel FEM Code Optimized for Intel Xeon

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Xeon Cluster

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Earth Simulator

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Hitachi SR8000

FEM code developed on PCI/F forVis.

I/F forSolvers

I/F forMat.Ass.

I/F forI/O

Page 23: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

23

3rd ACES WG mtg. June 2003.

Example of HPC MiddlewareParallel FEM Code Optimized for SR8000

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Xeon Cluster

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Earth Simulator

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Hitachi SR8000

FEM code developed on PCI/F forVis.

I/F forSolvers

I/F forMat.Ass.

I/F forI/O

Page 24: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

24

3rd ACES WG mtg. June 2003.

Background Basic Strategy Development On-Going Works/Collaborations XML-based System for User-Defined Data Structure Future Works Examples

Page 25: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

25

3rd ACES WG mtg. June 2003.

HPC Middleware( HPC-MW) ? Based on idea of "Plug-in" in GeoFEM.

Page 26: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

26

3rd ACES WG mtg. June 2003.

System Config. of GeoFEM

Visualization dataGPPView

One-domain mesh

Utilities Pluggable Analysis Modules

PEs

Partitioner

Equationsolvers

VisualizerParallelI/O

構造計算(Static linear)構造計算(Dynamic linear)構造計算(

Contact)

Partitioned mesh

PlatformSolverI/ F

Comm.I/ F

Vis.I/ F

Structure

FluidWave

Visualization dataGPPView

One-domain mesh

Utilities Pluggable Analysis Modules

PEs

Partitioner

Equationsolvers

VisualizerParallelI/O

構造計算(Static linear)構造計算(Dynamic linear)構造計算(

Contact)

Partitioned mesh

PlatformSolverI/ F

Comm.I/ F

Vis.I/ F

Structure

FluidWave

http://geofem.tokyo.rist.or.jp/

Page 27: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

27

3rd ACES WG mtg. June 2003.

Local Data StructureNode-based Partitioninginternal nodes-elements-external nodes

1 2 3 4 5

21 22 23 24 25

1617 18 19

20

1112 13 14

15

67 8 9

10

1 2 3

4 5

6 7

8 9 11

10

14 13

15

12

PE#0

7 8 9 10

4 5 6 12

3111

2

PE#1

7 1 2 3

10 9 11 12

568

4

PE#2

34

8

69

10 12

1 2

5

11

7

PE#3

1 2 3 4 5

21 22 23 24 25

1617 18 19

20

1112 13 14

15

67 8 9

10

PE#0PE#1

PE#2PE#3

1 2 3 4 5

21 22 23 24 25

1617 18 19

20

1112 13 14

15

67 8 9

10

1 2 3 4 5

21 22 23 24 25

1617 18 19

20

1112 13 14

15

67 8 9

10

1 2 3 4 5

21 22 23 24 25

1617 18 19

20

1112 13 14

15

67 8 9

10

1 2 3

4 5

6 7

8 9 11

10

14 13

15

12

PE#0

7 8 9 10

4 5 6 12

3111

2

PE#1

7 1 2 3

10 9 11 12

568

4

PE#2

34

8

69

10 12

1 2

5

11

7

PE#3

1 2 3

4 5

6 7

8 9 11

10

14 13

15

12

PE#0

1 2 3

4 5

6 7

8 9 11

10

14 13

15

12

1 2 3

4 5

6 7

8 9 11

10

14 13

15

12

PE#0

7 8 9 10

4 5 6 12

3111

2

PE#1

7 8 9 10

4 5 6 12

3111

2

7 8 9 10

4 5 6 12

3111

2

PE#1

7 1 2 3

10 9 11 12

568

4

PE#27 1 2 3

10 9 11 12

568

4

7 1 2 3

10 9 11 12

568

4

PE#2

34

8

69

10 12

1 2

5

11

7

PE#3

34

8

69

10 12

1 2

5

11

7

34

8

69

10 12

1 2

5

11

7

PE#3

1 2 3 4 5

21 22 23 24 25

1617 18 19

20

1112 13 14

15

67 8 9

10

PE#0PE#1

PE#2PE#3

1 2 3 4 5

21 22 23 24 25

1617 18 19

20

1112 13 14

15

67 8 9

10

PE#0PE#1

PE#2PE#3

1 2 3 4 5

21 22 23 24 25

1617 18 19

20

1112 13 14

15

67 8 9

10

1 2 3 4 5

21 22 23 24 25

1617 18 19

20

1112 13 14

15

67 8 9

10

Page 28: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

28

3rd ACES WG mtg. June 2003.

What can we do by HPC-MW ? We can develop optimized/parallel code easily on HPC-MW from user's code developed on PC. Library-Type most fundamental approach optimized library for individual architecture Compiler-Type Next Generation Architecture Irregular Data Network-Type GRID Environment (heterogeneous) Large-Scale Computing (Virtual Petaflops), Coupling.

Page 29: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

29

3rd ACES WG mtg. June 2003.

HPC-MWProcedure

CNTL DataFEM

Exec. FileParallel FEM

DistributedLocal Results

Patch Filefor Visualization

Image Filefor Visualization

DistributedLocal Mesh Data

Utility Software forParallel Mesh Generation

& Partitioning

Initial EntireMesh Data

CNTL Datamesh generation

partitioning

Library-typeHPC-MW

FEM Codes

線形ソルバソースコード線形ソルバ

ソースコード線形ソルバソースコード線形ソルバ

ソースコードNetwork-Type

HPC-MWSource Code

線形ソルバソースコード線形ソルバ

ソースコード線形ソルバソースコード線形ソルバ

ソースコードCompiler-Type

HPC-MWSource Code

HW Data

HPC-MWPrototype

Source FIle

FEM Source Codeby Users

MPICH, MPICH-G

DistributedLocal Mesh Data

Compiler-TypeHPC-MW Generator

Network-TypeHPC-MW Generator

NetworkHW Data

GEN.

GEN.

LINK

Page 30: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

30

3rd ACES WG mtg. June 2003.

Library-Type HPC-MWParallel FEM Code Optimized for ES

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Xeon Cluster

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Earth Simulator

Vis.LinearSolver

MatrixAssembleI/O

HPC-MW for Hitachi SR8000

FEM code developed on PCI/F forVis.

I/F forSolvers

I/F forMat.Ass.

I/F forI/O

Page 31: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

31

3rd ACES WG mtg. June 2003.

What can we do by HPC-MW ? We can develop optimized/parallel code easily on HPC-MW from user's code developed on PC. Library-Type most fundamental approach optimized library for individual architecture Compiler-Type Next Generation Architecture Irregular Data Network-Type GRID Environment, Large-Scale Computing (Virtual Petaflops), Coupling.

Page 32: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

32

3rd ACES WG mtg. June 2003.

Compiler-Type HPC-MW

FEM

I/O

Matrix Assemble

Linear Solver

Visualization

MPP-A

MPP-B

MPP-C

Data for Analysis Model

Parametersof H/W

SpecialCompiler

I/O

Matrix Assemble

Linear Solver

Visualization

Optimized code is generated by special language/ compiler based on analysis data (cache blocking etc.) and H/W information.

Page 33: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

33

3rd ACES WG mtg. June 2003.

What can we do by HPC-MW ? We can develop optimized/parallel code easily on HPC-MW from user's code developed on PC. Library-Type most fundamental approach optimized library for individual architecture Compiler-Type Next Generation Architecture Irregular Data Network-Type GRID Environment (heterogeneous). Large-Scale Computing (Virtual Petaflops), Coupling.

Page 34: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

34

3rd ACES WG mtg. June 2003.

Network-Type HPC-MWHeterogeneous Environment"Virtual" Supercomputer

FEM

analysismodelspace

analysismodelspace

Page 35: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

35

3rd ACES WG mtg. June 2003.

What is new, What is nice ? Application Oriented (limited to FEM at this stage) Various types of capabilities for parallel FEM are supported. NOTNOT just a library Optimized for Individual Hardware Single Performance Parallel Performance Similar Projects Cactus Grid Lab (on Cactus)

Page 36: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

36

3rd ACES WG mtg. June 2003.

Background Basic Strategy Development On-Going Works/Collaborations XML-based System for User-Defined Data Structure Future Works Examples

Page 37: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

37

3rd ACES WG mtg. June 2003.

Schedule

Public Release

FY.2002 FY.2003 FY.2004 FY.2005 FY.2006

Basic Design

Prototype

Library-typeHPC-MW

Compiler-typeNetwork-type

HPC-MW

FEM Codeson HPC-MW

Scalar Vector

Compiler Network

Page 38: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

38

3rd ACES WG mtg. June 2003.

System for Development

HPC-MW

Public Users

HW Vendors

FEM Code on HPC-MW

Infrastructure Feed-backAppl. Info.

HW Info.

Library-Type Compiler-Type Network-Type

I/O Vis. Solvers Coupler

AMR DLB Mat.Ass.

Solid Fluid Thermal

Public Release

Feed-back, Comments

Page 39: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

39

3rd ACES WG mtg. June 2003.

FY. 2003

Library-Type HPC-MW FORTRAN90,C, PC-Cluster Version Public Release

Sept. 2003: Prototype Released March 2004: Full version for PC Cluster

Demonstration on Network-Type HPC-MW in SC2003, Phoenix, AZ, Nov. 2003. Evaluation by FEM Code

Page 40: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

40

3rd ACES WG mtg. June 2003.

Library-Type HPC-MWMesh Generation is not considered Parallel I/O : I/F for commercial code (NASTRAN etc.) Adaptive Mesh Refinement (AMR) Dynamic Load-Balancing using pMETIS (DLB) Parallel Visualization Linear Solvers ( GeoFEM + AMG, SAI ) FEM Operations (Connectivity, Matrix Assembling) Coupling I/F Utility for Mesh Partitioning On-line Tutorial

Page 41: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

41

3rd ACES WG mtg. June 2003.

AMR+DLB

Page 42: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

42

3rd ACES WG mtg. June 2003.

Parallel Visualization 

Scalar Field Vector Field Tensor Field

Surface rendering

Interval volume-fitting

Volume rendering

Streamlines

Particle tracking

Topological map

LIC

Volume rendering

Hyperstreamlines

Extension of functions

Extension of dimensions

Extension of Data Types

Page 43: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

43

3rd ACES WG mtg. June 2003.

PMR ( Parallel Mesh Relocator ) Data Size is Potentially Very Large in Parallel

Computing. Handling entire mesh is impossible. Parallel Mesh Generation and Visualization are

difficult due to Requirement for Global Info. Adaptive Mesh Refinement (AMR) and Grid

Hierarchy.

Page 44: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

44

3rd ACES WG mtg. June 2003.

Parallel Mesh Generationusing AMR

Initial Mesh

Prepare Initial Mesh with size as large as single PE can handle.

Page 45: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

45

3rd ACES WG mtg. June 2003.

Parallel Mesh Generationusing AMR

Partition LocalData

LocalData

LocalData

LocalData

Initial Mesh

Partition the Initial Mesh into Local Data. potentially very coarse

Page 46: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

46

3rd ACES WG mtg. June 2003.

Parallel Mesh Generationusing AMR

Partition LocalData

LocalData

LocalData

LocalData

Initial Mesh

PMR

Parallel Mesh Relocation (PMR) by Local Refinement on Each PEs.

Page 47: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

47

3rd ACES WG mtg. June 2003.

Parallel Mesh Generationusing AMR

Partition LocalData

LocalData

LocalData

LocalData

Initial Mesh

Refine

Refine

Refine

Refine

PMR

Parallel Mesh Relocation (PMR) by Local Refinement on Each PEs.

Page 48: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

48

3rd ACES WG mtg. June 2003.

Parallel Mesh Generationusing AMR

Partition LocalData

LocalData

LocalData

LocalData

Initial Mesh

Refine

Refine

Refine

Refine

Refine

Refine

Refine

Refine

PMR

Parallel Mesh Relocation (PMR) by Local Refinement on Each PEs.

Page 49: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

49

3rd ACES WG mtg. June 2003.

Parallel Mesh Generationusing AMR

Partition LocalData

LocalData

LocalData

LocalData

Initial Mesh

Refine

Refine

Refine

Refine

Refine

Refine

Refine

Refine

PMR

Hierarchical Refinement History can be utilized for visualization & multigrid

Page 50: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

50

3rd ACES WG mtg. June 2003.

Parallel Mesh Generationusing AMR

mapping results reversely from local fine mesh to initial coarse meshInitial Mesh

Hierarchical Refinement History can be utilized for visualization & multigrid

Page 51: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

51

3rd ACES WG mtg. June 2003.

Parallel Mesh Generationusing AMR

mapping results reversely from local fine mesh to initial coarse meshInitial Mesh

Visualization is possible on Initial Coarse Mesh using Single PE.various existing software can be used.

Page 52: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

52

3rd ACES WG mtg. June 2003.

Parallel Mesh Generation and Visualization using AMRSummary

Initial Coarse Mesh(Single)

Page 53: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

53

3rd ACES WG mtg. June 2003.

Parallel Mesh Generation and Visualization using AMRSummary

PartitionPMR

Initial Coarse Mesh(Single)

Local Fine Mesh(Distributed)

Page 54: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

54

3rd ACES WG mtg. June 2003.

Parallel Mesh Generation and Visualization using AMRSummary

PartitionPMR

Initial Coarse Mesh(Single)

Local Fine Mesh(Distributed)

Initial Coarse Mesh

ReverseMapping

Second tothe Coarsest

Page 55: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

55

3rd ACES WG mtg. June 2003.

Example (1/3) Initial Entire Mesh

node : 1,308elem : 795

Page 56: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

56

3rd ACES WG mtg. June 2003.

Example (2/3) Initial Partitioning

224

184

188

207

203

222

202

194

nodes elementsPE

1

2

3

4

5

6

7

0 100

99

100

100

99

99

99

99

Page 57: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

57

3rd ACES WG mtg. June 2003.

Example (3/3) Refinemenet

Page 58: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

58

3rd ACES WG mtg. June 2003.

Parallel Linear Solvers 

GFLOPS rate Parallel Work Ratio

0

1000

2000

3000

4000

0 16 32 48 64 80 96 112 128 144 160 176 192

NODE#

GF

LO

PS

40

50

60

70

80

90

100

0 16 32 48 64 80 96 112 128 144 160 176 192

NODE#

Pa

rall

el

Wo

rk R

ati

o:

%

●Flat MPI

○Hybrid

On the Earth Simulator : 176 node , 3.8 TFLOPS

Page 59: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

59

3rd ACES WG mtg. June 2003.

Coupler 

Fluid

Structure

Coupler

Coupler= MAIN( MpCCI )

Fluid

Structure

Coupler

Fluid= MAINStructure is called from Fluid as a subroutine trough Couper.

Page 60: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

60

3rd ACES WG mtg. June 2003.

Coupler 

Fluid

Structure

Coupler

Fluid= MAINStructure is called from Fluid as a subroutine trough Couper.

module hpcmw_mesh type hpcmw_local_mesh ... end type hpcmw_local_mesh end module hpcmw_mesh

FLUID program fluid

use hpcmw_mesh type (hpcmw_local_mesh) :: local_mesh_b

... call hpcmw_couple_PtoS_put call structure_main call hpcmw_couple_StoP_get ... end program fluid

STRUCTURE subroutine structure_main

use hpcmw_mesh type (hpcmw_local_mesh) :: local_mesh_b

call hpcmw_couple_PtoS_get ... call hpcmw_couple_StoP_put

end subroutine structure_main

Page 61: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

61

3rd ACES WG mtg. June 2003.

Coupler 

Fluid

Structure

Coupler

Fluid= MAINStructure is called from Fluid as a subroutine trough Couper.

module hpcmw_mesh type hpcmw_local_mesh ... end type hpcmw_local_mesh end module hpcmw_mesh

FLUID program fluid

use hpcmw_mesh type (hpcmw_local_mesh) :: local_mesh_b

... call hpcmw_couple_PtoS_put call structure_main call hpcmw_couple_StoP_get ... end program fluid

STRUCTURE subroutine structure_main

use hpcmw_mesh type (hpcmw_local_mesh) :: local_mesh_b

call hpcmw_couple_PtoS_get ... call hpcmw_couple_StoP_put

end subroutine structure_main

Different MeshFiles

Page 62: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

62

3rd ACES WG mtg. June 2003.

Coupler 

Fluid

Structure

Coupler

Fluid= MAINStructure is called from Fluid as a subroutine trough Couper.

module hpcmw_mesh type hpcmw_local_mesh ... end type hpcmw_local_mesh end module hpcmw_mesh

FLUID program fluid

use hpcmw_mesh type (hpcmw_local_mesh) :: local_mesh_b

... call hpcmw_couple_PtoS_put call structure_main call hpcmw_couple_StoP_get ... end program fluid

STRUCTURE subroutine structure_main

use hpcmw_mesh type (hpcmw_local_mesh) :: local_mesh_b

call hpcmw_couple_PtoS_get ... call hpcmw_couple_StoP_put

end subroutine structure_main

Communication !!

Communication !!

Page 63: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

63

3rd ACES WG mtg. June 2003.

FEM Codes on HPC-MW Primary Target: Evaluation of HPC-MW itself ! Solid Mechanics

Elastic, Inelastic Static, Dynamic Various types of elements, boundary conditions.

Eigenvalue Analysis Compressible/Incompressible CFD Heat Transfer with Radiation & Phase Change

Page 64: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

64

3rd ACES WG mtg. June 2003.

Release in Late September 2003 Library-Type HPC-MW

Parallel I/O Original Data Structure, GeoFEM, ABAQUS

Parallel Visualization PVR , PSR

Parallel Linear Solvers Preconditioned Iterative Solvers (ILU, SAI)

Utility for Mesh Partitioning Serial Partitioner , Viewer

On-line Tutorial

FEM Code for Linear-Elastic Simulation (prototype)

Page 65: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

65

3rd ACES WG mtg. June 2003.

Technical Issues Common Data Structure

Flexibility vs. Efficiency Our data structure is efficient ... How to keep user's original data structure

Interface to Other Toolkits PETSc (ANL), Aztec/Trillinos (Sandia) ACTS Toolkit (LBNL/DOE) DRAMA (NEC Europe), Zoltan (Sandia)

Page 66: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

66

3rd ACES WG mtg. June 2003.

Background Basic Strategy Development On-Going Works/Collaborations XML-based System for User-Defined Data Structure Future Works Examples

Page 67: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

67

3rd ACES WG mtg. June 2003.

Public Use/Commercialization

Very important issues in this project.

Industry Education Research

Commercial Codes

Page 68: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

68

3rd ACES WG mtg. June 2003.

Strategy for Public Use

General Purpose Parallel FEM Code Environment for Development (1): for Legacy Code

"Parallelization" F77->F90 , COMMON -> Module Parallel Data Structure, Linear Solvers, Visualization

Environment for Development (2): from Scratch Education

Various type of collaboration

Page 69: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

69

3rd ACES WG mtg. June 2003.

On-going Collaboration Parallelization of Legacy Codes

CFD Grp. in FSIS project Mitsubishi Material: Groundwater Flow JNC (Japan Nuclear Cycle Development Inst.): HLW others: research, educations.

Part of HPC-MW Coupling Interface for Pump Simulation Parallel Visualization: Takashi Furumura (ERI/U.Tokyo)

Page 70: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

70

3rd ACES WG mtg. June 2003.

On-going Collaboration Environment for Development

ACcESS ( Australian Computational Earth Systems Simulator ) Group

Research Collaboration ITBL/JAERI DOE ACTS Toolkit ( Lawrence Berkeley National Laboratory ) NEC Europe ( Dynamic Load Balancing ) ACES/iSERVO GRID

Page 71: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

71

3rd ACES WG mtg. June 2003.

(Fluid+Vibrarion) Simulationfor Boiler Pump Hitachi

Suppression of Noise

Collaboration in FSIS project Fluid Structure PSE HPC-MW: Coupling Interface

Experiment, Measurement accelerometers on surface

Page 72: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

72

3rd ACES WG mtg. June 2003.

(Fluid+Vibrarion) Simulationfor Boiler Pump

Hitachi Suppression of Noise

Collaboration in FSIS project Fluid Structure PSE HPC-MW: Coupling Interface

Experiment, Measurement accelerometers on surface

Fluid

Structure

Coupler

Page 73: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

73

3rd ACES WG mtg. June 2003.

Parallelization of Legacy Codes Many cases !! Under Investigation through real collaboration

Optimum Procedure: Document, I/F, Work Assignment FEM: suitable for this type of procedure

Works to introduce new matrix storage manner for parallel iterative solvers in HPC-MW. to add subroutine calls for using parallel visualization functions in HPC-MW. to change data-structure is a big issue !!

flexibility vs. efficiency

problem specific/general

Page 74: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

74

3rd ACES WG mtg. June 2003.

Element Connectivity In HPC-MW

do icel= 1, ICELTOT iS= elem_index(icel-1) in1= elem_ptr(iS+1) in2= elem_ptr(iS+2) in3= elem_ptr(iS+3)enddo

Sometimes...

do icel= 1, ICELTOT in1= elem_node(1,icel) in2= elem_node(2,icel) in3= elem_node(3,icel)enddo

Page 75: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

75

3rd ACES WG mtg. June 2003.

Parallelization of Legacy Codes

Input

Mat. Conn.

Mat. Assem.

Linear Solver

Visualization

Output

FEM

Original Code

Input

Mat. Conn.

Mat. Assem.

Linear Solver

Visualization

Output

HPC-MWoptimized

Comm.

Input

Mat. Conn.

Mat. Assem.

Linear Solver

Visualization

Output

FEM

Parallel Code

Comm.

?

?

?

?

not optimumnot optimumbut easy to do.but easy to do.

Input

Mat. Conn.

Mat. Assem.

Linear Solver

Visualization

Output

FEM

Parallel Code

Comm.

Page 76: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

76

3rd ACES WG mtg. June 2003.

Works with CFD grp. in FSIS Original CFD Code

3D Finite-Volume, Serial, Fortran90 Strategy

use Poisson solver in HPC-MW keep ORIGINAL data structure: We (HPC-MW) developed new partitioner. CFD people do matrix assembling using HPC-MW's format: Matrix assembling part is intrinsically parallel

Schedule April 3, 2003: 1st meeting, overview. April 24, 2003: 2nd meeting, decision of strategy May 2, 2003: show I/F for Poisson solver May 12, 2003: completed new partitioner

Page 77: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

77

3rd ACES WG mtg. June 2003.

CFD Code: 2 types of comm.(1) Inter-Domain Communication

Page 78: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

78

3rd ACES WG mtg. June 2003.

CFD Code: 2 types of comm.(1) Inter-Domain Communication

Page 79: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

79

3rd ACES WG mtg. June 2003.

CFD Code: 2 types of comm.(2) Wall-Law Communication

Page 80: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

80

3rd ACES WG mtg. June 2003.

CFD Code: 2 types of comm.(2) Wall-Law Communication

Page 81: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

81

3rd ACES WG mtg. June 2003.

(Common) Data Structure

Users like to keep their original data structure. But they wan to parallelize the code. Compromise at this stage

keep original data structure We (I, more precisely) develop partitioners for individual users (1day work after I understand the data structure).

Page 82: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

82

3rd ACES WG mtg. June 2003.

Domain Partitioning Util. for User’s Original Data Structure very important for parallelization of legacy codes

Functions Input: Initial Entire Mesh in ORIGINAL Format Output: Distributed Local Meshes in ORIGINAL Format Communication Information (Separate File)

Merit Original I/O routines can be utilized. Operations for communication are hidden.

Technical Issues Basically, individual support (developing) ... BIG WORK. Various data structures for individual user code. Problem specific information.

Page 83: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

83

3rd ACES WG mtg. June 2003.

Mesh Partitioningfor Original Data Structure

cntl.

mesh

Initial Entire Data

Partitioner

Local Distributed Data

commoncntl.

meshdistributed

Comm. Info.

commoncntl.

meshdistributed

Comm. Info.

commoncntl.

meshdistributed

Comm. Info.

commoncntl.

meshdistributed

Comm. Info.

• Original I/O for local distributed data.

• I/F for Comm.Info. provided by HPC-MW

Page 84: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

84

3rd ACES WG mtg. June 2003.

Distributed Mesh + Comm. Info.Local Distributed Data

commoncntl.

meshdistributed

Original InputSubroutines

Comm. Info. HPCMW_COMM_INIT

This part is hidden except "CALL HPCMW_COMM_INIT".

Page 85: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

85

3rd ACES WG mtg. June 2003.

Background Basic Strategy Development On-Going Works/Collaborations XML-based System for User-Defined Data Structure Future Works Examples

Page 86: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

86

3rd ACES WG mtg. June 2003.

XML-based I/O Func. Generator K.Sakane (RIST) 8th-JSCES Conf., 2003.

User's original data structure can be described by certain XML-based definition information.

Generate I/O subroutines (C, F90) for partitioning utilities according to XML-based definition information.

Substitute existing I/O subroutines for partitioning utilities in HPC-MW with generated codes.

Page 87: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

87

3rd ACES WG mtg. June 2003.

XML-based I/O Func. Generator K.Sakane (RIST) 8th-JSCES Conf., 2003.

Utility partitioning reads initial entire mesh in HPC-MW format and writes distributed local mesh files in HPC-MW format with communication information.

Utility forPartitioning

I/O for Mesh DataHPCMW

initial mesh

HPCMWlocal mesh comm

HPCMWlocal mesh comm

HPCMWlocal mesh comm

HPCMWlocal mesh comm

This is ideal for us...But all of the users are not necessarilyhappy with that...

Page 88: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

88

3rd ACES WG mtg. June 2003.

XML-based I/O Func. Generator K.Sakane (RIST) 8th-JSCES Conf., 2003.

Generate I/O subroutines (C, F90) for partitioning utilities according to XML-based definition information.

Utility forPartitioning

I/O for Mesh Data

XML-based SystemI/O Func. Generator

for Orig. Data Structure

Definition Datain XML

I/O for Mesh DataUser-Def. Data

Structure

Page 89: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

89

3rd ACES WG mtg. June 2003.

XML-based I/O Func. Generator K.Sakane (RIST) 8th-JSCES Conf., 2003.

Substitute existing I/O subroutines for partitioning utilities in HPC-MW with generated codes.

Utility forPartitioning

I/O for Mesh Data

XML-based SystemI/O Func. Generator

for Orig. Data Structure

Definition Datain XML

I/O for Mesh DataUser-Def. Data

Structure

Page 90: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

90

3rd ACES WG mtg. June 2003.

XML-based I/O Func. Generator K.Sakane (RIST) 8th-JSCES Conf., 2003.

Substitute existing I/O subroutines for partitioning utilities in HPC-MW with generated codes.

Utility forPartitioning

I/O for Mesh DataUser-Def. Data

Structure

User Definedinitial mesh

comm

comm

comm

comm

User Definedlocal mesh

HPCMWlocal mesh

HPCMWlocal mesh

HPCMWlocal mesh

Page 91: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

91

3rd ACES WG mtg. June 2003.

TAGS in Definition Info. File

usermesh   starting pointparameter   parameter definitiondefine   sub-structure definition

token   smallest definition unit (number, label etc.)

mesh   entire structure definitionref   reference of sub-structure

Page 92: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

92

3rd ACES WG mtg. June 2003.

Example

NODE 1 0. 0. 0.NODE 2 1. 0. 0.NODE 3 0. 1. 0.NODE 4 0. 0. 1.ELEMENT 1 TETRA 1 2 3 4

TetrahedronConnectivity

Page 93: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

93

3rd ACES WG mtg. June 2003.

Example (ABAQUS, NASTRAN)

** ABAQUS format*NODE1, 0., 0., 0.2, 1., 0., 0.3, 0., 1., 0.4, 0., 0., 1.*ELEMENT, TYPE=C3D41 1 2 3 4

$ NASTRAN formatGRID 1 0 0. 0. 0. 0GRID 2 0 1. 0. 0. 0GRID 3 0 0. 1. 0. 0GRID 4 0 0. 0. 1. 0CTETRA 1 1 1 2 3 4

Page 94: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

94

3rd ACES WG mtg. June 2003.

Ex.: Definition Info. File (1/2)

<?xml version="1.0" encoding="EUC-JP"?><usermesh>

<parameter name=“TETRA">4</parameter><parameter name=“PENTA">5</parameter><parameter name=“HEXA">8</parameter>

<define name="mynode"> <token means="node.start">NODE</token> <token means="node.id"/> <token means="node.x"/> <token means="node.y"/> <token means="node.z"/> <token means="node.end"/></define>

<define name="mynode"> <token means="node.start">NODE</token> <token means="node.id"/> <token means="node.x"/> <token means="node.y"/> <token means="node.z"/> <token means="node.end"/></define>

Sub-structure

<parameter name=“TETRA">4</parameter><parameter name=“PENTA">5</parameter><parameter name=“HEXA">8</parameter>

Parameters

User can definethese ...#User format

NODE 1 0. 0. 0.

  !! HPC-MW format  !NODE node.start  1, 0.0, 0.0, 0.0

node.id node.x node.y   node.z

Page 95: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

95

3rd ACES WG mtg. June 2003.

Ex.: Definition Info. File (2/2)

<define name="myelement"> <token means="element.start">ELEMENT</token> <token means="element.id"/> <token means="element.type"/> <token means="element.node" times="$element.type"/> <token means="element.end"/></define>

<mesh> <ref name="mynode"/> <ref name="myelement"/></mesh>

</usermesh>

<define name="myelement"> <token means="element.start">ELEMENT</token> <token means="element.id"/> <token means="element.type"/> <token means="element.node" times="$element.type"/> <token means="element.end"/></define>

Sub-Structure

<mesh> <ref name="mynode"/> <ref name="myelement"/></mesh>

Entire Structure  !! HPC-MW format  !ELEMENT, TYPE=311  1, 1, 2, 3, 4 element.type

element.start element.id element.node

#User formatELEMENT 1 TETRA 1 2 3 4

Corresponding to parameter in "element.type

"

Page 96: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

96

3rd ACES WG mtg. June 2003.

Background Basic Strategy Development On-Going Works/Collaborations XML-based System for User-Defined Data Structure Future Works Examples

Page 97: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

97

3rd ACES WG mtg. June 2003.

Further Study/Works

Develop HPC-MW Collaboration Simplified I/F for Non-Experts Interaction is important !!: Procedure for Collaboration

Page 98: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

98

3rd ACES WG mtg. June 2003.

System for Development

HPC-MW

Public Users

HW Vendors

FEM Code on HPC-MW

Infrastructure Feed-backAppl. Info.

HW Info.

Library-Type Compiler-Type Network-Type

I/O Vis. Solvers Coupler

AMR DLB Mat.Ass.

Solid Fluid Thermal

Public Release

Feed-back, Comments

Page 99: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

99

3rd ACES WG mtg. June 2003.

Further Study/Works Develop HPC-MW

Collaboration Simplified I/F for Non-Experts Interaction is important !!: Procedure for Collaboration Extension to DEM etc.

Promotion for Public Use Parallel FEM Applications Parallelization of Legacy Codes Environment for Development

Page 100: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

100

3rd ACES WG mtg. June 2003.

Current Remarks If you want to parallelize your legacy code on PC cluster

... Keep your own data structure.

customized partitioner will be provided (or automatically generated by the XML system)

Rewrite your matrix assembling part. Introduce linear solvers and visualization in HPC-MW

Input

Mat. Conn.

Mat. Assem.

Linear Solver

Visualization

Output

Comm.

User’s Original HPC-MW

Page 101: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

101

3rd ACES WG mtg. June 2003.

Current Remarks (cont.) If you want to optimize your legacy code on the E

arth Simulator Use HPC-MW's data structure Anyway, you have to rewrite your code for ES. Utilize all components of HPC-MW

[email protected]

Input

Mat. Conn.

Mat. Assem.

Linear Solver

Visualization

Output

Comm.

User’s Original HPC-MW

Page 102: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

102

3rd ACES WG mtg. June 2003.

Some Examples

Page 103: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

103

3rd ACES WG mtg. June 2003.

Simple Interface: Communication call SOLVER_SEND_RECV_3 & & ( NP, NEIBPETOT, NEIBPE, STACK_IMPORT, NOD_IMPORT, & & STACK_EXPORT, NOD_EXPORT, WS, WR, WW(1,ZP) , SOLVER_COMM, & & my_rank)

module solver_SR_3 contains subroutine SOLVER_SEND_RECV_3 & & ( N, NEIBPETOT, NEIBPE, STACK_IMPORT, NOD_IMPORT, & & STACK_EXPORT, NOD_EXPORT, & & WS, WR, X, SOLVER_COMM,my_rank) implicit REAL*8 (A-H,O-Z) include 'mpif.h' include 'precision.inc'

integer(kind=kint ) , intent(in) :: N integer(kind=kint ) , intent(in) :: NEIBPETOT integer(kind=kint ), pointer :: NEIBPE (:) integer(kind=kint ), pointer :: STACK_IMPORT(:) integer(kind=kint ), pointer :: NOD_IMPORT (:) integer(kind=kint ), pointer :: STACK_EXPORT(:) integer(kind=kint ), pointer :: NOD_EXPORT (:) real (kind=kreal), dimension(3*N), intent(inout):: WS real (kind=kreal), dimension(3*N), intent(inout):: WR real (kind=kreal), dimension(3*N), intent(inout):: X integer , intent(in) ::SOLVER_COMM integer , intent(in) :: my_rank...

GeoFEM's Original I/F

Page 104: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

104

3rd ACES WG mtg. June 2003.

Simple Interface: Communication call SOLVER_SEND_RECV_3 & & ( NP, NEIBPETOT, NEIBPE, STACK_IMPORT, NOD_IMPORT, & & STACK_EXPORT, NOD_EXPORT, WS, WR, WW(1,ZP) , SOLVER_COMM, & & my_rank)

module solver_SR_3 contains subroutine SOLVER_SEND_RECV_3 & & ( N, NEIBPETOT, NEIBPE, STACK_IMPORT, NOD_IMPORT, & & STACK_EXPORT, NOD_EXPORT, & & WS, WR, X, SOLVER_COMM,my_rank) implicit REAL*8 (A-H,O-Z) include 'mpif.h' include 'precision.inc'

integer(kind=kint ) , intent(in) :: N integer(kind=kint ) , intent(in) :: NEIBPETOT integer(kind=kint ), pointer :: NEIBPE (:) integer(kind=kint ), pointer :: STACK_IMPORT(:) integer(kind=kint ), pointer :: NOD_IMPORT (:) integer(kind=kint ), pointer :: STACK_EXPORT(:) integer(kind=kint ), pointer :: NOD_EXPORT (:) real (kind=kreal), dimension(3*N), intent(inout):: WS real (kind=kreal), dimension(3*N), intent(inout):: WR real (kind=kreal), dimension(3*N), intent(inout):: X integer , intent(in) ::SOLVER_COMM integer , intent(in) :: my_rank...

Page 105: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

105

3rd ACES WG mtg. June 2003.

Simple Interface: Communication call SOLVER_SEND_RECV_3 & & ( NP, NEIBPETOT, NEIBPE, STACK_IMPORT, NOD_IMPORT, & & STACK_EXPORT, NOD_EXPORT, WS, WR, WW(1,ZP) , SOLVER_COMM, & & my_rank)

module solver_SR_3 contains subroutine SOLVER_SEND_RECV_3 & & ( N, NEIBPETOT, NEIBPE, STACK_IMPORT, NOD_IMPORT, & & STACK_EXPORT, NOD_EXPORT, & & WS, WR, X, SOLVER_COMM,my_rank) implicit REAL*8 (A-H,O-Z) include 'mpif.h' include 'precision.inc'

integer(kind=kint ) , intent(in) :: N integer(kind=kint ) , intent(in) :: NEIBPETOT integer(kind=kint ), pointer :: NEIBPE (:) integer(kind=kint ), pointer :: STACK_IMPORT(:) integer(kind=kint ), pointer :: NOD_IMPORT (:) integer(kind=kint ), pointer :: STACK_EXPORT(:) integer(kind=kint ), pointer :: NOD_EXPORT (:) real (kind=kreal), dimension(3*N), intent(inout):: WS real (kind=kreal), dimension(3*N), intent(inout):: WR real (kind=kreal), dimension(3*N), intent(inout):: X integer , intent(in) ::SOLVER_COMM integer , intent(in) :: my_rank...

use hpcmw_utiluse dynamic_griduse dynamic_cntltype (hpcmw_local_mesh) :: local_mesh N = local_mesh%n_internalNP= local_mesh%n_nodeICELTOT= local_mesh%n_elem

NEIBPETOT= local_mesh%n_neighbor_peNEIBPE => local_mesh%neighbor_pe

STACK_IMPORT=> local_mesh%import_index NOD_IMPORT=> local_mesh%import_nodeSTACK_EXPORT=> local_mesh%export_index NOD_EXPORT=> local_mesh%export_node

Page 106: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

106

3rd ACES WG mtg. June 2003.

Simple Interface: Communication use hpcmw_util type (hpcmw_local_mesh) :: local_mesh... call SOLVER_SEND_RECV_3 ( local_mesh, WW(1,ZP)) ...

module solver_SR_3 contains

subroutine SOLVER_SEND_RECV_3 (local_mesh, X) use hpcmw_util type (hpcmw_local_mesh) :: local_mesh real(kind=kreal), dimension(:), allocatable, save:: WS, WR ...

use hpcmw_utiluse dynamic_griduse dynamic_cntltype (hpcmw_local_mesh) :: local_mesh N = local_mesh%n_internalNP= local_mesh%n_nodeICELTOT= local_mesh%n_elem

NEIBPETOT= local_mesh%n_neighbor_peNEIBPE => local_mesh%neighbor_pe

STACK_IMPORT=> local_mesh%import_index NOD_IMPORT=> local_mesh%import_nodeSTACK_EXPORT=> local_mesh%export_index NOD_EXPORT=> local_mesh%export_node

Page 107: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

107

3rd ACES WG mtg. June 2003.

Preliminary Study in FY.2002 FEM Procedure for 3D Elastic Problem Parallel I/O Iterative Linear Solvers (ICCG, ILU-BiCGSTAB) FEM Procedures (Matrix Connectivity/Assembling)

Linear Solvers SMP Cluster/Distributed Memory Vector/Scalar Processors CM-RCM/MC Reordering

Page 108: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

108

3rd ACES WG mtg. June 2003.

Method of Matrix Storage Scalar/Distributed

Memory CRS with Natural

Ordering

Scalar/SMP Cluster PDCRS/CM-RCM PDCRS/MC

Vector/Distributed & SMP Cluster

PDJDS/CM-RCM PDJDS/MC

PDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-ordering

Page 109: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

109

3rd ACES WG mtg. June 2003.

Main Program

program SOLVER33_TEST

use solver33use hpcmw_all

implicit REAL*8(A-H,O-Z)

call HPCMW_INITcall INPUT_CNTLcall INPUT_GRID

call MAT_CON0call MAT_CON1

call MAT_ASS_MAIN (valA,valB,valX)call MAT_ASS_BC

call SOLVE33 (hpcmwIarray, hpcmwRarray)call HPCMW_FINALIZE

end program SOLVER33_TEST

FEM program developed by users

• call same subroutines• interfaces are same• NO MPI !!

Procedure of each subroutine is different in the individual library

Page 110: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

110

3rd ACES WG mtg. June 2003.

HPCMW_INIT ?for MPI

subroutine HPCMW_INIT

use hpcmw_allimplicit REAL*8(A-H,O-Z)

call MPI_INIT (ierr)call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr)call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, err)

end subroutine HPCMW_INIT

for NO-MPI (SMP)

subroutine HPCMW_INIT

use hpcmw_allimplicit REAL*8(A-H,O-Z)ierr= 0end subroutine HPCMW_INIT

Page 111: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

111

3rd ACES WG mtg. June 2003.

Solve33 for SMP Cluster/Scalar module SOLVER33

contains subroutine SOLVE33 (hpcmwIarray, hpcmwRarray)

use hpcmw_solver_matrix use hpcmw_solver_cntl use hpcmw_fem_mesh

use solver_CG_3_SMP_novec use solver_BiCGSTAB_3_SMP_novec

implicit REAL*8 (A-H,O-Z)

real(kind=kreal), dimension(3,3) :: ALU real(kind=kreal), dimension(3) :: PW

integer :: ERROR, ICFLAG character(len=char_length) :: BUF

data ICFLAG/0/

integer(kind=kint ), dimension(:) :: hpcmwIarray real (kind=kreal), dimension(:) :: hpcmwRarray

!C!C +------------+!C | PARAMETERs |!C +------------+!C=== ITER = hpcmwIarray(1) METHOD = hpcmwIarray(2) PRECOND = hpcmwIarray(3) NSET = hpcmwIarray(4) iterPREmax= hpcmwIarray(5)

RESID = hpcmwRarray(1) SIGMA_DIAG= hpcmwRarray(2)

if (iterPREmax.lt.1) iterPREmax= 1 if (iterPREmax.gt.4) iterPREmax= 4!C===

!C!C +-----------+!C | BLOCK LUs |!C +-----------+!C===

(skipped)

!C===

!C!C +------------------+!C | ITERATIVE solver |!C +------------------+!C=== if (METHOD.eq.1) then call CG_3_SMP_novec & & ( N, NP, NPL, NPU, PEsmpTOT, NHYP, IVECT, STACKmc, & & NEWtoOLD, OLDtoNEW, & & D, AL, indexL, itemL, AU, indexU, itemU, & & B, X, ALUG, RESID, ITER, ERROR, & & my_rank, NEIBPETOT, NEIBPE, & & NOD_STACK_IMPORT, NOD_IMPORT, & & NOD_STACK_EXPORT, NOD_EXPORT, & & SOLVER_COMM , PRECOND, iterPREmax) endif

if (METHOD.eq.2) then call BiCGSTAB_3_SMP_novec & & ( N, NP, NPL, NPU, PEsmpTOT, NHYP, IVECT, STACKmc, & & NEWtoOLD, OLDtoNEW, & & D, AL, indexL, itemL, AU, indexU, itemU, & & B, X, ALUG, RESID, ITER, ERROR, & & my_rank, NEIBPETOT, NEIBPE, & & NOD_STACK_IMPORT, NOD_IMPORT, & & NOD_STACK_EXPORT, NOD_EXPORT, & & SOLVER_COMM , PRECOND, iterPREmax) endif

ITERactual= ITER!C===

end subroutine SOLVE33 end module SOLVER33

Page 112: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

112

3rd ACES WG mtg. June 2003.

Solve33 for SMP Cluster/Vector module SOLVER33

contains subroutine SOLVE33 (hpcmwIarray, hpcmwRarray)

use hpcmw_solver_matrix use hpcmw_solver_cntl use hpcmw_fem_mesh

use solver_VCG33_DJDS_SMP use solver_VBiCGSTAB33_DJDS_SMP

implicit REAL*8 (A-H,O-Z)

real(kind=kreal), dimension(3,3) :: ALU real(kind=kreal), dimension(3) :: PW

integer :: ERROR, ICFLAG character(len=char_length) :: BUF

data ICFLAG/0/

integer(kind=kint ), dimension(:) :: hpcmwIarray real (kind=kreal), dimension(:) :: hpcmwRarray!C!C +------------+!C | PARAMETERs |!C +------------+!C=== ITER = hpcmwIarray(1) METHOD = hpcmwIarray(2) PRECOND = hpcmwIarray(3) NSET = hpcmwIarray(4) iterPREmax= hpcmwIarray(5)

RESID = hpcmwRarray(1) SIGMA_DIAG= hpcmwRarray(2)

if (iterPREmax.lt.1) iterPREmax= 1 if (iterPREmax.gt.4) iterPREmax= 4!C===

!C!C +-----------+!C | BLOCK LUs |!C +-----------+!C===

(skipped)

!C===

!C!C +------------------+!C | ITERATIVE solver |!C +------------------+!C=== if (METHOD.eq.1) then call VCG33_DJDS_SMP & & ( N, NP, NLmax, NUmax, NPL, NPU, NHYP, PEsmpTOT, & & STACKmcG, STACKmc, NLmaxHYP, NUmaxHYP, IVECT, & & NEWtoOLD, OLDtoNEW_L, OLDtoNEW_U, NEWtoOLD_U, LtoU,& & D, PAL, indexL, itemL, PAU, indexU, itemU, B, X, & & ALUG_L, ALUG_U, RESID, ITER, ERROR, my_rank, & & NEIBPETOT, NEIBPE, NOD_STACK_IMPORT, NOD_IMPORT, & & NOD_STACK_EXPORT, NOD_EXPORT, & & SOLVER_COMM, PRECOND, iterPREmax) endif

if (METHOD.eq.2) then call VBiCGSTAB33_DJDS_SMP & & ( N, NP, NLmax, NUmax, NPL, NPU, NHYP, PEsmpTOT, & & STACKmcG, STACKmc, NLmaxHYP, NUmaxHYP, IVECT, & & NEWtoOLD, OLDtoNEW_L, OLDtoNEW_U, NEWtoOLD_U, LtoU,& & D, PAL, indexL, itemL, PAU, indexU, itemU, B, X, & & ALUG_L, ALUG_U, RESID, ITER, ERROR, my_rank, & & NEIBPETOT, NEIBPE, NOD_STACK_IMPORT, NOD_IMPORT, & & NOD_STACK_EXPORT, NOD_EXPORT, & & SOLVER_COMM, PRECOND, iterPREmax) endif

ITERactual= ITER!C===

end subroutine SOLVE33 end module SOLVER33

Page 113: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

113

3rd ACES WG mtg. June 2003.

Mat.Ass. for SMP Cluster/Scalar do ie= 1, 8 ip = nodLOCAL(ie) do je= 1, 8 jp = nodLOCAL(je)

kk= 0 if (jp.gt.ip) then iiS= indexU(ip-1) + 1 iiE= indexU(ip )

do k= iiS, iiE if ( itemU(k).eq.jp ) then kk= k exit endif enddo endif

if (jp.lt.ip) then iiS= indexL(ip-1) + 1 iiE= indexL(ip )

do k= iiS, iiE if ( itemL(k).eq.jp) then kk= k exit endif enddo endif

PNXi= 0.d0 PNYi= 0.d0 PNZi= 0.d0 PNXj= 0.d0 PNYj= 0.d0 PNZj= 0.d0

VOL= 0.d0

do kpn= 1, 2 do jpn= 1, 2 do ipn= 1, 2 coef= dabs(DETJ(ipn,jpn,kpn))*WEI(ipn)*WEI(jpn)*WEI(kpn)

VOL= VOL + coef PNXi= PNX(ipn,jpn,kpn,ie) PNYi= PNY(ipn,jpn,kpn,ie) PNZi= PNZ(ipn,jpn,kpn,ie)

PNXj= PNX(ipn,jpn,kpn,je) PNYj= PNY(ipn,jpn,kpn,je) PNZj= PNZ(ipn,jpn,kpn,je)

a11= valX*(PNXi*PNXj+valB*(PNYi*PNYj+PNZi*PNZj))*coef a22= valX*(PNYi*PNYj+valB*(PNZi*PNZj+PNXi*PNXj))*coef a33= valX*(PNZi*PNZj+valB*(PNXi*PNXj+PNYi*PNYj))*coef

a12= (valA*PNXi*PNYj + valB*PNXj*PNYi)*coef a13= (valA*PNXi*PNZj + valB*PNXj*PNZi)*coef ...

if (jp.gt.ip) then PAU(9*kk-8)= PAU(9*kk-8) + a11 ... PAU(9*kk )= PAU(9*kk ) + a33 endif

if (jp.lt.ip) then PAL(9*kk-8)= PAL(9*kk-8) + a11 ... PAL(9*kk )= PAL(9*kk ) + a33 endif

if (jp.eq.ip) then D(9*ip-8)= D(9*ip-8) + a11 ... D(9*ip )= D(9*ip ) + a33 endif enddo enddo enddo

enddo endif enddo enddo

Page 114: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

114

3rd ACES WG mtg. June 2003.

Mat.Ass. for SMP Cluster/Vector do ie= 1, 8 ip = nodLOCAL(ie) if (ip.le.N) then do je= 1, 8 jp = nodLOCAL(je) kk= 0 if (jp.gt.ip) then ipU= OLDtoNEW_U(ip) jpU= OLDtoNEW_U(jp) kp= PEon(ipU) iv= COLORon(ipU) nn= ipU - STACKmc((iv-1)*PEsmpTOT+kp-1)

do k= 1, NUmaxHYP(iv) iS= indexU(npUX1*(iv-1)+PEsmpTOT*(k-1)+kp-1) + nn if ( itemU(iS).eq.jpU) then kk= iS exit endif enddo endif

if (jp.lt.ip) then ipL= OLDtoNEW_L(ip) jpL= OLDtoNEW_L(jp) kp= PEon(ipL) iv= COLORon(ipL) nn= ipL - STACKmc((iv-1)*PEsmpTOT+kp-1)

do k= 1, NLmaxHYP(iv) iS= indexL(npLX1*(iv-1)+PEsmpTOT*(k-1)+kp-1) + nn if ( itemL(iS).eq.jpL) then kk= iS exit endif enddo endif

PNXi= 0.d0 PNYi= 0.d0 PNZi= 0.d0 PNXj= 0.d0 PNYj= 0.d0 PNZj= 0.d0

VOL= 0.d0

do kpn= 1, 2 do jpn= 1, 2 do ipn= 1, 2 coef= dabs(DETJ(ipn,jpn,kpn))*WEI(ipn)*WEI(jpn)*WEI(kpn)

VOL= VOL + coef PNXi= PNX(ipn,jpn,kpn,ie) PNYi= PNY(ipn,jpn,kpn,ie) PNZi= PNZ(ipn,jpn,kpn,ie)

PNXj= PNX(ipn,jpn,kpn,je) PNYj= PNY(ipn,jpn,kpn,je) PNZj= PNZ(ipn,jpn,kpn,je)

a11= valX*(PNXi*PNXj+valB*(PNYi*PNYj+PNZi*PNZj))*coef a22= valX*(PNYi*PNYj+valB*(PNZi*PNZj+PNXi*PNXj))*coef a33= valX*(PNZi*PNZj+valB*(PNXi*PNXj+PNYi*PNYj))*coef

a12= (valA*PNXi*PNYj + valB*PNXj*PNYi)*coef a13= (valA*PNXi*PNZj + valB*PNXj*PNZi)*coef ...

if (jp.gt.ip) then PAU(9*kk-8)= PAU(9*kk-8) + a11 ... PAU(9*kk )= PAU(9*kk ) + a33 endif

if (jp.lt.ip) then PAL(9*kk-8)= PAL(9*kk-8) + a11 ... PAL(9*kk )= PAL(9*kk ) + a33 endif

if (jp.eq.ip) then D(9*ip-8)= D(9*ip-8) + a11 ... D(9*ip )= D(9*ip ) + a33 endif enddo enddo enddo

enddo endif enddo enddo

Page 115: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

115

3rd ACES WG mtg. June 2003.

Hardware Earth Simulator SMP Cluster, 8 PE/node Vector Processor Hitachi SR8000/128 SMP Cluster, 8 PE/node Pseudo-Vector Xeon 2.8 GHz Cluster 2 PE/node, Myrinet Flat MPI only Hitachi SR2201 Pseudo-Vector Flat MPI

Page 116: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

116

3rd ACES WG mtg. June 2003.

Simple 3D Cubic Model

x

y

z

Uz=0 @ z=Zmin

Ux=0 @ x=Xmin

Uy=0 @ y=Ymin

Uniform Distributed Force in z-dirrection @ z=Zmin

Ny-1

Nx-1

Nz-1

Page 117: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

117

3rd ACES WG mtg. June 2003.

Earth Simulator, DJDS.64x64x64/SMP node, up to 125,829,120 DOF

● : Flat MPI, ○ : Hybrid

0

1000

2000

3000

4000

0 16 32 48 64 80 96 112 128 144 160 176 192

NODE#

GF

LO

PS

40

50

60

70

80

90

100

0 16 32 48 64 80 96 112 128 144 160 176 192

NODE#

Pa

rall

el

Wo

rk R

ati

o:

%

GFLOPS rate Parallel Work Ratio

●Flat MPI

○Hybrid

Page 118: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

118

3rd ACES WG mtg. June 2003.

Earth Simulator, DJDS. 100x100x100/SMP node, up to 480,000,000 DOF

● : Flat MPI, ○ : Hybrid

GFLOPS rate Parallel Work Ratio

0

1000

2000

3000

4000

0 16 32 48 64 80 96 112 128 144 160 176 192

NODE#

GF

LO

PS

40

50

60

70

80

90

100

0 16 32 48 64 80 96 112 128 144 160 176 192

NODE#

Pa

rall

el

Wo

rk R

ati

o:

%

●Flat MPI

○Hybrid

Page 119: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

119

3rd ACES WG mtg. June 2003.

Earth Simulator, DJDS. 256x128x128/SMP node, up to 2,214,592,512 DOF

● : Flat MPI, ○ : Hybrid

GFLOPS rate Parallel Work Ratio

0

1000

2000

3000

4000

0 16 32 48 64 80 96 112 128 144 160 176 192

NODE#

GF

LO

PS

40

50

60

70

80

90

100

0 16 32 48 64 80 96 112 128 144 160 176 192

NODE#

Pa

rall

el

Wo

rk R

ati

o:

%

3.8TFLOPS for 2.2G DOF 3.8TFLOPS for 2.2G DOF 176 nodes (33.8%)176 nodes (33.8%)

●Flat MPI

○Hybrid

Page 120: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

120

3rd ACES WG mtg. June 2003.

Hitachi SR8000/1288 PEs/1-SMP node

0.00

0.50

1.00

1.50

2.00

2.50

1.E+04 1.E+05 1.E+06 1.E+07

DOF

GF

LO

PS

0.00

0.50

1.00

1.50

2.00

2.50

1.E+04 1.E+05 1.E+06 1.E+07

DOF

GF

LO

PS

SMP Flat-MPI

● : PDJDS, ○ : PDCRS, ▲ : CRS-Natural

PDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS PDJDS CRS

Page 121: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

121

3rd ACES WG mtg. June 2003.

Hitachi SR8000/1288 PEs/1-SMP nodePDJDS

● : SMP○ : Flat-MPI

0.00

0.50

1.00

1.50

2.00

2.50

1.E+04 1.E+05 1.E+06 1.E+07

DOF

GF

LO

PS

PDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS PDJDS CRS

Page 122: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

122

3rd ACES WG mtg. June 2003.

Xeon & SR22018 PEs

Xeon SR2201

● : PDJDS, ○ : PDCRS, ▲ : CRS-Natural

0.00

0.20

0.40

0.60

0.80

1.00

1.E+04 1.E+05 1.E+06 1.E+07

DOF

GF

LO

PS

1.50

2.00

2.50

3.00

1.E+04 1.E+05 1.E+06 1.E+07

DOF

GF

LO

PS

PDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS PDJDS CRS

Page 123: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

123

3rd ACES WG mtg. June 2003.

Xeon: Speed UP1-24 PEs

163 nodes/PE 323 nodes/PE

● : PDJDS, ○ : PDCRS, ▲ : CRS-Natural

0.00

2.00

4.00

6.00

8.00

0 8 16 24 32

PE#

GF

LO

PS

0.00

2.00

4.00

6.00

8.00

0 8 16 24 32

PE#

GF

LO

PS

PDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS PDJDS CRS

Page 124: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

124

3rd ACES WG mtg. June 2003.

Xeon: Speed UP1-24 PEs

163 nodes/PE 323 nodes/PE

● : PDJDS, ○ : PDCRS, ▲ : CRS-Natural

0

8

16

24

32

0 8 16 24 32

PE#

Sp

ee

d U

P

0

8

16

24

32

0 8 16 24 32

PE#

Sp

ee

d U

P

PDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS PDJDS CRS

Page 125: HPC Middleware (HPC-MW) Infrastructure for Scientific Applications on HPC Environments Overview and Recent Progress Kengo Nakajima, RIST. 3rd ACES WG Meeting,

125

3rd ACES WG mtg. June 2003.

Hitachi SR2201: Speed UP1-64 PEs

163 nodes/PE 323 nodes/PE

● : PDJDS, ○ : PDCRS, ▲ : CRS-Natural

0.00

1.00

2.00

3.00

4.00

5.00

6.00

0 8 16 24 32 40 48 56 64

PE#

GF

LO

PS

0.00

1.00

2.00

3.00

4.00

5.00

6.00

0 8 16 24 32 40 48 56 64

PE#

GF

LO

PS

PDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS/CM-RCM PDCRS/CM-RCMshort innermost loop

CRS no re-orderingPDJDS PDJDS CRS