parallel i/o - international hpc summer school · paralleli/o internationalhpcsummerschool...

83
Parallel I/O International HPC Summer School July 11, 2018 Elsa Gonsiorowski HPC I/O Specialist, LLNL LLNL-PRES-751922 This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC

Upload: others

Post on 27-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel I/OInternational HPC Summer School

July 11, 2018 Elsa GonsiorowskiHPC I/O Specialist, LLNL

LLNL-PRES-751922This work was performed under the auspices of the U.S. Department of Energy by Lawrence LivermoreNational Laboratory under contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC

Page 2: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Outline

MotivationI/O in ParallelStep 1: Recognize a needStep 2: Existing I/O Libraries and ToolsStep 3: I/O PatternsStep 4: Understand the File SystemStep 6: Profit

Technical Details: MPI I/OPro-Tips!

LLNL-PRES-751922 2

Page 3: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Motivation

LLNL-PRES-751922 3

Page 4: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Types of I/O

InputLaunching an executable & it’s linked librariesReading configuration fileLoading data files

OutputCheckpointsResults

ScienceMoving files from onemachine to anotherCleaning up after experiments

Everyone interacts with a file system therefore everyone does I/O!

LLNL-PRES-751922 4

Page 5: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Types of I/O

InputLaunching an executable & it’s linked librariesReading configuration fileLoading data files

OutputCheckpointsResults

ScienceMoving files from onemachine to anotherCleaning up after experiments

Everyone interacts with a file system therefore everyone does I/O!

LLNL-PRES-751922 4

Page 6: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Types of I/O

InputLaunching an executable & it’s linked librariesReading configuration fileLoading data files

OutputCheckpointsResults

ScienceMoving files from onemachine to anotherCleaning up after experiments

Everyone interacts with a file system therefore everyone does I/O!

LLNL-PRES-751922 4

Page 7: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Types of I/O

InputLaunching an executable & it’s linked librariesReading configuration fileLoading data files

OutputCheckpointsResults

ScienceMoving files from onemachine to anotherCleaning up after experiments

Everyone interacts with a file system therefore everyone does I/O!

LLNL-PRES-751922 4

Page 8: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Why should I care?

Datamovement is expensive andmust be optimized

Total execution time =Computation time

LLNL-PRES-751922 5

Page 9: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Why should I care?

Datamovement is expensive andmust be optimized

Total execution time =Computation time

LLNL-PRES-751922 5

Page 10: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Why should I care?

Datamovement is expensive andmust be optimized

Total execution time =Computation time+Communication time

LLNL-PRES-751922 5

Page 11: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Why should I care?

Datamovement is expensive andmust be optimized

Total execution time =Computation time+Communication time+I/O time

LLNL-PRES-751922 5

Page 12: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

HPC Storage Stack

GPUMemory (HBM2): 900GB/s

CPUMemory (DDR4): 120 GB/sNode-local storage or /tmp (SSD): 1.1 GB/sPFS (HDD+ SSD +Magic): 40 GB/s

burst buffer"project" storage"campaign store"

HPSS (Tape + Robots): 0.2 GB/s

LLNL-PRES-751922 6

Page 13: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

HPC Storage Stack

GPUMemory (HBM2): 900GB/sCPUMemory (DDR4): 120 GB/s

Node-local storage or /tmp (SSD): 1.1 GB/sPFS (HDD+ SSD +Magic): 40 GB/s

burst buffer"project" storage"campaign store"

HPSS (Tape + Robots): 0.2 GB/s

LLNL-PRES-751922 6

Page 14: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

HPC Storage Stack

GPUMemory (HBM2): 900GB/sCPUMemory (DDR4): 120 GB/sNode-local storage or /tmp (SSD): 1.1 GB/s

PFS (HDD+ SSD +Magic): 40 GB/s

burst buffer"project" storage"campaign store"

HPSS (Tape + Robots): 0.2 GB/s

LLNL-PRES-751922 6

Page 15: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

HPC Storage Stack

GPUMemory (HBM2): 900GB/sCPUMemory (DDR4): 120 GB/sNode-local storage or /tmp (SSD): 1.1 GB/sPFS (HDD+ SSD +Magic): 40 GB/s

burst buffer"project" storage"campaign store"

HPSS (Tape + Robots): 0.2 GB/s

LLNL-PRES-751922 6

Page 16: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

HPC Storage Stack

GPUMemory (HBM2): 900GB/sCPUMemory (DDR4): 120 GB/sNode-local storage or /tmp (SSD): 1.1 GB/sPFS (HDD+ SSD +Magic): 40 GB/s

burst buffer

"project" storage"campaign store"

HPSS (Tape + Robots): 0.2 GB/s

LLNL-PRES-751922 6

Page 17: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

HPC Storage Stack

GPUMemory (HBM2): 900GB/sCPUMemory (DDR4): 120 GB/sNode-local storage or /tmp (SSD): 1.1 GB/sPFS (HDD+ SSD +Magic): 40 GB/s

burst buffer"project" storage

"campaign store"HPSS (Tape + Robots): 0.2 GB/s

LLNL-PRES-751922 6

Page 18: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

HPC Storage Stack

GPUMemory (HBM2): 900GB/sCPUMemory (DDR4): 120 GB/sNode-local storage or /tmp (SSD): 1.1 GB/sPFS (HDD+ SSD +Magic): 40 GB/s

burst buffer"project" storage"campaign store"

HPSS (Tape + Robots): 0.2 GB/s

LLNL-PRES-751922 6

Page 19: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

HPC Storage Stack

GPUMemory (HBM2): 900GB/sCPUMemory (DDR4): 120 GB/sNode-local storage or /tmp (SSD): 1.1 GB/sPFS (HDD+ SSD +Magic): 40 GB/s

burst buffer"project" storage"campaign store"

HPSS (Tape + Robots): 0.2 GB/s

LLNL-PRES-751922 6

Page 20: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

HPC Storage Stack

GPUMemory (HBM2): 900GB/s per GPUCPUMemory (DDR4): 120 GB/s per socketNode-local storage (SSD): 1.1 GB/s per nodePFS (HDD+ SSD +Magic): 40 GB/s shared by a system

burst buffer"project" storage"campaign store"

HPSS (Tape + Robots): 0.2 GB/s shared by a center

LLNL-PRES-751922 6

Page 21: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

File SystemsLaptop

1 user1.1 GB/s

Network FileSystem (NFS)

m servers, n clientshome directory2 GB/s throughput280K IOPS

Parallel File System(PFS)

Used byHPC jobsSystem specificscratch or project storage40GB/s throughputMillions of IOPS

LLNL-PRES-751922 7

Page 22: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel File System

LLNL-PRES-751922 8

Page 23: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel File System

LLNL-PRES-751922 9

Page 24: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel File System

LLNL-PRES-751922 9

Page 25: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

I/O in Parallel

LLNL-PRES-751922 10

Page 26: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the need

Get some data out of the applicationGet some data out of the application fasterDeal with files efficiently

2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern4. Understand the file system you are working on5. ???6. Profit!

LLNL-PRES-751922 11

Page 27: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the needGet some data out of the application

Get some data out of the application fasterDeal with files efficiently

2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern4. Understand the file system you are working on5. ???6. Profit!

LLNL-PRES-751922 11

Page 28: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the needGet some data out of the applicationGet some data out of the application faster

Deal with files efficiently2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern4. Understand the file system you are working on5. ???6. Profit!

LLNL-PRES-751922 11

Page 29: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the needGet some data out of the applicationGet some data out of the application fasterDeal with files efficiently

2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern4. Understand the file system you are working on5. ???6. Profit!

LLNL-PRES-751922 11

Page 30: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the needGet some data out of the applicationGet some data out of the application fasterDeal with files efficiently

2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern4. Understand the file system you are working on5. ???6. Profit!

LLNL-PRES-751922 11

Page 31: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the needGet some data out of the applicationGet some data out of the application fasterDeal with files efficiently

2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern

4. Understand the file system you are working on5. ???6. Profit!

LLNL-PRES-751922 11

Page 32: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the needGet some data out of the applicationGet some data out of the application fasterDeal with files efficiently

2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern4. Understand the file system you are working on

5. ???6. Profit!

LLNL-PRES-751922 11

Page 33: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the needGet some data out of the applicationGet some data out of the application fasterDeal with files efficiently

2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern4. Understand the file system you are working on5. ???

6. Profit!

LLNL-PRES-751922 11

Page 34: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the needGet some data out of the applicationGet some data out of the application fasterDeal with files efficiently

2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern4. Understand the file system you are working on5. ???6. Profit!

LLNL-PRES-751922 11

Page 35: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Step 1: Recognize a need

LLNL-PRES-751922 12

Page 36: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Profiling

DarshanTau

Attend tomorrow’s performance analysis session!

LLNL-PRES-751922 13

Page 37: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Profiling

DarshanTau

Attend tomorrow’s performance analysis session!

LLNL-PRES-751922 13

Page 38: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Step 2: Existing Libraries + Tools

LLNL-PRES-751922 14

Page 39: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel I/O Libraries and Tools

Reading &Writing Files:HDF5PnetCDFOthers: ADIOS, TyphonIO,SILOMPI-IO

Managing Files:SpindlempiFileUtilsSCR

LLNL-PRES-751922 15

Page 40: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Library: HDF5

Hierarchical Data FormatFile-system in a fileDatasets: multidimensional arrays of a homogeneous typeGroups: container structures which can hold datasets andother groupsOfficial support for C, C++, Fortran 77, Fortran 90, JavaImplementations in R, Perl, Python, Ruby, Haskell,Mathematica, MATLAB, etc.

LLNL-PRES-751922 16

Page 41: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Library: PNetCDF

Built on netCDF andMPI-IOnetCDF:

self-describing, machine-independent formatdesigned for arrays of scientific datanetCDF is implemented in C, C++, Fortran 77, Fortran 90,Java, R, Perl, Python, Ruby, Haskell, Mathematica, MATLAB,etc.

LLNL-PRES-751922 17

Page 42: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Library: MPI-IO

API for interacting with files withMPI conceptsblocking vs. non-blockingcollective vs. non-collective

Lower level than other librariesFine-grain control of files and offsetsC and Fortran interfacesSeparate effort from regularMPI

LLNL-PRES-751922 18

Page 43: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Tool: Spindle

Scalable dynamic library and Python loadingCaches linked librariesLife saver for NFS issues

https://github.com/hpc/spindle

LLNL-PRES-751922 19

Page 44: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Tool: mpiFileUtils

Use parallel processes to perform file operationsExecutedwithin a job allocationdbcast: broadcast a file from PFS to node-local storagedcp: copymultiple file in paralleldrm: delete files in parallelmanymore

https://github.com/hpc/mpifileutils

LLNL-PRES-751922 20

Page 45: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Library: SCR

Scalable Checkpoint RestartEnable checkpointing applications totake advantage of system storagehierarchiesEfficient file movement betweenstorage layersData redundancy operations

LLNL-PRES-751922 21

Page 46: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Step 3: I/O Patterns

LLNL-PRES-751922 22

Page 47: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel I/O Patterns

Single file, accessed by 1 task

Single shared file, accessed by all tasksMany shared files, accessed by groups of tasks

Baton-passingCoordinated "View"

Many independent files, accessed by a subset of tasksOne file per process

LLNL-PRES-751922 23

Page 48: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel I/O Patterns

Single file, accessed by 1 taskSingle shared file, accessed by all tasks

Many shared files, accessed by groups of tasks

Baton-passingCoordinated "View"

Many independent files, accessed by a subset of tasksOne file per process

LLNL-PRES-751922 23

Page 49: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel I/O Patterns

Single file, accessed by 1 taskSingle shared file, accessed by all tasksMany shared files, accessed by groups of tasks

Baton-passingCoordinated "View"

Many independent files, accessed by a subset of tasksOne file per process

LLNL-PRES-751922 23

Page 50: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel I/O Patterns

Single file, accessed by 1 taskSingle shared file, accessed by all tasksMany shared files, accessed by groups of tasks

Baton-passing

Coordinated "View"Many independent files, accessed by a subset of tasksOne file per process

LLNL-PRES-751922 23

Page 51: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel I/O Patterns

Single file, accessed by 1 taskSingle shared file, accessed by all tasksMany shared files, accessed by groups of tasks

Baton-passingCoordinated "View"

Many independent files, accessed by a subset of tasksOne file per process

LLNL-PRES-751922 23

Page 52: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel I/O Patterns

Single file, accessed by 1 taskSingle shared file, accessed by all tasksMany shared files, accessed by groups of tasks

Baton-passingCoordinated "View"

Many independent files, accessed by a subset of tasks

One file per process

LLNL-PRES-751922 23

Page 53: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel I/O Patterns

Single file, accessed by 1 taskSingle shared file, accessed by all tasksMany shared files, accessed by groups of tasks

Baton-passingCoordinated "View"

Many independent files, accessed by a subset of tasksOne file per process

LLNL-PRES-751922 23

Page 54: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Step 4: Understand the PFS

LLNL-PRES-751922 24

Page 55: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel File SystemPolicies

Allocation: howmuch space you have

Backups: if backups or snapshots are createdPurges: when data is deletedConfiguration: I/O pattern system is configured for

LLNL-PRES-751922 25

Page 56: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel File SystemPolicies

Allocation: howmuch space you haveBackups: if backups or snapshots are created

Purges: when data is deletedConfiguration: I/O pattern system is configured for

LLNL-PRES-751922 25

Page 57: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel File SystemPolicies

Allocation: howmuch space you haveBackups: if backups or snapshots are createdPurges: when data is deleted

Configuration: I/O pattern system is configured for

LLNL-PRES-751922 25

Page 58: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel File SystemPolicies

Allocation: howmuch space you haveBackups: if backups or snapshots are createdPurges: when data is deletedConfiguration: I/O pattern system is configured for

LLNL-PRES-751922 25

Page 59: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel File Systems

BlackMagic: IBM’s GPFS (general parallel file system)Closed sourceaka Elastic Scale Storage™ or Spectrum Scale™HPC users do not have knobs to tune

WhiteMagic: LustreOpen sourceUsers can deviate from default behavior

LLNL-PRES-751922 26

Page 60: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Parallel File Systems

BlackMagic: IBM’s GPFS (general parallel file system)Closed sourceaka Elastic Scale Storage™ or Spectrum Scale™HPC users do not have knobs to tune

WhiteMagic: LustreOpen sourceUsers can deviate from default behavior

LLNL-PRES-751922 26

Page 61: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Lustre Striping

HDDs are logically grouped intoOSTs (Object StorageTargets)Users can stripe a file across multiple OSTs

Explicitly take advantage of multiple OSTsDepends on the total amount of I/O you are doingThere is a system default

Use the correct striping for your use case

LLNL-PRES-751922 27

Page 62: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Lustre Striping Commands

$ lfs setstripe -c 4 -s 4M testfile2$ lfs getstripe ./testfile2./testfile2lmm_stripe_count: 4lmm_stripe_size: 4194304lmm_stripe_offset: 21

obdidx objid objid group50 8916056 0x880c58 038 8952827 0x889bfb 0

LLNL-PRES-751922 28

Page 63: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Lustre Striping Commands

$ lfs getstripe ./testfile./testfilelmm_stripe_count: 2lmm_stripe_size: 1048576lmm_stripe_offset: 50

obdidx objid objid group21 8891547 0x87ac9b 013 8946053 0x888185 057 8906813 0x87e83d 044 8945736 0x888048 0

LLNL-PRES-751922 29

Page 64: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Step 6: Profit

LLNL-PRES-751922 30

Page 65: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Steps for Dealing with I/O

1. Recognize the needGet some data out of the applicationGet some data out of the application fasterDeal with files efficiently

2. Investigate I/O libraries and tools, onemay be common inyour field.

3. Implement an I/O pattern4. Understand the file system you are working on5. ???6. Profit!

LLNL-PRES-751922 31

Page 66: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Technical Details: MPI I/O

LLNL-PRES-751922 32

Page 67: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Locking and Atomicity

$ export BGLOCKLESSMPIO_F_TYPE=1

int MPI_File_set_atomicity ( MPI_File mpi_fh, int flag );

LLNL-PRES-751922 33

Page 68: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Opening Files

int MPI_File_open(MPI_Comm comm, const char *filename,int amode, MPI_Info info, MPI_File *fh);

AMode DescriptionMPI_MODE_RDONLY read onlyMPI_MODE_RDWR reading andwritingMPI_MODE_WRONLY write onlyMPI_MODE_CREATE create the fileMPI_MODE_EXCL error if file already existsMPI_MODE_DELETE_ON_CLOSE delete file on closeMPI_MODE_UNIQUE_OPEN file will not be concurrently openedMPI_MODE_SEQUENTIAL file will only be accessed sequentiallyMPI_MODE_APPEND position of all file pointers to end

LLNL-PRES-751922 34

Page 69: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Organizing Data

Use MPI_Datatype to define the structure of your dataCorresponds to C struct

Read andwrite instances of this dataUse MPI_File_set_view for working with non-contiguousdata in a shared file

LLNL-PRES-751922 35

Page 70: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

UsefulMPI Function

offset = (long long) 0;MPI_Exscan(&contribute, &offset, 1, MPI_LONG_LONG,

MPI_SUM, file_comm);

Rank 0 1 2 3 4contribute 3 4 2 7 3offset 0 3 7 9 16

LLNL-PRES-751922 36

Page 71: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

UsefulMPI Function

offset = (long long) 0;MPI_Exscan(&contribute, &offset, 1, MPI_LONG_LONG,

MPI_SUM, file_comm);

Rank 0 1 2 3 4contribute 3 4 2 7 3

offset 0 3 7 9 16

LLNL-PRES-751922 36

Page 72: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

UsefulMPI Function

offset = (long long) 0;MPI_Exscan(&contribute, &offset, 1, MPI_LONG_LONG,

MPI_SUM, file_comm);

Rank 0 1 2 3 4contribute 3 4 2 7 3offset 0 3 7 9 16

LLNL-PRES-751922 36

Page 73: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Accessing Files withMPI

LLNL-PRES-751922 37

Page 74: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Accessing Files withMPI

LLNL-PRES-751922 37

Page 75: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Accessing Files withMPILevel 0independent file ops, explicit offset, sequential dataLevel 1collective file ops, explicit offset, sequential dataLevel 2independent file ops, derived or non-contiguous dataLevel 3collective file ops, derived or non-contiguous data

LLNL-PRES-751922 38

Page 76: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

MPI I/O& Lustre

Can be built by HPC resource providers with Lustreintegration

mpi_info_set(myinfo, "striping_factor", stripe_count);mpi_info_set(myinfo, "striping_unit", stripe_size);mpi_info_set(myinfo, "cb_nodes", num_writers);

LLNL-PRES-751922 39

Page 77: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Pro-Tips!

LLNL-PRES-751922 40

Page 78: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Pro-Tip!

StepOneProfile your code. Fix up the I/O until it doesn’t suck.

LLNL-PRES-751922 41

Page 79: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Pro-Tip!

Be SmartDon’t re-invent I/O, use an existing library or tool.

LLNL-PRES-751922 42

Page 80: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Pro-Tip!

Working with File SystemsUse the PFS for Parallel I/O, do NOT use NFS.

LLNL-PRES-751922 43

Page 81: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Pro-Tip!

I/O PatternCreate 1 file per node andmake this a tune-able parameter.

LLNL-PRES-751922 44

Page 82: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

Pro-Tip!

Ask an ExpertFind the "I/O person" at your HPC center and ask for guidance.

LLNL-PRES-751922 45

Page 83: Parallel I/O - International HPC Summer School · ParallelI/O InternationalHPCSummerSchool July11,2018 ElsaGonsiorowski HPCI/OSpecialist,LLNL LLNL-PRES-751922 ThisworkwasperformedundertheauspicesoftheU.S

This document was prepared as an account of work sponsored by an agency of the United States government. Neitherthe United States government nor Lawrence Livermore National Security, LLC, nor any of their employeesmakes anywarranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, orusefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringeprivately owned rights. Reference herein to any specific commercial product, process, or service by trade name,trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, orfavoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions ofauthors expressed herein do not necessarily state or reflect those of the United States government or LawrenceLivermore National Security, LLC, and shall not be used for advertising or product endorsement purposes.