data management challenges in - oak ridge institute for ...€¦ · data management challenges in...

1
Data Management Challenges in HBPS Jong Youl Choi 1 , Michael Churchill 2 , Robert Hager 2 , Seung-Hoe Ku 2 , E. D’Azevedo 1 , Bill Hoffman 3 , David Pugmire 1 , Scott Klasky 1 , C. S. Chang 3 1 ORNL, 2 PPPL, 3 Kitware Acknowledgments: Work supported by U.S. DOE Office of Science, ASCR and FES. This research used resources of OLCF, ALCF, and NERSC, which are DOE Office of Science User Facilities. XGC I/O Performance Coupling Workflows I/O For Next Generation HPCs We maintain cutting edge I/O performance for XGC on various file systems, including SSDs and NVMe, on Theta, Cori, and Summit. We also tested on Tusbame3. We develop 3+ way coupling workflows in HBPS 1) XGC and hPIC Plasma-material-interaction hPIC code coupled into XGC hPIC code has 6D marker particles, while XGC has 5D marker particles 2) XGC and on-memory analysis routines XGC computes 5D f and electromagnetic field Hand-off computational reduction of physics from XGC 3) XGC and M3D-C1 M3D-C1 will calculate the nonlinear evolution of the ELM XGC1 will use the MHD-produced non-axisymmetric electromagnetic field and calculate kinetic transport The kinetic information from XGC will be continuously fed back into M3D-C1 for a higher fidelity MHD calculation ADIOS coupling technology provides In Situ/in-memory coupling framework Provide easy-monitoring, data compression, and/or visualization services while XGC and coupled application are running hPIC F-total M3D-C1 Put In Situ Staging (SST, InSituMPI, DataMan) Get XGC Performance Monitoring Data Reduction Coupling Execution In Situ Staging Services In Situ Visualization ADIOS I/O System Summit ORNL Theta ANL Cori NERSC Locality Node local Node local Remote Shared System Local filesystem XFS filesystem Cray WARP Capacity 800 GB per node 128 GB per node 288 Server 50 TB limit per job Parallel Filesystem GPFS Lustre Lustre Lustre The DM team continue to innovate to take full advantage of the new memory and storage technologies, and to provide the highest levels of performance. Plasma distributions from XGC-hPIC XGC Software Process Agile XGC development Incorporate a modern CMake build system Continuous Integration testing system Git workflow incorporated with CI system Integrate CDash into github

Upload: others

Post on 18-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Management Challenges in - Oak Ridge Institute for ...€¦ · Data Management Challenges in HBPS Jong Youl Choi1, Michael Churchill2, Robert Hager2, Seung-Hoe Ku2, E. D’Azevedo1,

Data Management Challenges in HBPS

Jong Youl Choi1, Michael Churchill2, Robert Hager2, Seung-Hoe Ku2, E. D’Azevedo1, Bill Hoffman3, David Pugmire1, Scott Klasky1, C. S. Chang3

1ORNL, 2PPPL, 3Kitware

Acknowledgments: Work supported by U.S. DOE Office of Science, ASCR and FES. This research used resources of OLCF, ALCF, and NERSC, which are DOE Office of Science User Facilities.

XGC I/O Performance Coupling Workflows

I/O For Next Generation HPCs

We maintain cutting edge I/O performance for XGC on various file systems, including SSDs and NVMe, on Theta, Cori, and Summit. We also tested on Tusbame3.

We develop 3+ way coupling workflows in HBPS1) XGC and hPIC• Plasma-material-interaction hPIC code

coupled into XGC• hPIC code has 6D marker particles,

while XGC has 5D marker particles2) XGC and on-memory analysis routines• XGC computes 5D f and electromagnetic field• Hand-off computational reduction of physics from XGC3) XGC and M3D-C1• M3D-C1 will calculate the nonlinear evolution of the ELM• XGC1 will use the MHD-produced non-axisymmetric

electromagnetic field and calculate kinetic transport• The kinetic information from XGC will be continuously fed

back into M3D-C1 for a higher fidelity MHD calculation

ADIOS coupling technology provides• In Situ/in-memory coupling framework• Provide easy-monitoring, data compression, and/or

visualization services while XGC and coupled application are running

hPICF-total

M3D-C1

Put

In Situ Staging(SST, InSituMPI,

DataMan)

Get

XGC

Performance Monitoring Data Reduction Coupling

Execution

In Situ Staging Services

In Situ Visualization

ADIOS

I/O System SummitORNL

ThetaANL

CoriNERSC

Locality Node local Node local Remote SharedSystem Local filesystem XFS filesystem Cray WARP

Capacity 800 GB per node

128 GB per node

288 Server50 TB limit

per jobParallel

FilesystemGPFSLustre

Lustre Lustre

The DM team continue to innovate to take full advantage of the new memory and storage technologies, and to provide the highest levels of performance.

Plasma distributions from XGC-hPIC

XGC Software Process

Agile XGC development• Incorporate a modern

CMake build system• Continuous Integration

testing system• Git workflow incorporated with CI system• Integrate CDash into github