san diego supercomputer center hdf5/srb integration july 10, 2006 mike wan [email protected]...

10
SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan [email protected] SRB, SDSC Peter Cao [email protected] HDF, NCSA Sponsored by CIP/NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration

Upload: ronnie-bluitt

Post on 15-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

HDF5/SRB IntegrationJuly 10, 2006

Mike [email protected]

SRB, SDSC

Peter [email protected]

HDF, NCSA

Sponsored by CIP/NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration

Page 2: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

Current Status

• Present the work at the TeraGrid '06 • Publish HDF5 AIP documents

• White paper: http://hdf.ncsa.uiuc.edu/hdf-aip-html/• HDF5 METS template: http://hdf.ncsa.uiuc.edu/hdf-aip-html/hdf5_mets_template.xml

• Finish h5ingest command line tool• Create HDF5 METS template file• Validate HDF5 METS document

• Setup a demo server to support SCEC files

Page 3: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

Current Status

• Work on test suite and bug fix• Add code to separate HDF5 I/O time and SRB time• Test large files and dataset (>2GB)• Fix bug at srb client handler

• Work on performance improvement• Implement a fairly large set of changes for the

performance improvement by transfer raw data by byte-stream

• Need to test on large files

Page 4: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

Next Month

• More tests on performance for transferring raw data

• Add more features to HDFView for SRB support• Integrate the software into the SRB configuration

and distribution

Page 5: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

Potential SAC Projects

• SDSC ENZO project• Enzo, 3D cosmological hydrodynamics code, simulating the process

of massive star formation and destruction• HDF5 is used as file format and parallel file I/O access

• FLASH Program• The UC/DOE collaboration on creating three-dimensional, virtual

reality projections of the cosmic explosions• HDF5 is used for storing the data and high I/O access

• SCEC Terascale Earthquake Simulations • Over 100 TB data/year• Collections at SRB – 2.6 million files, 114 Terabytes

Page 6: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

TeraShake Surface Seismograms

• 4D Array (1.2 TB)• Time (22,728) • Horizontal (3,000) • Vertical (1,500) • Vector Component (3)

• Each file:• 22,728 x 3,000 x 5 x 1• 1,363,680,000 Bytes

• TeraShake scenario• 900 files

Page 7: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

Example HDF5 File

xhist00001 hpss-scec 1363680000 2005-02-23-22.37

xhist00002 hpss-scec 1363680000 2005-02-23-22.38

xhist00003 hpss-scec 1363680000 2005-02-23-22.39

xhist00004 hpss-scec 1363680000 2005-02-23-22.40

xhist00005 hpss-scec 1363680000 2005-02-23-22.40

HDF5File

32-bit float

22,7283,000

25

Page 8: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

File on SRB server

Page 9: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

Select a Subset

Page 10: SAN DIEGO SUPERCOMPUTER CENTER HDF5/SRB Integration July 10, 2006 Mike Wan mwan@sdsc.edu mwan@sdsc.edu SRB, SDSC Peter Cao xcao@ncsa.edu xcao@ncsa.edu

SAN DIEGO SUPERCOMPUTER CENTER

HDFView