san diego supercomputer center hdf5/srb integration july 10, 2006 mike wan [email protected]...
TRANSCRIPT
SAN DIEGO SUPERCOMPUTER CENTER
HDF5/SRB IntegrationJuly 10, 2006
Mike [email protected]
SRB, SDSC
Peter [email protected]
HDF, NCSA
Sponsored by CIP/NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration
SAN DIEGO SUPERCOMPUTER CENTER
Current Status
• Present the work at the TeraGrid '06 • Publish HDF5 AIP documents
• White paper: http://hdf.ncsa.uiuc.edu/hdf-aip-html/• HDF5 METS template: http://hdf.ncsa.uiuc.edu/hdf-aip-html/hdf5_mets_template.xml
• Finish h5ingest command line tool• Create HDF5 METS template file• Validate HDF5 METS document
• Setup a demo server to support SCEC files
SAN DIEGO SUPERCOMPUTER CENTER
Current Status
• Work on test suite and bug fix• Add code to separate HDF5 I/O time and SRB time• Test large files and dataset (>2GB)• Fix bug at srb client handler
• Work on performance improvement• Implement a fairly large set of changes for the
performance improvement by transfer raw data by byte-stream
• Need to test on large files
SAN DIEGO SUPERCOMPUTER CENTER
Next Month
• More tests on performance for transferring raw data
• Add more features to HDFView for SRB support• Integrate the software into the SRB configuration
and distribution
SAN DIEGO SUPERCOMPUTER CENTER
Potential SAC Projects
• SDSC ENZO project• Enzo, 3D cosmological hydrodynamics code, simulating the process
of massive star formation and destruction• HDF5 is used as file format and parallel file I/O access
• FLASH Program• The UC/DOE collaboration on creating three-dimensional, virtual
reality projections of the cosmic explosions• HDF5 is used for storing the data and high I/O access
• SCEC Terascale Earthquake Simulations • Over 100 TB data/year• Collections at SRB – 2.6 million files, 114 Terabytes
SAN DIEGO SUPERCOMPUTER CENTER
TeraShake Surface Seismograms
• 4D Array (1.2 TB)• Time (22,728) • Horizontal (3,000) • Vertical (1,500) • Vector Component (3)
• Each file:• 22,728 x 3,000 x 5 x 1• 1,363,680,000 Bytes
• TeraShake scenario• 900 files
SAN DIEGO SUPERCOMPUTER CENTER
Example HDF5 File
xhist00001 hpss-scec 1363680000 2005-02-23-22.37
xhist00002 hpss-scec 1363680000 2005-02-23-22.38
xhist00003 hpss-scec 1363680000 2005-02-23-22.39
xhist00004 hpss-scec 1363680000 2005-02-23-22.40
xhist00005 hpss-scec 1363680000 2005-02-23-22.40
HDF5File
32-bit float
22,7283,000
25
SAN DIEGO SUPERCOMPUTER CENTER
File on SRB server
SAN DIEGO SUPERCOMPUTER CENTER
Select a Subset
SAN DIEGO SUPERCOMPUTER CENTER
HDFView