hdf 1 ncsa hdf xml activities robert e. mcgrath ([email protected]) mike folk...
TRANSCRIPT
1 HDFHDF
NCSA HDF XML ActivitiesNCSA HDF XML Activities
Robert E. McGrath ([email protected])Mike Folk ([email protected])
National Center for Supercomputing ApplicationsUniversity of Illinois, Urbana-Champaign
October 18, 2000
2 HDFHDF
SummarySummary
• This year we began to implement what is planned to be a complete set of tools for using XML and HDF5 to describe and manage scientific data.
3 HDFHDF
Viewer/Editor
Generator
Dumper
DDL
Source code
DDL
ASCII
XML HDF5 file
HDF5 file
HDF5 file
HDF5 file
XML
XML
XML
SRBHDF5 file XML
XML
HDF5 file
Input and output for HDF5 toolsInput and output for HDF5 tools
Java
Non-Java
DDL
ASCII
DIFFHDF5 and/orXML
Diff file, etc.
Done
Done
Done
Doing
Doing
Doing
Doing
Binary
?
October, 2000
As of: ?
4 HDFHDF
Work To DateWork To Date
• The completed work includes:– A formal description of the HDF5 Data Model [1]– A design study of "Use Cases" for XML and HDF5
[2]– A Document Type Definition (DTD) for the HDF5
Data Model [3]– 'h5gen', a Java tool that generates and HDF5 file
from an XML description. [4]
5 HDFHDF
Current WorkCurrent Work
• Our current work includes:– Adding XML output to the 'h5dump' utility, to
generate XML description of an HDF5 file– Adding XML ingest and output to the 'h5view'
visual editor [4]– Revisions of the HDF5 DTD
6 HDFHDF
Lessons LearnedLessons Learned
• In this work, we have learned some important lessons. [5]– The "Use Cases" were especially instructive: it was
clear that there are many possible uses of XML, with different requirements.
– XML DTD's are unable to model numeric data effectively.
– An HDF5 file may be structured as a rooted directed graph, this is difficult to represent in an XML tree structure.
7 HDFHDF
XML Use CasesXML Use Cases• The HDF5 DTD will support a variety of uses:
– Viewing structure and contents of HDF5 file using a Web browser
– XML as a catalog record for locating datasets – XML as an intermediate form for programs – Generation, validation, and reconstruction of HDF5 files – XML as intermediate to other formal languages and file
formats – Store XML in archive or in dataset as machine readable
documentation – Templates, skeleton files, etc.
See: [3]
8 HDFHDF
Numeric DataNumeric Data• E.g., an HDF5 dataset. 10 X 10, with 32-bit, signed
Integers, big-endian.– XML can describe this datatype (see the HDF5
DTD)– but the contents of the array (100 numbers) can’t
be expressed in any simple, standard way
• XML has no data types except “characters”
• Solution: Use XML Schema?
9 HDFHDF
Expressing a GraphExpressing a Graph• HDF5 file structure is not a tree, it is a rooted,
directed graph. It may have ‘loops’, objects may have > 1 parent, >1 name, etc.
• XML enclosure maps nicely to a tree, but can’t express a graph or multiple parentage.
• Solution: several possibilities, see [5]• Our approach: include the object in the tree once,
use pointers for subsequent instances
10 HDFHDF
Future PossibilitiesFuture Possibilities
• Additional future work may include:– Create an XML Schema for the HDF5 Data
Model, this should be able to handle numbers effectively.
– Explore the use of style sheets to transform between HDF5 and other data formats, such as netcdf.
11 HDFHDF
ReferencesReferences1. HDF5 Abstract Data Model,
http://hdf.ncsa.uiuc.edu/HDF5/ADM_990506/
2. Suggested Use Cases for XML with HDF-5, http://hdf.ncsa.uiuc.edu/HDF5/XML/UseCases/use-cases.html
3. HDF5-File.dtd, http://hdf.ncsa.uiuc.edu/HDF5-File.dtd
4. HDF Java Tools, http://hdf.ncsa.uiuc.edu/java-hdf5-html
5. A DTD for HDF5: Design Notes,
http://hdf.ncsa.uiuc.edu/HDF5/XML/dtd-report.html