hdf 1 ncsa hdf xml activities robert e. mcgrath ([email protected]) mike folk...

11
1 HDF HDF NCSA HDF XML Activities NCSA HDF XML Activities Robert E. McGrath ([email protected]) Mike Folk ([email protected]) National Center for Supercomputing Applications University of Illinois, Urbana-Champaign October 18, 2000

Upload: bertha-crawford

Post on 23-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

1 HDFHDF

NCSA HDF XML ActivitiesNCSA HDF XML Activities

Robert E. McGrath ([email protected])Mike Folk ([email protected])

National Center for Supercomputing ApplicationsUniversity of Illinois, Urbana-Champaign

October 18, 2000

Page 2: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

2 HDFHDF

SummarySummary

• This year we began to implement what is planned to be a complete set of tools for using XML and HDF5 to describe and manage scientific data.

Page 3: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

3 HDFHDF

Viewer/Editor

Generator

Dumper

DDL

Source code

DDL

ASCII

XML HDF5 file

HDF5 file

HDF5 file

HDF5 file

XML

XML

XML

SRBHDF5 file XML

XML

HDF5 file

Input and output for HDF5 toolsInput and output for HDF5 tools

Java

Non-Java

DDL

ASCII

DIFFHDF5 and/orXML

Diff file, etc.

Done

Done

Done

Doing

Doing

Doing

Doing

Binary

?

October, 2000

As of: ?

Page 4: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

4 HDFHDF

Work To DateWork To Date

• The completed work includes:– A formal description of the HDF5 Data Model [1]– A design study of "Use Cases" for XML and HDF5

[2]– A Document Type Definition (DTD) for the HDF5

Data Model [3]– 'h5gen', a Java tool that generates and HDF5 file

from an XML description. [4]

Page 5: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

5 HDFHDF

Current WorkCurrent Work

• Our current work includes:– Adding XML output to the 'h5dump' utility, to

generate XML description of an HDF5 file– Adding XML ingest and output to the 'h5view'

visual editor [4]– Revisions of the HDF5 DTD

Page 6: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

6 HDFHDF

Lessons LearnedLessons Learned

• In this work, we have learned some important lessons. [5]– The "Use Cases" were especially instructive: it was

clear that there are many possible uses of XML, with different requirements.

– XML DTD's are unable to model numeric data effectively.

– An HDF5 file may be structured as a rooted directed graph, this is difficult to represent in an XML tree structure.

Page 7: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

7 HDFHDF

XML Use CasesXML Use Cases• The HDF5 DTD will support a variety of uses:

– Viewing structure and contents of HDF5 file using a Web browser

– XML as a catalog record for locating datasets – XML as an intermediate form for programs – Generation, validation, and reconstruction of HDF5 files – XML as intermediate to other formal languages and file

formats – Store XML in archive or in dataset as machine readable

documentation – Templates, skeleton files, etc.

See: [3]

Page 8: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

8 HDFHDF

Numeric DataNumeric Data• E.g., an HDF5 dataset. 10 X 10, with 32-bit, signed

Integers, big-endian.– XML can describe this datatype (see the HDF5

DTD)– but the contents of the array (100 numbers) can’t

be expressed in any simple, standard way

• XML has no data types except “characters”

• Solution: Use XML Schema?

Page 9: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

9 HDFHDF

Expressing a GraphExpressing a Graph• HDF5 file structure is not a tree, it is a rooted,

directed graph. It may have ‘loops’, objects may have > 1 parent, >1 name, etc.

• XML enclosure maps nicely to a tree, but can’t express a graph or multiple parentage.

• Solution: several possibilities, see [5]• Our approach: include the object in the tree once,

use pointers for subsequent instances

Page 10: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

10 HDFHDF

Future PossibilitiesFuture Possibilities

• Additional future work may include:– Create an XML Schema for the HDF5 Data

Model, this should be able to handle numbers effectively.

– Explore the use of style sheets to transform between HDF5 and other data formats, such as netcdf.

Page 11: HDF 1 NCSA HDF XML Activities Robert E. McGrath (mcgrath@ncsa.uiuc.edu) Mike Folk (mfolk@ncsa.uiuc.edu) National Center for Supercomputing Applications

11 HDFHDF

ReferencesReferences1. HDF5 Abstract Data Model,

http://hdf.ncsa.uiuc.edu/HDF5/ADM_990506/

2. Suggested Use Cases for XML with HDF-5, http://hdf.ncsa.uiuc.edu/HDF5/XML/UseCases/use-cases.html

3. HDF5-File.dtd, http://hdf.ncsa.uiuc.edu/HDF5-File.dtd

4. HDF Java Tools, http://hdf.ncsa.uiuc.edu/java-hdf5-html

5. A DTD for HDF5: Design Notes,

http://hdf.ncsa.uiuc.edu/HDF5/XML/dtd-report.html