integrating netcdf and opendap (the drno project) dr. dennis heimbigner unidata go-essp workshop...
TRANSCRIPT
Integrating netCDF and OPeNDAP
(The DrNO Project)
Dr. Dennis Heimbigner
Unidata
Go-ESSP Workshop
Seattle, WA, Sept 17-19 2008
Overview Primary goal: Integrate client-side DAP protocol
into netCDF C Library Access any DAP data source (thru DAP server)
using the netCDF API Initially netCDF-3, later netCDF-4
Rationale: Combine two commonly used API’s for access to scientific datasets
Issues: Data model translation DAP dataset URL support (Server side: transparent access to netCDF-4 data)
DAP Data Model Primitive types:
byte, (u)int16, (u)int32, float32, float64, string Arrays: FORTRAN style rectangular arrays
with bounded dimensions Limited naming of dimensions
Structure: heterogeneous collection of fields Analog to C/C++ Structs
Sequence: variable length array of Structures Allows relational constraints
DAP Data Model (cont.) Grid: Combination of an n-dimensional array
with n 1-dimensional mapping arrays In effect a structure for an array plus its coordinate
variables (in netCDF-speak) Structures, Grids, and Sequences may be
arbitrarily nested with each other All types are “singletons”
Type reuse requires repeating the definition
Specifying a DAP Data Source A DAP data source is specified using an
extended URL syntax that refers to the DAP server containing that data
Format: <clientparams><baseurl>?<projection>&<selection>
Client parameters: [name=value]… URL extension specific to the DAP/netCDF
integration Base URL: e.g. http://.../...
Points to the DAP server
Specifying a DAP source (cont.) DAP URL also specifies constraints on the
data to be returned by the server Projection: variable-name[first:stride:last]
Returns a slice of a rectangular array Selection: boolean expression over variables
E.g. x > 5 or y < 6 Only applies to sequences
netCDF-3 (aka classic) Data Model Primitive types: char, byte, short, int, float,
double Named shared dimensions N-dimensional FORTRAN style arrays Single unlimited dimension
May only occur as first (slowest changing) dimension
E.g. int var(unlimited,lat,long)
netCDF-3 Translation Issues Result must conform to legal classic model
E.g. no nested sequences or arrays of sequences Synthesize shared dimensions
Infer from DAP dimension name and value Convert grids to equivalent netCDF-3
coordinate variable convention Coordinate variable = 1-d variable with same name
as a dimension Contains coordinate values for that dimension
Flatten non-dimensioned structures and grids Sequence = unlimited dimension 1-d array
netCDF-4 (aka enhanced) Data Model Derived from the HDF5 data model netCDF-3 model plus: More primitives: ubyte, ushort, uint (u)int64,
string, enums, opaque (fixed length byte strings) Named user defined types: Compound (=Structure) Vlen – variable length 1-d array Arbitrary use of unlimited dimensions Groups: similar to file system directory tree
Group can contain types, dimensions, and variables
netCDF-4 Translation Issues netCDF-4 is effectively a superset of the current
OPenDAP data model Carryover issues from netCDF-3:
Inference of shared dimensions Grid translation to coordinate variable convention
Translate structures, grids, and sequences to compound types or maybe groups?
Explore DAP data model extensions to include selected netCDF Enhanced concepts Esp. groups and shared dimensions
Server-side issues Desirable to be able to pass a netCDF-4 file
through a DAP server to a DAP client and through the translation and get the same file
Information is currently lost in translation Solutions:
add various attribute tags to restore missing information
Extend OPenDAP data model
Status netCDF 4.1-alpha: available now
Libdap+libnc-dap version integrated into current netCDF snapshot build
Supports translation of subset of the DAP protocol to netCDF-3
Requires C++ netCDF 4.1-beta: end of 2008
Utilizes Ocapi + modified netCDF => no C++ Limited translation similar to libnc-dap
netCDF 4.1: 2009 Utilizes Ocapi + modified netCDF Complete support for translating DAP to netCDF-4
Java version also exists now Uses somewhat different translation rules
Acknowledgement NSF Award #0721628 Title: SDCI NMI Improvement: OPeNDAP
and NetCDF Integration Principal Investigators: James Gallagher
(opendap.org) and Russell Rew (Unidata)