steve rutz noaa/nesdis national oceanographic data center nodc observing systems team leader june...
TRANSCRIPT
Towards a common data model and a more efficient
ocean data Archive
Steve RutzNOAA/NESDIS National Oceanographic Data Center
NODC Observing Systems Team LeaderJune 21, 2011
OverviewBackground and TerminologyPast Data ModelsNew Data ModelsRecent activities toward a more efficient
archiveArchive automationArchive access servicesNetCDF templates
Outreach
Acquire: receive ocean and coastal data from U.S. and foreign sources
Archive: preserve those data assets for the long term
Access: provide access to archival data and data products for business, federal, science, and many other customers
Add Value: assemble easy-to-use, long term collections for science and applications
So, what does NODC do?
Scientific Stewardship
The OAIS Reference ModelOpen Archival Information System (OAIS)
ISO Standard (14721) for Digital ArchivesApplies to all organizations that need to
preserve digital information for the long-termDoes NOT specify any particular
implementationAn organization conforms to the OAIS RM by
discharging a minimal set of responsibilities and supporting basic information concepts
The OAIS EnvironmentProducer provides information to be
preservedManagement sets overall policyConsumer seeks and acquires preserved
informationOAIS
Archive
Management
Producer Consumer
The OAIS Environment from 30,000 ft
Data Model (NODCentric)
An abstract model that documents and organizes environmental data for communication between data producers, the Archive (NODC), and data consumers so that applications can be written to access and store data.
Data Producer ↔ NODC ↔ Data Consumer
Past Data Model: MULDARSMULti-disciplinary Data Archive and Retrieval System
(aka, NODC Master Data Files)NODC developed MULDARS in 1970sDozens of File Formats
One for each type of dataASCII/text, 80 or 120 character recordsExample: High-resolution CTD/STD Data (F022)Example: Benthic Organism Data (F132)
Most data stored/accessed in one or more of these formats
Ended in 1994 – Difficult to maintainConverting new dataAdding new data types
New Data Models: netCDFNetwork Common Data Form (netCDF)Developed by Unidata program at the University
Corporation for Atmospheric Research (UCAR)In different contexts: data model, file format, or
APINODC started receiving data in netCDF – 1990sTwo Data Models
ClassicEnhanced
Classic netCDF Data Model
Enhanced netCDF Data Model
Submission Information Form – a Submission Agreement reached between NODC and the Producer that specifies a data model for the Data Submission SessionsIDs POCs, environmental data types, file formats, etc.
Submission Information Package – data and metadata files packaged by Producer and acquired by NODC
Archival Information PackageAccession Tracking Database – tracks metadata relevant to
data discovery and records version control for long-term stewardship
Producer's SIP – data are preserved as submitted to NODCArchive's supplementary files – browse graphic of
geographic coordinates, FGDC metadata records, and more interoperable file formats of the data for long-term stewardship
Archive Automation
Archival buoy data from NDBC in F291 format (Meteorology, Oceanography, and Wave Spectra Data from Buoys)NODC “manually” archived since 1970Difficult for NDBC to add new data types (last
one added in 2005) and maintain codeNODC-NDBC Modernization Project – 1st phase
completed 2nd Quarter 2011NODC automated acquisition and archiving of
data from NDBC’s moored (weather) buoys and Coastal-Marine Automated Network (C-MAN) stations
Data in netCDF-4 – uses Enhanced Data ModelNext phase – Tropical Atmosphere-Ocean (TAO)
buoys and CTD casts taken during TAO maintenance cruises
Example: NDBC Data
Example: GHRSST ProjectNODC serves as
GHRSST Long Term Stewardship and Reanalysis Facility
First data stream automatically ingested into NODC Ocean Archive System
Over 1.6 million netCDF files and 33 TB of SST data
Transitioning from netCDF-3 to netCDF-4 Classic by 2013
http://ghrsst.nodc.noaa.gov
DiscoveryHuman to Machine Interfaces
Machine to Machine Interfaces
Discovery
The NODC Ocean Archive
AIP AIP AIP AIP AIP
Discovery is enabled through numerous interfaces designed for both humans and their machine clients.
Human-to-machine interfaces include government-mandated generalized portals like Data.gov and Geospatial One-Stop (GOS)
Discovery services are available for ALL of the NODC Archive holdings, but better metadata supports better discovery!
OAS
GoogleData.go
vGOS
CSW
OpenSearch
SRU/ISO23950
Geoportal Server Web App
Geoportal Server REST API
Access and Use
LAS, GIS, KML
WCS, WMS, SOS
DAP
FTP and HTTP
The NODC Ocean Archive
AIP AIP AIP AIP AIP
Enhanced online access, visualization, and analysis tools: These capabilities require more structured metadata and standardized file formats, so are available to the fewest archive holdings.
Data Access Protocol (DAP): Requires standard file formats so is available to fewer archive holdings.
Basic FTP/HTTP access for all Archival Information Packages (AIP) in the NODC Ocean Archive: These distribution methods have no format or metadata requirements so they work for all archive holdings, but they provide only basic download capability.
Get your DIPs!
Access Services
Ocean Archive System http://www.nodc.noaa.gov/Archive/Search
HTTP and FTP ftp://data.nodc.noaa.gov/http://data.nodc.noaa.gov/
OPeNDAP’s Data Access Protocol via Hyraxhttp://data.nodc.noaa.gov/opendap
OGC’s WMS and WCS via THREDDS Data Serverhttp://data.nodc.noaa.gov/thredds
Web Accessible Folder of metadata harvested by Google, geodata.gov (aka, Geospatial One Stop), and Data.govhttp://data.nodc.noaa.gov/NESDIS_DataCenters/metadata/index.html
Live Access Serverhttp://data.nodc.noaa.gov/las
Access ServicesStay tuned … More to Come!
OGC’s Catalog Service for the Web (CSW) Search/Retrieval via URL (SRU)Geoportal ServerArcGIS Serverand someday Sensor Observation Service (SOS)
In other words, this is not your father’s NODC! (or your adviser’s, or the NODC you knew while in grad school…)
NODC’s netCDF templatesFollows standard conventions
NetCDF Climate and Forecast (CF) MetadataUnidata’s netCDF Attribute Convention for
Dataset Discovery (ACDD)Recommends best-practices for variables and
attributes – e.g., uuid, platform, instrument, and expand acronyms
NODC’s netCDF templatesBased on "feature types" by Unidata and CF
trajectory (“done”)profile (“done”)grid (“done”)point (started)trajectory profile (started)time seriestime series profileswath
NetCDF and Conventions stack
netCDF-3
netCDF-4(/HDF-5)
netCDF-classic
netCDF-enhanced
CF
NODC Best Practices
File Format
Data Model
Community Conventions
Extended Community Conventions
Low level netCDF, HDF API
netCDF- API, OPeNDAP Servers (e.g. THREDDS), Matlab
libcf, nc-ISO, GIS software (e.g. GDAL), LAS
Custom software
ACDD
pointtime
seriestrajector
y
profiletime
series profile
trajectory profile
swath
grid
CF Feature Types
This is now an OGC standard
Software Stack GHRSST, WOD,
GTSPP, etc standards
NODC – National Oceanographic Data CenterCF- Climate ForecastACDD - Attribute Conventions for Data DiscoveryGDAL – Geospatial Data Abstraction LibraryISO – International Organization for StandardizationLAS - Live Access Server
ERDDAP – Environmental Research Division’s Data Access ProgramnetCDF- network Common Data FormatHDF – Hierarchical Data FormatGIS – Geographic Information SystemsDAP – Distributed Access ProtocolAPI – Application Programming Interface
Outreach
NODC Archive Training Session for IOOS Regional Association Data Managers (April 2011)
NOAA EDMC (June 2011)Earth Science Information Partner (ESIP)
Federation meeting (August 2011) Next steps with netCDF templates
Post to NOAA's Global Earth Observation - Integrated Data Environment (GEO-IDE) for comment
Propose to CF
Send us an e-mail!
To: [email protected]: StakeholderSubject: NODC’s netCDF templates
Information? Collaboration?