steve rutz noaa/nesdis national oceanographic data center nodc observing systems team leader june...

22
Towards a common data model and a more efficient ocean data Archive Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Upload: christian-webb

Post on 24-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Towards a common data model and a more efficient

ocean data Archive

Steve RutzNOAA/NESDIS National Oceanographic Data Center

NODC Observing Systems Team LeaderJune 21, 2011

Page 2: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

OverviewBackground and TerminologyPast Data ModelsNew Data ModelsRecent activities toward a more efficient

archiveArchive automationArchive access servicesNetCDF templates

Outreach

Page 3: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Acquire: receive ocean and coastal data from U.S. and foreign sources

Archive: preserve those data assets for the long term

Access: provide access to archival data and data products for business, federal, science, and many other customers

Add Value: assemble easy-to-use, long term collections for science and applications

So, what does NODC do?

Scientific Stewardship

Page 4: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

The OAIS Reference ModelOpen Archival Information System (OAIS)

ISO Standard (14721) for Digital ArchivesApplies to all organizations that need to

preserve digital information for the long-termDoes NOT specify any particular

implementationAn organization conforms to the OAIS RM by

discharging a minimal set of responsibilities and supporting basic information concepts

Page 5: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

The OAIS EnvironmentProducer provides information to be

preservedManagement sets overall policyConsumer seeks and acquires preserved

informationOAIS

Archive

Management

Producer Consumer

The OAIS Environment from 30,000 ft

Page 6: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Data Model (NODCentric)

An abstract model that documents and organizes environmental data for communication between data producers, the Archive (NODC), and data consumers so that applications can be written to access and store data.

Data Producer ↔ NODC ↔ Data Consumer

Page 7: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Past Data Model: MULDARSMULti-disciplinary Data Archive and Retrieval System

(aka, NODC Master Data Files)NODC developed MULDARS in 1970sDozens of File Formats

One for each type of dataASCII/text, 80 or 120 character recordsExample: High-resolution CTD/STD Data (F022)Example: Benthic Organism Data (F132)

Most data stored/accessed in one or more of these formats

Ended in 1994 – Difficult to maintainConverting new dataAdding new data types

Page 8: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

New Data Models: netCDFNetwork Common Data Form (netCDF)Developed by Unidata program at the University

Corporation for Atmospheric Research (UCAR)In different contexts: data model, file format, or

APINODC started receiving data in netCDF – 1990sTwo Data Models

ClassicEnhanced

Page 9: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Classic netCDF Data Model

Page 10: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Enhanced netCDF Data Model

Page 11: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Submission Information Form – a Submission Agreement reached between NODC and the Producer that specifies a data model for the Data Submission SessionsIDs POCs, environmental data types, file formats, etc.

Submission Information Package – data and metadata files packaged by Producer and acquired by NODC

Archival Information PackageAccession Tracking Database – tracks metadata relevant to

data discovery and records version control for long-term stewardship

Producer's SIP – data are preserved as submitted to NODCArchive's supplementary files – browse graphic of

geographic coordinates, FGDC metadata records, and more interoperable file formats of the data for long-term stewardship

Archive Automation

Page 12: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Archival buoy data from NDBC in F291 format (Meteorology, Oceanography, and Wave Spectra Data from Buoys)NODC “manually” archived since 1970Difficult for NDBC to add new data types (last

one added in 2005) and maintain codeNODC-NDBC Modernization Project – 1st phase

completed 2nd Quarter 2011NODC automated acquisition and archiving of

data from NDBC’s moored (weather) buoys and Coastal-Marine Automated Network (C-MAN) stations

Data in netCDF-4 – uses Enhanced Data ModelNext phase – Tropical Atmosphere-Ocean (TAO)

buoys and CTD casts taken during TAO maintenance cruises

Example: NDBC Data

Page 13: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Example: GHRSST ProjectNODC serves as

GHRSST Long Term Stewardship and Reanalysis Facility

First data stream automatically ingested into NODC Ocean Archive System

Over 1.6 million netCDF files and 33 TB of SST data

Transitioning from netCDF-3 to netCDF-4 Classic by 2013

http://ghrsst.nodc.noaa.gov

Page 14: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

DiscoveryHuman to Machine Interfaces

Machine to Machine Interfaces

Discovery

The NODC Ocean Archive

AIP AIP AIP AIP AIP

Discovery is enabled through numerous interfaces designed for both humans and their machine clients.

Human-to-machine interfaces include government-mandated generalized portals like Data.gov and Geospatial One-Stop (GOS)

Discovery services are available for ALL of the NODC Archive holdings, but better metadata supports better discovery!

OAS

GoogleData.go

vGOS

CSW

OpenSearch

SRU/ISO23950

Geoportal Server Web App

Geoportal Server REST API

Page 15: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Access and Use

LAS, GIS, KML

WCS, WMS, SOS

DAP

FTP and HTTP

The NODC Ocean Archive

AIP AIP AIP AIP AIP

Enhanced online access, visualization, and analysis tools: These capabilities require more structured metadata and standardized file formats, so are available to the fewest archive holdings.

Data Access Protocol (DAP): Requires standard file formats so is available to fewer archive holdings.

Basic FTP/HTTP access for all Archival Information Packages (AIP) in the NODC Ocean Archive: These distribution methods have no format or metadata requirements so they work for all archive holdings, but they provide only basic download capability.

Get your DIPs!

Page 16: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Access Services

Ocean Archive System http://www.nodc.noaa.gov/Archive/Search

HTTP and FTP ftp://data.nodc.noaa.gov/http://data.nodc.noaa.gov/

OPeNDAP’s Data Access Protocol via Hyraxhttp://data.nodc.noaa.gov/opendap

OGC’s WMS and WCS via THREDDS Data Serverhttp://data.nodc.noaa.gov/thredds

Web Accessible Folder of metadata harvested by Google, geodata.gov (aka, Geospatial One Stop), and Data.govhttp://data.nodc.noaa.gov/NESDIS_DataCenters/metadata/index.html

Live Access Serverhttp://data.nodc.noaa.gov/las

Page 17: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Access ServicesStay tuned … More to Come!

OGC’s Catalog Service for the Web (CSW) Search/Retrieval via URL (SRU)Geoportal ServerArcGIS Serverand someday Sensor Observation Service (SOS)

In other words, this is not your father’s NODC! (or your adviser’s, or the NODC you knew while in grad school…)

Page 18: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

NODC’s netCDF templatesFollows standard conventions

NetCDF Climate and Forecast (CF) MetadataUnidata’s netCDF Attribute Convention for

Dataset Discovery (ACDD)Recommends best-practices for variables and

attributes – e.g., uuid, platform, instrument, and expand acronyms

Page 19: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

NODC’s netCDF templatesBased on "feature types" by Unidata and CF

trajectory (“done”)profile (“done”)grid (“done”)point (started)trajectory profile (started)time seriestime series profileswath

Page 20: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

NetCDF and Conventions stack

netCDF-3

netCDF-4(/HDF-5)

netCDF-classic

netCDF-enhanced

CF

NODC Best Practices

File Format

Data Model

Community Conventions

Extended Community Conventions

Low level netCDF, HDF API

netCDF- API, OPeNDAP Servers (e.g. THREDDS), Matlab

libcf, nc-ISO, GIS software (e.g. GDAL), LAS

Custom software

ACDD

pointtime

seriestrajector

y

profiletime

series profile

trajectory profile

swath

grid

CF Feature Types

This is now an OGC standard

Software Stack GHRSST, WOD,

GTSPP, etc standards

NODC – National Oceanographic Data CenterCF- Climate ForecastACDD - Attribute Conventions for Data DiscoveryGDAL – Geospatial Data Abstraction LibraryISO – International Organization for StandardizationLAS - Live Access Server

ERDDAP – Environmental Research Division’s Data Access ProgramnetCDF- network Common Data FormatHDF – Hierarchical Data FormatGIS – Geographic Information SystemsDAP – Distributed Access ProtocolAPI – Application Programming Interface

Page 21: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Outreach

NODC Archive Training Session for IOOS Regional Association Data Managers (April 2011)

NOAA EDMC (June 2011)Earth Science Information Partner (ESIP)

Federation meeting (August 2011) Next steps with netCDF templates

Post to NOAA's Global Earth Observation - Integrated Data Environment (GEO-IDE) for comment

Propose to CF

Page 22: Steve Rutz NOAA/NESDIS National Oceanographic Data Center NODC Observing Systems Team Leader June 21, 2011

Send us an e-mail!

To: [email protected]: StakeholderSubject: NODC’s netCDF templates

Information? Collaboration?