1p opening plenary - national oceanic and atmospheric ... · rob bochenek 2a.4 12 designing...

35
1P Session Time Schedule Location Description Chair Folder Agenda ID # Duration Title (see abstracts below) Presenter 1P.1 15 Welcome & Logistics Jeff de La Beaujardière 1P.2 15 NOAA EDM Update Jeff de La Beaujardière 1P.3 60 EDMC Panel EDMC LO Reps https://drive.google.com/open?id=0B_DpNPgLE0tEVlhFbUV0SnVIZ0E Opening Plenary Jan 9, 08:30-10:00 Salon F-H Conference welcome, NOAA EDM Update, Panel Session with EDM Committee Members Jeff de La Beaujardière Abstracts 1P.1 Welcome & Logistics Jeff de La Beaujardière (NOAA/EDMC) Conference opening and logistical information 1P.3 EDMC Panel EDMC LO Reps (NOAA/EDMC) Scott Hausmann (NESDIS), Steve Olson (NWS), Michele Jacobi (NOS), Laura Letson (OAR), Nathan Wilson (NMFS) Environmental Data Management Committee (EDMC) representatives from each NOAA Line Office will summarize some data management activities in their domain, followed by Q&A. 1P.2 NOAA EDM Update Jeff de La Beaujardière (NESDIS/EDMC) This talk will review the current status of NOAA EDMC Directives and data management activities.

Upload: tranthuy

Post on 28-Feb-2019

222 views

Category:

Documents


0 download

TRANSCRIPT

1P

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter1P.1 15 Welcome & Logistics Jeff de La Beaujardière1P.2 15 NOAA EDM Update Jeff de La Beaujardière1P.3 60 EDMC Panel EDMC LO Reps

https://drive.google.com/open?id=0B_DpNPgLE0tEVlhFbUV0SnVIZ0E

Opening Plenary

Jan 9, 08:30-10:00

Salon F-H

Conference welcome, NOAA EDM Update, Panel Session with EDM Committee MembersJeff de La Beaujardière

Abstracts

1P.1 Welcome & LogisticsJeff de La Beaujardière (NOAA/EDMC)Conference opening and logistical information

1P.3 EDMC PanelEDMC LO Reps (NOAA/EDMC)Scott Hausmann (NESDIS), Steve Olson (NWS), Michele Jacobi (NOS), Laura Letson (OAR), Nathan Wilson (NMFS)Environmental Data Management Committee (EDMC) representatives from each NOAA Line Office will summarize some data management activities in their domain, followed by Q&A.

1P.2 NOAA EDM UpdateJeff de La Beaujardière (NESDIS/EDMC)This talk will review the current status of NOAA EDMC Directives and data management activities.

2A

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter2A.1 12 Using Data Visualization to Tell NOAA’s Story Denna Geppi2A.2 12 Enabling Event-Driven Data Discovery Aaron Sweeney2A.3 12 Developing User Tools for Integration and Visualization of Multidisciplinary Coastal

and Ocean DataRob Bochenek

2A.4 12 Designing visualizations and user interfaces that reduce noise and increase understanding

Dan Pisut

2A.5 12 User Driven Data Mining, Visualization and Decision Making for NOAA Observing System and Data Investments

Matthew Austin

2A.6 12 Using Environmental Big Data to Improve Outcomes Michelle Lapinski2A.7 12 How does data get from the source to the end user? Old School vs New School Bob Simons2A.8 6 Open Discussion

Linking Data Visualization to Societal Benefits

Salon F-H

NOAA produces terabytes of data, and the challenge is to make that data accessible to decision support systems by decision-makers in ways that are useful for their needs. Data visualization offers a solution for making data more useful because visualization allows us to identify patterns, trends, and correlations that might otherwise go unnoticed. We propose a session with a series of talks that demonstrate the data value chain from development, societal benefits and use. Dr. Monica Grasso, NOAA’s Chief Economist will moderate the session.

This session will serve as the beginning of a broader conversation at NOAA about how to use data visualization, and as an opportunity to encourage participants to join a Data Visualization Community of Practice that is in its early stages.Monica Grasso or Jeff Adkins

Using Data Visualization to Tell NOAA’s Story

Jan 9, 10:30-12:00

https://drive.google.com/open?id=0B_DpNPgLE0tEaGJyREhLMkJKdGs

Abstracts

2A.1

2A.2

Denna Geppi (CFO/Office of Performance, Risk and Social Science)Jeff Adkins, Valerie Were, Rajendra PoudelSocial Science is integral to achieving NOAA’s goals. The strategy of the Social Science team is to integrate social, behavioral, and economic (SBE) science end-to-end in NOAA’s mission and priorities. In order to do that, we must articulate and communicate the societal benefits generated by NOAA’s science, service, and stewardship activities to NOAA leadership and to decision makers. One of our major challenges has been getting a natural science agency to fully embrace and integrate social science in their operations to better relay the societal impacts of the agency’s work. Through Data Visualization, we are finding more effective ways to tell our story. This talk will discuss how the Social Science team is starting to use data visualization to communicate the societal benefits of our work linked to all of NOAA priorities.

Enabling Event-Driven Data DiscoveryAaron Sweeney (NESDIS/NCEI)Paula Dunbar (NCEI), Nicolas Arcos (NCEI)People naturally want to learn more about potential impacts to their lives. At the National Centers for Environmental Information (NCEI), we inspect and archive multiple sources of information regarding natural hazards, especially tsunamis. We have developed a visual aid to data discovery related to specific tsunami events using a timeline. The timeline is created using an open-source Javascript library (TimelineJS) originally designed for news stories. The timeline depends on information expressed in Javascript Object Notation (JSON) format. The approach is quite flexible, using images to draw the reader’s attention and providing linkages to multiple data sources for further exploration. We initially focused on data and products that we steward at NCEI. We’ll discuss how this approach might be extended to data held by other entities. See https://www.ngdc.noaa.gov/hazard/recenttsunamis.shtml.

2A.3 Developing User Tools for Integration and Visualization of Multidisciplinary Coastal and Ocean DataRob Bochenek (Axiom Data Science LLC/IOOS)Derrick Snowden (NOS/IOOS)The presenter of this talk will demonstrate a suite of discovery and visualization tools that have been developed to improve the communication of ocean data and information products to the broader user base. The presentation will begin with a brief summary of the architecture and approach to support the integration of a wide breadth of data types (sensors. model/remotely sensed grids, GIS, time series and project/field study data) and domains (geophysical, ecological and socioeconomic). The techniques described here form the foundation for the cyber infrastructure that supports three of the eleven US IOOS regional associations and several national data integration systems funded through the centralized IOOS office. The talk will then showcase a series of modern spatial/temporal visualization techniques developed to improve accessibility to complex scientific ocean and coastal data sets. Additionally, examples of cyber infrastructure, developed with funding from NOAA and others, will be demonstrated which enable data resources (GIS, numerical models and remote sensed grids, sensor networks, time series and research project assets) to be discovered, integrated and visualized.

2A.5

Sabrina Taijeron(NESDIS TPIO),Lauraleen O'Connor, Brant Priest(NESDIS TPIO)Adam Neiss(NESDIS TPIO), Rohit Arora (NESDIS TPIO)The National Oceanic and Atmospheric Administration (NOAA) observing system enterprise represents a $2.4B annual investment. Earth observations from these systems are foundational to NOAA’s mission to describe, understand, and predict the Earth’s environment. NOAA’s decision makers are charged with managing this complex portfolio of observing systems to serve the national interest effectively and efficiently.The Technology Planning & Integration for Observation (TPIO) Office currently maintains an observing system portfolio for NOAA's validated user observation requirements, observing capabilities, and resulting data products and services. TPIO performs data analytics to provide NOAA leadership business case recommendations for making sound budgetary decisions.Over the last year, TPIO has moved from massive spreadsheets to intuitive dashboards that enable Federal agencies as well as the general public the ability to explore user observation requirements and environmental observing systems that monitor and predict changes in the environment. This change has led to an organizational data management shift to analytics and visualizations by allowing analysts more time to focus on understanding the data, discovering insights, and effectively communicating the information to decision makers.Moving forward, the next step is to facilitate a cultural change toward self-serve data sharing across NOAA, other Federal agencies, and the public using intuitive data visualizations that answer relevant business questions for users of NOAA’s Observing System Enterprise. Users and producers of environmental data will become aware of the need for enhancing communication to simplify information exchange to achieve multipurpose goals across a variety of disciplines. NOAA cannot achieve its goal of producing environmental intelligence without data that can be shared by multiple user communities.This presentation will describe where we are on this journey and will provide examples of these visualizations, promoting a better understanding of NOAA’s environmental sensing capabilities that enable improved communication to decision makers in an effective and intuitive manner.

2A.4 Designing visualizations and user interfaces that reduce noise and increase understandingDan Pisut (NESDIS/Visualization Lab)

With the ubiquity of software and toolkits, creating data visualizations and user interfaces can be done by almost anyone. However, all too often we inject unintentional flaws into our designs that weaken the veracity or make navigation less straightforward. This talk will focus on addressing some of these pitfalls and provide solutions that use best practices from cognitive and social science all with the goal of making a better connection to your end user.

User Driven Data Mining, Visualization and Decision Making for NOAA Observing System and Data Matthew Austin (NESDIS/OPPA/TPIO)

2A.7 How does data get from the source to the end user? Old School vs New SchoolBob Simons (NMFS/SWFSC ERD)This talk will contrast the old school systems for data distribution, which are still widely used, and the new systems, which can be tremendously easier to use and more efficient. The new systems offer catalogs to make the datasets discoverable, web services to make the actual data accessible and subsetable, and improved metadata to make the data understandable and usable. The federal government's PARR requirements mandate that all government funded data be made freely available via these new systems.

2A.6 Using Environmental Big Data to Improve OutcomesMichelle Lapinski (The Earth Genome/VP)All of the world's most influential decision makers are potential users of environmental data, but today the use is limited. Why? They lack an easy way to integrate data, science and visualization to help them evaluate the impacts (social, environmental and financial) outcomes of their own decisions. Today that all feels too hard. At Earth Genome, our vision is for the world's key decision makers to consistently and routinely take into account the full value of nature and the consequences of their activities on natural systems, thereby averting economic and social disruptions due to the mismanagement of natural capital and environmental risk. We build on data, applied science, and the latest in GIS visualization to create highly visual decision-support tools that help businesses and governments evaluate how solutions/interventions could be applied on a greater scale to improve real outcomes on the ground. We are working with companies and governments to provide an assessment of place-based and temporal science of natural resource challenges ranging from securing water supply to increasing agricultural yield while improving environmental or social conditions. This presentation will include a demo of the Green Infrastructure Support Tool that enables evaluation of green infrastructure solutions to address water quantity issues (being piloted by Dow Chemical in their largest global facility, located in Freeport Texas) and a preview of the under development Groundwater Recharge Assessment Tool that will enable California irrigation districts to optimize investments in groundwater recharge. Both tools draw on NOAA datasets, and enable end users to take into account the impacts of their own decisions. These apps are part of Earth Genome's broader efforts to increase the availability and usefulness of planetary data for decision-making through a data/tool platform that exponentially lowers the cost and time to analyze data and translate it into implementable insights.

2B

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter2B.1 90 Design a Decision Tree for Scoping Metadata Aggregation Anna Milan

https://drive.google.com/open?id=0B_DpNPgLE0tEV29IMFhheGtSYjA

Improved solutions for metadata publication to the NOAA Data Catalog

Jan 9, 10:30-12:00

Glen Echo

Problem-soving Session: There are common problems across NOAA that need common solutions. Recently PARR has raised to the surface a few prevalent needs. This problem solving session will focus on two aspects. 1) Metadata record publication methods. The problem: there is no defined workflow for getting metadata collections into the NOAA Catalog, NOAA OneStop Project, or other discovery portals. Each line office or project needs to define their own process and determine how to set up publication. How do funded NOAA partners become involved? There should be a well-defined process and end-user resources for facilitating publication of all NOAA metadata. Chris will go over the current process and propose ideas for discussion.

2) Scope of the metadata collection level records: If the granule is the minimum retrievable unit and the collection is the minimum citable unit, how do you know what should be citable? Current collection level definitions vary widely depending on the data type, their communities and technical implementations available. These various implementations clutter the search results, confuse inventory results and often don’t make sense for the end user. The purpose of this discussion is to develop of a decision tree that helps determine the appropriate level of aggregation and could even result in a defined set of ISO metadata templates appropriate for your collection type.Chris MacDermaid & Anna Milan

Abstracts

2B.1 Design a Decision Tree for Scoping Metadata AggregationAnna Milan (NESDIS/NCEI)If the granule is the minimum retrievable unit and the collection is the minimum citable unit, how do you know what should be citable? Current collection level definitions vary widely depending on the data type, their communities and technical solutions. These various implementations clutter the NOAA Catalog’s search results, confuse inventory results and often don’t make sense for the end user. The purpose of this discussion is to develop a decision tree that helps determine the appropriate level of aggregation and could even result in a defined set of ISO metadata templates appropriate for your collection type.

2C

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter2C.1 15 An Introduction to ERDDAP Bob Simons2C.2 30 Hands-on with ERDDAP: Generating dataset configurations Bob Simons2C.3 15 Deploying and Managing ERDDAP servers Kyle Wilcox2C.4 15 Using Erddap as a building block in Ireland’s Integrated Digital Ocean Adam Leadbetter2C.5 15 ERDDAP and the International Ocean Community Kevin O'Brien

https://drive.google.com/open?id=0B_DpNPgLE0tEUmxvT0ZqS0tQM1E

ERDDAP: Introduction and Use

Jan 9, 10:30-12:00

White Flint

The session will revolve around the software tool known as ERDDAP. The first half of the session will include: presentations for introducing ERDDAP and what it offers data providers; examples of how ERDDAP is being used in the scientific community; tools for installing and using ERDDAP.

The second half of the session would be focused on hands-on configuration of datasets into ERDDAP. Ideally, we would take advantage of data providers who were interested in using ERDDAP, but were struggling to properly configure their datasets. This session would address those issues, as well as offer insights and tips from the ERDDAP developer on how to better configure and run an ERDDAP server.

At the end of the session, we hope that there would be a much clearer understanding of the value that ERDDAP provides data producers, as well as having assisted attendees in actually configuring their specific data into an ERDDAP server.Kevin O'Brien

Abstracts

2C.1 An Introduction to ERDDAPBob Simons (NMFS/SWFSC ERD)ERDDAP is a free, open source, data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP has been installed at over 50 organizations around the world. NOAA's Data Access Procedural Directive includes ERDDAP in its list of recommended data servers for use by groups within NOAA. This talk will offer a quick introduction to ERDDAP and describe some features that have recently been added to ERDDAP.

2C.3 Deploying and Managing ERDDAP serversKyle Wilcox (Axiom Data Science/NOAA/IOOS)This technical talk will focus on automation techniques for easing deployment and management of ERDDAP servers - pulled from the experiences of managing a deploying dozens of ERDDAP instances in the past. We will highlight a few modern software principal improvements we have made to ERDDAP, techniques for streamlining the deployment of ERDDAP into a production environment, python tools for managing ERDDAP's content files, and explain the different ways a data manger can reload datasets without ever restarting the ERDDAP instance.

2C.2 Hands-on with ERDDAP: Generating dataset configurationsBob Simons (NMFS/NOAA/NMFS/SWFC)Kevin O'Brien, UW/JISAO and NOAA/PMELThis hands-on session will focus on demonstrating how to configure datasets into an ERDDAP server. This will include tips and tricks on how to best configure data, as well as tools to help create data configurations.

2C.5 ERDDAP and the International Ocean CommunityKevin O'Brien (UW/JISAO and NOAA/PMEL/OAR/UAF, OAR/OSMC)Bob Simons (NOAA/NMFS), Kevin Kern (NOAA/NDBC), Bill Smith (NOAA/NDBC), Eugene Burger (NOAA/PMEL)Currently in the ocean observing community, there are a wide variety of platforms that provide measurements on a variety of timescales, both in real time and through delayed mode processes. These distinct data streams also tend to have disparate methods of managing and serving the data and information. As the ocean community evolves to support the concepts of Essential Ocean Variables (EOV), or Essential Climate Variables (ECV), it becomes critical to provide access to integrated collections of well documented observations, which can often originate from different platform observing networks.

NOAA’s Observing System Monitoring Center (OSMC), in partnership with the Unified Access Framework (UAF) project, is utilizing ERDDAP as the framework to improve integration of data and information across the global ocean observing networks. Several observing networks, such as Argo and Global Drifter Program, are utilizing ERDDAP to serve their delayed mode data. In addition, the OSMC project is utilizing ERDDAP to provide easy access to real time data that is distributed via the Global Telecommunications System (GTS). Often this data is distributed in binary BUFR format, and is difficult for the non-expert to decode and understand. By using ERDDAP to serve the data, even novice users can access near real-time ocean data through an assortment of interoperable services and formats.

In this presentation, we will discuss the ongoing efforts to improve integration and interoperability of ocean in situ observations using ERDDAP.

2C.4 Using Erddap as a building block in Ireland’s Integrated Digital OceanAdam Leadbetter (Marine Institute, Ireland/Integrated Digital Ocean)Eoin O'Grady (Marine Institute)The Marine Institute is Ireland’s national agency with responsibility for marine research, technology development and innovation in Ireland. The Institute provides scientific and technical advice to the Irish Government to help inform policy and to support the sustainable development of Ireland’s marine resource. Under this remit, and recognising that the Marine Institute is only one of many organisations collecting data in Ireland’s marine region, the Institute has over the last two years developed and promoted the concept of Ireland’s Integrated Digital Ocean. This activity has included the development of a data portal, one of the building blocks of which is NOAA’s Erddap software.

The Integrated Digital Ocean programme promotes the publication of marine data through openly accessible web services, which can be combined into applications for various end user groups. The demonstrator portal currently presents data published online by fifteen organisations in a single website. The portal delivers real-time (order of one second delay), near-real time (order ten minutes delay) and static (Geographical Information Systems type) data. A map interface allows users to drill down to a selected location where either an overview view of a bay scale region or an individual point where Erddap is used to deliver a summary of the latest results to either cards or a map pin. From the overview, users may drill down further to a graph viewer where both observed and modelled results delivered from Erddap are combined into web graphics using the HighCharts.js library. Finally, users may navigate to a data download area where again the backend is powered by Erddap.

In this architecture Erddap is being used as a brokering layer to multiple data sources which include a Cassandra database which is storing data in real-time from a subsea observatory; netCDF files as provided by the global Argo drifting float programme; a THREDDS server delivering modelled data; and a traditional SQL relational database storing other observational data. The Erddap data outputs and graphs have been combined in some of the dashboard graphs with MQTT protocol outputs to update the graphs in real-time.

Further developments in progress for the Integrated Digital Ocean platform include: incorporation of remotely sensed and modelled data on the main map interface; incorporation of research vessel data outputs; and development of a standardised API for discovery and access to the Digital Ocean data.

3A

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter3A.1 15 PARR - What you need to know Katharine Weathers3A.2 15 Implementing PARR at NOAA NCCOS: Providing Data Access via Archiving and

Web ServicesJessica Morgan

3A.3 15 Beyond PARR - PMEL’s Integrated Data Management Strategy Eugene F. Burger3A.4 15 NOAA Institutional Repository Update Stanley Elswick3A.5 15 Use of DOIs at the NOAA Central Library Stanley Elswick3A.6 15 Open Discussion3A.7 30 NWFSC PARR Implementation Plan - A Work in Progress Part 1. Overview Richard Kang3A.8 30 NWFSC PARR Implementation Plan - A Work in Progress Part 2. Live Demo Brendan Sylvander3A.9 30 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEVUxhOFZURzdqN00

NOAA PARR Implementation

Jan 9, 13:30-17:00 (double session, with break)

Salon F-H

Federal and NOAA "Public Access to Research Results" (PARR) requirements [https://doi.org/10.7289/V5F47M2H] present implementation issues across NOAA, in particular the resources needed to properly track, archive, and provide access to scientific observations, models, and data-rich decision support tools, in addition to scientific publications (including technical reports and peer-reviewed publications) and the data supporting those publications. We would like to invite NOAA Program Offices to present their current and future implementation approaches, and discuss common and unique capabilities, needs, and challenges.

Jessica Morgan

Abstracts

3A.2 Implementing PARR at NOAA NCCOS: Providing Data Access via Archiving and Web ServicesJessica Morgan (NOS/National Centers for Coastal Ocean Science (NCCOS))The NOAA National Centers for Coastal Ocean Science (NCCOS) has a responsibility to share federally-funded scientific data according to the Federal Public Access to Research Results (PARR) requirements. Using the NOAA Environmental Data Management Committee (EDMC) Procedural Directives on Data Management as a guide, along with best practices of the NOAA data management community, we are developing a implementation approach for NCCOS, including data management planning, data documentation, archiving, and web services. We are providing basic data access by archiving GIS and tabular data at the NOAA National Centers for Environmental Information (NCEI) using a modified S2N workflow that includes additional steps to improve data discovery. We are also providing enhanced data access to high-priority datasets through mapping (OGC/MapServer) and tabular (ERDDAP) web services.

3A.1 PARR - What you need to knowKatharine Weathers (NESDIS/NCEI)Jacqueline Mize (NCEI)The Public Access to Research Results (PARR) Memo was issued in 2013. Since its inception, many programs have battled confusion implementing the memo and directives presented by NOAA. The National Centers for Environmental Information (NCEI) has a long standing history of being able to provide expertise ensuring the long-term preservation of scientific data and scientific data stewardship that lends itself well to being able to assist with meeting PARR implementation. As part of Scientific Data Stewardship, NCEI built a web page, http://www.ngdc.noaa.gov/parr.html, for users to better understand the PARR. This page answers questions that data providers may have about the PARR in a concise manner with links to helpful resources. This presentation will describe activities undertaken by NOAA to meet these goals to increase the public accessibility of publications and digital data produced and exactly what federal researchers or recipients of federal funds need to do in order to meet these goals. Along with PARR compliance and stewardship, this presentation will go into how resources within NCEI Science Centers can aid data providers in implementation of the Tiers of Stewardship.

3A.5 Use of DOIs at the NOAA Central LibraryStanley Elswick (OAR/OAR/CFO-CAO/NCRL)Anna FiolekThe NOAA Central Library has minted DOIs for publications for a couple of years now and have established procedures for handling publications. The process is an integral part of the ingest of many of these publications into the NOAA Institutional Repository. This process also includes work with NOAA librarians and publication offices.

The presentation will describe the procedures, the format and list of fields the Library includes in the EZID metadata, and which publications receive a DOI.

3A.4 NOAA Institutional Repository UpdateStanley Elswick (OAR/OAR/CFO-CAO/NCRL)The NOAA Institutional Repository fulfills the requirement of the NOAA Public Access to the Results of Research (PARR) Plan to make available publications resulting from NOAA research. The Repository includes peer-reviewed journal manuscripts and NOAA series documents as called for in the NOAA publications policy.

This presentation will give an overview of the Repository at its current stage of development and the plans for the coming year. The presentation will include a demo of the system. It will also cover the submission process with a demo of the submission page.

3A.3 Beyond PARR - PMEL’s Integrated Data Management StrategyEugene F. Burger (OAR/PMEL/SDIG)Kevin M. O'Brien (JISAO, Univ. of Washington), Ansley B. Manke (NOAA/OAR/PMEL), Roland Schweitzer (WeatherTop Consulting), Karl M. Smith ( JISAO, Univ. of Washington)NOAA’s Pacific Marine Environmental Laboratory (PMEL) hosts a wide range of scientific projects that span a number of scientific and environmental research disciplines. Each of these 14 research projects have their own data streams that are as diverse as the research. With its requirements for public access to federally funded research results and data, the 2013 White House Office of Science and Technology memo on Public Access to Research Results (PARR) changed the data management landscape for Federal agencies. In 2015 PMEL’s Science Data Integration Group (SDIG) initiated a multi-year effort to formulate and implement an integrated data-management strategy for PMEL research efforts.

Instead of using external requirements, such as PARR, to define our approach, we focussed on strategies to provide PMEL science projects with a unified framework for data submission, interoperable data access, data storage, and easier data archival to National Data Centers. This improves data access to PMEL scientists, their collaborators, and the public, and also provides a unified lab framework that allows our projects to meet their data management objectives, as well as those required by the PARR.

We are implementing this solution in stages that allows us to test technology and architecture choices before committing to a large scale implementation. SDIG developers have completed the first year of development where our approach is to reuse and leverage existing frameworks and standards. This presentation will describe our data management strategy, explain our phased implementation approach, the software and framework choices, and how these elements help us meet the objectives of this strategy. We will share the lessons learned in dealing with diverse and complex datasets in this first year of implementation and how these outcomes will shape our decisions for this ongoing effort. The data management capabilities now available to scientific projects, and other services being developed to manage and preserve PMEL’s scientific data assets for our researchers, their collaborators, and future generations, will be described.

3A.7 NWFSC PARR Implementation Plan - A Work in Progress Part 1. OverviewRichard Kang (NMFS/NOAA/NFMS/NWFSC/OMI)The NMFS Northwest Fisheries Science Center in Seattle has been actively engaged in attempting to meet NOAA PARR compliance. A number of steps have been taken at the NMFS level, including the formation of the NMFS PARR Transition Team in addition to the Fisheries Management Advisory Committee, but also significant efforts made at the local level to improve information management at the field offices. The Center has progressively evolved in meeting a series of objectives from tracking Center's strategic research plan goals; documenting our NOAA and NMFS Data Management Plans; completing the PARR waiver process; capturing our data inventory; batch publishing the meta data into NMFS InPort/Data.gov; providing a pilot portal of web services and user ad-hoc query access; investigating our pilot steps to exchange Center data for archival to NCEI S2N/ATRAC systems without making our PI's and data managers re-enter the information more than once throughout the entire life cycle following the Center's PARR Implementation Plan. The goal over the coming months is to automate the complete data workflow cycle over time, report the performance metrics, and truly provide an environment for data sharing for our internal and external customers. With the help of PARR, the Center is now finally moving towards a culture where data is a shared resource, integrate our internal processes, and hopefully provide opportunities for improved research.

3A.8 NWFSC PARR Implementation Plan - A Work in Progress Part 2. Live DemoBrendan Sylvander (NMFS/NOAA/NFMS/NWFSC/OMI)Jeff Cowen (NWFSC)This is a live demo of the workflow process developed at the NWFSC for the NWFSC PARR Implementation Plan - A Work in Progress in bringing together various sources of data, documents, web services, and applications websites for internal and external users for public access as well as connect various target systems such as the Center public website, InPort/Data.gov, NCEI S2N/Atrac for archive, Socrata, GenBank, NOAA Publications, ESRI GIS Servervia direct connect, as well as use the web services for custom applications for both consumer and producer of the data.

3B

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter3B.1 15 Overview of OneStop Project from a Metadata Point of View Anna Milan3B.2 15 Decomposition of Collection Metadata and Dynamic Regeneration John Relph3B.3 15 Light under ISOLite - OneStop granule level discovery metadata Yuanjie Li3B.4 45 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tES3NsdjI4VWoyLU0

NOAA OneStop Metadata Requirements and Tools

Jan 9, 13:30-15:00

Glen Echo

The OneStop Project’s mission is to improve discoverability and access to NOAA data. We can do this through high quality collection level and granule metadata descriptions and online access. This session will comprise two parts that provide insight into the OneStop activities from the metadata team’s perspective. The first part will provide an overview of our requirements, our progress, lessons learned and next steps. The second part will provide a deeper dive into the metadata team’s workflows and solutions to support Data Stewardship Maturity assessments, granule metadata generation and collection level metadata updates. There will be some time at the end of session to solicit feedback about how to make these processes and requirements more accessible to all NOAA data providers. Anna Milan

Abstracts

3B.1 Overview of OneStop Project from a Metadata Point of ViewAnna Milan (NESDIS/NCEI)Yuanji Li, John Relph, Phil Jones, Nancy Ritchey, OneStop Metadata TeamThe OneStop Metadata team was formed during the spring of 2016. Since then, we improved and created metadata for 5 data groups comprised of 322 collection level records and hundreds of thousands of granule level records. In this short amount of time, following the iterative spirit of an agile approach, we developed requirements and best practices while producing metadata while the design and development of the discovery system is also unfolding. This presentation will discuss our requirements, progress so far, lessons learned and next steps.

3B.3 Light under ISOLite - OneStop granule level discovery metadataYuanjie Li (NESDIS/NCEI)Phil Jones (NESDIS/NCEI), Anna Milan (NESDIS/NCEI)This presentation will introduce talks about the new ISOLite metadata template, including the real examples of using ISOLite for granule metadata/discovery work, and the built up tailored validation tools, such as the new Metadata Rubric tool. The ISOLite metadata is the template was created by the NOAA OneStop Metadata Team following . This template was built from the ISO 19115-2 schema validation file. 41% of the required fields can be mapped to ACDD attributes in netCDF data and 48% are static template fields or can be sourced from the parent collection metadata. ISOLite uses the minimum content that is necessary for the granule data discovery uses, thus it is lightweight, but efficient. A comparison between the old ISO granule metadata and the ISOLite granule metadata shows a clear decrease in the metadata file size. The reduced of the file size also improved the performance of metadata indexing to the discovery portal.

3B.2 Decomposition of Collection Metadata and Dynamic RegenerationJohn Relph (NESDIS/NCEI/OneStop)The descriptive information (metadata) for many data sets contain similar information; they might use the same keywords, they might have some of the same contributors, they might refer to the same scientific papers. In order to increase consistency between the representations of this information across data sets, it might be useful to decompose the information into references to external components or to identified terms in standard vocabularies, for example. Automated processes can be built to regenerate the descriptive information from its decomposed components. This can reduce the amount of effort that data managers need to expend in order to update the information in all metadata records, because the components can be updated and the changes will be reflected in all affected records. This can also increase consistency between those records because those common components are rendered identically, and because every component of a similar type is rendered in the same way. The metadata management system being developed by OneStop is being designed to support this data model.

3C

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter3C.1 15 The NOAA View Portal: streamlining development and access via Esri software Dan Pisut3C.2 15 Can I Weather My Map? Nipa Parikh3C.3 15 DEMO: Making Web Services Discoverable Jacqueline Mize3C.4 15 Map Services: A Two Way Flow of Data Ken Buja3C.5 15 Geospatial workflows and processes used to develop and visualize the United

States’ Extended Continental ShelfFinn Dahl

3C.6 15 ERMA as an OGC web service catalog

https://drive.google.com/open?id=0B_DpNPgLE0tEYTdxTmRtTzBzZUk

Data Access: GIS and Web Services

Jan 9, 13:30-15:00

White Flint

What does "data access" mean for the GIS community? GIS tools such as ArcGIS Online, MapServer web services, and GeoPlatform allow users to "access" (explore and visualize) GIS datasets in a fundamentally different way from the more traditional NCEI archives (e.g., buoy data). We would like to invite the NOAA GIS community to discuss "access" to geospatial data and products, and we suggest that while archiving these data at NCEI is important for long-term preservation, commonly used geospatial tools and resources are ultimately what provides practitioners the data and information they need to conduct their work.Randy Warren & Charlie Menza

Abstracts

3C.1 The NOAA View Portal: streamlining development and access via Esri softwareDan Pisut (NESDIS/Visualization Lab)Tim Loomis (NESDIS), Vivek Goel (NESDIS)Developed four years ago as an educational data exploration tool, NOAA View has undergone significant upgrades of its backend over the past year. Now built on Esri ArcGIS Server and Portal and hosted by the Visualization Lab, the NOAA View Portal hosts around 200 data services, mostly tiled or cached images, a series of story maps being developed by contributors from across NOAA and its external partners, web applications, and other utilities. This demonstration will walk through some of NOAA View Portal’s offerings, while also discussing future plans for the system, along with how others may access or contribute to these services.

3C.3 DEMO: Making Web Services DiscoverableJacqueline Mize (NESDIS/NCEI)Katharine Weathers (NCEI), David Moffitt (NCEI)Many powerful and useful web services such as ERDDAP and ArcGIS map services often go undiscovered. This demonstration will feature how NCEI has utilized ISO metadata and ESRI Geoportal software to provide discoverability and access to various web services created for archived data.

3C.2 Can I Weather My Map?Nipa Parikh (NWS/Office of Dissemination)Donald RinkerNOAA portfolio of geospatially enabled highly available (7x24) products is growing rapidly. This portfolio of geospatial web services is part of the larger Integrated Dissemination Program (IDP), a major consolidation and update of existing NOAA dissemination systems with a large focus on weather and environmental decision support systems. Increasing the reliability and support of delivering weather information as Geographic Information Systems (GIS) services was one of the objectives of IDP. The IDP GIS projects included migrating nowCOAST services and application to the IDP virtual machine infrastructure and consolidating existing NWS geospatial services to a fully supported dissemination infrastructure. As a result of the implementation of these systems, over 200 NOAA products are available as geospatial map services. Some of the data included in these services are current NWS hazardous weather watches, warnings, and advisories; NESDIS’ GOES cloud imagery; NOS’ nautical charts; NOS’ Environmental Sensitivity Index; and National Marine Fisheries Service’s (NMFS) Essential Fish Habitat data. Geospatial data clients can access maps of NOAA data and products as REST map services (some time-enabled), OGC compliant Web Map Services, and KML.

IDP GIS continues to work to keep our infrastructure technology up to date while increasing the available products and application functionality. 2017 is an exciting year for the projects, as we all know the weather and coasts continually change, so IDP GIS is working on the best way to deliver time dependent products, incorporate advances in technology to better handle scientific data, and expand the available OGC compliant data formats. Additionally IDP GIS product suite will be expanded to include Sea, Lake, and Overland Surges from Hurricane (SLOSH) data, restructured NWS tropical services, and NDGD air quality data. This talk will focus on data access best practices, challenges, and future plans with time for you to provide feedback and suggestions for how IDP GIS can better serve your geospatial needs.

3C.5 Geospatial workflows and processes used to develop and visualize the United States’ Extended Continental ShelfFinn Dahl (NESDIS/CIRES)Barry Eakins (CIRES), Jennifer Jencks (NCEI), Erin LeFevre (CIRES) Elliot Lim (CIRES), Jesse Varner (CIRES)

The U.S. Extended Continental Shelf (ECS) Project is a multi-agency collaboration to establish the full extent of the continental shelf of the United States, beyond 200 nm, consistent with international law. The process to determine this new maritime outer limit requires the collection and analysis of data that describe the depth, shape, and geophysical characteristics of the seabed and sub-sea floor. In 2014, the ECS Project Office, led by the Department of State, was established at NOAA’s National Centers for Environmental Information (NCEI) in Boulder, Colorado to efficiently and effectively guide the U.S. ECS Project to completion and final documentation. As a process for formulating the GIS documentation requirements, the ECS GIS team is developing a geospatial database used for stewarding project specific assets, analyzing over 30 ECS-funded cruises that collected a variety of marine geophysical data (>2 million km² of bathymetric data, along with seismic, geological samples, magnetic and gravity data), generating nearly 1000 maps, profiles, illustrations, and figures, storing ECS maritime boundary delineation products into the NCEI archive, and sharing ECS bathymetric and analytical products using custom web mapping services. We will describe how cutting-edge analytical geospatial software, custom python driven automation tools, and collaborative web technologies come together in order to meet demands of the project, and how those technologies may be extended beyond the project.

3C.4 Map Services: A Two Way Flow of DataKen Buja (NOS/NCCOS)NCCOS has been publishing map services for use in a number of applications not only to share our data with the public, but to also gather data from the public. In addition to publishing services with IDP and NCEI using ArcGIS Server, we have published several on the NOAA GeoPortal to create editable feature services. This talk will cover the use of map services in three different types of projects. The first is a simple data viewer using Esri’s Web AppBuilder with a map service utilizing a complex nested grouping structure. The second is a custom-built application to showcase benthic habitat data and imagery in dynamic and tiles map services, as well as external video photography. The final project is a custom-built application that uses a NOAA GeoPortal feature service that gives users the ability to edit existing data.

3C.6 ERMA as an OGC web service catalogRobb Wright (NOS/OR&R / ERMA)Jay Coady (NOS/OR&R)NOAA’s Office of Response and Restoration’s Environmental Response Management Application (ERMA) is an online open-source-based mapping application providing centralized access and visualization of integrated data. Thousands of data layers are visible within ERMA. The data comes from a variety of sources, including federal, state and local governments, along with NGOs and partner companies. OGC connections are available and visible for all layers, with ERMA providing local WMS links for data held locally as well as remote WMS links for data being brought in from external web services. ERMA’s tiered security structure allows for data to be secured behind logins, and ERMA has recently made these secured data layers available through a tokenized access system for partners who need quick and easy access to data in ERMA without requiring them to download and duplicate data sources. These full web references to local ERMA data or remote data services ensure that end-users are able to find and connect to all data viewable in ERMA.

3D

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter3D.1 15 Biological and Ecological Data Management Operations in the Foundation for IOOS

SuccessJennifer Bosch

3D.2 15 Integrating marine species observations and making them accessible to the world through OBIS

Abigail Benson

3D.3 15 Developing a Biodiversity Data Management Toolkit through the Marine Biodiversity Observation Network

Rob Bochenek

3D.4 15 EML, KNB, and ERDDAP Bob Simons3D.5 15 Beginning a comparison of new animal telemetry data exchange standards to

existing database architecture in NOAA OR&R’s DIVER Explorer – applying animal telemetry lessons learned from Deepwater Horizon

Troy L. Baker

3D.6 15 The benefits and challenges of data integration and visualization for the NCEI Water Column Sonar Data Archive

Carrie Wall

3D.7 15 Managing Deep-Sea Coral and Sponge Biodiversity Data Matt Dornback3D.8 75 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEU0t0dnA3UEwwUDQ

Biodiversity and ecological data integration and interoperability

Jan 9, 13:30-17:00 (double session, with break)

Forest Glen

Today’s most pressing ecological and biodiversity questions-- such as how climate change will affect species distributions and ecosystem processes or how to manage fisheries or protecting biodiversity—depend on integration of data from many sources. This session will bring together biodiversity and ecological data managers and scientists from diverse set of federal organizations and academia who focus on data integration and interoperability to discuss progress made and opportunities and challenges in data integration. Talks will follow three main themes: 1) progress made on data integration and challenges reconciling data and metadata standards, 2) tools for finding, managing, sharing, and visualizing ecological data, and 3) research using integrated data to address today’s current challenges such as climate change and Ocean Acidification. The goal of this session is to find opportunities to collaborate on informatics in biodiversity and ecology between federal agencies and academia. Hassan Moustahfid

Abstracts

3D.1 Biological and Ecological Data Management Operations in the Foundation for IOOS SuccessJennifer Bosch (NOS/US IOOS)Hassan MoustahfidData accessibility and effective data management operations are the foundation for success for any observing system. United States Integrated Ocean Observing System (IOOS) Data Management and Communications (DMAC) subsystem have taken a strong lead in formalizing data standards and encouraging the use of interoperable web standards for sharing data and metadata. For almost a decade, IOOS has rolled out regional applications of biological and ecological data science related to fish species absence and abundance. Additional biological data capabilities have expanded with growing levels of participation, evolving infrastructure, and new advances in global standards adoption such as those emerging from Global Biodiversity Information Facility (GBIF) and Ocean Biogeography Information System (OBIS). The foundation of IOOS accomplishments to date relies on adoption and careful compliance with global standards such as Ratified Darwin Core (DwC) and Climate & Forecast Conventions (CF). These standards do not simply assure that biological data content, quality, and data flow are effective, they are also fundamentally involved in assuring biological data integration with other oceanographic data types to enable modeling and synthesis.Building on this foundation, and leveraging the technical and organizational infrastructure of IOOS regional participants, IOOS and OBIS are developing more application capabilities for standard biological web services, such as animal telemetry. We have begun technical investigation of developing mechanisms that will address more IOOS biological variables, and GOOS Essential Biological Variables (EBVs) and offer solutions to local, regional and global efforts such as Convention of Biological Diversity Biodiversity Assessment. These data integrations can drive transformative research and enable the implementation of US IOOS.

Keywords: biology, ecology, ocean observing, data management, standards, IOOS, OBIS, GBIF, GOOS, EBVs, Darwin Core, Climate forecasts, bioinformatics.

3D.3 Developing a Biodiversity Data Management Toolkit through the Marine Biodiversity Observation NetworkRob Bochenek (Axiom Data Science, LLC/NOS/IOOS)

The Marine Biodiversity Observation Network (MBON) is part of a series of three regional efforts to aiming to distribute knowledge and understanding of the patterns and drivers of change in marine biodiversity. These regions provide a wide range of ecosystems including deep sea, reefs, estuaries, and the continental shelf and bring together remote sensing, genomics, ecology, biogeochemistry, and physical data which are essential for studying ecosystem, biodiversity, and oceanographic conditions over time. The MBON demonstration projects are working together to collaboratively refine best practices for increased interoperability (metadata, data sharing, services, and discovery) of the diverse, distributed and often times massive datasets. An integrated MBON demonstration data portal enables visualization of biodiversity data sets in concert with the host of physical, environmental, and biological data housed at the regional and national data centers. This dedicated demonstration portal includes biological data sets and products from the Monterey Bay, Florida Keys, and Arctic regions and provides a centralized and easy to use interface to discover, visualize, analyze, and download data. Axiom Data Science and a team of Monterey Bay and Florida scientists are working on the development of a generalized biodiversity indices tool that allow researchers and managers to perform data analyses online. MBON data management and product development efforts, such as the real-time biodiversity indices tool, demonstrate both large-scale interoperability and enhanced performance and effectiveness in scientific data analysis work flows.

3D.2 Integrating marine species observations and making them accessible to the world through OBISAbigail Benson (USGS/USGS)The Ocean Biogeographic Information System (OBIS) which is part of the International Oceanographic Data and Information Exchange (IODE) of the Intergovernmental Oceanographic Commission (IOC), works to integrate and make accessible observations of marine life. Through recent developments, OBIS has implemented a new API and library for R (rOBIS) as well as a new website. These developments increase accessibility and usability for the Darwin Core aligned data managed by OBIS. In addition to these recent improvements, OBIS has been working with the Global Ocean Observing System (GOOS) through the Biology and Ecosystems Panel to help draft the Essential Ocean Variables as well as the Marine Biodiversity Observation Network (MBON) of the Group on Earth Observations Biodiversity Observation Network (GEOBON) to advance the goals for all three groups toward a better understanding of marine ecosystems.

ERDDAP is a free, open source, data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is well suited to working with ecological data. ERDDAP administrators can now quickly and easily add datasets to ERDDAP based on the information in Ecological Metadata Language (EML) files, which are widely used in the ecological community. This opens up the possibility of quickly and easily setting up an ERDDAP which has both environmental data (from NOAA, NASA, and the USGS) and ecological data (from the Knowledge Network for Biocomplexity, KNB, and other groups) and which would serve as a bridge between the two communities.

3D.5

Benjamin Shorr (NOS/OR&R), Nicolas Eckardt (NOS/OR&R), Dr. Hassan Moustahfid (NOS/ASTADM), Dr. Matthew Multiple animal telemetry studies in the Gulf of Mexico began after the Deepwater Horizon oil spill during the Natural Resource Damage Assessment (NRDA). These field investigations started in 2010 and continued in 2015, and until recently data were embargoed because of an agency litigation hold. A new data exchange standard for acoustic, archival, and satellite tags is an opportunity for GCOOS/OR&R to evaluate how existing and future telemetry data collected for NRDAs is more easily shared with the Animal Telemetry Network. The authors have begun an extensive side-by-side evaluation of data fields in the data exchange standard and OR&R’s DIVER Explorer system. Although the schema evaluation and data transfer is not fully completed, results will be presented with implications for how OR&R’s architecture may be modified to aid data transfer with GCOOS Data Management and Communications (DMAC) services, or other entities. OR&R also is beginning to explore the concept of a more formal data specification package to give to Principal Investigators working on future NRDA studies involving telemetry, similar to the approach OR&R uses for other data types in NRDAs, such as Electronic Data Deliverable specifications given to analytical laboratories. Project goals related to sharing specific animal datasets will be discussed.

Beginning a comparison of new animal telemetry data exchange standards to existing database architecture Troy L. Baker (NOS/OR&R/Assessment & Restoration Division)

3D.4 EML, KNB, and ERDDAPBob Simons (NMFS/SWFSC ERD)

3D.6 The benefits and challenges of data integration and visualization for the NCEI Water Column Sonar Data Carrie Wall (NESDIS/NCEI/CCOG and University of Colorado at Boulder)Charles Anderson (NESDIS)Scientific echosounders aboard NOAA fishery survey vessels are used to estimate biomass, measure fish school morphology, and characterize habitat. These surveys produce large volumes of data that are costly and difficult to maintain due to their size, complexity, and proprietary format that requires specific software and extensive knowledge. However, through proper management they can deliver valuable information beyond their original collection purpose. In order to maximize the benefit to the public, the data must be easily discoverable and accessible. Access to ancillary data is also essential for complete environmental context and ecosystem assessment. NOAA’s National Centers for Environmental Information, in partnership with the National Marine Fisheries Service and the University of Colorado, created a national archive for the stewardship and distribution of water column sonar data. A data access web page allows users to query the metadata and access the raw sonar data. Visualization products allow researchers and the public to understand the quality and content of large volumes of archived data. Such products transform the complex raw data into a digestible image and are highly valuable for a broad audience of varying backgrounds. Links to concurrently collected oceanographic and bathymetric data are being integrated into the data access page to provide an ecosystem-wide understanding of the area surveyed. The need and benefit of having associated oceanographic data readily available for each cruise along with visual imagery for the raw acoustic data is apparent. Efficiently finding and linking to disparately-archived data, and processing large volumes of acoustic data remain a challenge.

3D.7 Managing Deep-Sea Coral and Sponge Biodiversity DataMatt Dornback (NESDIS/NCEI/NMFS-DSCRTP)Tom Hourigan (NMFS, DSCRTP), Scott Cross (NESDIS, NCEI), Peter Etnoyer (NOS, CCEHBR), Robert McGuinn Coral and sponge habitats represent hotspots of biodiversity in the deep-sea and are a focus for conservation efforts. NOAA’s Deep Sea Coral Research and Technology Program (DSCRTP) collects and manages deep-sea coral and sponge occurrence records and makes them available to the scientific and management communities. In order to meet the challenge of managing and distributing these datasets and associated environmental information from diverse sources, DSCRTP, in partnership with NOAA’s National Center for Environmental Information (NCEI) and National Centers for Coastal Ocean Science, developed a custom data management system. This system includes the National Deep-Sea Coral and Sponge Database, a data stewardship workflow for related oceanographic data, and a web-based discovery and distribution portal. The National Database schema is based on the internationally-accepted Darwin Core data standards; occurrence observations are quality controlled through scripted data checkers and expert taxonomic review. The National Database is visualized through a web map that allows users to query the points by taxonomy, location, depth, and year to find exactly what they are looking for and to download the desired data through a seamless ERDDAP transaction. In the spirit of disseminating the data as widely as possible, DSCRTP makes map services available for advanced GIS users to plug directly into the data, and the data are routinely submitted to the NOAA archives and Ocean Biogeographic Information System for long term preservation and global distribution. The oceanographic cruise data management system is based at the event level and tracks all of the desired information coming from field work until it reaches NCEI or another national archive. The Deep-Sea Coral Data Portal provides a central location for deep-sea coral and sponge data, as well more complete ecological descriptions and analyses from the DSCRTP’s field research, made available through archival links.

4B

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter4B.1 15 Retrospective on “Enterprise” Metadata Systems in NOAA Anna Milan4B.2 15 Concerns to address about a One-NOAA metadata management system Katharine Weathers4B.3 15 Straw-man proposal for a One-NOAA metadata system Tyler Christensen4B.4 45 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tES2llNmdvNTJHcEU

Is a One-NOAA metadata management system possible?

Jan 9, 15:30-17:00

Glen Echo

There has recently been some discussion of whether NOAA could build an enterprise metadata management system, supporting data documentation through the whole data lifecycle. But is it possible that one system could meet the diverse needs of everyone at NOAA: field scientists, GIS professionals, satellite ground segment, line offices, archive, etc.?

The session will begin with a short presentation to present one vision for what the system might look like, to get the conversation started. Then the attendees will discuss if it could work, potential pitfalls, and whether or not this is something we should be working toward. And especially-- is there some way we could start working towards this now? The outcome could be a general outline of an enterprise metadata system, and a draft plan to make it happen.Tyler Christensen and Jeff de La Beaujardière

Abstracts

4B.1 Retrospective on “Enterprise” Metadata Systems in NOAAAnna Milan (NESDIS/NCEI)

Ten years ago I started managing metadata with the NOAA Metadata Manager Repository (NMMR). The NMMR was our enterprise solution, but clunky and expensive to maintain. Then I learned about other enterprise options, such as the Other NMMR, MerMAID, InPort and XML editors. With each approach there were costs, benefits and cultural differences. This presentation will summarize the pros and cons of previous and current metadata management systems and approaches.

“..in order to move forward into the future, you need to know where you've been.”- Charles Williams

4B.3 Straw-man proposal for a One-NOAA metadata systemTyler Christensen (NOS/IMO)Jeff de La Beaujardiere (NOAA/EDMC)There has been some talk about designing an integrated metadata management system for all of NOAA, from planning data acquisitions all the way to the archives. But can one system really support the diverse needs throughout the data lifecycle? This talk will describe what such a system might look like. It is not intended to be a complete proposal, but rather a conversation-starter. The talk will be followed by an open discussion of the straw-man proposal, potential roadblocks, and opportunities.

4B.2 Concerns to address about a One-NOAA metadata management systemKatharine Weathers (NESDIS/NCEI)Jacqueline Mize (NCEI), Matt Dornback (NCEI), Kathy Martinolich (NCEI), Lauren Jackson (NCEI)A One-NOAA metadata management system is a fine concept. The feasibility of this concept is complicated by the existing metadata tools that frame the unique situations of the current line offices and individual programs. These tools such as DIVER, InPort, and CIMS help their respective offices with data management along with metadata creation. With the knowledge that these programs all take different approaches and their data is different in magnitude and scope to other NOAA data types, such as satellite data, is there concern that one-size doesn’t fit all?

4C

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter4C.1 15 The HDF Data Format: High Performance Interoperability for Earth Science

CommunitiesTed Habermann

4C.2 15 Advancing netCDF-CF for the Geoscience Community Ethan Davis4C.3 15 NetCDF Gold Standard Examples: A walkthrough of the NCEI templates Mathew Biddle4C.4 15 IOOS Compliance Checker Luke Campbell4C.5 15 Accessing netCDF Quality -- Tools for catalogs and files. Roland Schweitzer4C.6 15 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEZHcweWg4bklQUkU

netCDF/HDF - A Foundation for Sharing NOAA Data

Jan 9, 15:30-17:00

White Flint

NetCDF/HDF is the standard format for important NOAA satellite programs (JPSS and GOES-R), for data exchange in AWIPS, a key element in climate models and results, an international standard in the hydrographic community (BAG) and a standard for all STAR satellite products. It is an important part of the NOAA-wide environmental data foundation. This session has two goals: to improve cross-line office awareness and communication among developers and users of netCDF/HDF and to initiate a discussion with the creators of netCDF/HDF about new capabilities and enhancements that would improve the format and related tools for the NOAA user community.

Ajay Krishnan

Abstracts

4C.1 The HDF Data Format: High Performance Interoperability for Earth Science CommunitiesTed Habermann (The HDF Group/Earth Science)John Kozimor (The HDF Group), Sean Gordon (The HDF Group)Effective interdisciplinary research requires sharing of well-described data in formats that are usable across platforms and languages. Many communities in the earth sciences have turned to HDF5 to provide a stable and flexible format that they can customize with their own community conventions and standards. The HDF format serves these communities especially well because it is a high performance, self-describing format that has implementations and associated tools across many platforms and popular languages. The Common Framework for Earth Observations recommends HDF5 and netCDF4 (an API and data structure built on HDF5) as standard formats for numerical data and associated metadata. We describe several examples of community conventions and standards used to customize the HDF5 format to address specific needs, in order to illustrate the similarities among communities and data types. The intended outcome of this exposure is to foster expertise sharing among communities resulting in increased data and convention interoperability.

4C.2 Advancing netCDF-CF for the Geoscience CommunityEthan Davis (UCAR/Earthcube)The Climate and Forecast (CF) metadata convention for netCDF (netCDF-CF) is a community-developed convention for encoding geoscience data stored in the netCDF binary data format in a self-describing manner. Now an OGC standard, it can encode information that describes coordinate systems, the geophysical meaning and units of each variable, and how the data were collected. It can capture this information for a number of scientific feature types (e.g., station, sounding, and gridded data). It is widely used by weather forecasters, climate scientists and remote-sensing researchers. Numerous open source and commercial software tools are able to explore and analyze netCDF-CF datasets.

This presentation will provide an overview and update on work to extend the existing netCDF-CF metadata convention in ways that will broaden the range of earth science domains whose data can be represented. It will include discussion of the enhancements to netCDF-CF that are underway; the current CF community-based standards development process (including validation); and integration with existing and emerging tools.

4C.3 NetCDF Gold Standard Examples: A walkthrough of the NCEI templatesMathew Biddle (NESDIS/NCEI/IOOS/CICS)The NOAA National Centers for Environmental Information (NCEI) have developed netCDF templates based on what are called "feature types" by Unidata and CF. These templates conform to Unidata's netCDF Attribute Convention for Dataset Discovery (ACDD) and netCDF Climate and Forecast (CF) conventions. Adding to these established conventions, NCEI also provides several recommendations for both netCDF variables and attributes. These best practices capture NCEI's experience in providing long-term preservation, scientific quality control, product development, and multiple data re-use beyond its original intent. In conjunction with the templates, NCEI has provided ‘gold standard’ example files which follow the templates and a collection of reports which document the results of testing the files in various compliance checkers. This presentation will review the NCEI templates and provide information regarding the gold standard examples and subsequent reports.

4C.5 Accessing netCDF Quality -- Tools for catalogs and files.Roland Schweitzer (OAR/Contractor for NOAA/PMEL)Kevin O'Brien (JISAO), Sean Arms (Unidata), Dave Neufeld (NCEI)The ncISO Tool traverses THREDDS Catalogs, reads dataset documentation and translates that documentation into different views using Extensible Stylesheet Language Transformations (XSLT) which allow users to assess the quality of the data source with regard to it's coverage of the Attribute Convention for Data Discovery. We have also developed a tool for traversing THREDDS catalogs (the Catalog Evaluator) which also collects information about the underlying netCDF data sources and produces a display of quality information regarding whether the data complies with known conventions and whether it is aggregated appropriately. Recently we complete a project to convert the ncISO to consider the latest ADDC specification (from 1.2 to 1.3) and have incorporated ncISO output in our Catalog Evaluator output. We will report on our experiences updating ncISO, incorporating ncISO output and its implications for data quality across NOAA and beyond.

4C.4 IOOS Compliance CheckerLuke Campbell (Applied Science Associates/IOOS Program Office)As the volume of scientific data collection increases the importance of adhering to standard protocols for data discovery and distribution becomes increasingly important. High-quality metadata is critical to improving searchability of and access to these scientific datasets. Standardizing metadata allows integration with data catalogs and facilitates data distribution. There are several community standards that help to improve the quality of metadata and thus prepare datasets for cataloging. These include the Attribute Convention for Dataset Discovery (ACDD) and Climate and Forecast Metadata (CF).

To assist data providers with generation of quality metadata, the IOOS Program Office developed the Compliance Checker. The Compliance Checker provides an efficient means for data providers to determine how well a dataset meets various standards and provides guidance for improving the quality of metadata. The tool, written in Python, can be run either from the command line or via a web-based interface. The core checker provides scores for ACDD and CF standards; a series of plug-ins allow data providers to check compliance with other standards such as the NOAA National Center for Environmental Information (NCEI) 2.0 standard. The results of the compliance checker include not only a compliance score, but a report detailing actions that users can take to improve metadata. Use of the Compliance Checker has already directly benefited data providers wanting to submit data to the IOOS Catalog, IOOS Glider Data Assembly Center, or to NCEI for permanent archiving.

5A

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter5A.1 15 Leveraging Geographic Response Plans in NOAA’s ERMA Rachel Fox5A.2 15 Managing the Deepwater Horizon Natural Resource Damage Assessment

Environmental DataBen Shorr

5A.3 15 Use cases for climate.data.gov Cathy Smith5A.4 15 Improving data stewardship through the use of a maturity matrix: a success story Ge Peng5A.5 15 OneStop Usability Testing Report Ken Casey5A.6 15 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEaUs2NHhEeWFtT2M

Data Successes and Challenges: Examples of Real-World Applications and Problems

Jan 10, 09:00-10:30

Salon F-H

This session includes talks on successful applications of NOAA data to real-world use cases, preferably highlighting instances in which good metadata, interoperable data access services, and/or other positive attributes were beneficial. This session also includes examples of problems with NOAA data: instances where datasets could not be easily found, accessed, understood, and/or used. The intent is not to shame the data provider, but rather to point out areas in which further progress is needed and to highlight candidates for possible near-term improvement.Jeff de La Beaujardière

Abstracts

5A.1 Leveraging Geographic Response Plans in NOAA’s ERMARachel Fox (NOS/Office of Response & Restoration (OR&R))Geographic Response Plans (GRPs) aid responders in the first 24 to 48 hours of an oil spill response. They identify sensitive environmental and socioeconomic sites as well as the strategies available to protect these sites based on their location and proximity to an incident. GRPs have historically been paper products taken out into the field to assist the Coast Guard, EPA, and state and local responders at the onset of an incident, but they can now be mapped online in a central, collaborative environment that allows for common visualization across multiple state and sector GRPs. The Environmental Response Mapping Application (ERMA) was developed by NOAA’s Office of Response and Restoration (OR&R) to provide a common operational picture for responding to or coordinating emergency response efforts and situational awareness for oil spills and damage assessment efforts. ERMA is an online mapping tool that integrates both static and real-time data in a centralized, easy-to-use format for environmental responders and decision makers. ERMA is a secure environment where data can be discovered, obtained, or downloaded based on a user's need for data and appropriate role in response.

5A.2 Managing the Deepwater Horizon Natural Resource Damage Assessment Environmental DataBen Shorr (NOS/Damage Assessment Remediation and Restoration Program)Dr. Amy Merten (ORR)The Deepwater Horizon DIVER data warehouse and query application was designed and built by the Trustees responsible for assessing damage and implementing restoration in the Deepwater Horizon Natural Resource Damage Assessment (NRDA), and for public access to environmental datasets collected and used for the NRDA and Restoration. The DIVER data warehouse and Explorer query tools were begun during the second year of the five-year effort to support and build the Damage Assessment case that was ultimately settled in March 2016 with a consent decree that includes $8.1 billion in natural resource damages. My talk will focus on the challenges and solutions that our team encountered as we addressed this unprecedented magnitude and complexity of data. This real world experience with legacy data management systems and building new data management systems and interoperability has informed our entire approach to managing environmental data and is a key part of our Offices approach to current and future cases, and data transparency and communication.

5A.3 Use cases for climate.data.govCathy Smith (OAR/ESRL/PSAD)Scientists and Applied Science researchers at NOAA/ESRL PSD examined the climate.data.gov website to see how it would provide the datasets needed to address a set of climate related questions. Those questions ranged from research specific ones to some similar to what we are asked by the general public. For example, how do SST's over the North Pacific ocean impact drought over the midwest. And, for an applied user, what is the range of wind speeds over a location for use in generating wind energy. We have strong climate/weather expertise in our lab and have experience with datasets that can be used to answer these sorts of questions. The results of our searches indicate opportunities for improvement for the data.gov website. In particular, we feel the need for adding geophysical search parameters that would not necessarily be needed for searching other types of data.gov webpages.

5A.5 OneStop Usability Testing ReportKen Casey (NESDIS/OneStop)Nancy Ritchie, David Neufeld, Michael Chapman, John Relph, David FischmanNOAA OneStop is a pathfinder effort to provide improved public discovery, access, and visualization for all NOAA data. OneStop is a two pronged effort. One is to provide a User Interface that is adept at presenting data with improved relevancy. The second is to improve the quality of the metadata associated with a given data group. Both of these efforts build on each other and require your feedback! Please stop by the OneStop kiosk to explore the beta site and provide your feedback on bugs, enhancements, or new feature requests! Summary of results collected from the kiosk will be presented at the session on Data Successes and Challenges.

5A.4 Improving data stewardship through the use of a maturity matrix: a success storyGe Peng (CICS-NC/NCEI/OneStop Project)Christina Lief (NESDIS/NCEI), Steve Ansari (NESDIS/NCEI)This presentation will use the highly utilized monthly land surface temperature data product derived from the Global Historical Climatology Network (GHCN-M) to demonstrate how a data stewardship maturity matrix (DSMM) can help identify potential areas of improvement in both stewardship practices and system integration. This success story shows how people from multiple disciplines utilized the DSMM to address topics needing improvement by integrating interoperable, high quality metadata with product-specific descriptive information, resulting in enhanced product accessibility and usability.

5B

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter5B.1 15 Facilitating Data Submission and Archive with the NCEI Water Column Sonar Data

PackagerCharles Anderson

5B.2 15 Design and implementation of automation tools for DSMM diagrams and reports Sonny Zinn5B.3 15 Implementing Data Stewardship Maturity Matrix in ISO metadata Anna Milan5B.4 45 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEdzEySXNWYmJyUmc

Toward systematically curating and integrating data product descriptive information

Jan 10, 09:00-10:30

Glen Echo

Complete, consistent, and easy to understand information about data products is critical for meeting user needs of data discoverability, improved accessibility and usability, and interoperability.

In the BigData and Open Data Era, with ever increasing variety and number of data products, it becomes increasingly impractical to manually complete data descriptions/documentation. The most effective way to ensure the completeness and quality of metadata and description documentations is to curate data products in a systematic, consistent, and automatic manner based on standards, community best practices, and defined frameworks.

This session invites presentations describing and sharing work/progress on systems, tools, frameworks, workflows, etc. that enable systematic generation of descriptive information about data products for improved discoverability, usability and interoperability. Additionally this session will discuss gaps that still need to be addressed.Nancy Ritchey & Ge Peng

Abstracts

5B.1 Facilitating Data Submission and Archive with the NCEI Water Column Sonar Data PackagerCharles Anderson (NESDIS/NCEI)Carrie C Wall (NCEI)Archiving large volumes of complex environmental data requires both efficient archive systems for for ingest and thorough metadata documentation to ensure data usefulness now and into the future. The NCEI Water Column Sonar Data (WCSD) Archive addresses these needs with a data packaging tool built for data providers. Developed in collaboration with our NMFS Fisheries Science Center partners, the WCSD Packager is a stand-alone executable with a simple user interface to control packager operation and facilitate entry of metadata by the user. Using the packager, data providers specify data source and destination locations, easily enter basic metadata information aided by drop down lists and other controlled vocabulary fields, and click “package data”. From there data packaging is fully automatic. The packager copies the sonar and ancillary data files, generates ISO standard cruise-, dataset- and file-level metadata records and creates an md5 checksum manifest file; all contained in a structured data package conforming to the Library of Congress BagIt specification. Due to the size of WCSD, the data packages are created on external hard drives that are then shipped to NCEI for ingest and archive. An individual drive can contain dozens of packages comprising several TB of data. The consistent structure of the data packages facilitates an automated archiving system that performs a checksum validation to ensure file integrity, archives the data files, populates the WCSD metadata database, and updates the WCSD data discovery and ordering portal without data manager intervention once the ingest is initiated. The WCSD Packager and automated ingest system have enabled the ingest and archival of 385 data packages comprising 31.6TB of WCSD since January 2014. The WCSD Packager serves as a model for facilitating data submission and automated archiving of other data streams at NCEI.

5B.3 Implementing Data Stewardship Maturity Matrix in ISO metadataAnna Milan (NESDIS/NCEI)Knowing the data and stewardship maturity is essential in making informed decisions on which data product is best suited for a particular application. The Data Stewardship Maturity Matrix is a framework for assessing the maturity of data and is being used by NOAA OneStop. This presentation will describe the implementation of DSMM information in ISO compliant metadata.

5B.2 Design and implementation of automation tools for DSMM diagrams and reportsSonny Zinn (NESDIS/OneStop)John Relph (NCEI), Ge Peng (CICS-NC), Anna Milan (NCEI), Aaron Rosenberg (ERT)The OneStop project aims to make NOAA environmental data easily discoverable and useable by improving metadata and providing a user-friendly interface [1]. Providing transparent dataset quality information to users is a part of OneStop ready requirement. To help meet this requirement, the stewardship maturity of each dataset is thoroughly evaluated under nine categories using a consistent assessment framework, namely, the NCEI/CICS-NC Data Stewardship Maturity Matrix (DSMM) [2]. The evaluation process requires extensive research by metadata content editors who are specialized in information science. An evaluation produces a DSMM report which includes two figures, namely, a scoreboard and star rating chart, that are drawn using nine assessment scores ranging from 1 to 5. Creation of these diagrams turns out to be tedious and laborious as it entails coloring of 45 elements and shading of 45 stars for the scoreboard and star rating chart, respectively. To ease their efforts on writing a report, we created a software program that reads an existing DSMM report and automatically generates and embeds diagrams within the report. To provide a more efficient workflow, the program was extended to take inputs from a spreadsheet with DSMM information and it can generate over 700 reports in less than three hours. Currently we are further improving the workflow by adopting an existing web application called CEdit for collecting and storing DSMM information and by interfacing our automation program with CEdit.

References

[1] Casey, K.S., 2016: OneStop: Project Overview. Improving NOAA’s Data Discovery and Access Framework. 2016 ESIP Winter Meeting.

[2] Peng, G., J.L. Privette, E.J. Kearns, N.A. Ritchey, and S. Ansari, 2015: A unified framework for measuring stewardship practices applied to digital environmental datasets. Data Science Journal, 13, 231 - 253. doi: http://dx.doi.org/10.2481/dsj.14-049

5C

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter5C.1 45 Progress and the Future of NOAA's Big Data Project Ed Kearns5C.2 45 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEblFJNHc3NWc5QjQ

NOAA Big Data Project

Jan 10, 09:00-10:30

White Flint

The progress of the NOAA Big Data Project (BDP) will be discussed, including the benefits that have been realized thus far, and the best practices and lessons learned from the first two years of the BDP CRADA will be shared. Options for the future of the project will be discussed among participants as the BDP's CRADA phase enters its final year.Edward Kearns

Abstracts

5C.1 Progress and the Future of NOAA's Big Data ProjectEd Kearns (OCIO/NOAA Big Data Project)Jed Sundwall (AWS), James Stevenson (IBM)Ed Kearns will give an overview briefing and moderate a panel discussion with Cloud collaborators from the NOAA Big Data Project. An open discussion will follow.

5D

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter5D.1 30 OER Video Portal - A Video Data Management Success Story Susan Gottfried5D.2 15 Deep Sea Video Acquisition Brendan Reser5D.3 15 Legacy Video - A Practical Application in Video Rescue and Management Fred Katz5D.4 30 Open Discussion5D.5 15 Report on the 2016 Workshop to Establish Community Standards for Underwater

Video Data Collection and ManagementSharon Mesick

5D.6 15 Standardizing metadata for NOAA video and imagery data Mashkoor Malik5D.7 15 Application of Coastal and Marine Ecological Classification Standard (CMECS) to

ROV Video Data for Enhanced Geospatial Analysis of Deep Sea HabitatsCaitlin Ruby

5D.8 45 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEY3pMeXJsdTR5dlk

Video Data Management

Jan 10, 09:00-12:30 (double session, with break)

Forest Glen

A number of NOAA Programs have requirements for managing video data in compliance with NOAA Directives. The complexity and volume of video data collections create management challenges, and best practices for video data management have not been clearly defined. Anecdotally we find that many Program's video management needs overlap, regardless of collection method. A common dialog is needed to find enterprise solutions.

NOAA's Office of Ocean Exploration and Research partnered with NCEI in a series of multiyear projects collectively called the Video Data Management Modernization Initiative (VDDMI.) VDDMI was designed to investigate and test methods of video data management, and has developed both long term archival solutions and an easy access portal. Best practices, video metadata templates, and reusable methods have been developed and documented for reuse by others.

This session will focus on two main video data management themes: 1) VDDMI: Best practices and operational video management solutions will be demonstrated and discussed; 2) Informal discussion of different video projects in NOAA will highlight the need for common solutions and areas for collaboration.

The outcome will be a greater understanding of the different video data management requirements around NOAA, available tools, and greater opportunities for interested groups to contribute towards the development of an enterprise solution.Sharon Mesick

Abstracts

5D.1 OER Video Portal - A Video Data Management Success StorySusan Gottfried (NESDIS/OER)NOAA’s Office of Ocean Exploration and Research (OER) has a large and growing inventory of environmental data on video media or in video formats. The OER Data Management Team has been involved in a Video Data Management Modernization Initiative (VDMMI) project to come up with a solution to not only preserve these valuable video data assets, but to make them discoverable and accessible in a self-service model. In this session presentation, the VDMMI project lead will demonstrate the resulting OER Video Portal and discuss the elements that make it work.

5D.2 Deep Sea Video AcquisitionBrendan Reser (OAR/NCEI / Okeanos Explorer)At sea high quality video acquisition from remotely operated vehicles (ROV)’s is both a costly and a challenging enterprise. The Office of Exploration and Research (OER) Okeanos Explorer program is a world leader not only in acquiring, curating, automating, and processing these video datasets from cameras up to 6000 meters / 19685 feet below sea level. This session presentation will focus on the acquisition strategies, lessons learned, and data pathway from 6000 meters to public access.

5D.3 Legacy Video - A Practical Application in Video Rescue and ManagementFred Katz (NESDIS/NCEI/Video Data Management Modernization Initiative)Jonathan Blythe (Bureau of Ocean Energy Management)Since the 1980s, DOI's Minerals Management Service, now Bureau of Ocean Energy Management (BOEM), have supported studies using research submersibles to make video recordings of hydrocarbon seep sites in the Gulf of Mexico. The size and extent of this legacy video dataset, over 900 gigabytes in volume and comprising 3,000 files, has presented a challenge to BOEM to make the collection preservable and discoverable. BOEM and NOAA’s Office of Ocean Exploration and Research (OER) have been partnering together to apply key elements of the OER Video Data Management Modernization Initiative (VDMMI) to preserve and steward this dataset, and also demonstrate the utility of the OER model in successful technology transfer. In this session presentation, the lead on the BOEM video project will discuss progress made towards creating an effective solution for BOEM.

5D.5 Report on the 2016 Workshop to Establish Community Standards for Underwater Video Data Collection and ManagementSharon Mesick (NESDIS/NCEI)Vickie Ferrini (Columbia University), Dwight Coleman, (University of Rhode Island), Adam Soule (WHOI)In June 2016, the National Science Foundation (NSF) sponsored a workshop focused on development of a community-driven strategy for managing underwater video data collections. The workshop brought together the community of stakeholders - including scientists, data management professionals, vehicle operators and system designers, and education and outreach professionals - to define current practices and needs, and to begin to develop consensus and best practices recommendations for underwater video acquisition, tagging, archiving and access. This presentation will provide an overview of the high-level priorities and recommendations resulting from the workshop, with the intent of starting a broader dialog within the Environmental Data Management community.

Chris Beaverson (NOAA OAR, Office of Ocean Exploration and Research), Laura Kracker (NOAA NOS, Center for Coastal Monitoring and Assessment, Biogeography), William Michaels (NOAA Fisheries, Office of Science & Technology), Elizabeth Clarke (NOAA Fisheries, Northwest Fisheries Science Center), Susan Gottfried (NOAA NCEI, National Coastal Data Development Center)Video and still images are critical for several NOAA missions. Imagery is collected from various NOAA platforms including aircraft, vessels, Remotely Operated Vehicle (ROV), Autonomous Underwater Vehicle (AUV), Unmanned Aerial System (UAS), trawl cameras. Irrespective of how the video and images are collected, users of these data sets require primary information about the data sets, including time, position, attitude, and camera variables (pan tilt, focal distance, and aperture) that can answer the basic questions of when, where and how the data set was collected so that interpretive observations can be made. This presentation will focus on three NOAA use cases to identify metadata requirements from NOAA’s Office of Ocean Exploration and Research (OER): Deep Sea Exploration using ROV, National Marine Fisheries Service (NMFS): Fish Stock Assessment using AUV and National Centers for Coastal Ocean Science (NCCOS): Habitat characterization using towed camera and ROV. Examples of how metadata are developed and used will be explained. The presentation will explore commonalities among NOAA use cases; results of a recent National Science Foundation (NSF) workshop on underwater video imagery data (https://github.com/underwatervideo/UnderwaterVideoWorkingGroup/tree/master/Meetings/2016_Workshop/Documents) and a discussion on developing metadata standards for NOAA imagery data.

The metadata standards will enable NOAA-wide development of tools that can enable better access and interpretative analysis of NOAA video data and assimilation of observations within metadata. High-quality metadata will help to document existence of video content, and enable a unified approach to the long-term archiving and accessibility challenges. The NOAA enterprise strives to develop an inventory of distributed video resources that can be effectively queried for resource discovery and access. This NOAA wide effort will solicit input from the wider expert community to develop and implement the best practice guidelines for developing metadata standards to aid in archiving, accessing, and analyzing video and imagery data.

5D.6 Standardizing metadata for NOAA video and imagery dataMashkoor Malik (OAR/Office of Ocean Exploration and Research)

5D.7 Application of Coastal and Marine Ecological Classification Standard (CMECS) to ROV Video Data for Enhanced Geospatial Analysis of Deep Sea HabitatsCaitlin Ruby (NESDIS/MSU / NGI / NCEI)Adam Skarke (Mississippi State University), Sharon Mesick (NCEI)The Coastal and Marine Ecological Classification Standard (CMECS) is a network of common nomenclature that provides a comprehensive framework for organizing physical, chemical, biological, and geological information about marine ecosystems. It was developed by the NOAA Coastal Services Center and NatureServe, in collaboration with other federal agencies and academic institutions. This classification standard serves as a means for scientists to more easily access, compare, and integrate marine environmental data from a wide range of sources and time frames. CMECS has been endorsed by the Federal Geographic Data Committee (FGDC) as a national metadata standard. The research presented here is focused on the application of CMECS to deep sea video and environmental data collected by the NOAA ROV Deep Discoverer and the NOAA Ship Okeanos Explorer while exploring the northern Gulf of Mexico in 2014. Research objectives focus on determining the extent to which CMECS can be applied to deep sea benthic habitats in the northern Gulf of Mexico, assessing the feasibility of annotating ROV video data through CMECS identifiers, and developing geospatial processing techniques necessary to spatially analyze and cartographically represent the classified deep sea habitats. Video classifications were accomplished using a hot key pad generated by the Video Annotation tool within Mashkoor Malik’s ROV Data Analyzer Software (still under development). This Video Annotation tool extracts the ROV coordinates based on the embedded time within the ROV video being classified, which allows the annotations to be ingested within a GIS. Geospatial processing techniques were used to interpolate environmental parameters (higher spatial resolution of local bathymetry, temperature, salinity, and dissolved oxygen content), an array of CMECS compliant ecosystem surfaces (physical, geological, and biological), viewed areas along the seafloor (ROV viewsheds), holiday regions in which no ROV observations were made, as well as CMECS compliant habitat surfaces within the ROV viewsheds representing visually observable seafloor characteristics (physical, geological, and biological). The resulting geospatial data products support spatiotemporal analysis of the surrounding seafloor based on CMECS classifications. Furthermore, attributing ROV video and ancillary data with CMECS notations within the metadata documentation increases and refines the search options for end-user.

6A

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter6A.1 15 Connecting Communities of Practice to Advance Environmental Data Management Leslie Hsu6A.2 15 How do stakeholder requirements fit into EDM? Karsten Shein6A.3 60 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEdUFSaG1pVEU5aG8

What is missing from the EDM conversation at NOAA?

Jan 10, 11:00-12:30

Salon F-H

Most EDM sessions are very focused on the details of a particular topic. We wanted to leave space for some creativity. Talks should focus on topics that we are not yet talking about but SHOULD be talking about; critical needs that we haven't identified yet; so-crazy-it-just-might-work ideas; looking at the bigger picture or the interconnection between topics; or even innovative data management examples that don't fit into another workshop session. Bring us your most creative ideas!Tyler Christensen & Nancy Ritchey

Abstracts

6A.1 Connecting Communities of Practice to Advance Environmental Data ManagementLeslie Hsu (US Geological Survey/CDI)Data communities of practice are working in parallel across different federal agencies and disciplines. Although their overall goals of data curation and distribution may overlap with each other, the nature of the data, the data policies, and the data sharing culture may be very different. Where and when should connection points be made between groups for the greatest mutual benefit? This presentation will use the USGS Community for Data Integration (CDI) as a discussion point for how connections with external partners could reduce redundancy, build networks, and advance the field of environmental data management.

6A.2 How do stakeholder requirements fit into EDM?Karsten Shein (NESDIS/NCEI/CCOG)Sharon Mesick (NOAA/NCEI/CCOG)Environmental Data Management (EDM) traditionally considers the stewardship of data to occur between receipt and access, but data collection and end use considerations affect how those data must be managed. What efforts are undertaken by EDM stewards to ensure that data are collected and provisioned in ways that optimize their utility based on the needs of a broad consumer base? Engaging data consumers and collectors is an essential, but oft overlooked component of true end-to-end (E2E) EDM stewardship. When we ignore these steps, define them too narrowly, under-resource them, or attempt to fulfill them after the fact, EDM stewardship becomes suboptimal and data potential becomes unnecessarily limited. Optimizing stewardship requires identifying and cultivating ongoing dialog with stakeholders on both ends of the data chain to ensure the needs of those stakeholders are considered and integrated, where possible, into data collection and stewardship processes to maximum effect. To be effective, such engagement must, like all other aspects of stewardship, be proactively included in the process as a clearly defined activity. This presentation discusses the development, implementation, and benefits of stakeholder engagement as an integral part of an E2E EDM stewardship process.

6B

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter6B.1 15 Helping the Coastal Management Community Access AIS Daniel R Martin6B.2 15 Using AIS to determine Hydrographic Survey Priorities Lucy Hick6B.3 15 Developing Scalable Data Management Solutions for Large Scale AIS data and

BeyondRob Bochenek

6B.4 45 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEUDNmamFFb3VxYm8

Ship Automatic Identification System (AIS) for everyone

Jan 10, 11:00-12:30

Glen Echo

The Automatic Identification System (AIS) is a system of shipboard transmitters and land-based and satellite-based receivers that allow vessel locations to be broadcast and recorded. AIS is a critical part of ship-board navigation and vessel traffic services (VTS) around the world. Increasingly, different groups are identifying ways to use AIS that expand beyond the traditional usage in marine navigation. However, because these shipboard AIS transponders are capable of transmitting updates as often as every two seconds, the generated datasets quickly become massive and unwieldy, making data storage and access incredibly challenging.

A number of groups within NOAA (OCM, OCS, ONMS, NMFS, IOOS etc.), have requirements for AIS data. A recent informal survey found that many of these requirements are overlapping. While OCM in particular has made effort in deriving and sharing AIS products, there is still progress that can be made.

This session would focus on different projects in NOAA that highlight the need for an enterprise tool for dealing with AIS data and explore potential solutions for dealing with this issue.

The outcome would be a greater understanding of the different requirements for AIS data around NOAA and greater opportunities for interested groups to contribute towards the development of an enterprise solution.

Lucy Hick

Abstracts

6B.1 Helping the Coastal Management Community Access AISDaniel R Martin (NOS/Office for Coastal Management - contractor)Anna VerrillFor Federal and state agencies to effectively manage human use of the coastal zone and the outer Continental Shelf they need to understand patterns of vessel traffic and the potential for location-based conflicts. The Marine Cadastre project, led by NOAA and BOEM, leverages the U.S. Coast Guard’s National Automatic Identification System (NAIS) to help NOAA constituents solve some of these needs. The expansive geographic scope of the NAIS, volume and frequency of record collection, changing carriage requirements, privacy concerns, and breadth of end user needs are just a few of the challenges the project team has managed over the past 6 years of operations. This presentation will share some of our lessons learned in designing data products and tools and serving countless customers. Some of our observations indicate a need to find a more scalable or enterprise design in order to meet the growing data science demands of our community.

6B.3 Developing Scalable Data Management Solutions for Large Scale AIS data and BeyondRob Bochenek (Axiom Data Science LLC/Partner with NOS/OCS)

This talk will present progress on an a NOAA funded effort to develop the capability to significantly reduce the execution time for handling and analyzing extremely large collections of Automated Information System (AIS) vessel tracking data. This capability is enabling the investigating team to produce a variety of analytical products that serve as primary inputs for NOAA’s OCS Hydrographic Health Model which identifies areas that may provide a risk to surface navigation due to inaccurate depths or unknown hazards. In addition, this capability provides a framework for ad hoc querying and granular processing of the underlying AIS dataset. Utilizing and optimizing a high performance computing, parallel compute workflow solved current data volume challenges experienced by traditional database and single node computing systems. To overcome the limitations of traditional data storage and processing infrastructure, we propose the assembly and use of a cluster computing stack using Apache Spark as the computing engine. This stack is horizontally scalable, high-availability, and built upon a suite of high-performance open-source technologies to provide a general processing engine for a wide variety of high volume datasets.

6B.2 Using AIS to determine Hydrographic Survey PrioritiesLucy Hick (NOS/Office of Coast Survey)Corey Allen (NOAA), Christina Fandel (NOAA), Barry Gallagher (NOAA), Mike Gonsalves (NOAA), Patrick Keown (NOAA)Vessel traffic, or Automated Information System (AIS) data is important for a number of initiatives within NOAA’s Office of Coast Survey, including determining the adequacy of navigational charts and prioritizing the acquisition of hydrographic data. In particular, AIS is one of the main parameters for Coast Survey’s new Hydrographic Health model.

Coast Survey has amassed several years’ worth of AIS data (both satellite and terrestrial) for the entire United States. AIS data is also available from MarineCadastre.gov. However, due to immense size of these datasets and limitation in our infrastructure and computing power, AIS data must currently be processed in small temporal or spatial subsets. This has proven inadequate for the Hydro Health Model, which requires analysis of the entire AIS national data over an entire year.

In order to resolve these issues, Coast Survey has partnered with Axiom Data Science and the U.S. Integrated Ocean Observing System (IOOS) to develop a system which allows easy interaction with AIS data.

This talk will highlight how AIS is being used to determine NOAA's hydrographic survey priorities, some of the issues that have been encountered trying to work with AIS, and progress that has been made on the development of new tools and methods.

6C

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter6C.1 15 Improving NESDIS Data Management Planning Helen Wood6C.2 15 Improved Data Curation and Exploration: A Timeline for Visualizing Data Inventory Aaron Sweeney6C.3 15 Establishing a NOAA Passive Acoustic Data Archive for Long-Term Storage and

AccessCarrie Wall

6C.4 15 Access to 50 years of Scientific Ocean Drilling Data. But that’s not all! Kelly Stroker6C.5 15 Scientific Stewardship of Ocean Satellite Data Sheekela Baker-

Yeboah6C.6 15 Open Discussion

https://drive.google.com/open?id=0B_DpNPgLE0tEa19EWC1aOXp5aVU

Leveraging Long-Term Environmental Archives

Jan 10, 11:00-12:30

White Flint

The purpose of this session is to discuss how users synthesize large quantities of long-term archived environmental data - from multiple disciplines capturing important sun to ocean floor earth parameters - to develop new products and spur research that addresses compelling scientific questions and societal challenges and provides decision makers with actionable information. This session will also discuss ways to leverage NOAA and other environmental archives and archive related activities including long term observations. Mining long term observations provides new revelations, reference baselines for studying dynamics, and touchstones representing current scientific understanding. Integrative data approaches that incorporate all of the information content, utilizing sophisticated methods (e.g. new statistical methods; data and physics assimilation; feature classification; search, discovery and federation of diverse sources) and other emerging “Big Data” analysis and fusion techniques are a powerful way to overcome the challenges, and they rely on the many scientifically-curated datasets carefully managed by NOAA NCEI and other like-minded international institutions.Kelly Stroker

Abstracts

6C.1 Improving NESDIS Data Management PlanningHelen Wood (NESDIS/NESDIS/OSAAP)Scott Hausman (NESDIS)NESDIS has developed a policy for the management of data and information generated by its Observing Systems and associated Data Management Systems. The policy is intended to ensure that NESDIS Environmental Data are properly planned and supported through their lifecycle. In addition to presenting the key elements of the policy, this talk will describe some of the challenges encountered during its development.

6C.2 Improved Data Curation and Exploration: A Timeline for Visualizing Data InventoryAaron Sweeney (NESDIS/NCEI)We report on the effectiveness of visualizing data inventory via timelines for improving the curation and exploration of archived ocean-bottom pressure data and coastal tide gauge data at the National Centers for Environmental Information (NCEI). Metadata about the inventories are expressed in Javascript Object Notation (JSON) format and visualized on a timeline through an open-source Javascript library (VisJS). Through this timeline, gaps in coverage immediately become apparent. Within the first two months of this timeline going live, our primary data provider used this inventory visualization to identify and submit for archive 17 at-risk data packages from the backlog of data collection that were not previously submitted for archive. Given the high cost of collecting these data, this represents a significant return on investment. Along with the inventory timeline, instrument deployment pages were also published providing time-series plots and direct access to unassessed data, quality-controlled data products, modeled tidal constituents, supporting metadata, and a list of associated tsunami events. These products support the research forecasting efforts of the Pacific Marine Environmental Laboratory (OAR/PMEL) which then transfers products to the NOAA Tsunami Warning Centers for operational usage. This occurs under the auspices of the US NOAA Tsunami Program. These products also support the broader tsunami modeling community. This timeline adds a new dimension to data stewardship and discovery. See https://www.ngdc.noaa.gov/hazard/dart.

6C.3 Establishing a NOAA Passive Acoustic Data Archive for Long-Term Storage and AccessCarrie Wall (NESDIS/NCEI/University of Colorado)Charles Anderson (NESDIS), Sofie Van Parijs (NMFS), Leila Hatch (NOS), Jason Gedamke (NMFS)Passive acoustic monitoring of the ocean sound field is a critical aspect of NOAA’s mandate for ocean and coastal data stewardship. Sound can travel vast distances underwater (e.g. across ocean basins) making passive acoustic monitoring a powerful observational tool that is used across NOAA to detect and characterize: (1) sounds produced and used by living marine resources (e.g., endangered marine mammals, commercially important fish species); (2) natural sources of noise from physical oceanographic processes; and (3) anthropogenic noise sources that contribute to the overall ocean noise environment. The NOAA Ocean Noise Reference Station Network, through a unique collaborative effort across NOAA’s OAR, NMFS, and NOS offices, is the first ever acoustic monitoring system deployed broadly throughout the US EEZ, and allows NOAA to collect consistent and comparable multi-year acoustic data sets covering all major regions of the U.S.. NOAA's recently released Ocean Noise Strategy Roadmap highlighted the importance of establishing a centralized, long-term archive for these passive acoustic data, and the development of the archive is designated as a flagship project of the Ocean Noise Strategy. Toward that end and in compliance with PARR, NMFS and collaborative partners are proactively seeking mechanisms to efficiently maintain and store passive acoustic data. A 2016 BEDI proposal is underway to progress a 2014-2015 pilot project and establish passive acoustic data stewardship at NCEI. A new web-based map viewer has been built to allow the public to discover, query and access archived passive acoustic data. Additional datasets collected as part of the Ocean Noise Reference Station Network will be included in the data that are made available to the public. The current and future implementation plans for an operational archive of passive acoustic data for NOAA’s OAR, NMFS and NOS offices as well as the challenges of managing large volume and multi-platform data will be discussed.

6C.5 Scientific Stewardship of Ocean Satellite DataSheekela Baker-Yeboah (NESDIS/NCEI/CICS-MD)Korak Saha(NCEI/CICS), Yongsheng Zhang(NCEI/CICS), Dexin Zhang(NCEI/STC)Part of NOAA’s mission is the archive and stewardship of oceanographic data and the NOAA National Centers for Environmental Information (NCEI) play an important institutional role by serving as the authoritative source within the US and abroad, providing rigorous long-term archival services. NCEI provides scientific stewardship of remotely sensed oceanographic data, which consists of the application of an integrated suite of functions designed to preserve and exploit the full scientific value of environmental data and information over the long-term (decades). NCEI also develops satellite data products and provides authoritative records. Some examples of scientific stewardship and product development will be presented.

6C.4 Access to 50 years of Scientific Ocean Drilling Data. But that’s not all!Kelly Stroker (NESDIS/Cooperative Institute for Research in Environmental Sciences (CIRES))Barry Eakins (CIRES), Jennifer Jencks (NOAA/NCEI), Ken Tanaka (CIRES), Erin Reeves (CIRES), Chris Esterlein Geophysical data has long been stewarded safely and securely by NOAA’s National Centers for Environmental Information (NCEI), formerly the National Geophysical Data Center (NGDC). This includes, but is certainly not limited to, digital data and geologic sample photographs from 50 years of drilling into the sea floor to recover sediment cores and rocks in order to answer fundamental questions about Earth’s geologic history and long-term evolution of ocean biota and global climate. This stewardship mission has been reinforced and strengthened through participation in a variety of international programs, including hosting and operating the World Data Service for Geophysics, the International Hydrographic Organization’s Data Center for Digital Bathymetry, and the long-term archive for geologic samples data collected by the international scientific ocean drilling program. While the mission to provide long-term scientific data stewardship ensuring quality, integrity, and accessibility has remained largely unchanged for more than 200 years, the methods and technologies used to collect data have changed substantially as have user requests and expectations on the quality of the data and the volumes of data they receive. NCEI has been working towards a common NCEI Extract System (NEXT) to deliver a variety of geophysical data that NCEI stewards as well as handle the volume of data requested. The NEXT system handles every aspect from receiving data requests, collecting and packaging results, and delivering these results to the user. The system is easily extensible for new data and provides a programmatic interface so that external users and services can access it. In this paper we will describe an example of an NSF-funded, 1-year project to provide easy, Internet access to over 50 years of scientific ocean drilling (SOD) data in NCEI’s deep, tape archive through the NEXT graphical user interface as well as an API to enable our partners to build their own web applications to access these data.

7P

Session Time Schedule

LocationDescription

ChairFolder

AgendaID # Duration Title (see abstracts below) Presenter7P.1 10 Closing Plenary Intro Jeff de La Beaujardière7P.2 10 ESIP Introduction Erin Robinson7P.3 45 Breakout Session Highlights Session Chairs7P.4 45 Keynote - Metamorphosis: Environmental Data Management Transitions from

Adolescence to AdulthoodDr William Hooke

7P.5 10 Final Wrap-Up

https://drive.google.com/open?id=0B_DpNPgLE0tEWE55MVJnMzhTYU0

Closing Plenary

Jan 10, 14:00-16:00

Salon F-H

Breakout session highlights, Keynote Speaker, Workshop wrap-up.Jeff de La Beaujardière

Abstracts

7P.1 Closing Plenary IntroJeff de La Beaujardière (NOAA/EDMC)Call to order of the Closing Plenary of 2017 NOAA EDMW

7P.3 Breakout Session HighlightsSession Chairs (NOAA/)

Breakout session chairs or rapporteurs will brief 1 slide of highlights per breakout, 2 minutes max, rapid sequence.

7P.2 ESIP IntroductionErin Robinson (ESIP/)As Executive Director the Earth Science Information Partnership (ESIP), Erin Robinson will provide a brief overview of ESIP and its Winter Meeting, held immediately after EDMW.

7P.4 Keynote - Metamorphosis: Environmental Data Management Transitions from Adolescence to AdulthoodDr William Hooke (AMS/Assoc Exec Dir)Four big trends are forcing environmental data management (EDM) to mature. Two are the challenges of adulthood. Big jobs need doing. Two are the new tools of adulthood. They require mastering, but with that mastery comes opportunity. Environmental data management has never mattered more. But though the EDM of the future will share the DNA of the EDM of the past, the two will be no more similar than a butterfly is to the caterpillar it came from.