cyndy chandler biological and chemical oceanography data management office
DESCRIPTION
Technical Issues of Connecting GeoData within and Between G overnmental Agencies: Focus on NSF Research Data. Cyndy Chandler Biological and Chemical Oceanography Data Management Office Woods Hole Oceanographic Institution. - PowerPoint PPT PresentationTRANSCRIPT
Technical Issues of Connecting GeoData within and Between
Governmental Agencies: Focus on NSF Research Data
CYNDY CHANDLERBIOLOGICAL AND CHEMICAL OCEANOGRAPHY DATA
MANAGEMENT OFFICEWOODS HOLE OCEANOGRAPHIC INSTITUTION
GeoData 2014 ~ 18 June 2014 ~ NCAR Center Green Campus, Boulder, Colorado
Scope: NSF GeoData• NSF funded, hypothesis-driven, ocean science
research projects from Division of Ocean Sciences (OCE)• OCE Biology and Chemistry
Division of Polar Programs (PLR)• Antarctic Research
ANT Antarctic Organisms and Ecosystems
Connectivity Challenges• Goals:
linking content at distributed repositories improved interoperability
• Technical strategies/solutions: metadata content standards controlled vocabularies Linked Data
• Not just technical cultural conditions, behaviors research data lifecycle “proposal to preservation”
An example
• A researcher reads a paper We have already assumed they have found and
are able to retrieve the paperhttp://www.pnas.org/content/111/22/8089.fullPatrick Martin, Sonya T. Dyhrman, Michael W. Lomas, Nicole J. Poulton, and Benjamin A. S. Van Mooy (2014) “Accumulation and enhanced cycling of polyphosphate by Sargasso Sea plankton in response to low phosphorus” PNAS 2014 111 (22) 8089-8094; published ahead of print April 21, 2014, doi:10.1073/pnas.1321719111
Example (cont’d)
there is a data supplement
DOI
What do I Know?
• Publication: PNAS, has a DOI, has data suppl.• Person name (author): Benjamin Van Mooy• Dates of activity: 2010 and 2012• Location keywords: Sargasso Sea• Cruise: on vessel Knorr • Data keywords: plankton, polyphosphate, lipid
general knowledge
domain specific
Research is a
game of Connect the Dots
• the dots are entities of information and data from distributed repositories
• Some catalogs or repositories are already connected making it easier to “connect the dots”
Connect the Dots
Connect the Dots
• Some catalogs (repositories) are already connected making it easier to “connect the dots”
• Dot #3 is a piece of information held in common (e.g. cruise ID)
• Some catalogs or repositories are already connected
Connect the Dots
• Some catalogs or repositories are already connected
Connect the Dots
Connect the Dots
Persistent identifiers• for publications
(DOI)• for data (DOI)• for people (ORCID)
• metadata• negotiated, shared,
common IDs• persistent IDs from
authoritative sources• controlled
vocabularies
local terms mapped to community-wide terms identified by URIs
Connect the Dots
Connect the Dots
• metadata• negotiated, shared,
common IDs• persistent IDs from
authoritative sources• controlled
vocabularies• semantic markup to
provide context and establish relationships
context matters
Semantic Web technologies can help
Connect the Dots
• Technical strategies/solutions: metadata … more metadata standards-compliant metadata globally unique persistent identifiers from
authoritative sources controlled vocabularies (local & community-wide) semantic markup Linked Data*
• Support transition from human to machine clients
*Linked Data: Bizer, Heath, Berners-Lee, 2009; 10.4018/jswis.2009081901
Progress since 2011
What has made the difference? Program manager involvement• Consequences for PIs for not making data available • Long-term commitment (funding, active engagement)
Changing expectations from originators• Marine ecosystem research requires access to many
different kinds of data
Progress since 2011
What has made the difference?Community organizations• NSF EarthCube: funding to establish partnerships with
other data managers, computer scientists and geoscientists
• ESIP: opportunity to work with people from other communities doing similar work discussions focus on challenges, activities deliver results
• RDA: global organization to foster data sharing• International efforts with a domain focus (e.g. ocean)
Modern data Semantic Webinfrastructure requires Technologies involve
inspired by (2013)