environmental information data centre: enabling the discovery of ceh-held data john watkins deputy...
DESCRIPTION
Environmental Information Data Centre: enabling the discovery of CEH-held data John Watkins Deputy Director EIDC . CEH monitoring and data collection. As diverse as our science Micro- to macro-scale Many sources: Monitoring campaigns 180+ field sites State-of-the-art facilities - PowerPoint PPT PresentationTRANSCRIPT
Click on icon to insert your own image instead
Environmental Information Data Centre: enabling the discovery of CEH-held data
John WatkinsDeputy Director EIDC
CEH monitoring and data collection
• As diverse as our science
• Micro- to macro-scale
• Many sources:• Monitoring campaigns• 180+ field sites• State-of-the-art facilities• Regulator networks• Volunteers• Model outputs
• Long-term and unique
10µm
River Lambourn, Boxford
National River Flow Archive
Environmental Change
NetworkNERC
Environmental
Bioinformatics Centre
Biological Records Centre
Other Data
NERC Designated
Data Centre Data
CEH data
CEH data
Web Access
Users
CEH Information Gateway
Metadata catalogue
(data discovery)
Linked data and integration
Long-termStorage and
Curation
View & download(data access)
Query & visualisation tools
NERC CatalogueUK Gov Catalogue
Data Transfer Process
EIDC Data Hub
EIDC Data Hub
Data citation via the Data Hub
“.....the datahave been allocated a digital object identifier (http://dx.doi.org/10.5285/1a91c7d1-ec44-4858-9af2-98d80f169bbd).”
Making definitions open access
CEH Analytical Services Thesaurus (CAST)• Created to Simple Knowledge Organization System (SKOS) W3C standard• Designed to describe whole process• Top concepts:
• determinands• machine descriptions• measurement units• methods• filtration• preservation
Resource oriented discovery
CEH Analytical Services Thesaurus (CAST)• SKOS allows links to externally hosted vocabularies e.g. ChEBI
• adds further value to datasets tagged using CAST, as they can be integrated with datasets tagged using concepts from linked vocabularies
Issues & challenges
• Researchers can ask complex questions across diverse data sources using LOD
• How to incentivise data providers to document & tag data => buy-in (e.g. DOIs)!
• Tools to automate the process, tagging at source/time of creation (e.g. LIMS)
• Automating the creation of semantic information for legacy data using diverse information sources (e.g. text mining of past reports and science papers)