urmas kõljalg: data sources (wp1) task 1.1 assessment and evaluation of biodiversity data sources...

27

Upload: sawyer-sales

Post on 15-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,
Page 2: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Urmas Kõljalg: Data sources (WP1)

Task 1.1 Assessment and evaluation of biodiversity data sources

Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC, Plazi, NRM,IBSAS, EBCC, NBIC, WCMC;

The evaluation will assess important data characteristics, such as coverage, accessibility, quality, and format. Targeted data types 1) remote sensing data (incl. derived products e.g. vegetation and habitat maps, habitat classification schemes); 3) taxonomic backbone data; 4) ecological data; 5) specimen data from scientific collections; 6) species profile data, including descriptions and functional traits, conservation status, distribution and abundance data; 7) DNA sequence data.

Page 3: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Urmas Kõljalg: Data sources (WP1)

Task 1.2 Harmonization of European taxonomic backbone and analysis of taxonomic coverage

Lead BGBM; MfN, UTARTU, UEF, UFZ, SGN, FIN, HCMR, UCPH, MRAC, Plazi, NRM, IBSAS, NBIC

A unified taxonomic backbone will be built on the Pan-European Species directories Infrastructure (PESI, www.eu-nomen.eu), and harmonized with ongoing attempts towards a global Catalogue of Life (CoL). This task will integrate the Fauna Europaea and Euro+Med PlantBase databases into the EDIT Platform for Cybertaxonomy, set up a mechanism for regular updates, develop a full set of GEO BON and LifeWatch compliant services for integration of taxonomic backbone data into the overall EU BON framework, and integrate these services with INSPIRE conformant national data sources that include taxonomic names.

Page 4: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Urmas Kõljalg: Data sources (WP1)

Task 1.3 Gap analysis of available biodiversity information sources and identifying priorities

Lead MfN; UTARTU, GBIF, FIN, BGBM, MRAC, WCMC

Based on the assessments of data sources undertaken for task 1.1. and the review of policy requirements in WP6 (task 6.1.), identified gaps in data coverage and quality for different information layers will be evaluated against scientific interests and capacities in the biodiversity information stakeholder communities, focusing at the European level. Taxonomic, geographic, thematic, and other areas of bias will be analyzed in assessed datasets and priority levels for closing gaps will be provided, also comparing European to global level coverage.

Page 5: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Urmas Kõljalg: Data sources (WP1)

Task 1.4 Integrated approaches for focused biodiversity data mobilization

Lead NRM; UTARTU, Pensoft, BGBM, UCPH, MRAC, Plazi, GlueCAD, IBSAS, NBIC;

With this task, EU BON will advance data mobilization efforts targeting collection-based and molecular data. In particular, three activities will be pursued: i) Through the open-source DINA initiative led by NRM to develop a web-based collection management system, an integrated solution for digitizing, managing and mobilizing specimen data in natural history collections will be developed. ii) The open access system JACQ (Virtual Herbaria) for capturing botanical data will be integrated with other web services, to allow increased participation of a significant number of European herbaria to feed their data into GBIF, BioCASE, and other networks relevant for GEO BON. iii) For DNA and genomic datasets, a set of web-based services, integrating distributional and molecular information will be provided, also linking to molecular identification.

Page 6: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Urmas Kõljalg: Data sources (WP1)

Task 1.5 Exploring citizen science – based approaches for mobilizing and generating biodiversity data

Lead UTARTU; MfN, Pensoft, FIN, HCMR, UCPH, Plazi, GlueCAD, NRM, IBSAS, NBIC;

This task will explore the potentials of citizen science based approaches for biodiversity assessment and monitoring, particularly for achieving more comprehensive data coverage and towards future GEO BON developments. Using highly successful technology-based citizen science recording schemes from EU BON partners and associates, the status of citizen science and its links to curricula and environmental education will be evaluated for best practice examples from at least five EU countries. Elements for an action plan for pan-European citizen science networking for biodiversity information will be developed, including supporting an education network that will provide links to international and global initiatives and programs.

Page 7: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Data Integration and Interoperability

WP2

Hannu Saarenmaa

Page 8: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Objectives• Establish an information architecture for the EU BON project that will be

compatible with the global GEO BON, INSPIRE, other European projects, and the LifeWatch research infrastructure

• Develop data integration and interoperability between the various networks, and with new generation of data sharing tools enhance linking between observational data, ecosystem monitoring data, and remote sensing data

• Develop new web service interfaces for data holdings using state-of-the-art standards and protocols. Register the networks on the GEOSS Common Infrastructure (GCI) using harmonised metadata

• Develop a new portal to enable fast access to EU BON integrated data and products by researchers, decision makers and other stakeholders

• Ensure global coordination of development efforts through an international data interoperability task force and adoption of the results through helpdesk and a comprehensive training programme

Page 9: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Backdrop: GEO BON wants to achieve an operational system by 2015

• What do we mean by “operational”?– data flow from observations through various

aggregation and processing services, end-to-end to Essential Biodiversity Variables (EBV) and indicators;

– automated and streamlined, as appropriate;– using a plug-and-play service-oriented approach;– coordinated through the GEO BON registry system

and linked to the GEOSS Common Infrastructure;– transparent to users through portals.

Page 10: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Occurrence

Analysable dataset

Obser-vation

Specimen

Taxonomy

Eco-logical

Analysable datasetPlot

Individual

Measurement

Geo-spatial

Analysable datasetScenarios

Remote sensing

In Situ

Eco-System Services

Analysable datasetSurveys

Spatial data

NationalStatistics

EBV-1

EBV-2

EBV-x

Indicator-1

Indicator-2

Processing service-1

Processing service-3

GEOSS &GEO BONRegistries

Processing service-2

Other inputs

Streamlined data flow, end-to-end

Integration

Page 11: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

EU BON WP2 Deliverables

1. Review (January 2014)– Design of information architecture– Review of data standards

2. New data sharing tools (2014/2015)3. Registry and metadata catalogue

(2014/2015)4. Portal (2015/2016)5. Assessment of training activities (2016)

Page 12: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

EU BON WP2 Milestones in year 1

• A global Informatics Task force– Invited (March)– First meeting (May/June)

• Helpdesk opened (April)• Review

– Draft outlines (May)– Deliverable (January 2014)

• Initial informatics workshop (May/June)• Specifications for data sharing tools (January)• First training workshop (January)

Page 13: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

GEOSSClearinghouses

GEO Web Portals

GEOSS Common Infrastructure

Components & Services

Standards andInteroperability

Best PracticesWiki

User Requirements

Registries

Main GEOWeb Site

Registered Community Resources

Community Portals

Client Applications

Client Tier

Mediation Tier

CommunityCatalogues

AlertServers

WorkflowManagement

ProcessingServers

Access Tier

GEONETCastProduct Access

ServersSensor Web

ServersModel Access

Servers

Test Facility

MediationServers

CSW WMS CSW W*S

WMS WFS WFS SOS SAS SPS W*S

CSW

WPS

CSW WMS

CSW

GEOSS Infrastructure Components

Page 14: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Task 2.1 Design of information architecture for EU BON

• Starting from the information architectures of relevant infrastructures, i.e., GBIF, LTER, GEOSS, GEO BON, LifeWatch, and INSPIRE, adopt a coherent architecture that will guide the development, integration and interoperability efforts within the EU BON project.

• The architecture will highlight the relevant components of registry, portal, semantic mediation, workflows, and e-services as envisaged in the GEO BON Detailed Implementation Plan and open access as recommended by the GEOSS Data Sharing Principles.

• Link to, and adopt informatics components and approaches of other relevant EU projects.

• The task will address heterogeneity of projects and networks by ensuring that the developments of EU BON can be migrated to permanent infrastructures.

• In particular, the architecture will map GCI components to European and global biodiversity infrastructure.

• (Lead CSIC; UTARTU, UEF, GBIF, MRAC, GlueCAD, IBSAS, NBIC, TerraData; Months 4-14)

Page 15: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Task 2.2 Improving data standards and interoperability

• Starting from the GEO BON Detailed Implementation Plan and the architecture (task 2.1) as well as relevant European projects (ALTERNet, EBONE, LifeWatch), review the state-of-the-art and needs for improvement of the current data standards of TDWG, OGC, BioCASE, GBIF, LTER-Europe, PESI, and INSPIRE.

• Consider how the available protocols and mechanisms for interoperability can be best used for integrating different data layers (i.e., genetic data, primary occurrence data, monitoring data, ecological measurements, remote sensing data) in the European context.

• Consider reasons for heterogeneity of biodiversity information and make recommendations for use of standards by the various networks.

• (Lead GBIF; UTARTU, UEF, CSIC, Pensoft, MRAC, Plazi, GlueCAD, INPA, IBSAS, NBIC, TerraData; Months 4-51)

Page 16: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Task 2.3 Tools for data sharing• This task will work with international partners (task 2.7) to scope the

requirements and build new releases of data sharing tools for relevant data providers.

• These open source tools implement the selected interoperability mechanisms (task 2.2) and data publishing mechanisms (task 8.5) for use by the relevant networks, and provide registration and query functions towards the GCI.

• As the basis of development, existing tools for metadata, occurrence data and ecological data from GBIF and LTER will be used.

• New tools for sharing habitat data will be investigated. • A model for distributed development will be adopted. • (Lead MRAC; UTARTU, UEF, GBIF, Pensoft, Plazi, GlueCAD, INPA,

IBSAS; Months 9-51)

Page 17: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Task 2.4 Metadata registry and catalogue

• Building on the existing GBIF and LTER registry and metadata catalogues, an enhanced and integrated metadata system will be developed for EU BON.

• The various entities such as networks, projects, sites, and datasets identified in the analysis and mobilization efforts of WP1 will be described in the new registry/catalogue.

• The entity descriptions should include web service interfaces or other access points, and will also be registered at the GCI and other indexing services.

• In order to overcome heterogeneity of data, accommodate multilingualism, enhance discoverability and interoperability, and facilitate querying in portals, the use of Knowledge Organisations Systems (KOS; e.g., thesauri) will be explored.

• (Lead GBIF; UEF, CSIC, Pensoft, MRAC, INPA, IBSAS; Months 9-51)

Page 18: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Task 2.5 European Biodiversity Portal

• A European Biodiversity Portal (EBP) will be developed as the main GEO BON information hub.

• It will link to relevant databases and information systems, policy contacts and recommendations, and structured advice for assessing relevant distributed information/datasets for different user groups, including contributions from citizen science data gathering gateways.

• The EBP will technically integrate the various data sources under one search facility and spatially/temporally oriented user interface.

• The portal will build on the tools developed by task 2.3, functions developed by task 2.4.

• It will provide access to full detailed data, geographic visualisation, and remotely sensed data. It will be closely linked to the GCI and GEO Portal, and access layers and data from GEOSS sources.

• The portal would also act as showcase for the products from the analytical and modelling activities of other WPs and support workflows for building such products using the registered e-services.

• The portal will also serve general dissemination functions for WP8. • (Lead CSIC; UEF, GBIF, UnivLeeds, Pensoft, FIN, MRAC, Plazi, GlueCAD, NBIC; Months 1-

54)

Page 19: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Task 2.6 Technical support and helpdesk

• This task will set up a technical coordination unit and helpdesk for European networks and for global outreach as envisaged in the GEO BON Detailed Implementation Plan.

• The helpdesk will also support developments in other WPs, and will play a key role in reviewing all design documents, testing the developments and giving feedback to developers.

• The helpdesk will actively promote and assist national BONs and other users in installing and using the tools that enable the interoperability mechanisms (task 2.3).

• It will promote open access and assist in registering EU BON services at the GCI, and help populating the portal with content from other WPs and partners.

• The helpdesk facility will be set in place in collaboration and synergy with the common Helpdesk platform of the MRAC, which is currently active for six projects.

• (Lead MRAC; UEF, Pensoft; Months 4-54)

Page 20: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Task 2.8 Training programme• This task will develop and operate a training programme in data and metadata

integration strategies, use of standards, and use of data tools, within the EU BON consortium and beyond, thereby contributing also to WP8 and the long term impact of the project.

• The training events will start with introduction courses and will first target the consortium members. As the project evolves and the tools become available the trainings will target also external users. The plan is to invite stakeholders from and beyond Europe. We target an audience up to 25 participants per event.

• The training programme will be organized in collaboration with the DEST (the Distributed European School of Taxonomy) set up under the European project EDIT and currently maintained by RBINS, MRAC and NBGB. Initially mainly dedicated to taxonomy courses, the scope has now been broadened to other biodiversity and environment related training activities (http://www.taxonomytraining.eu/).

• (Lead MRAC; UEF, Pensoft; Months 4-54)

Page 21: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Task 2.7 Informatics task force and global cooperation

• In order to ensure integration of the various initiatives and reduction of heterogeneity of biodiversity information, EU BON will need to liaise broadly on technical issues.

• In this task an informatics and data standards task force and advisory board will be organized and operated.

• In particular, the task will cooperate with, and build on technical solutions developed by the LifeWatch infrastructure and several FP7 infrastructure projects for biodiversity.

• Close and regular contact with the GEOSS Infrastructure Implementation Board and the GEO BON Steering Committee will be ensured.

• It will also be necessary to link to and cooperate with related initiatives in other parts of the world, such as ALA, DataOne, regional BONs, etc), and help to ensure their compatibility.

• (Lead UEF; MfN, GBIF, CSIC, MRAC, IBSAS, NBIC; Months 4-54)

Page 22: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Summary

• EU BON WP2 aligns very closely to GEO BON WG8 – global task force will ensure this

• We have a lot to learn on how to be part of the GEOSS Common Infrastructure

• We need to pilot structure and function for LifeWatch

• We will need to build on tools already being worked on in the community

Page 23: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

LifeWatch Component Architecture

Page 24: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Some Generic Data Standards, Interoperability Requirements, and Tools

XYZ, t, P

XY, t, P

XYZ, t, P/ B

NetCDF

S-DB

O&M

MetaCat

NetCDF

WFS, WMS, …

SOS

EML, CSVXYZ, t, P/ B

Multi-dimensional

Traditional Spatial

Signals

Ecosystem

GBIF IPT EML, DwCXYZ, T, Tx Occurrence

GenBank FTP/ ASN.1XYZ, T, Al Genome

Page 25: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

EML ISO 19115/p2

FGDC

Meta-Data Landscape

Dublin Core

Darwin Core

OPeNDAP/NetCDF/ HDF-5Semantic Web / Linked Open Data

Page 26: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

What is the role of a portal in aService Oriented Architecture?

Project-ware

Pre-GBIF

SOA

LifeWatch, GEOSS, SEISEAI

GBIF

• and in distributed communities?

source:Sam Gentile

Page 27: Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC,

Service Architecture

Service

Service

Service

ServiceEcosystem Services

Remote sensingEcological measurements

Occurrence

Service virtualizes how that capability is performed, and where and by whom the resources are provided, enabling multiple providers and consumers to participate together in shared business activities.

Multiple Service ConsumersMultiple Business Processes

Multiple Discrete ResourcesMultiple Service Providers

source:TietoEnator AB, Kurts Bilder

Biodiversity science

SOA structures the business and its systems as a set of capabilities that are offeredas Services, organized into a Service Architecture

Shared

Services

A Service Centric Portal