using e-infrastructures for biodiversity conservation - module 1

33
Using e- Infrastructures for Biodiversity Conservation Gianpaolo Coro ISTI-CNR, Pisa, Italy

Upload: gianpaolo-coro

Post on 15-Aug-2015

25 views

Category:

Technology


2 download

TRANSCRIPT

Using e-Infrastructures for Biodiversity Conservation

Gianpaolo Coro ISTI-CNR, Pisa, Italy

1. Overview of approaches to biodiversity data management and analysis

2. Explain how to support a specific community of practice using a general purpose system

3. Show collection of approaches/models/interfaces that are applicable also to other domains

Aims of the course

• E-Infrastructures• Virtual Research Environments• The i-Marine Web Portal• Biodiversity Catalogues• Management of heterogeneous data• Tools for Biodiversity data access

Module 1 - Outline

• E-Infrastructures• Virtual Research Environments• The i-Marine Web Portal• Biodiversity Catalogues• Management of heterogeneous data• Tools for Biodiversity data access

e-Infrastructures“e-Infrastructures enable researchers in different locations across the worldto collaborate in the context of their home institutions or in national or multinational scientific initiatives. They can work together by having shared access to unique or distributed scientific facilities (including data, instruments, computing and communications)*.”

Examples:

*Belief, http://www.beliefproject.org/OpenAire, http://www.openaire.eu/i-Marine, http://www.i-marine.eu/EU-Brazil OpenBio, http://www.eubrazilopenbio.eu/

e-Infrastructures• Data e-Infrastructure: an e-Infrastructure promoting data sharing and

consumption. Addresses the needs of the research activity performed by a certain community.

• Computational e-Infrastructure: an e-Infrastructures offering computational resources distributed in a network environment. Uses Cloud computing to execute calculations with a large number of connected computers. Offers collaboration facilities for scientists to share experimental results.

• E-Infrastructures• Virtual Research Environments• The i-Marine Web Portal• Biodiversity Catalogues• Management of heterogeneous data• Tools for Biodiversity data access

Virtual Research EnvironmentsVirtual Research Environments: virtual organizations of communities of researchers for helping them collaborating.

• Define sub-communities inside an e-Infrastructure;

• Allow temporary dedicated assignment of computational, storage, and data resources to a group of people;

• Very important in fields where research is carried out in several teams which span institutions and countries.

e-InfrastructureVREVRE

VRE

D4ScienceD4Science is both a Data and a Computational e-Infrastructure

• Used by several Projects: i-Marine, EUBrazil OpenBio, ENVRI;

• Implements the notion of e-Infrastructure as-a-Service: it offers on demand access to data management services and computational facilities;

• Hosts several VREs for Fisheries Managers, Biologists, Statisticians…and Students.

A continuously updated list of events / news produced by users and applications

User-shared News

Application-shared News

Share News

D4Science Social

A folder-based file system allowing to manage complex information objects in a seamless way

Information objects can be • files, dataset,

workflows, experiments, etc.

• organized into folders and shared

• disseminated via URIs• accessed via WebDAV

D4Science Workspace

D4Science - ResourcesLarge Set of Biodiversity and Taxonomic Datasets connected

A Network to distribute and access to Geospatial Data

Distributed Storage System to store datasets and documents

A Social Networkto share opinions and useful news

Algorithms for Biology-related experiments

• E-Infrastructures• Virtual Research Environments• The i-Marine Web Portal• Biodiversity Catalogues• Management of heterogeneous data• Tools for Biodiversity data access

i-Marinei-Marine is an European funded project.It aims at establishing and operating a Data and Computational e-Infrastructure supporting the principles of the Ecosystem Approach to Fisheries Management and Conservation of Marine Living Resources.

Biodiversity build and

analyse species

distribution and

biodiversity maps

Geospatialstore,

discover, access, and process of geospatial

data

Statisticalexchange

and process of statistical

data

Semanticdiscover and

bridge across

knowledge providers

Physical and chemical features

Inventories of biological

information

Habitat typesSocio-

economic aspects

Marine resource

assessment

Fishery operation,

processingand trade

Marine Planning

15

i-Marine Community

Online examples: the i-Marine

Web Portal and basic functionshttp://portal.i-marine.d4science.org/

• E-Infrastructures• Virtual Research Environments• The i-Marine Web Portal• Biodiversity Catalogues• Management of heterogeneous data• Tools for Biodiversity data access

An authoritative use case

Coelacanth (Latimeria chalumnae, Smith 1939)

Coelacanths were thought to have gone extinct in the Late Cretaceous, but were rediscovered in 1938 off the coast of South Africa.

Its current form is closely related to its form 400 million years ago. It is related to lungfishes and tetrapods.

Biodiversity DataTaxonomies

In biology, a taxon (plural taxa) is a group of one or more populations of an organism or organisms seen by taxonomists to form a unit.

Introduced by Linnaeus's system in Systema Naturae (10th edition, 1758).

• A taxon is usually known by a particular name and given a particular ranking, especially if (and when) it is accepted or becomes established

• An accepted taxon is given a formal scientific name, according to nomenclature codes, e.g. Gadus morhua (Linnaeus, 1758)*

• A "good" or "useful" taxon is one that reflects evolutionary relationships.

* More on scientific names here: http://wiki.i-marine.eu/index.php/Taxa_Merging_Discussion

Taxa RepresentationsBiology

Computer science

Biodiversity DataOccurrence data

Specimen, Human Observations (direct/indirect)

Records of species presence, usually provided by scientific surveys

Biodiversity Data Providers

i-Marine hosts biodiversity datasets coming from several data providers:• Some are remotely accessed and are maintained by the respective owners;• Other ones are resident in the e-Infrastructure.

Currently, the accessible datasets are:• Catalogue of Life (CoL) • Global Biodiversity Information Facility (GBIF), • Integrated Taxonomic Information System (ITIS), • Interim Register of Marine and Nonmarine Genera (IRMNG), • Ocean Biogeographic Information System (OBIS), • World Register of Marine Species (WoRMS) • World Register of Deep-Sea Species ( WoRDSS )

Some data providers are collectors of other data providers, but the alignment is not guaranteed!The datasets allow to retrieve:• Occurrence points (presence points or specimen)• Taxa names

• E-Infrastructures• Virtual Research Environments• The i-Marine Web Portal• Biodiversity Catalogues• Management of heterogeneous data• Tools for Biodiversity data access

Biodiversity Data Providers

Remote

Biodiversity Data Representation

Darwin core:

• An extension of Dublin Core

• Used in Biodiversity Informatics

• Its terms are part of vocabularies and technical specifications developed and maintained by the Taxonomic Databases Working Group (TDWG)

• Based on taxa, refer to species occurrence in nature as documented by observations, specimens, samples, and related information

• The Simple Darwin Core is a commonly used specification to share data about taxa and their occurrences in a simply structured way

<?xml version="1.0" encoding="UTF-8"?><SimpleDarwinRecordSet xmlns="http://rs.tdwg.org/dwc/xsd/simpledarwincore/"

xmlns:dc="http://purl.org/dc/terms/" xmlns:dwc="http://rs.tdwg.org/dwc/terms/"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://rs.tdwg.org/dwc/xsd/simpledarwincore/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd"><SimpleDarwinRecord>

<dc:modified>2006-05-04T18:13:51.0Z</dc:modified><dc:language>en</dc:language><dwc:basisOfRecord>Taxon</dwc:basisOfRecord><dwc:scientificNameID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=53548</dwc:scientificNameID><dwc:acceptedNameUsageID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=22010</dwc:acceptedNameUsageID><dwc:originalNameUsageID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=53548</dwc:originalNameUsageID><dwc:nameAccordingToID>http://research.calacademy.org/research/ichthyology/catalog/getref.asp?id=22764</dwc:nameAccordingToID><dwc:namePublishedInID>http://research.calacademy.org/research/ichthyology/catalog/getref.asp?id=671</dwc:namePublishedInID><dwc:scientificName>Centropyge flavicauda Fraser-Brunner 1933</dwc:scientificName><dwc:acceptedNameUsage>Centropyge fisheri (Snyder 1904)</dwc:acceptedNameUsage><dwc:parentNameUsage>Centropyge Kaup, 1860</dwc:parentNameUsage><dwc:originalNameUsage>Centropyge flavicauda Fraser-Brunner 1933</dwc:originalNameUsage><dwc:nameAccordingTo>Allen, G.R. 1980. Butterfly and angelfishes of

the world. Volume II. Mergus Publishers. Pp. 149-352.</dwc:nameAccordingTo><dwc:namePublishedIn>Fraser-Brunner, A. 1933. A revision of the

chaetodont fishes of the subfamily Pomacanthinae. Proceedings of theGeneral Meetings for Scientific Business of the Zoological Society ofLondon 1933 (pt 3, no.30): 543-599, Pl. 1.</dwc:namePublishedIn>

<dwc:higherClassification>Animalia;Chordata;Vertebrata;Osteichthyes;Actinopterygii;Neopterygii;Teleostei;Acanthopterygii;Perciformes;Percoidei;Pomacanthidae;Centropyge</dwc:higherClassification>

<dwc:kingdom>Animalia</dwc:kingdom><dwc:phylum>Chordata</dwc:phylum><dwc:class>Osteichthyes</dwc:class><dwc:order>Perciformes</dwc:order><dwc:family>Pomacanthidae</dwc:family><dwc:genus>Centropyge</dwc:genus><dwc:specificEpithet>flavicauda</dwc:specificEpithet><dwc:scientificNameAuthorship>Fraser-Brunner 1933</dwc:scientificNameAuthorship><dwc:taxonRank>species</dwc:taxonRank><dwc:nomenclaturalCode>ICZN</dwc:nomenclaturalCode><dwc:taxonomicStatus>accepted</dwc:taxonomicStatus>

</SimpleDarwinRecord></SimpleDarwinRecordSet>

Example of DwC document:

Biodiversity Data RepresentationData provisioning

RESTful Web Services

OBISFishBaseSeaLifeBase

GBIFSpeciesLinkITIS…

Web Interfaces

Web Interfaces

Client programs

Usage in other applications

• E-Infrastructures• Virtual Research Environments• The i-Marine Web Portal• Biodiversity Catalogues• Management of heterogeneous data• Tools for Biodiversity data access

Remote

Species Products DiscoverySpecies Products Discovery allows to retrieve detailed information from several data providers

We can visualize the occurrence points on a map and visually detect the errors.

We can inspect the points metadata

Online example: the i-Marine Species Products Discovery

https://i-marine.d4science.org/group/biodiversitylab/species-data-discovery

Biodiversity Data Providers

Remote

Species ViewSpecies View allows to discover species information from FishBase

FishBase

Also images and GIS maps may be attached to the species

Online example: the i-Marine Species View

https://i-marine.d4science.org/group/biodiversitylab/species-visualisation