open research data: taxonomy

Post on 15-Apr-2017

583 Views

Category:

Science

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Donat Agosti Plazihttp://plazi.org

Linked Data Switzerland Workshop8th October, HES-SO, Sierre

Open Access to Scientific Data

Google.maps

Search.ch

Search.ch

food

food

foodcoffee

sleep

Search.ch

4min

8min

8min

9min

7min

Search.ch

46°16'57.80"N 7°32'23.11"E

253m

8min

9min

228m

46°16'56.81"N 7°32'13.59"E

46°17'3.98"N 7°32'13.00"E 46°17'5.18"N 7°32'18.34"E

203m

Search.ch

location

distance

distancedistance

distance

location

location location

distance

location location

Search.ch

host

welfare

welfarew

elfare

welfare

gastronomy

gastronomy gastronomy

welfare

gastronomy gastronomy

Search.ch

Lateinische Namen als Zugang zur Wissenschaft

Treatment

XML

RDF

cites

same as ref

ers tosam

e as refers to

The «Mehlwurm» lives in dry bread... or the Potential of LOD

has traits

Is pa

rt o

f

is part of

refers to

has traits

“The larva of the mealworm lives in dry bread and can be eaten in Switzerland”

The Scientific Challenge

1 tnntttccca cgaataaata atataagatt ttgattatta cctccttctt taattttatt 61 attatcaaga agattagttt ataaaggagt aggaacagga tgaactgttt atcctccttt 121 atctaataat ttatatcata atggattttc aactgattta gcaatttttt ctttacatat 181 tgcaggaata tcatcaatta taggagcaat taattttatt tcaacaattt taaatataca 241 tcataaaaat ttatcattag ataaaattcc attgttagtt tgatcaattt taattacagc 301 tattttatta ttattatctt tacctgtatt agcaggtgca attactatat tattaactga 361 tcgaaatcta aatacaactt tttttgatcc ttcgggtgga ggagatccaa ttttatatca 421 acatttattt

The Scientific Challenge

The Scientific Challenge

The Scientific Challenge

LODPDF

HNS

HNS

The Scientific Challenge

The Scientific Challenge

The Scientific Challenge

The Political Vision: Create the LOD-Cloud

The Plazi Vision: The Giant Global Biodiversity Graph

Plazi’s intended uses of LOD

• Public discovery by people not otherwise connected to other discovery services such as Plazi’s own, the GBIF data repository, …. of data about species extracted from the publications in which they are described

• Public facility for citation of Plazi’s data by arbitrary internet users.• Plazi creates a new dataset from literature/publications rather than republish existing

data sets

The Plazi Vision: The Giant Global Biodiversity Graph

LegalSocialTechnicalOntologiesInfrastructure

500 M pages 5*

What does this mean?

The Linking Open Data cloud diagram

Linked Open Data Cloud

Text

<tax:treatment> <tax:nomenclature> <tax:name> <tax:xid source="HNS" identifier="193329"/> <tax:xmldata> <dc:Genus>Mystrium</dc:Genus> <dc:Species>leonie</dc:Species> </tax:xmldata> Mystrium leonie </tax:name> <tax:status>n. sp.</tax:status> Fig 1 D - F </tax:nomenclature> <tax:div type="description"> <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL 1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving to a sharp apical tooth, the apex parallel to the anterior clypeal margin. (Holotype with material in mandibles, so mandibles and anterior clypeus $ described below from paratypes.) Median clypeus....</treatment>

Semantic enchanced Text(TaxonX/Taxpub)

… alternatives: From human to machine readable text

RDF

Countries (Region)Australia (Queensland)

Export species materials citations (DwC)

Treatment Content Visualization

10.3897/BDJ.3.e5063

Treatment Graph for the Malagasy Ants Aphaenogaster

Treatment Graph for the Malagasy Ants Aphaenogaster

Original description

Re-description cites

cite

s /sy

nony

mize

s

Re-description

Re-de.Re-description

cites

Treatment Graph for the Ant Azteca alfari

https://github.com/plazi/TreatmentOntologies

Pseudomyrmex ants and Vachellia ant-acaciasare a classic example of mutualism in biology.

allenii

melanoceras

ruddiae

chiapensis

collinsii

cookii

cornigera

globulifera

hindsii

janzenii

mayana

sphaerocephala

boopis

flavicornis

hesperius

ita

janzenikuenckeli

mixtecus

nigrocinctus

nigropilosus

opaciceps

particeps

peperi

reconditus

satanicus

simulansspinicola

subtilissimus

veneficus

ferrugineus

gentlei

gracilis

Transbiotic link networkAssociated species linked throughreferences in taxonomic treatments

Acacia-ant species: Pseudomyrmex gracili

Treatment: redescription

Associated ant-acacia: Acacia gentlei

Ants Plants

Photocredits: Alex Wild

Treatment

Treatments linked through citations

Treatment opportunities

Treatment

Verlinkung der Daten mit externen Referenzen

5*2014

NCBI

Zugang zu wissenschaftlicher Literatur: DOI via Zenodo/CERN

Open Access as Necessity

Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the only location with a complete set of ant systematics publications from 1758 - present.

Through antbase.org‘s digital library, access to this body of literature is worldwide, and it is actively used (>10,000 visits in one month only).

Online catalogueOpen accessOnline library2004

Conversion Workflows: Plazi

Plazi SRS

find scan «OCR» markup store +access

Swiss exceptions to copyright law to extract data is an advantage for the sciences in Switzerland

Conversion Workflows: Plazi: Scientific Names

Conversion Workflows: Plazi: Tables

«Treatment»Wissenschaftliche ArtnameVerbreitungsnachweisBibliographische RecordsExterne Links

ENVO?Namen

Cataglyphis tartessica workersVariable mean ± SDHead length 11.23 ± 0.12Head width 11.15 ± 0.12Scape length 11.47 ± 0.12Mesosoma length 11.94 ± 0.16Femur length 12.03 ± 0.14Cephalic index 0 93.60 ± 3.940Scape index 128.10 ± 7.660

Conversion Workflows: Plazi: Bibliographic References

Conversion Workflows: Plazi: Geographic data

Conversion Workflows: Plazi: Pipelines

Status quo

• 50,000+ treatments life• RDF in Betaversion• GoldenGate Imagine (Text mining tool) in Betaversion• Provider für Daten für NCBI, GBIF, EOL, antweb• Biodiversity Literature Repository functional

Trait information machine ready

BioDiPResolutionReconciliation

TreatmentBank

NAMES MANAGEMENT

CITATIONMANAGEMENT

REFBANK

TREATMENTMANAGEMENT

ATOMIZATION & SEMANTICIZATION

OF CONTENT MARKUP / initial trait extraction

Specialist taxonomic databases

Next steps: HES-SO, HEG Geneva

Next steps: CotentMine

Planned collaboration with ContentMine to extract treatments on a daly bases

http://www.slideshare.net/petermurrayrust/?

BioDiv

article

treatment

CiteshttpURI

cites (DOI)

Scientific name

https://www.wikidata.org/wiki/Property:P1992

Feed Wikipedia with taxonomic data

Publications, one of our footprints

Next steps

• 1 Million treatments life• RDF Version 1• GoldenGate Imagine (Text mining tool) Version 1• Provider für Daten für NCBI, GBIF, EOL, antweb• Biodiversity Literature Repository mit 100,000

Bibliographischen Referenzen und digitalen Versionen

Danke!

Donat Agostiagosti@plazi.org

top related