nmbu, april 2015 gbif data use dag endresen gbif norway uio natural history museum in oslo...

47
NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th , 2015 Slides: CC-BY-4.0

Upload: stephanie-powers

Post on 11-Jan-2016

217 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

NMBU, April 2015

GBIF data use

Dag EndresenGBIF NorwayUiO Natural History Museum in OsloUniversity of Oslo

Wednesday, April 29th, 2015

Slides: CC-BY-4.0

Page 2: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Status 29th April 2015

GBIF enables free and open access to biodiversity data online. We are an international government-initiated and -funded initiative focused on making biodiversity data available to all and anyone, for scientific research, conservation and sustainable development.

2

Page 3: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

GBIF provides a data discovery system

global registry data portal

that is dependent on resolvable stable identifiers for efficient functionality

3

Page 4: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

GBIF and GEO Intergovernmental group on earth observations

Data Integration & InteroperabilityGBIF provides the infrastructure delivering species occurrence data in GEO.

GEO BONBiodiversity observation network

Page 5: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0
Page 6: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Data distribution in GBIF

Last updated: 2014-07-09

Page 7: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

GBIF portal:

18,2 million occurrences with locations in Norway.Published from 31 countries worldwide.

Page 8: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Danmark Finland

Norway Sweden

Iceland

April 2015 Datasets Occurrences

Denmark 55 10 250 100

Finland 58 19 626 137

Iceland 4 458 705

Norway 93 17 425 011

Sweden 34 50 997 763

Status for Nordic GBIF nodes (data hosted by…)

http:

//w

ww

.gbi

f.org

/cou

ntry

/NO

Page 9: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Download data

Page 10: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

GBIF DATA PORTAL

Page 11: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

SPECIES SEARCH

Page 12: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Portal API

Page 13: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

GBIF DATA PORTAL API

An interface to access data published through the GBIF network using web services.

Page 14: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

PORTAL API

GBIF Data Portal API:http://api.gbif.org/v1/ (+parameters)

Summary and information:http://www.gbif.org/developer/summary

The RESTful API take search parameters as key=value pairs and respond with json content type.

RESTful query formatJSON response type

Page 15: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

GBIF API sections

• Registry information about the datasets, organizations (e.g. data publishers), networks and the means to access them (technical endpoints)

• Speciesinformation about species and higher taxa, and utility services for interpreting names and looking up the identifiers (access to all published checklists in the GBIF checklist bank)

• Occurrenceoccurrence information crawled and indexed by GBIF and search services to do real time paged search and asynchronous download services to do large batch downloads

• Mapssimple services to show the maps of GBIF mobilized content

Page 16: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

API example : dataset

Search for datasets by publishing country: http://api.gbif.org/v1/dataset/search?publishingCountry=NO

Dataset information (UiO NHM Lichens):http://api.gbif.org/v1/dataset/7948250c-6958-4a29-a670-ed1015b26252

Contacts for a dataset:http://api.gbif.org/v1/dataset/7948250c-6958-4a29-a670-ed1015b26252/contact

Dataset endpoint (get the download URL): http://api.gbif.org/v1/dataset/7948250c-6958-4a29-a670-ed1015b26252/endpoint

http://www.gbif.org/developer/registry

Page 17: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

API example : species

List all name usages (across all checklists):http://api.gbif.org/v1/species?name=Beta%20vulgaris

Name usage across checklists (Beta vulgaris, 5383920):http://api.gbif.org/v1/species/5383920/related

Name parsed into epithets and author etc.:http://api.gbif.org/v1/parser/name?name=Abies%20alba%20Mill.%20sec.%20Markus%20D.

{"scientificName": "Abies alba Mill. sec. Markus D.","type": "SCINAME","genusOrAbove": "Abies","specificEpithet": "alba","authorsParsed": true,"authorship": "Mill.","sensu": "sec. Markus D.","canonicalName": "Abies alba","canonicalNameWithMarker": "Abies alba","canonicalNameComplete": "Abies alba Mill."

}

http://www.gbif.org/developer/species

Page 18: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

API example : occurrence

List occurrences of Beta vulgaris: http://api.gbif.org/v1/species/match?name=Beta+vulgaris => taxonKey

http://api.gbif.org/v1/occurrence/search?taxonKey=5383920

List occurrences from Norway (of Beta vulgaris):http://api.gbif.org/v1/occurrence/search?publishingCountry=NOhttp://api.gbif.org/v1/occurrence/search?publishingCountry=NO&taxonKey=5383920

Information about a single occurrence record:http://api.gbif.org/v1/occurrence/1040970640 http://api.gbif.org/v1/occurrence/1040970640/fragment http://api.gbif.org/v1/occurrence/1040970640/verbatim

List occurrence counts for datasets of country (or taxon):http://api.gbif.org/v1/occurrence/counts/datasets?country=NO

http://www.gbif.org/developer/occurrence

Page 19: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

API example : download dataLookup speciesKey (1) and download occurrences (2):

http://api.gbif.org/v1/species/match?verbose=false&kingdom=Plantae&name=Beta+vulgaris

=> usageKey/speciesKey = 5383920

http://api.gbif.org/v1/occurrence/search?taxonKey=5383920 [&limit=1000&offset=0]

=> notice: count = 25 513=> then: page through results…

(using offset & limit)

http://api.gbif.org/v1/occurrence/download/request [POST] => downloadKey (see next slide)

Page 20: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

API example : asynchronous (1)Request asynchronous download:

$ curl -i --user yourGbifUserName:yourGbifPassord -H "Content-Type: application/json" -H "Accept: application/json" -X POST -d @filter.json http://api.gbif.org/v1//occurrence/download/request >> log.txt

Search parameters in a json text file: filter.json (in current directory or located in a “PATH-directory”):

{ "creator":”yourGbifUserName", "notification_address": [“[email protected]"], "predicate": { "type":"and", "predicates": [{"type":"equals","key":"HAS_COORDINATE","value":"false"}, {"type":"equals","key":"TAXON_KEY","value":"5383920"}] } }

Page 21: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Downloads are available in the portal (from your user profile)

Page 22: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

API example : asynchronous (2a)Request asynchronous download:

function gbifapi {curl -i –user yourGbifUserName:yourGbifPassword -H "Content-Type: application/json" -H "Accept: application/json" -X POST -d "{\"creator\":\”yourGbifUserName\", \"notification_address\": [\”[email protected]\"], \"predicate\": {\"type\":\"and\", \"predicates\": [{\"type\":\"equals\",\"key\":\"HAS_COORDINATE\",\"value\":\"true\"}, {\"type\":\"equals\", \"key\":\"TAXON_KEY\", \"value\":\"$1\"}] }}" http://api.gbif.org/v1/occurrence/download/request >> log.txtecho -e "\r\n$1 $2\r\n\r\n----------------\r\n\r\n" >> log.txt}

$ gbifapi 4140730 "Aciachne acicularis"$ gbifapi 4140704 "Aciachne flagellifera"$ gbifapi 5289784 "Aegilops comosa”…

Page 23: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

API example : asynchronous (2b)(…clean log.txt with the downloadKeys using regular expressions…)

function gbifwget { echo -e "\n\n----------------\n$1 $2 $3\n" >> log_wget.txt wget http://api.gbif.org/v1/occurrence/download/request/$1.zip 2>&1 | tee /dev/tty >> log_wget.txt mv $1.zip ./dwca/$2.zip 2>&1 | tee /dev/tty >> log_wget.txt}

$ gbifwget 0006050-141024112412452 4140730 "Aciachne acicularis"$ gbifwget 0006053-141024112412452 4140704 "Aciachne flagellifera"$ gbifwget 0006056-141024112412452 5289784 "Aegilops comosa"…

(work in progress…)

Page 24: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

MAPPING API v1.0

You can easily overlay GBIF content on your own maps. http://www.gbif.org/developer/maps

Slide by Daniel Amariles, 2013

Page 25: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

This service is intended for use with commonly used map clients such as the Google Maps API, Leaflet JS library or Modest maps JS library.

These libraries allow the GBIF layers to be visualized with other content, such as those coming from Web Map Service (WMS) providers. It should be noted that the mapping API is not a WMS service, nor does it support WFS capabilities.

http://leafletjs.com/

MAPPING API v1.0

http://modestmaps.com/

Slide by Daniel Amariles, 2013

Page 26: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Useful Tools (JSON & REST)

• REST client …• JSON client/parser …

• JSONView (Firefox, Chrome, …)

• http://jsonview.com/ • Display formatted JSON in browser

• R CRAN : jsonlite• http://cran.r-project.org/web/packages/jsonlite/

• E.g. read json into a dataframe [link]

• OpenRefine• http://openrefine.org/

Page 27: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0
Page 28: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

R CRAN

rOpenSci provides programmatic access to scientific data with R (rgbif, taxize, EML, geonames, …).

https://github.com/ropensci http://ropensci.org/packages/

http://ropensci.org/tutorials/rgbif_tutorial.html http://ropensci.org/tutorials/taxize_tutorial.html

Page 29: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

rOpenSci : rgbif

library(rgbif) key <- name_backbone(name='Beta vulgaris', kingdom=‘Plantae')$speciesKeybv <- occ_search(taxonKey=key, return='data', hasCoordinate=TRUE, limit=1000)gbifmap(bv)

Page 30: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

raster : WorldClim, BioClim layers# using GBIF data (bv) from the previous slide…library(raster)xy <- cbind('lon'=bv$decimalLongitude, 'lat'=bv$decimalLatitude);env <- getData('worldclim', var='bio', res=10) # bioclim (pkg raster)plot(env, 1) # plot the first bioclim layerpoints(xy[,'lon'], xy[,'lat'], col='red') # plot pointsbio <- extract(env, xy); # extract environment to points (pkg raster)bv_bio <- cbind(bv, bio); # column-bind GBIF-data and bioclim

Page 31: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

rOpenSci : rWBclimatelibrary(rWBclimate, ggplot2) country_dat <- get_historical_temp(c("NOR", "SWE", "DNK", "FIN"), "year")ggplot(country_dat, aes(x = year, y = data, group = locator)) + theme_bw(base_size=18) + geom_point() + geom_path() + labs(y="Average annual temperature of Nordic countries", x="Year") + stat_smooth(se = F, colour = "black") + facet_wrap(~locator, scale = "free")

Page 32: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Resolve taxonomic nameslibrary(taxize) # rOpenSci Taxizegnr <- gnr_resolve(names = "Beta vuulgariss") # Misspelled namegnr$results # display suggested names

submitted_name matched_name data_source_title score1 Beta vuulgariss Beta vulgaris L. Catalogue of Life 0.752 Beta vuulgariss Beta vulgaris L. ITIS 0.753 Beta vuulgariss Beta vulgaris NCBI 0.754 Beta vuulgariss Beta vulgaris var.-gr. crassa Alef.GRIN Taxonomy for Plants 0.75

specieslist <- c("Beta vulgaris", "Phleum pratensis", "Nicotiana glauca")classification(specieslist, db = 'itis') # lookup higher taxonomy

Global Names Resolver: http://resolver.globalnames.org/ rOpenSci Taxize: http://ropensci.org/tutorials/taxize_tutorial.html

db = ’col'

db = ’itis'

Page 33: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

rOpenSci : EML

library(EML, rfigshare)

description <- "My dataset published in GBIF"

eml_write(dat = dat, meta, title = "My Dataset", description = description, creator = "Your Name <[email protected]>", file = "dataset.xml")

eml_publish("dataset.xml", description = description, categories = "Ecology", tags = "biodiversity", destination = "figshare", visibility = "public")

meta <- eml_read("eml_example.xml")

Page 34: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

GBIF API support

Subscribe to the mailing-list for help and information messages:

[email protected]

Page 35: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Data use in research

Page 36: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Use citations, by country of authors

15 JAN 2015

1. United States 114 6. Italy 22

2. Spain 41 7. Mexico 20

3. United Kingdom 40 8. Brazil 19

4. Germany 36 9. France 18

5. Australia 32 10. South Africa 17

Total 2014

Number of research publications from January to December 2014 citing use of GBIF-mediated data, ranked by country according to affiliation of author. Top 10 countries shown.

Relationship line represents collaboration between authors affiliated in different countries.

December 2014

1. United States 22 4. South Africa 5

2. United Kingdom 9 7. Switzerland 4

3. Spain 8 7. China 4

4. Germany 5 7. Mexico 4

4. Italy 5

December 2014

Number of research publications in December 2014 citing use of GBIF-mediated data, ranked by country according to affiliation of author. Top 9 countries shown.

Page 37: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Citations in peer-reviewed research (2008-2014)

03 MAR 2015

Annual number of peer-reviewed publications using GBIF-mediated data

Page 38: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Darwin Coredata standard

Page 39: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Darwin Core – a vocabulary of terms

Wieczorek J, Bloom D, Guralnick R, Blum S, Döring M, De Giovanni R, Robertson T, and Vieglais D (2012) Darwin Core: An Evolving Community-Developed Biodiversity Data Standard. PLoS ONE 7(1): e29715. (doi:10.1371/journal.pone.0029715)

Page 40: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Unifying species data

Integrated access for records of the occurrence of any species:

• What?• When?• Where?• What evidence?• Data owner?• Link to full record

Presence only data

Collections

Genetics Ecological Monitoring

Darwin Core

2015: Survey data compatible with existing Darwin Core data, plus:

• Which species were recorded together?

• Which sets of data are directly comparable?

• Which species were most abundant in each sample?

Presence/absence

Darwin CoreOccurrences

Page 41: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Publish your own data!

Page 42: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Publish and archive your own species occurrence data

• You can easily publish your species occurrence data by sending an email to [email protected]

• The GBIF Norway helpdesk will assist with data publishing (to GBIF and Artskart)!

• You can also install a data publishing software such as the GBIF Integrated Publishing Toolkit (IPT).

• Citizen Science portals such as Artsobservasjoner, iNaturalist, Anymals + Plants, …

• We recommend always using a long-term data archiving platform such as B2SHARE (EUDAT) or NorStore (Norwegian research data, EUDAT).

Page 43: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

http

://ar

tsob

serv

asjo

ner.

no/

Page 45: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Android & iPhone

Page 46: NMBU, April 2015 GBIF data use Dag Endresen GBIF Norway UiO Natural History Museum in Oslo University of Oslo Wednesday, April 29 th, 2015 Slides: CC-BY-4.0

Node team at UiO NHM:Dag Endresen, Node Manager Christian Svindseth, Database Manager

Fridtjof Mehlum, Research DirectorEinar Timdal, Associate Professor Geir Søli, Associate Professor

Artsdatabanken Trondheim:Wouter Koch, AdvisorNils Valland, Senior advisor

The Research Council of Norway:Per Backe-Hansen, Head of delegation

46