scratchpads virtual research environments for taxonomic and biodiversity related data dr dimitrios...

52
Scratchpads Virtual Research Environme for taxonomic and biodiversity related Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics Group The Natural History Museum London

Upload: ulysses-packman

Post on 01-Apr-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

ScratchpadsVirtual Research Environments

for taxonomic and biodiversity related data

Dr Dimitrios KoureasDepartment of Life Sciences | Biodiversity Informatics Group

The Natural History Museum London

Page 2: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Scratchpads introductory presentation. Dimitrios Koureas,

Laurence Livermore. figshare. 2013.

doi:10.6084/m9.figshare.640101

Where to find and how to cite this presentation

Page 3: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Publications based on countless

specimens, images, maps, keys and datasets

Current taxonomic data production

Typically generated by small communities for “local” research projects

Figure from Costello M.J et al, 2013. doi: 10.1126/science.1230318

Page 4: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

On the other hand:

Estimates of

7.5 million species

still undescribed1

1How Many Species Are There on Earth and in the Ocean? Mora C et al.

doi:10.1371/journal.pbio.1001127

Page 5: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Expected volume

of taxonomic and

biodiversity data

Need of extracting,

aggregating and linking

data on a global level

Page 6: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

The four nodes of data cycle

1. We collect and generate data

2. We curate, link and structure data

3. We analyse data

4. We publish data

Page 7: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Data curation

Data publishing

The four nodes of data cycle

Data collection &generation

What are the

bottlenecks

in the workflow?

Data analysis

Page 8: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Data curation

Data publishing

What we need is…

Data collection &generation

aseamless

workflow

Data analysis

Page 9: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Cyndy Parr, Rob Guralnick, Nico Cellinese and Rod Page. TREE. doi:10.1016/j.tree.2011.11.001

This requires data, information & knowledge to be…

• Digital Not printed paper

• Openly accessible Not behind barriers (e.g. paywalls)

• Linked-up Not in silos

“Link together evolutionary data… by developing

analytical tools and proper documentation and then use this framework to conduct comparative analyses, studies of evolutionary process and biodiversity analyses”

To achieve this…

Page 10: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

ScratchpadsVirtual Research Environments

Making taxonomy digital, open & linked

Page 11: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

so…

what are

the

Scratchpads?

Page 12: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

What are Scratchpads?

Hosted websites for biodiversity data

Virtual research & publication platform

Completely open access & open source

Modular & flexible

Page 13: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

What are Scratchpads?

development of online research communities

facilitate

standardized environment of entering and curating data

through

sharing and interlinking

that allow

dissemination of research products

and

Page 14: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

A Scratchpad is a website that holds data for you and your community

The Scratchpads concept

Your data External data & services

Page 15: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

The Scratchpads concept

Page 16: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Taxa(Classifications, taxon profiles, specimens, literature, images, maps, phenotypic, genotypic

& morphometric datasets, keys, phylogenies)

ProjectsConservation Regions Societies

Examples of use:

Page 17: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Red List conservation assessments

Examples of use:

Page 18: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Examples of use:

Bulbous monocot genera listed in CITES

Page 19: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Global Invasive Alien Species Information Partnership

Examples of use:

Page 20: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Belgian Network for DNA Barcoding

Examples of use:

Page 21: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Major integrated projects

• Online resource for monocot plants

• Collaboration between Kew, Oxford University and NHM

• Data to be open and usable by other scientists

Page 22: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Major integrated projects

• 21+ open community sites and growing

• Over 45 internationally collaborating scientists

• Site data feeds into a “Portal”

Site List: http://about.e-monocot.org/list-emonocot-scratchpads

Page 23: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Major integrated projects

• Retrieve information on any Monocot plant

• Rich downloadable data

• Identification keys

• Model example of linked attributed data

eMonocot Portal: http://e-monocot.org/

Page 24: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

65,000 unique visitors/month

Per month unique visitors to Scratchpads sites

512 Scratchpads Communities

by 6,500 active registered users

covering 73,444 taxa

in 515,189 pages.

Are Scratchpads sustainable?

In total more than

1,300,000 visitors

Page 26: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Marker Portal a project in the making

Unified, comprehensive access to public marker data across the tree of life

Mine genome and other submitted data for MLST targets in addition to the data submitted explicitly as MLST

Support for bioprospecting and biomonitoring

Are Scratchpads sustainable?

Page 27: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

the main

features

Page 28: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Classification term oriented system

Biologicalclassifications

Non-biologicalclassifications

Taxonomies Hierarchical controlled vocabularies

The main features

Page 29: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Dynamic Biological Classifications

Manually entered or imported

Auto generated

The main features

Page 30: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Taxon pages

Overview of data related to taxon

Generated from tagged content

The main features

Page 31: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Bibliography management

Faceted browsing

An inbuilt Bibliography manager

Taxon tagging and free keywords

Import from and export to all major formats

The main features

Page 32: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Specimen/Observation data

Linked to images and georeferenced

Annotated full specimen/observation records

The main features

Linked to GenBank accession numbers

Page 33: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics
Page 34: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Distribution maps

Google maps based

Data layers

Occurrence data

Distribution dataTDWG regions

GBIF data

The main features

Page 35: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Example regional distributionThe main features

Page 36: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Create phylogenetic treesBased on Newick/NeXML

Different views

Page 37: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Character matrices – Key construction

Quantitative or qualitative characters

Auto generation of keys

Taxon based matrices [Specimens based character matrices]

The main features

Page 38: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Media handling

Bulk upload

Metadata

(EXIF & Audobon core)

Media galleries

The main features

Page 39: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Generation of custom pages

Tagged or not

External RSS

Twitter feeds

Media files

The main features

Page 40: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Working groups

Forums

Blog entries

Webforms

Newsletters

RSS syndication

Inbuilt comments

Enhanced communication tools

The main features

Page 41: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

analytical tools

OBOE service

i.a.

Ecological informatics,

Phylogenetics,

Sequence alignment

The main features

Page 42: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

MCMC methods to estimate the posterior distribution of model parameters

Phylogenies

Sequence alignment

Multiple sequence alignment

Microsatellite repeats finder

Page 43: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

data

mobilisation

more on the way…

External services Integration

Page 44: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

IUCN data integration

Page 45: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

GBIF data integration

Page 46: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics
Page 47: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Help & Support

• In-site Support

• Wiki

• Training Courses (12 in 2012)

• Ambassadors Programme

• Embedded Issues Queue

• Sandbox Site

http://help.scratchpads.eu

Page 48: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Scratchpads are an integrated system to

Enter, Curate, Mark-up, Link and Publish data

taxonomic workflowin a single virtual environment

Page 49: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Scratchpads technical development- Simon Rycroft, Ben Scott, Ed Baker, Alice Heaton, Katherine Boutton, Khalid Almaini

Scratchpads outreach- Laurence Livermore, Isa van deVelde & Dimitris Koureas

e-Monocot- Paul Wilkin & the Kew team, Charles Godfray & the Oxford team

ViBRANT- Vince Smith, Dave Roberts & Lucy Reeve

Pensoft

- Lyubomir Penev and the Pensoft team

Our 7000 users

Acknowledgements

Page 50: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Thank you Data

curation

Data analysis

Data publishing

Data collection &generation

Page 51: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics
Page 52: Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Dr Dimitrios Koureas Department of Life Sciences | Biodiversity Informatics

Authors and Contributors

Manuscript ready to submit