bien confederated db (s) analytical db(s) heterogeneous source database(s) of...

8
BIEN Confederated DB (S) Analytic al DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback to original data providers. Filtered push/BiSciCol?-- annotation and feedback, QC Provider feedback Errors, annotatio n etc Data- source Steward Modified by NS/SD from MS/BB Later changes to taxon names applied to BIEN Staging DB 2.0 BIEN harvester and loader 4.0 Transform, Audit change, Integrate, Rule set, Rule framework 5.0 Populating Analytical DB 6.0 User/web interface API VegX DwC 1.0 Data Mapping and provider services 3.0 Internal validation, Taxon scrubbing, Geospatial validation Native BIEN Traits Mediated through TNRS & Geo scrubbing Other data associated with taxon Integrate geospatial & environmental data associated with location Geospatial data linking There may be linkages other data BIEN wants to integrate into core data. 7.0 Reapplication of updated Names Phylogenie s Sequences Traits External data sources ?How are linkages to external resources made? If it is through Names then is a TNRS mediation step required against the external resources? Provider BIEN Full BIEN Workflow There may be trait data BIEN wants to integrate into core data

Upload: richard-jackson

Post on 02-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback

BIEN Confederated DB (S)

Analytical

DB(s)

Heterogeneous source database(s) of Plots/Specimens/Occurrences

Synonymy

Names

Reference taxonomy

***

*** Feedback to original data providers. Filtered push/BiSciCol?-- annotation and feedback, QC

Provider feedback

Errors, annotation

etc

Data-source Steward

Modified by NS/SD from MS/BB

Later changes to taxon names

applied to BIEN

Staging DB2.0 BIEN

harvester and loader

4.0 Transform, Audit change, Integrate, Rule

set, Rule framework

5.0 Populating Analytical DB

6.0 User/web interface API

VegX

DwC

1.0 Data Mapping

and provider services

3.0 Internal validation, Taxon

scrubbing, Geospatial validation

Native BIEN Traits

Mediated through TNRS & Geo scrubbing

Other data associated with taxon

Integrate geospatial & environmental data associated

with location

Geospatial data linking

There may be linkages other data BIEN wants to integrate

into core data.

7.0 Reapplication of updated Names

Phylogenies

Sequences

Traits

External data sources

?How are linkages to external resources made? If it is through Names then is a TNRS mediation step required against the external resources?

Provider BIEN

Full BIEN Workflow

There may be trait data BIEN wants to integrate into core

data

Page 2: BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback

Provider BIEN

‘Manual’ schema-mappingtools

1.0 Data mapping and provider services

2.0 BIEN Harvest process

1.1 Data mapping

tool

Harvesting protocol: OAI PMH

1.3 Provider data web serviceProvider

Cache

VegX

Veg Obs multi-dataset

DB

Single datasets

Provider Data

Would follow the TDWG specifications and architecture for harvesting DwC.

This workflow represents a provider using a manual process to map and transfer data to BIEN.

VegX

1.2 VegX programmatic

mapping

1.4 DwC programmatic

mappingDwC

VegX

Specimen multi-dataset

DB

1.5 TAPIR/IPT Provider

web serviceProvider Cache

DwC/A

VegX

TAPIR or IPT protocol

DwC

This workflow represents a provider using an automated harvesting process to map and transfer data to BIEN.

Transfer may occur via a website or FTP process?

VegX

Notification services?

Page 3: BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback

Temporary

XML file store

2.0 BIEN harvester and loader

2.1 BIEN harvester: VegX

& DwC

2.3 Import into staging DB

Parse XML to RDBMSAssign ID’s, gather metadata, set status flags, initialise versions, insert data into staging tables, update audit tracking.

2.2 First level schema validationIs document well formed are mandatory data present

Provider BIEN

Provider feedback

BIEN staging DB

BIEN feedback

VegX files transferred to BIEN file system

Transfer : website or FTP process?

1.0 Data mapping and provider services

Transfer : Web services – REST, SOAP …

Manual process to map and transfer data to BIEN

Automated harvesting process to map and transfer data to BIEN.

BIEN harvester retrieves VegX files from files system

Page 4: BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback

3.0 Internal validation, Taxon scrubbing, Geospatial validation

Synonymy

Names

Reference taxonomy

3.1 Internal validation

3.2 Taxon scrubbing3.3 Geospatial

validation

Validation rule sets

Geo –validation rule

sets

TNRS Geo Resolution Service

To be completed

Page 5: BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback

4.0 Transform, Audit change, Integrate, Rule set, Rule framework

To be done

Page 6: BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback

Confederated DB (S) DW

5.1 Create dimensional extracts

5.2 Create individualobservationextracts

Geo

Time

Taxon

Raw plot

Agg plot

Specimen

5.3 Transformand load

5.4 Build andcompute aggregate tables

DM

DM

5.6 Archiving

5.5 BuildanalyticalDBs

Archived DM

Archived DM

Archived DW

Other data sources

5.0 Populating Analytical DB

To be completed

Page 7: BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback

6.0 User/web interface API

To be done

Page 8: BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback

7.0 Reapplication of updated Names

To be done