bien confederated db (s) analytical db(s) heterogeneous source database(s) of...
TRANSCRIPT
BIEN Confederated DB (S)
Analytical
DB(s)
Heterogeneous source database(s) of Plots/Specimens/Occurrences
Synonymy
Names
Reference taxonomy
***
*** Feedback to original data providers. Filtered push/BiSciCol?-- annotation and feedback, QC
Provider feedback
Errors, annotation
etc
Data-source Steward
Modified by NS/SD from MS/BB
Later changes to taxon names
applied to BIEN
Staging DB2.0 BIEN
harvester and loader
4.0 Transform, Audit change, Integrate, Rule
set, Rule framework
5.0 Populating Analytical DB
6.0 User/web interface API
VegX
DwC
1.0 Data Mapping
and provider services
3.0 Internal validation, Taxon
scrubbing, Geospatial validation
Native BIEN Traits
Mediated through TNRS & Geo scrubbing
Other data associated with taxon
Integrate geospatial & environmental data associated
with location
Geospatial data linking
There may be linkages other data BIEN wants to integrate
into core data.
7.0 Reapplication of updated Names
Phylogenies
Sequences
Traits
External data sources
?How are linkages to external resources made? If it is through Names then is a TNRS mediation step required against the external resources?
Provider BIEN
Full BIEN Workflow
There may be trait data BIEN wants to integrate into core
data
Provider BIEN
‘Manual’ schema-mappingtools
1.0 Data mapping and provider services
2.0 BIEN Harvest process
1.1 Data mapping
tool
Harvesting protocol: OAI PMH
1.3 Provider data web serviceProvider
Cache
VegX
Veg Obs multi-dataset
DB
Single datasets
Provider Data
Would follow the TDWG specifications and architecture for harvesting DwC.
This workflow represents a provider using a manual process to map and transfer data to BIEN.
VegX
1.2 VegX programmatic
mapping
1.4 DwC programmatic
mappingDwC
VegX
Specimen multi-dataset
DB
1.5 TAPIR/IPT Provider
web serviceProvider Cache
DwC/A
VegX
TAPIR or IPT protocol
DwC
This workflow represents a provider using an automated harvesting process to map and transfer data to BIEN.
Transfer may occur via a website or FTP process?
VegX
Notification services?
Temporary
XML file store
2.0 BIEN harvester and loader
2.1 BIEN harvester: VegX
& DwC
2.3 Import into staging DB
Parse XML to RDBMSAssign ID’s, gather metadata, set status flags, initialise versions, insert data into staging tables, update audit tracking.
2.2 First level schema validationIs document well formed are mandatory data present
Provider BIEN
Provider feedback
BIEN staging DB
BIEN feedback
VegX files transferred to BIEN file system
Transfer : website or FTP process?
1.0 Data mapping and provider services
Transfer : Web services – REST, SOAP …
Manual process to map and transfer data to BIEN
Automated harvesting process to map and transfer data to BIEN.
BIEN harvester retrieves VegX files from files system
3.0 Internal validation, Taxon scrubbing, Geospatial validation
Synonymy
Names
Reference taxonomy
3.1 Internal validation
3.2 Taxon scrubbing3.3 Geospatial
validation
Validation rule sets
Geo –validation rule
sets
TNRS Geo Resolution Service
To be completed
4.0 Transform, Audit change, Integrate, Rule set, Rule framework
To be done
Confederated DB (S) DW
5.1 Create dimensional extracts
5.2 Create individualobservationextracts
Geo
Time
Taxon
Raw plot
Agg plot
Specimen
5.3 Transformand load
5.4 Build andcompute aggregate tables
DM
DM
5.6 Archiving
5.5 BuildanalyticalDBs
Archived DM
Archived DM
Archived DW
Other data sources
5.0 Populating Analytical DB
To be completed
6.0 User/web interface API
To be done
7.0 Reapplication of updated Names
To be done