the process of data ingestion in Ækos
DESCRIPTION
The Process of Data Ingestion in ÆKOS. Andrew Graham and Matt Schneider TERN Ecoinformatics Data Analysts. Logos used with consent. Content of this presentation except logos is released under TERN Attribution Licence Data Licence v1.0. Introduction. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/1.jpg)
The Process of Data Ingestion in ÆKOS
Andrew Graham and Matt SchneiderTERN Ecoinformatics Data Analysts
Logos used with consent. Content of
this presentation except logos is
released under TERN Attribution Licence Data Licence v1.0
![Page 2: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/2.jpg)
Introduction
The Data Analyst Role with TERN Ecoinformatics• Analysis of source data and methods• ÆKOS system development and domain modelling• Contextual description of the data• Publication of data into ÆKOS
![Page 3: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/3.jpg)
The AEKOS Framework
1. Upper Context: Party, Project, Scope etc
2. Domain Model (Ontology): Observed entities, their features and relationships
3. Description Model: Methods and definitions
4. Indexing Model: Search and federation
![Page 4: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/4.jpg)
Upper Context
Provides context for Datasets:• Contact details• High level objectives of program• Licensing details and conditions of use• Statement of scope• Alignment with national metadata standards
(ANDS)• Statement of curation processes applied to data
![Page 5: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/5.jpg)
Understanding Field SamplingSchematic view of sampling configuration
![Page 6: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/6.jpg)
Methodological work-flowStudy Location
Selection
Study Location Visit
Study Location Establishment
Sampling Unit Selection Vegetation
Assessment
Physical Assessment
Landscape AssessmentSoil Assessment
Fire EvidenceSurface Cover
Disturbance EvidenceVertebrate Evidence
Climate Evidence
Species AssessmentSpecies Life Stage
Vegetation Assemblage
Voucher Collection
Canopy Age-classCanopy AssessmentStructural Formation
Overstorey Measurement
![Page 7: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/7.jpg)
Authored Method Descriptions• Start with published
method manuals
• Enrich existing method descriptions (protocols) with external web links and other resources
• Clarify questions about methods
• Divide the protocol into smaller method descriptions
![Page 8: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/8.jpg)
Authored Method Descriptions
• Use a consistent format across datasets to allow comparison
• Direct linkage between the data value and the specific method of measurement
• Allows rapid assessment of suitability of data for re-use
• Eventually a method catalogue for researchers
![Page 9: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/9.jpg)
Definition of source datasets
Analysis and definition of source data types:
• Observation data• Taxonomic concepts (a
specific type of ref. data)• Reference data (i.e. Lookup
tables)• Images and other artefacts.
![Page 10: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/10.jpg)
Mapping to the ÆKOS Domain ModelStudy Location
Sampling Unit
Study Location Visit Spatial Point
mudmapcomment
visit dateobserversdisturbance
datumx coordy coord
identifiermarkertype
Species Organism Group
Voucher Specimendetermined identityaccession No.determiner
field identitylife formcover/abundancelife stagephenologydominance
Landscapeslopeaspectlandform pattern
selectsrepresents
contains
contains
represented by
![Page 11: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/11.jpg)
IndexingEnrichment of data with common indexes:• Project level traits• Data management traits• Ecological process traits (disturbance and land-use)• Measurement details• Species taxonomy• Vegetation Assemblage (e.g. NVIS Major Veg. Groups)• Jurisdictional and Bio-geographic boundaries• Spatially derived features (e.g. distance from road,
slope, aspect, etc.)
![Page 12: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/12.jpg)
Federated Taxonomy
![Page 13: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/13.jpg)
The AEKOS Ingestion “DSL”
Screen cap of Eclipse...
• Source data query• Vocabulary management• Method description• Mapping to the common model• Populate indexes• Upper context authoring• Sandbox testing
![Page 14: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/14.jpg)
Data Work-flow• Point of truth is always the source database• Data values are not changed• Data issues fed back to Data Providers• Automatic data refresh mechanism developed• Corrections made in source database and fed back
to AEKOS on next “push”• Just new records and edits after the first load• Update frequency defined for each dataset
![Page 15: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/15.jpg)
Quality Assurance
ÆKOS QA and review:• Team review domain modelling of every dataset
ingested• “Sandbox” test ingestion before publishing to
ÆKOS• Review of method description by other team
members• Internal code validation and error checking
![Page 16: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/16.jpg)
Quality AssuranceData Providers QA:• Review method descriptions• Review upper context
Portal feedback:• Review data content in the portal• Use the portal and suggest enhancements and changes• Look and feel• Index traits• Data accuracy and representation
• Feedback survey and email facility on portal
![Page 17: The Process of Data Ingestion in ÆKOS](https://reader038.vdocument.in/reader038/viewer/2022110101/56812b7b550346895d8f99ac/html5/thumbnails/17.jpg)
Thank you
Contact Details
Data Analyst – Matt Schneider [email protected]
Data Analyst – Andrew Graham [email protected]
Website www.aekos.org.au