![Page 1: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/1.jpg)
Accessing the original observation data captured during plant exploration missions for collecting crop diversity
Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, ItalyHannes Gaisberger, Massimo Buonaiuto, Federico Mattei, Andrea De Pirro, Valentina Barbiero, Simone Mori, Imke Thormann, Tom Hazekamp, Elizabeth Arnaud
![Page 2: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/2.jpg)
Agenda
• Part 1: Safeguarding the original paper documents by scanning and digitizing the data – Hannes Gaisberger
• Part 2: Creation of a public repository of full scanned documents enabling access to the full text – Massimo Buonaiuto
![Page 3: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/3.jpg)
Bioversity supported germplasm collecting missions
• Since 1974, Bioversity International has supported more than 550 germplasm collecting missions yielding 225,875 samples and covering 4,300 species from 137 countries
• Samples were sent to several genebanks worldwide for safety duplication, conservation and potential distribution
• Other CGIAR centers organized various collecting missions for their mandate crops
![Page 4: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/4.jpg)
Original observation data is essential for:
• Identify duplicates between collections and gaps in diversity – value for genebank curators and collecting actions
• Tracking original sample & country of origin in pedigrees – value for Breeders and Benefit Sharing
![Page 5: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/5.jpg)
• Collectors recorded key sample information (passport data) and other observation data in field books
Scanning of field notebooks and related documents
![Page 6: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/6.jpg)
Original observation: a treasure for genebanks and breeders
• Genus and Species
• Collecting Number
• Site Information: Admin boundaries, Latitude, Longitude and Elevation
• Collecting Source and Sample Status
The collecting form contains the botanical classification along with localization details, environment, cultural practices, diseases and pest presence and symptoms and traditional uses
![Page 7: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/7.jpg)
Identification and quality-checking in databases
• Different publicly available genebank inventories are checked in order to track corresponding samples and complete missing passport data
![Page 8: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/8.jpg)
Integration of quality passport data
• Data extracted from field books and databases is integrated in a sample level database of collecting missions
![Page 9: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/9.jpg)
Results in figures
• To date, the quality of 101,171 passport data records from 375 collecting missions has been improved through data extracted from scanned documentation
• 56,454 of these collected samples are linked to genebank accessions in 51 institutes worldwide
Priority crops/ use group
Number of collected samples
Forages 44056Rice 25022Maize 16484Beans 10976Wheat 7507Cowpea 7473Potato 7146Pearl millet 6662Barley 4429Groundnut 2928Finger millet 2850Chickpea 1467Banana 1326Pigeon pea 999Others 86550Total 225875
• A total of 43,637 scanned pages are saved as 1063 pdf-files and stored in an online repository aside the 26,000 other files scanned by CGIAR centers and partners
![Page 10: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/10.jpg)
Publishing the data and attached information
• End of 2010: work must be finished for Bioversity supported missions
• Full text available on the online repository and publish the collection mission database
• Visualization: Map sites where diversity was collected (after georeferencing with Biogeomancer)
• Various projects to address gaps analysis and diversity analysis, like Genesys, encourage partners to perform same work and share the full text and data – links to CWR information, Museum herbaria information, Literature
![Page 11: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/11.jpg)
Public access to the scanned collecting missions documents
A Repository that presently contains 27,000 Collecting Missions Files from CGIAR Centers and partners:• Agricultural Research Centre (ARC) of Lao People’s
Democratic Republic • AfricaRice• Agricultural Research for Development in Africa (IITA)• Bioversity International• International Rice Research Institute (IRRI)
![Page 12: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/12.jpg)
Typology of the documents produced by Collectors
1) Mission Reports2) Summary Forms3) Sample lists4) Collecting Forms5) Accession Vouchers 6) Newsletters7) Factsheets8) Distribution lists9) Field Books
![Page 13: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/13.jpg)
Documents Types Hierarchy
![Page 14: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/14.jpg)
Analysis of Metadata (1/5)
![Page 15: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/15.jpg)
Analysis of Metadata (2/5)
![Page 17: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/17.jpg)
Analysis of Metadata – Darwin Core for Germplasm (4/5)
![Page 18: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/18.jpg)
Analysis of Metadata (5/5)
Darwin Core Germplasm metadata+
Collecting Missions metadata =
Metadata for Collecting Missions Documents
![Page 19: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/19.jpg)
How users will access the Repository
Alfresco DMS
Typo3 CMS
![Page 20: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/20.jpg)
Import of 27,000 PDF Files
Process of import PDF files in 3 phases:1. Conversion of institutional metadata in Darwin
Core Germplasm metadata2. Association of metadata to all PDFs files, using
heterogeneous sources (databases, Excel files and filenames, etc.)
3. Batch upload of all PDF files together with metadata file associated to each file in DC-Germplasm standard.
![Page 22: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/22.jpg)
Public Search Mask (2/3)
![Page 23: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/23.jpg)
Public Search Mask (3/3)
![Page 24: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/24.jpg)
How users will manage and publish documents
• Simple Workflow to publish into the Repository:1. Upload the file in private
user Home Space2. Edit metadata3. Approve the document
for public repository with a click
... the file will be and public
![Page 25: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/25.jpg)
Summary
• Improved quality of passport data for about 100,000 collected samples from 137 countries
• 56,454 of these collected samples are linked to genebank accessions in 51 institutes worldwide
• Collected 27,000 documents classified in 9 types of documents with metadata
• Metadata extracted and parsed using Gerplasm Darwin Core standards
![Page 26: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/26.jpg)
Open questions and challenges- Interaction with Open Archive standards and Protocol
for Metadata Harvesting- Integration with Crop Terminizer, University of
Manchester- Web Analytics for monitoring of downloads in details
(referrers, visits, etc.) and web marketing- CMIS protocol used to interact with content
management systems- Metadata validation with crop scientists, collectors
http://www.central-repository.cgiar.org/
http://www.central-repository.cgiar.org/
![Page 27: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/27.jpg)
Guidelines for collecting samples
- Being revised and will be published in a new section of the on the Crop genebank knowledge base
- Adding guidelines for illustrating with photos that support the tentative taxonomy, captured data and GPS
![Page 28: Bioversity International, Via dei Tre Denari 472/a, Maccarese, Rome, Italy](https://reader036.vdocument.in/reader036/viewer/2022081503/568162c0550346895dd35051/html5/thumbnails/28.jpg)
THANK YOU!