ifla art of life presentation final
Post on 30-Jun-2015
56 Views
Preview:
DESCRIPTION
TRANSCRIPT
IFLA Section on Science & Technology August 19,2014
Merging the Worlds of Art and Science
Trish Rose-SandlerCenter for Biodiversity Informatics, Missouri Botanical Garden, St. Louis, MO, USA
Nancy E. GwinnSmithsonian Libraries, Washington, DC, USA
Constance RinaldoErnst Mayr Library, Museum of Comparative Zoology, Harvard University,Cambridge, MA, USA
What is Biodiversity Content?
Digital Repository ofBooks
Field Notebooks
Journals
• Extensive
• Open
• Global
• Linked
Digital Library
78,947 Titles
141,770 Volumes
44,275,666 Pages
What is the Biodiversity Heritage Library?
• American Museum of Natural History Library• California Academy of Sciences Library• Cornell University Library• Harvard University Botany Library• Ernst Mayr Library of the Museum of Comparative
Zoology• Library of Congress• Marine Biological Laboratory Library• Missouri Botanical Garden Library• Natural History Museum, London Library• The New York Botanical Garden Library• Royal Botanic Garden, Kew, Library• Smithsonian Libraries• Washington University at St. Louis Library• University of Illinois Library• United States Geological Survey Libraries
15MEMBERS
4AFFILIATES
BHL Central
• Academy of Natural Sciences Library• Chicago Botanical Garden Library• The Field Museum Library• Los Angeles County Natural History Museum Library
Global Partners
Europe
China
Australia
Egypt
Brazil
Africa
Singapore
• Scientific descriptions of animals, plants, nature in general (Taxonomic literature)
• Type in a taxon name and find all detected occurrences in the text corpus.
• Systematic biology is easier for scientists—the taxonomic impediment is resolved.
BHL = Libraries + Technology + Science
BHL Book Viewer
ChallengesMillions of natural history illustrations
Very few have metadata that describe their content or where they are in the book
BHL and Flickr—90,000 Images
Only pretty pictures
Still too manual
Many illustrations not included
o Full title—The Art of Life: Data Mining and Crowdsourcing the Identification and Description of Natural History Illustrations from the Biodiversity Heritage Library (BHL)
o Grant given to Missouri Botanical Garden in St. Louis, Missouri
o Funded by U.S. National Endowment for the Humanities
o Runs May 2012-April 2015
Art of Life?
o Define metadata schema for natural history illustrations
o Build algorithms to automate identification of illustrations
o Enhance existing tools to classify illustrations
o Crowdsource adding descriptive metadata
o Integrate metadata into BHL and share images and metadata more widely
Goals/Objectives
Algorithms to automate identification of images tested:
Picture blocks 87-88% effectiveContrast 87-88% effectiveColor ineffectiveCompression ineffective
How to Identify Illustrations?
Web Application
For analyzing & visualizing algorithm results
Basic Workflow
Macaw
An interface for classifying pages identified by algorithms
Crowdsourcing
Flickr tagging
Wikimedia
Commonsapply schema
For adding description of illustrations
Describing Natural History Illustrations
Schema = VRA Core with elements from Darwin Core
o Running algorithms across BHL corpus
o 1.5 million pages processed
o 300,000 pages with images
o Estimate will be 15% of corpus—pages with images
o Classifying results—78,000 pages so far
o Testing bulk upload to Wikimedia Commons
o Testing extraction of metadata from tool
o BHL architecture modified and ready to store and preserve newly created metadata
Art of Life Status
o Images serve multiple audienceso Scientistso Artistso Historianso Teachers
o Rich cultural heritage contento If specimens are gone, may be the only way to
document a specieso Image finding algorithms and analysis tools on
githubgithub.com/IMAmuseum/artoflife
o Image classification tool (Macaw) on githubgithub.com/cajunjoel/macaw-neh
Benefits
For more information: Trish Rose-Sandler, Principal Investigatortrish.rose-sandler@mobot.orgbiodivlib.wikispaces.com/Art+of+Life
Questions?
top related