biodiversity heritage library project “underway” taxonomic intelligence for large scale...

61
MBL WHOI Library Marine Biological Laboratory Woods Hole Oceanographic Institution MBL WHOI Library Marine Biological Laboratory Woods Hole Oceanographic Institution © 2007 MBLWHOI Library www.mblwhoilibrary.org Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects 33 rd IAMSLIC Conference Sarasota, FL Cathy Norton MBLWHOI Library Director Oct. 7-11, 2007

Upload: martin-kalfatovic

Post on 07-May-2015

1.587 views

Category:

Education


0 download

DESCRIPTION

Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects by Cathy Norton, Marine Biological Laboratory / Woods Hole Oceanographic Institution Library. 33rd IAMSLIC: Changes on the Horizon.October 7-11, 2007. Sarasota. FL

TRANSCRIPT

Page 1: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Biodiversity HeritageLibrary Project

“Underway”Taxonomic Intelligence for LARGE Scale

Digitization Projects33rd IAMSLIC Conference

Sarasota, FL

Cathy NortonMBLWHOI Library Director

Oct. 7-11, 2007

Page 2: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

This library serves the MBL, WHOI, USGS, NMFS, SEA, WHRC,

and other scientific groups in the area.

Facing a new dynamic phase

NMFS - 1871

MBL - 1888

WHOI - 1930

USGS - 1960

SEA - 1971

WHRC - 1985

Woods Hole Scientific Community

Page 3: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

• Biodiversity Heritage Libraries • Open Content Alliance, Principles • Internet Archive Partner• Northeast Regional Digitizing Center

@Boston Public Library• Taxonomic Intelligence- modernizing the literature

TOPICS

Page 4: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 5: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Vision

Build a Digital Open Access Library for Biodiversity Literature

Page 6: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Meetings in Colorado,2005

London, 2005 laboratories and libraries

Washington BHL 2006

Simultaneous Meetings in Woods Hole for BHL& EOL 2006

Page 7: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Members

• American Museum of Natural History• Botany Library- Harvard• British Natural History Museum, UK• Field Museum• MBLWHOI Library• Missouri Botanical Gardens• Museum of Comparative Zoology-Harvard• New York Botanical Gardens• Royal Botanical Gardens @ Kew ,UK• Smithsonian Museum of Natural History

– University of Illinois, contributing member

Page 8: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

• Legacy Taxonomic Literature available in museums has limited access

• Much of it is rare

• Systematic literature depends on the historic literature

• The cited half-life of natural history is longer than that of any other scientific domain

• 90% of Biodiversity Information is in these libraries

• 90% of Biodiversity is in 3rd world countries like Africa and South America

Why BHL now ?

Page 9: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

The Open Content Alliance (OCA) represents the collaborative efforts of a group of cultural, technology, nonprofit, and governmental organizations from around the world that will help build a permanent archive of multilingual digitized text and multimedia content.

Page 10: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Principles of OCA

• The OCA will encourage the greatest possible degree of access to and reuse of collections in the archive, while respecting the rights of content owners and contributors.

INTERNET ARCHIVE• Contributors will determine the terms and conditions under which their collections

are distributed and how attribution should be made.

• IA need not be obligated to accept all content that is offered to it and may give preference to that which can be made widely accessible.

• • IA will offer collection and item-level metadata of its hosted collections in a variety

of formats.

• IA welcomes efforts to create and offer tools (including finding aids, catalogs, and indexes) that will enhance the usability of the materials in the archive.

• Copies of IA collections will reside in multiple archives internationally to ensure their long-term preservation and accessibility to all.

Page 11: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Name:

BioDiversity Heritage Library

Wiki- for all involved

Web Presence! Where to begin?

Page 12: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 13: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 14: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 15: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 16: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 17: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 18: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 19: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 20: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 21: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 22: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 23: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

In the end… simplicity…• http://bhl.si.edu/

Page 24: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

• BHL invited to be a part of the EOL project.

• EOL - build one web page for each known species… 1.8 million!

• Alfred P. Sloan and Macarthur Foundations

Page 25: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

QuickTime™ and a decompressor

are needed to see this picture.

Page 26: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 27: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Northeast Digitization Center• Boston Public Library

– Space infrastructure

• 10 Scanning Stations• .10 ¢per page• 50 Books per day• Journals- metadata,foldouts• Transportation

– ILL deliverymoving company15 rolling carts per trip

Photo by lesveilleus 9/20/07

Cathy Norton, Bernie Margolis, Brewster Kale

Page 28: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Economies of Scale

• North East Regional Digitization Center

• Agreements made with the Boston Public Library to Include the Boston Library Consortium and NE BHL members.

• Smithsonian and Library of Congress

• Field Museum of Ill

• BNH UK and Kew UK*Bernie Margolis- Boston Public Library

Judy Warnement - Harvard Botany Library, Brewster Kale-Internet Archive

Cathy Norton-MBLWHOI Library

Doran Weber- Alfred P. Sloan Foundation

Page 29: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Page 30: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

10 scribes

BPL

Page 31: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Biology Digitization Projects Problems, Dilemmas,Puzzles,Difficulties

• Copyright - Pre 1923, 1923-1964, orphan works, out-of-print

– Stanford University Copyright Renewal Database

• Permissions

• Collaboration with publisher, societies, institutions, etc.

• Duplicates, journals 85,000 - 14,000 BID LIST

• Monographs, collection analysis-- Ref Works

Page 32: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Name Changes over Time

Taxonomic Intelligence

Page 33: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

“All accumulated information of a species is tied to a scientific name, a name that serves as a link between what has been learned in the past and what we today add to the body of knowledge.”

~ Grimaldi & Engel, 2005, Evolution of the Insects

Page 34: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

The challenge for contemporary DIGITAL libraries

Goal:

Use one name to find the content for all names

Page 35: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Names are even misspelled, such as Loligo pealei

Loligo pealeiiLoligo pealiiLoligo pealei

Page 36: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Peranema – the fern

One name can refer to more than one organism

Peranema – the euglenid

Yet, despite this, taxonomists have used names and hierarchies to manage

information about organisms very effectively for 250 years

Page 37: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

LibrariesPublishers

MuseumsFederal Agencies

Who is affected by these problems?

Page 38: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Serious challenges in federated environments

One organism

4 scientific names

4 maps

We want one map

Page 39: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Reconciliation – linking alternative names for the same organism

A query initiated with any name, can be expanded to all names and will unify data associated with each

Page 40: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

• All names & all Classifications ClassificationBank • Alternative names reconciled

• Similar names disambiguated

• Exploit hierarchies to browse and search, build a comprehensive classification

• Improve performance with federated systems

• Read documents, web sites, databases and taxonomically indexing the content

• Create a unified portal to information about organisms on the internet

Taxonomic intelligence is the inclusion of taxonomic practices, skills and knowledge within informatics services to manage information about organisms

Page 41: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

• There are many resources out there, but no single comprehensive resource for species information

• Rather than building another big database, we can create a new way to link existing information using an aggregation portal

• This places little or no burden on data providers

• Protecting ownership and diversity of initiatives

Taxonomically intelligent aggregation technology builds portals to distribute information about

organisms

Page 42: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Alternative names

Vernacular names

Expert view

More or less specific

Suggestions & corrections

Indexing power from NameBank

Page 43: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Results from an array of resources

Page 44: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

• data from various sources may be merged

• red dots on the maplink back to the website thatprovided the geographical co-ordinates

Specimen distribution data from remote sources

Page 45: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

FindIT - uBio’s Scientific Name Recognition Algorithm

Page 46: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Training and Improving the Algorithm

Page 47: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

uBioRSS Taxonomically Intelligent RSS Feed Aggregator

Page 48: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

uBioRSS Taxonomically Intelligent RSS Feed Aggregator

Page 49: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

MBL WHOI Library – Woods Hole authors’ publications

Page 50: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

MBL WHOI Library – Woods Hole species publications

Page 51: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Taxonomically intelligent scientific text parsing

Page 52: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

• Documents go to Internet Archive for OCR and storage

• The documents are added to the BHL collection

• uBio checks the BHL collection for new documents

• The documents are scanned for names

• TaxonFinder adds new strings to Namebank

• Document markup with anchors

• TaxonFinder adds all namebankIDs to Taxonomic Index

• This index is called upon by various applications...

Page 53: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Biological Data Revolution

Biomedical Knowledge

Biodiversity Knowledge

Page 54: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Scientific Names

No Complete List of Scientific Names

Published Variants

Escherichia coli

112,133 741,872

49,382

Objective Synonyms

Bacterium coliBacillus coli

Mis-spellings

Escheria coli

*

*Scientific Names ≠ Species

Page 55: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Taxonomic Knowledge

Page 56: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

BHLVision Information coming from Everywhere

Page 57: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

What role will librariesplay once the scanning is done?

Will you be negotiators like you are now with serials?

Public domain publications restricted FOREVER by contract…. or open?

Page 58: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Road map• US libraries -- 12 billion per year ( OCLC)

• Acquisitions--- 3-4 billion per year• 1% - could scan 1 million BKS/vols per year• Librarians will create informatics tools that will

enhance indexing and organizing for not only their users but world wide

Page 59: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

Acknowledgments

A.W. Mellon FoundationAlfred P Sloan Foundation

Macarthur Foundation

Martin KalfatovicTom GarnetGraham HigleyConnie Rinaldo

Neil SarkarDavid RemsenDavid PattersonDiane RielingerLesveilleux.com

Page 60: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

URLSwww.biodiversitylibrary.org

www.eol.org

www.collections.stanford.edu/copyrightrenewal

www.ubio.org

The Public Domain: How to Find & Use Copyright-Free Writingsk Music, Art & More by Attorney Stephen Fishman

Page 61: Biodiversity Heritage Library Project “Underway” Taxonomic Intelligence for LARGE Scale Digitization Projects

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

MBL WHOI LibraryMarine Biological Laboratory

Woods Hole Oceanographic Institution

© 2007 MBLWHOI Library www.mblwhoilibrary.org

BHL Bid List for Serialshttp://obsidian.nhm.ac.uk/test/library/bhlseriallist/

Taxa Toy