digitizing the legacy literature of biodiversity: an introduction to the biodiversity heritage...

15
TDWG 2006 Conference, St Louis Digitizing the legacy literature of biodiversity An introduction to the Biodiversity Heritage Library (BHL) Neil Thomson Natural History Museum, London

Upload: martin-kalfatovic

Post on 07-Jun-2015

791 views

Category:

Economy & Finance


3 download

DESCRIPTION

Neil Thomson, Natural History Museum, London. TDWG Meeting, October 17, 2006. St. Louis, Missouri.

TRANSCRIPT

Page 1: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Digitizing the legacy literature of biodiversity

An introduction to the Biodiversity Heritage Library (BHL)

Neil ThomsonNatural History Museum, London

Page 2: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

BHL origins and objectives

Encyclopedia of Life meeting at Telluride, 2003 Cost and storage possibilities Natural history literature is an ideal digitization candidate Aim: Available at point of use

Page 3: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Scope and IPR

Public domain (pre-1923 in USA) Legacy literature as complement to current material Negotiation with societies and Not-For-Profits Creative Commons licensing – some rights reserved

Page 4: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Partners

10 Library partners American Museum of Natural History Field Museum Harvard University Botany Library Missouri Botanical Garden Museum of Comparative Zoology, Ernst Mayr Library National Museum of Natural History, Smithsonian

Institution Natural History Museum, London New York Botanical Garden Royal Botanic Gardens, Kew Woods Hole Oceanographic Institution

Page 5: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Associates

OCLC http://www.oclc.org/

Internet Archive http://www.archive.org/index.p

hp

Others in negotiation

Page 6: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Structure & funding

BHL is a founder member of the Open Content Alliance www.opencontentalliance.org

/

Charitable status English-language project Register of intentFunding

Page 7: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Digitization phases

Bibliographic record pooling Internet Archive Pod of 10 cameras Boutique scanning of rare, fragile or oversize material Metadata enhancement Service building

Page 8: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Digitization process

Pooled bibliographic records used for selection, matching and status Page images and OCR Addition of identifiers Quality check Return or offsite storage

Page 9: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Metadata repository

Bibliographic record pool

Monographs Serial-titles Article-level metadata

OCLC analysis

Page 10: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Statistics - 1

Initial analysis showed: We have 1.3 million catalogue records 73% are monographs (remainder are

serials at title-level) 63% is English language material. The

next most popular language (9%) is German.

About 30% of material was published before 1923.

Page 11: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Statistics - 2

Overlap analysis Of the 981,000 monograph records

from all institutions 378,000 matching pairs were found

616,000 had no matches at all and were unique to one institution.

After de-duplication of the matching pairs, the final file contains 757,000 records.

Page 12: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Metadata development

Data standards METS

DOIs

LSIDs

Indexes and taxonomic intelligence

Page 13: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

Page 14: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

The future

What do scientists want from a digital library?

What will the BHL look like?

Page 15: Digitizing the legacy literature of biodiversity: An introduction to the Biodiversity Heritage Library (BHL)

TDWG 2006 Conference, St Louis

http://bhl.si.edu/index.cfm