the luminary library experience: large scale digitization at toronto public library agenda...
TRANSCRIPT
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Agenda
• Introduction• Background• The project• Current status• Implementation• Issues and lessons• Next steps
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Uncharted waters
• Microfilming• The Internet• Digitization of
collections• Scaling up
digitization – Large scale versus mass
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Definitions
• Mass digitization
- Conversion of whole libraries
- Minimal human intervention
- Google, Internet Archive• Large-scale digitization
- Creation of collections
- Complete document sets
- JSTOR, ECO
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Robotic equipment
New opportunities• Treventus• Qidenus• 4DigitalBooks• Kirtas
http://digitisation.jiscinvolve.org/files/2008/10/automated-book-scanners-munich-2008-final.pdf
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Before new technology• Books needed to be unbound or flattened• Preservation issues• Labour intensive and high cost
With new technology• Provides open-book capture,robot page-turning,
dual camera capture, processing software• Reduced preservation issues• Less human intervention on standard items
The Luminary Library Experience: Large scale digitization at Toronto Public Library
The Luminary Library Project
• Partnership 2007• Kirtas, Ristech and Amazon
with various university and public libraries (initially University of Maine, Emory University, Cincinnati Public Library, and TPL)
• Digitize for Print-on-Demand (POD)
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Our goal
To digitize and make available, both freely online and for print-on-demand,10,000 volumes of our pre-Confederation Canadian imprints over the next five years.
• Driver is access• Primary sources exposed in new ways to new users• Users can search for and purchase titles via Amazon• Library receives royalties for each copy reprinted
• Challenges of the collection• Workflow• Web-ready versus print-ready
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Print on Demand (POD)
• Library digitizes and sends books with metadata to Kirtas
• Kirtas processes and sends content to Amazon• Amazon loads content on “Historical
Reproductions” – printed books are available for purchase
• TPL retains copies of files, can repurpose its content as desired
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Current status
• Orientation to production• 1,000 books delivered as of
end 2008• 650 books online on
Amazon.com• Languages – English 78%,
French 18%, German 1.7%, Other 2.3%
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Project Implementation
• Kirtas video• Collection Selection• Digitization Process• Issues and Lessons
Learned
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Collection Selection
• Focus on Special Collections – Baldwin Room• Copyright-free, public domain• Pre-Confederation Canadian imprints• Almost 10,000 titles documented in A
bibliography of Canadiana (5 vols., 1934-89)• Many well known titles 1512-1867• Detailed guidelines for selecting – constraints
from Amazon, Kirtas, TPL• Many challenges with historical books (e.g.
fragile, heterogeneous, foldouts……)
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Before Digitization Acquiring ISBNs Access database designed for:
- tracking items through the workflow
- entering metadata (Special Collections and Digitization staff)
- identifying each item uniquely (barcode, bibliography number)
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Access Database
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Workflow – Digital Conversion Examination of books Setup in Kirtas APT BookScan 2400 Software setup – APT Manager Scanning – automated vs. manual Operator tasks Troubleshooting problems (e.g. multiple pages
turned at the same time)
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Kirtas APT BookScan 2400 –
TPL workstation
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Kirtas APT Manager software
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Workflow – Image Editing Output of scanning = raw JPEG images Templating right and left-hand pages –
BookScan Editor software Batch processing multiple books – SuperBatch Output of batching = bitonal TIFF images
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Kirtas BookScan Editor software – templating raw images
Raw JPEG image
cropping raw image
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Workflow – Image
Quality Control Initial processing includes crop,
contrast, deskew Editing bitonal TIFFs after
SuperBatch Global and individual page
adjustments in BookScan Editor software
Historical books require more editing!
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Kirtas BookScan Editor software – editing bitonal TIFF images
BEFORE and AFTER
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Digitization Process
Workflow – Delivery Metadata generation – ONIX xml file Conversion of bitonal TIFFs to a single PDF Delivery of PDFs, metadata to Kirtas weekly via
FTP
Statistics Snapshot for 2008 - averages No. pages/book = 159 No. books delivered/week = 27.4 No. staff FTE devoted to project = 1.7
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Issues and Lessons Learned
New Technology Adopting new hardware and software Steep learning curve for staff
Historical Collections Many scanning challenges: fragility, size,
foldouts, faded text, bleedthrough,…. Numerous books did not meet guidelines Order of production did not match original plan More manual than automated page turning
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Historical Collections - Issues
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Issues and Lessons Learned
Cataloguing and Metadata Missing information in MARC records Spine title a librarian decision for each book Pricing model decisions
Production Levels Numerous ramp-up challenges 2007-08 Nature of materials prevents higher levels of
production using automated page turning Average unit time to produce one book is 2.2
hours
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Issues and Lessons Learned
Communication Open communication, planning is key Alignment across departments of
understanding, resources, expectations Staff dedicated to both collections and
access
Partnership Library/vendor relationship is still new Different kind of partnership – centralized Mutual support among libraries is growing Challenges of working with a large
commercial organization
The Luminary Library Experience: Large scale digitization at Toronto Public Library
Implications/Next steps
• User group• User interface and OCR• Production 2009• Explore DOD options
(digitization-on demand)• Repurposing of Kirtas
Content on TPL website
The Luminary Library Experience: Large scale digitization at Toronto Public Library
?Questions?
• Johanna Wellheiser
Manager Preservation and Digitization Services, Toronto Public Library
[email protected]• Andrew Lofft
Dept. Head, Canadiana Dept., Toronto Public Library