autrian books online - the public private partnership of the austrian national library with google
DESCRIPTION
Presentation at the European Business Press Editors’ Seminar, Vienna, 26 March 2014TRANSCRIPT
@maxkaiser
Austrian Books Online
Max Kaiser
Head of Research and Development Austrian National Library [email protected]
European Business Press Editors’ Seminar
Vienna, 26 March 2014
The Public Private Partnership of the Austrian National Library with Google
@maxkaiser @maxkaiser
@maxkaiser
history back to the
14th century
@maxkaiser @maxkaiser
one of the world‘s
most significant
collections
@maxkaiser @maxkaiser
Quelle:
http://commons.wikimedia.org/wiki/File:A
ustria_Hungary_ethnic_de.svg
„legal deposit“
@maxkaiser @maxkaiser
@maxkaiser
→ Picture Archives and Graphics Department
→ Map Department
→ Music Department
→ Literary Archives
→ Papyri Department
→ Department of Planned Languages
→ Department of Rare Books and Manuscripts
@maxkaiser @maxkaiser
@maxkaiser
→ State Hall
→ Papyrus Museum
→ Globe Museum
→ Esperanto Museum
@maxkaiser @maxkaiser
@maxkaiser
collect preserve describe make available foster research
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser
@maxkaiser
September 2012 http://www.onb.ac.at/
vision2025
@maxkaiser
@maxkaiser
@maxkaiser
@maxkaiser
Vision 2025 Knowledge for the world of tomorrow
Our holdings are digitized
We collect and sustain knowledge
Access to our knowledge is simple
With us, research is more faceted and effective
We enrich cultural and social life
@maxkaiser @maxkaiser
@maxkaiser
→ substantial part of our book collections digitised
→ full-text search
→ important parts of other collections digitised
→ all our services are digital
our holdings are
digitised 2025
@maxkaiser
@maxkaiser
→ focal point of our collection policy is digital
→ collect user-generated content and new digital formats
→ scalable system for digital long-term preservation
we collect and
sustain knowledge 2025
@maxkaiser @maxkaiser
@maxkaiser
→ enrich metadata and connect with semantic web
→ link with external metadata (e.g. geo data)
→ build innovative (e.g. visual) interfaces
→ Open (Linked) Data
access to knowledge
is simple 2025
@maxkaiser @maxkaiser
@maxkaiser
→digital content integrated virtual research environments
→ tailored digital services for researchers
→digital humanities
→crowdsourcing
with us, research is more
faceted and simple 2025
@maxkaiser @maxkaiser
we enrich cultural and social life
@maxkaiser
→ digital services and reading rooms and museums
→ reinforce library as social space
→ foster user participation with our digital resources
→ user generated content
we enrich cultural and
social life 2025
@maxkaiser @maxkaiser
@maxkaiser
access for everyone from anywhere
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser @maxkaiser
@maxkaiser
Austrian Books Online
@maxkaiser
Austrian Books Online www.onb.ac.at/ev/austrianbooksonline/
@maxkaiser
digitisation of the entire historical book holdings of the Austrian National Library
@maxkaiser
largest Austrian
public private partnership in the cultural sector
@maxkaiser
600,000 volumes
200 Mio pages
@maxkaiser
Google Books
Digital Library
Austrian National Library
@maxkaiser
Partner Program
Library Program
Google Books
@maxkaiser
13 Libraries in Europe
5 National Libraries Italy Austria The Netherlands Czech Republic Great Britain
@maxkaiser
>20 Mio. books
> 50% non-English
~ 75% from libraries
~ 2 Mio. books from European libraries
> 3 Mio. books public domain
@maxkaiser
@maxkaiser
→long duration of the cooperation
→substantial investment by both partners
→distribution of responsibilities and risks
@maxkaiser
→ intellectual property rights
→ public domain works only
→ non-exclusivity
→ ONB free to digitise material with
other partners
→ transparency of process and
agreement
→ public tender
→ detailed online FAQs
@maxkaiser @maxkaiser
@maxkaiser
→ access
→ all files available free-of-charge for non-
commercial use
→ access via platforms like Europeana
→ provision to research partners
@maxkaiser
@maxkaiser
who is paying for what?
http://www.bildarchivaustria.at/downl/1148453/layout/CE%2043_3.jpg
@maxkaiser
Google:
→ transport
→ insurance
→scanning
→OCR
→ image processing
→quality control
→Google Books
@maxkaiser
Austrian National Library:
→ provision of metadata
→ selection
→ internal logistics
→ conservational assessment
→ barcoding
→ metadata adjustments
→ data download and control
→ data storage & digital preservation
→ Digital Library
@maxkaiser
70+ ONB staff members
20+ exclusively for project →book logistics
→metadata adaptation
→ cataloguing
→ conservation / restoration
→quality control
→ software implementation
→project management
@maxkaiser
@maxkaiser
entire historical book holdings
16th–19th century
@maxkaiser @maxkaiser
200.000 volumes
State Hall
@maxkaiser Quelle: http://deu.archinform.net/projekte/10734.htm
Department of Manuscripts and Rare Books
Map Department
@maxkaiser
Department of Music
@maxkaiser Quelle: http://commons.wikimedia.org/wiki/File:Palais_Lobkowitz_Vienna_Oct._2006_006.jpg
Theatre Museum
@maxkaiser
Fidei Commiss Library
@maxkaiser
Workflow
@maxkaiser
„book flow“
„digital flow“
@maxkaiser
book flow
@maxkaiser
no individual selection …
@maxkaiser
size
@maxkaiser
size
@maxkaiser
condition
@maxkaiser
conservational evaluation
@maxkaiser
value
@maxkaiser
logistics in the
State Hall
@maxkaiser
challenges…
@maxkaiser
challenges…
@maxkaiser
challenges…
@maxkaiser
logistics in the
„Aurum“ Depot
@maxkaiser
preparation for digitisation
@maxkaiser
manipulation area …
@maxkaiser
adaptation of metadata
@maxkaiser
8 minutes / volume
@maxkaiser
600.000 books
@maxkaiser
80.000 hours
@maxkaiser
10.256 working days
@maxkaiser
48,8 person years
@maxkaiser complex cases …
@maxkaiser bound-togethers …
@maxkaiser bound-togethers …
@maxkaiser bound-togethers …
@maxkaiser conservational protection
@maxkaiser
conservational protection
@maxkaiser
cataloguing the Fidei Commiss Library
@maxkaiser
ready for digitisation …
@maxkaiser
digitisation
→ scanning Center in Germany
→ procedures agreed
→ Austrian Federal Office for Monuments involved
→ each volume checked after return
→ books unavailable to users for ~ 3 months
@maxkaiser @maxkaiser
@maxkaiser
book flow digital flow
@maxkaiser
digitisation
data download
book logistics
quality control
storage
access
ADOCO (Austrian Books Online
Download & Control)
@maxkaiser
quality control
@maxkaiser
quality control
→goal: automated jobs
→ representative samples
→ IT assisted discovery of error clusters
→error candidates checked manually
→detect systematic
and critical errors
@maxkaiser
bleedthrough
non-critical
@maxkaiser
cropping error
critical!
@maxkaiser
quality control via sampling
re-processing
re-download
@maxkaiser
cropping error
fixed!
@maxkaiser
@maxkaiser ~215.000 volumes digitised
March 2013
@maxkaiser ~68,5 Mio. pages
March 2013
@maxkaiser
10%
13%
31%
44%
2%
16. Jh.
17. Jh.
18. Jh.
19. Jh.
no year
centuries… Austrian Books
Online
@maxkaiser
3%
12%
14%
29%
33%
9%
eng
ita
fre
lat
ger
others
languages… Austrian Books
Online
@maxkaiser
0%
10%
20%
30%
40%
50%
60%
70%
16. Jh. 17. Jh. 18. Jh. 19. Jh.
eng
ita
fre
lat
ger
Austrian Books
Online
@maxkaiser
@maxkaiser
Catalogue / “Quick Search”
full-text search
ABO Book Viewer
ANNO newspaper portal
@maxkaiser
@maxkaiser
@maxkaiser
@maxkaiser
@maxkaiser
@maxkaiser
@maxkaiser
@maxkaiser
@maxkaiser
ABO
Book Viewer
@maxkaiser
outlook
@maxkaiser
@maxkaiser
@maxkaiser
outlook
→ full-text: new possibilities for research
→e.g. named entities search
→ data enrichment
→ linked data
→ new data centric research in the Humanities & Social Sciences
@maxkaiser
critical mass of digitally available texts
and (meta) data
new research questions to textual material?
@maxkaiser
Data
@maxkaiser
ÖNB
Hadoop-
Cluster
@maxkaiser
close reading
distant reading
interpretation / analysis / edition of individual texts
analysis of Big Data textmining
@maxkaiser
metadata
digitised collections
data fata
data
Server
Server
Server
Server
Server data
processing
Tool
Tool
Tool
Tool
@maxkaiser
thank you! [email protected] www.onb.ac.at
twitter.com/maxkaiser www.linkedin.com/in/maxkaiser plus.google.com/+maxkaiser1