digital documents and libraries adolf knoll national library of the czech republic...
TRANSCRIPT
Digital Documents and Libraries
Adolf KnollNational Library of the Czech [email protected]
© Adolf Knoll, National Library of the Czech Republic
Communication
Creator of
information
Receiver of
information
Informationdelivery
Receiver of
information
Receiver of
information
Receiver of
information
Receiver of
informationMultiplyinginformation
What is transmitted?
Information that can be perceived by human senses: smell (we can receive the smell of the cooked
meals) taste (we can taste the wine) touch (we can touch various objects to discover
their properties) hearing (we can hear information, e.g. music) sight (we can see objects)
What is emitting information?
Physical Object
Artefact
•It smells well•It cools when touched•It can be seen•It tastes well•It can be heard when opened and even poured in glasses
However, what remains from this informationif the artefact and the action are encoded in a document ?
What is a document?
A surrogate of the reality,its reflection fixed (written) on a physical carrier.
… and if it is sound,
how does it work for me?
It makes me rememberspecific moments of mypersonal life
What a document can transmit?
The document bridges two worlds: that of the its creator and that of the receiver of the information message.
Each receiver processes the same information message differently: during its interpretation, he uses his knowledge, his experience, his own feelings.
Thus, the results differ from one another.
Basic message
Basicmessage
Basicmessage
Basicmessage
Original Document Recreated original
Addedinformation
Addedinformation
Multiplying information
Growing possibilities manual copying (from inscriptions in
stone to manuscripts produced in scriptoria)
printing (the number of copies grows) analogue copying (Xerox, audio
recordings, motion pictures, ...) digital copying and delivery
Problems
Fidelity of copied information Durability of information carrier Decoding process
These factors have a strong influence on our understanding of what the creator wanted to share with us.
End
Fidelity
The digital copying can solve the fidelity, because we are able to copy all the bits.
However, digitization is not copying, it is re-encoding of information into another system.
Digitization is conversion of the analogue scheme into the digital scheme.
Any change of parameters of the digital format or any conversion
from one digital format into another can affect the fidelity.
Change of parameters
Digital audio recording - reducing the bit rate: MP3 128 Kbit/s MP3 34 Kbit/s
Digital image - reducing the number of colours: 256 colours 16 colours 2 colours
Converting between digital formats
Through this conversion losses of information can take place.
In order to avoid this situation, we must know the characteristics of the most used digital formats.
E.g.: converting from JPEG into GIF, we reduce the number of colours.
Durability of information carriers
In the analogue world, this is the most critical problem.
In the digital world, it is a problem of good organization of digital archiving system.
The problem can be high amount of carriers to be monitored - certain reliability of CDs (the redundancy of written-on information can be measured; the copying should be made before reaching the critical point)
Decoding process
The decoding process is the most critical point in both analogue and digital worlds.
The only difference is that the development of digital schemes and platforms is faster. This is an Egyptian
papyrus; we can stillread it and understand it.
Latin
Iste liber pertinet ad dominam abbatissam de convento Georgio Chunegundemfiliam Othakari Regis Bohemie.
Digital resources
Digital Library
On-line externaldocuments
In-house digitizationproducts
Physical mediadocuments
Physical media documents
Issued by various publishers CD, DVD, diskette Great variety of application/access
software versus uniformity of audio CD National Libraries face their long-term
preservation The documents include also software
releases and games
Physical media documentsCritical factors
Information carrier: unreliable diskette and relatively reliable CD (possibility to measure the data redundancy)
Access software: very critical Data and metadata formats: very critical SW/HW platform dependence
We must try to keep them unbound as much as possible from one another.
On-line external documents
We do not acquire them physically, we do access them on networks
Should we try to preserve them as NL? What is their storage format? What we see, it is frequently a transient
document formatted for us (often into HTML) or a part of a static source only.
On-line external documents
Data SourceData Source
request
delivery
Partialor
TransientDocument
Partialor
TransientDocument
In-house digitization and publications
Great chance to behave cleverly To consider all the possible long-term access
problems before to start This can cost more now, but it will surely save much
more money in future The most general recommendation:
STICK TO THE MOST WIDESPREAD STANDARDS
for developing your applications.