key terms and concepts: introducing first principles
TRANSCRIPT
Overview
digital imagingresolution and bit depthfile typescolour managementmetadatadigital librariesdigital preservation
Reminder: what can be delivered digitally
Born digital contentPaperText contentBound volumes or manuscriptsPhotographs – prints, slides, transparenciesMicrofilm, microfiche and aperture cardsVideo and audioMaps, drawings and large paper formatsOriginal art works, textiles etc.Physical 3-dimensional objects or views
Digitization is?
Digitization is the process of converting analogue originals to computer-readable formDigital imaging and scanning are mechanisms for capturing a digital pictureA digital image is sampled and mapped as a grid of squares known as picture elements (pixels)
Digital files use a binary format with a series of ‘1’ and ‘0’, ‘on’ and ‘off, to represent data
like a light switch ...Tech joke: there are 10 types of people, those who understand binary and those who don’t
Digitization processes: Scanning
Capturing lines of pixels moving across the object
Used in flatbed scanners, slide scanners and scanning back cameras for instance
Digitization processes: Digital Photography
Digital photography or direct digital capture: captures all the pixels in a single matrix
Used in digital cameras and bookscanners for instance
Digital text
Digital text requires additional processes:Optical Character Recognition (OCR)
Rekeying
Mark-up:XML, SGML
Pixels redux
Pixels are picture elementsThey are usually squareThey are the smallest component of the digital imageBy combining pixels in different orientations and density we get shapes and contentBy changing the tonal values of pixels we get colouri.e. resolution and bit depth
Resolution
describes the density of spatial detail
is usually expressed as dots-per-inch (dpi) or pixels-per-inch (ppi)these terms are synonymous, but dpi usually refers to printed images and ppi to screen imagesremember the spatial detail is in relation to the original item imaged
Resolution
It is often more useful to use absolute terms for resolution
actual pixel dimensions are given2490 x 3510 for exampleEquals the pixel dimensions of an A4 sheet of paper scanned at 300 dpiBut also equals* the dimensions of * within 5%
A5 page at 425 dpiA3 page at 200 dpiA2 page at 150 dpi8.7 Megapixel digital camera image of a landscape @ 96dpi
Bit depth
Defines the colour space for each image and pixel
this is the number of bits (binary digits) used to define each pixels tonal value
Black and white (bitonal) = 1-bit per pixel
Greyscale = 8-bit (256 shades of grey)
RGB Colour = 24-bit (16.7 million colour tones)
Some rules of thumb
Resolution:capture the smallest significant detailthe smaller the original the higher the resolutiondouble the resolution - quadruple the filesize
Bit depth:1 bit = Black and white8 bit = greyscale (x8 filesize)24 bit = full RGB colour (x24 filesize)CMYK: avoid using for scanning or storage
Select the right colour space for your original
Digitization Basics: Tutorials
Cornell Digital Imaging Tutorialwww.library.cornell.edu/preservation/tutorial/contents.html
Digital files
Can use compression to reduce file sizes. There are 2 main types:
Lossythere is irrecoverable loss of data with inevitable worsening of quality, but can achieve considerable size reductionsJPEG
Losslessno loss of data, but not such great size reductionsLZW, ITU.T.6 (formerly CCITT Group 4)
Some common file formats
There are many, many file formatsThe commonest you will meet are probably:
TIFFGIFJPEGPDF
TIFF: Tagged Image File Format
De facto standard
Needs plug-in or external application for web display although some browsers now accept itCan be tagged with basic metadataCan be used for files up to a bit depth of 64The format of choice for long-term archiving
JPEG:
Joint Photographic Expert’s Group/JFIF (JPEG File Interchange Format)
De facto standard for web displayNative to web browsers (ie no plug-ins needed)Has free-text comment field for metadataCan be used for files up to 24 bitCommonly used for web display imagesJPEG2000 – enables zooming and more metadata
GIF: Graphics Interchange Format
De facto standard for web display
Native to web browsers (ie no plug-ins needed)
Has free-text comment field for metadata
Can be used for files up to 8 bit
Commonly used for web display images
Likely to be replaced by PNG (Portable Network Graphics)
PDF: Portable Document Format
Proprietary (Adobe) format, but now a de facto standard for document deliveryNeeds plug-in or external application for web displayCan be used for files up to 64 bitUsed for printing and viewing multipage documentsComes in 3 versions:
Image only
Image and text
Full text
Colour Management: What is it?
Colour is device dependent and looks different when:
printed on different printersviewed on different monitorsprinted on a printer and viewed on a monitorviewed in a light booth and under office lighting
Colour Management Systems (CMS) maintain the consistent and accurate "appearance" of a colour on different devices (e.g. scanners, monitors, printers, etc.) throughout an imaging workflow
"Colour" Workflow
RGB Scanner
Original App
Driver
Displays
Scanner
RGBs
SendsRGBs orCMYKs
to Printer
RGB Display
CMYK Printer
Colour Management: components
Use a consistent colour spaceApply an independent colour profile
International Color Consortiumwww.color.org
Monitor calibrationColour targets
GretagMacbethwww.gretagmacbeth.com
What is metadata?
Tony Gill – ARTstorMetadata refers to structured descriptions, stored as computer data, that attempt to describe the essential properties of other discrete computer data objects.
Big picture definition: the sum total of what can be said about any information object at any level of aggregation
What is metadata for?
World Wide Web consortium say metadata is:
to provide a means to discover that the data set exists and how it might be obtained or accessedto document the content, quality, and features of a data set, indicating its fitness for use.
Therefore we need to think:content, context and structure
What characterises a digital library
1. A digital library is a managed collection of digital objects
2. The digital objects are created or collected according to principles of collection development
3. The digital objects are made available in a cohesive manner, supported by services necessary to allow users to retrieve and exploit the resources just as they would any other library materials
4. The digital objects are treated as long-term stable resources and appropriate processes are applied to them to ensure their quality and survivability."
What is collection development?
American Library Association's definition:
"A term which encompasses a number of activities related to the development and determination of the collection, including the determination and coordination of selection policy, assessment of needs of users and potential users, collection evaluation, identification of collection needs, selection of materials, planning for resource sharing, collection maintenance, and weeding."
(ALA Glossary of Library & Information Science)
Digital Preservation: digital lifecycle approach
‘The major implications for lifecycle management of digital resources, whatever their form or function, is the need to actively manage the resource at each stage of its lifecycle and to recognise the interdependencies between each stage and commence preservation activities as early as practicable. This represents a major difference with traditional preservation, where management is largely passive until detailed conservation work is required, typically many years after creation and rarely, if ever, involving the creator. There is an active and interlinked lifecycle to digital resources which has prompted many to promote the term 'continuum' to distinguish it from the more traditional and linear flow of the lifecycle for traditional analogue materials.’
Preservation Management of Digital Materials: A Handbook - Neil Beagrie & Maggie Jones www.jisc.ac.uk/dner/preservation/dpc/