key terms and concepts: introducing first principles

38
Key terms and concepts: introducing first principles

Upload: emory-dickerson

Post on 26-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Key terms and concepts:introducing first principles

Overview

digital imagingresolution and bit depthfile typescolour managementmetadatadigital librariesdigital preservation

Reminder: what can be delivered digitally

Born digital contentPaperText contentBound volumes or manuscriptsPhotographs – prints, slides, transparenciesMicrofilm, microfiche and aperture cardsVideo and audioMaps, drawings and large paper formatsOriginal art works, textiles etc.Physical 3-dimensional objects or views

Digitization is?

Digitization is the process of converting analogue originals to computer-readable formDigital imaging and scanning are mechanisms for capturing a digital pictureA digital image is sampled and mapped as a grid of squares known as picture elements (pixels)

Digital files use a binary format with a series of ‘1’ and ‘0’, ‘on’ and ‘off, to represent data

like a light switch ...Tech joke: there are 10 types of people, those who understand binary and those who don’t

Digitization processes: Scanning

Capturing lines of pixels moving across the object

Used in flatbed scanners, slide scanners and scanning back cameras for instance

Digitization processes: Digital Photography

Digital photography or direct digital capture: captures all the pixels in a single matrix

Used in digital cameras and bookscanners for instance

Digital imaging = digital pictures

All we get is a digital picture, not digital text

Digital text

Digital text requires additional processes:Optical Character Recognition (OCR)

Rekeying

Mark-up:XML, SGML

Pixels redux

Pixels are picture elementsThey are usually squareThey are the smallest component of the digital imageBy combining pixels in different orientations and density we get shapes and contentBy changing the tonal values of pixels we get colouri.e. resolution and bit depth

Resolution

describes the density of spatial detail

is usually expressed as dots-per-inch (dpi) or pixels-per-inch (ppi)these terms are synonymous, but dpi usually refers to printed images and ppi to screen imagesremember the spatial detail is in relation to the original item imaged

Resolution

It is often more useful to use absolute terms for resolution

actual pixel dimensions are given2490 x 3510 for exampleEquals the pixel dimensions of an A4 sheet of paper scanned at 300 dpiBut also equals* the dimensions of * within 5%

A5 page at 425 dpiA3 page at 200 dpiA2 page at 150 dpi8.7 Megapixel digital camera image of a landscape @ 96dpi

Resolution is spatial density

Bit depth

Defines the colour space for each image and pixel

this is the number of bits (binary digits) used to define each pixels tonal value

Black and white (bitonal) = 1-bit per pixel

Greyscale = 8-bit (256 shades of grey)

RGB Colour = 24-bit (16.7 million colour tones)

Some rules of thumb

Resolution:capture the smallest significant detailthe smaller the original the higher the resolutiondouble the resolution - quadruple the filesize

Bit depth:1 bit = Black and white8 bit = greyscale (x8 filesize)24 bit = full RGB colour (x24 filesize)CMYK: avoid using for scanning or storage

Select the right colour space for your original

Digitization Basics: Tutorials

Cornell Digital Imaging Tutorialwww.library.cornell.edu/preservation/tutorial/contents.html

Digital files

Can use compression to reduce file sizes. There are 2 main types:

Lossythere is irrecoverable loss of data with inevitable worsening of quality, but can achieve considerable size reductionsJPEG

Losslessno loss of data, but not such great size reductionsLZW, ITU.T.6 (formerly CCITT Group 4)

Some common file formats

There are many, many file formatsThe commonest you will meet are probably:

TIFFGIFJPEGPDF

TIFF: Tagged Image File Format

De facto standard

Needs plug-in or external application for web display although some browsers now accept itCan be tagged with basic metadataCan be used for files up to a bit depth of 64The format of choice for long-term archiving

JPEG:

Joint Photographic Expert’s Group/JFIF (JPEG File Interchange Format)

De facto standard for web displayNative to web browsers (ie no plug-ins needed)Has free-text comment field for metadataCan be used for files up to 24 bitCommonly used for web display imagesJPEG2000 – enables zooming and more metadata

GIF: Graphics Interchange Format

De facto standard for web display

Native to web browsers (ie no plug-ins needed)

Has free-text comment field for metadata

Can be used for files up to 8 bit

Commonly used for web display images

Likely to be replaced by PNG (Portable Network Graphics)

PDF: Portable Document Format

Proprietary (Adobe) format, but now a de facto standard for document deliveryNeeds plug-in or external application for web displayCan be used for files up to 64 bitUsed for printing and viewing multipage documentsComes in 3 versions:

Image only

Image and text

Full text

Colour Management: What is it?

Colour is device dependent and looks different when:

printed on different printersviewed on different monitorsprinted on a printer and viewed on a monitorviewed in a light booth and under office lighting

Colour Management Systems (CMS) maintain the consistent and accurate "appearance" of a colour on different devices (e.g. scanners, monitors, printers, etc.) throughout an imaging workflow

"Colour" Workflow

RGB Scanner

Original App

Driver

Displays

Scanner

RGBs

SendsRGBs orCMYKs

to Printer

RGB Display

CMYK Printer

Colour Management: components

Use a consistent colour spaceApply an independent colour profile

International Color Consortiumwww.color.org

Monitor calibrationColour targets

GretagMacbethwww.gretagmacbeth.com

Metadata

What is metadataWhat is metadata for

What is metadata?

Tony Gill – ARTstorMetadata refers to structured descriptions, stored as computer data, that attempt to describe the essential properties of other discrete computer data objects.

Big picture definition: the sum total of what can be said about any information object at any level of aggregation

What is metadata for?

World Wide Web consortium say metadata is:

to provide a means to discover that the data set exists and how it might be obtained or accessedto document the content, quality, and features of a data set, indicating its fitness for use.

Therefore we need to think:content, context and structure

What characterises a digital library

1. A digital library is a managed collection of digital objects

2. The digital objects are created or collected according to principles of collection development

3. The digital objects are made available in a cohesive manner, supported by services necessary to allow users to retrieve and exploit the resources just as they would any other library materials

4. The digital objects are treated as long-term stable resources and appropriate processes are applied to them to ensure their quality and survivability."

What is collection development?

American Library Association's definition:

"A term which encompasses a number of activities related to the development and determination of the collection, including the determination and coordination of selection policy, assessment of needs of users and potential users, collection evaluation, identification of collection needs, selection of materials, planning for resource sharing, collection maintenance, and weeding."

(ALA Glossary of Library & Information Science)

Digital Preservation: digital lifecycle approach

‘The major implications for lifecycle management of digital resources, whatever their form or function, is the need to actively manage the resource at each stage of its lifecycle and to recognise the interdependencies between each stage and commence preservation activities as early as practicable. This represents a major difference with traditional preservation, where management is largely passive until detailed conservation work is required, typically many years after creation and rarely, if ever, involving the creator. There is an active and interlinked lifecycle to digital resources which has prompted many to promote the term 'continuum' to distinguish it from the more traditional and linear flow of the lifecycle for traditional analogue materials.’

Preservation Management of Digital Materials: A Handbook - Neil Beagrie & Maggie Jones www.jisc.ac.uk/dner/preservation/dpc/