dave pearson the adventures of digi

48
The Adventures of Digi: Ideas, Requirements and Reality David Pearson National Library of Australia Future Perfect 2012 Digi By Imogene Pearson (7 years) (March 2012)

Upload: future-perfect-2012

Post on 17-May-2015

880 views

Category:

Technology


3 download

DESCRIPTION

The Adventures of Digi:Ideas, Requirements and RealityDave PearsonNational Library of Australia

TRANSCRIPT

Page 1: Dave Pearson The Adventures of Digi

The Adventures of Digi: Ideas, Requirementsand Reality

David PearsonNational Library of

AustraliaFuture Perfect 2012

DigiBy Imogene Pearson (7 years)

(March 2012)

Page 2: Dave Pearson The Adventures of Digi

1.) Some Context

DigiBy Imogene Pearson (7 years)

(March 2012)

Page 3: Dave Pearson The Adventures of Digi

From a preservation point of view, the Library’s digital collections present:

• A mix of materials needing to be kept in perpetuity, along with materials that can be discarded after specified periods or events;

• Mixed levels of complexity in terms of object structure, relationships and dependencies;• Mixed levels of intellectual control;• A wide range of file formats (and carrier formats);• Different levels of complexity in preservation planning and processing;• Different timetables for preservation action;• A need for different preservation approaches, often at different scales; and• A need for recurring – and possibly changing - preservation action cycles over time, using a

changing suite of tools.

Page 4: Dave Pearson The Adventures of Digi

NLA Image

Page 5: Dave Pearson The Adventures of Digi

2.) A caveat

NAA Image

Page 6: Dave Pearson The Adventures of Digi

EcologyEcology or Layers of consciousness for the need for digital preservation intervention

(Given some need to access content over time) Unaware:• I am unaware if I have any digital content; or• I am unaware if I may have a problem accessing any of my digital content.

Aware - no response:• I don’t think that I have a problem accessing any of my digital content;• I recognise that I have a problem accessing some of my digital content; • I recognise that I have a problem accessing some of my digital content. However, the problem is not my

problem; or• I recognise that I have a problem, but have no response in place - not even a limited one.

Aware – taking some action:• I accept that I may have a problem accessing some of my digital content. I am taking limited actions to

manage this problem; or• I accept that I may have a problem accessing some of my digital content. The preservation mandate is a

part of my enterprise or system ecology.

Page 7: Dave Pearson The Adventures of Digi

Another way of looking at it might be:

David Pearson 2012

Page 8: Dave Pearson The Adventures of Digi

3.) What we have come to understand over time.

http://www.motifake.com/79532 via Google Images

Page 9: Dave Pearson The Adventures of Digi

Preservation responsibilities:

Preservation of the Library's digital collections involves three main goals:• Maintaining access to reliable data at bit-stream level;• Maintaining access to content encoded in the bit streams; and• Maintaining access to the intended and available meaning of the content.

While specific preservation activities may focus on one or more of these goals, the Library’s preservation responsibility is only fulfilled when all three goals have been adequately addressed.

This responsibility applies across all digital collections, subject to curatorial and policy decisions for specific groups of digital objects.

Page 10: Dave Pearson The Adventures of Digi

‘Stuffed?’

 

‘Logical on Physical

Stuff’

 

Google Images

Systems to Ingest, Manage, Report and

take Actions

Systems to Ingest, Manage, Report and

take Actions

time

time

Mission: The primary objective of preservation activities within the NLA is to maintain theability to meaningfully access digital collection content over time.

Contextual Information – About

Content

Contextual Information – About

Content

Dependency Information – About

Formats etc.

Dependency Information – About

Formats etc.

Systems to Access – Master or DerivativeSystems to Access – Master or Derivative

‘Logical on Physical

Stuff’

 A B

David Pearson 2012

Page 11: Dave Pearson The Adventures of Digi

Required preservation processes

The Library must be able to:• Understand what it holds in its collections;• Understand what its preservation intentions are for every digital object and what it is

entitled to do to realise its intentions;• Understand what is required to provide access, existing inhibitors to access, and the current

level of support the Library is able to provide;• Evaluate and monitor the degree of risk arising from collection composition, preservation

intentions and available level of support within the Library for digital collection content, and monitor for risk conditions arising during general Digital Library operations;

• Anticipate the effects of changes in support;• Recognise planning triggers, and plan and take appropriate action on a scale appropriate to

the size of the target; and• Audit the effectiveness of its preservation arrangements and modify the arrangements if

necessary.

Page 12: Dave Pearson The Adventures of Digi

Risk or ‘Risk-on’ (are you a splitter or a lumper?)

• ‘parameter-based’ risks: a match against a criterion defined by Library staff to indicate a preservation risk – for example, video encoded with a codec considered to be problematic;

• ‘exception’ risks: the value of a monitored parameter is outside a set of acceptable values; • ‘change’ risks: there has been a change in status for a monitored parameter for content – for

example, the confidence in format identification for a particular file has changed; • ‘conflict’ risks: conflicting values for the parameter are reported by one or more tools – for

example, file format identification returns conflicting values; • ‘unknown value’ risks: undetermined values for defined parameters – for example,

undetermined values for file format and version; and • ‘access support’ risks: changes in level of support which affect the Library’s ability to access

to content in accordance with preservation intent and significance – for example, reduction below an acceptable threshold in the availability of supporting software for a particular file format.

• ‘content-based’ risks: characteristics of content that may not be identifiable from metadata – for example, presence of deprecated HTML tags.

Page 13: Dave Pearson The Adventures of Digi

Likely preservation treatment actions

Broad preservation action approaches that are likely to be required will include:

• Format migration at the point of collecting;• Format migration on recognition of risks;• Format migration at the point of delivery;• Emulation of various levels of software and hardware environments;• Maintenance or supply of appropriate software or hardware;• Documenting known problems for which no other action can be taken; and• Deaccessioning or deletion.

Page 14: Dave Pearson The Adventures of Digi

Prioritising Preservation Treatment:

The Library expects to take into account indicators of ‘preservation intent’, ‘significance’, and ‘level of support’ within monitoring and reporting activities, and in evaluations of risk and prioritisation for preservation planning and action.

http://callmemilo.deviantart.com/art/Thunderbirds-are-GO-20717927

Page 15: Dave Pearson The Adventures of Digi

Preservation intent – indicates the expectations for preservation for content:

• whether content is to be preserved;• who is responsible for preservation of the content;• the period over which content must be preserved;• the required level of support for access to the content over time; for example, that the

Library intends to actively maintain the ability to both present and modify content, or only to present content, or does not intend to actively maintain access to content beyond its expected useful life.

• Preservation intent may also extend to include more specific characteristics to be supported, based on curatorial input or constraints imposed by rights policies or agreements with rights holders.

Page 16: Dave Pearson The Adventures of Digi

Significance – indicates the relative priority required for taking preservation action to maintain access to content, as determined by collection curators; for example, content rated as highly significant would be prioritised for preservation planning and action before content of lower significance.

Level of support – indicates how well a digital collection object is supported within the Library, based on a combination of how much is known about the object and its components (including their file formats), and the degree to which supporting software or hardware environments are available.

NLA Image

Page 17: Dave Pearson The Adventures of Digi

4.) This got us thinking

Colin Webb 2009

Page 18: Dave Pearson The Adventures of Digi

Which turned into this

NLA 2011

Page 19: Dave Pearson The Adventures of Digi

Preservation assessment and reporting

The Library must be able to review the composition and characteristics of its digital collections to assess trends that may affect preservation management, to aid setup of preservation monitoring, planning and action, and to report on specific aspects of content when necessary.

A solution must enable staff to define and request, on both an ad hoc and scheduled basis:

• summary reports of content, metadata characteristics and risks across collections or defined sets of managed content;

• detailed metadata reports for individual items or sets of items; and

• audit trail history reports for individual items or sets of items.

Page 20: Dave Pearson The Adventures of Digi

Reference knowledgebases (General)Enable staff to create, update and maintain reference information

knowledgebases on:• File formats and versions• Software and hardware components that support access to file

formats and versions, for maintaining access to managed content; and

• The level of support available for particular file formats and versions:

– i. sets of software or hardware components available to support access to formats;

– ii. functions supported, both for providing access to content and for use in preservation action – for example, presentation, modification, batch processing;

– iii. fidelity of support – how well functions are supported; and– iv. known risks, including potential inhibitors to preservation,

associated with formats or supporting software or hardware.• Preservation intent descriptions and parameters for sets of

content.

Page 21: Dave Pearson The Adventures of Digi

Other systems are also required to interrelate in this ecosystem such as:

•Preservation monitoring, reporting and prioritisation

•Preservation options and preservation action planning

•Preservation action evaluation

Page 22: Dave Pearson The Adventures of Digi

5.) Pres Intent (current NLA prototype)

David Pearson 2012

Page 23: Dave Pearson The Adventures of Digi

NLA 2011

Page 24: Dave Pearson The Adventures of Digi

Collections

Preservation Intent - Asian Collections and Overseas Collections Management — Version 1.0 Preservation Intent - Australian Books and Serials — Version 1.0 Preservation Intent - Dance — Version 1.0 Preservation Intent - Manuscripts — Draft Preservation Intent - Maps — Version 1.0 Preservation Intent - Music — Draft Preservation Intent - Newsapaper Digitisation — Version 1.0 Preservation Intent - Oral History — Unknown Preservation Intent - Pictures — Version 1.0 Preservation Intent - Selective Web Harvesting — Version 1.0 Preservation Intent - Web Domain Harvests — Version 1.0

Page 25: Dave Pearson The Adventures of Digi
Page 26: Dave Pearson The Adventures of Digi

An attempt to systematise Pres Intent (requires some additional thinking)

Page 27: Dave Pearson The Adventures of Digi

This is how collections thought about it.

Page 28: Dave Pearson The Adventures of Digi

This is how we tended to think about it (a job for a new system).

Page 29: Dave Pearson The Adventures of Digi

==

6.) Info on Formats, software and level of support (some prototyping)

NLA 2011

Page 30: Dave Pearson The Adventures of Digi
Page 31: Dave Pearson The Adventures of Digi
Page 32: Dave Pearson The Adventures of Digi
Page 33: Dave Pearson The Adventures of Digi
Page 34: Dave Pearson The Adventures of Digi
Page 35: Dave Pearson The Adventures of Digi
Page 36: Dave Pearson The Adventures of Digi

7.) Level of support and Prioritisation

NLA 2011

Page 37: Dave Pearson The Adventures of Digi

Level of support (an early concept model)

DP 2011

Page 38: Dave Pearson The Adventures of Digi
Page 39: Dave Pearson The Adventures of Digi

Prioritising preservation treatment based on level of support

In evaluations of risk and prioritisation for preservation planning and action, we must take into the Level of Support/Access Risks and:

• Any constraints imposed by rights policies or agreements; and• The amount of resources available.

Based on these factors, the Library (Management, Collections and Digi Pres) should be able to prioritise material to be preserved.

Page 40: Dave Pearson The Adventures of Digi

8.) Preservation actions and options generation

NLA 2011

Page 41: Dave Pearson The Adventures of Digi

Options for preservation actions

We would like to be able to enable staff to:

•define types of preservation actions for use within preservation planning and evaluation.

•update and delete reference information on options for preservation action, both in general and for particular formats or format types.

•link to information able from the software KB which provides information on what actions specific software might be useful for and the proximity of the software to the format.

•Link to other linked data sources.

Page 42: Dave Pearson The Adventures of Digi

Pres action options generationThe Library must be able to test and evaluate preservation action plans to determine if they satisfactorily achieve the preservation intent for managed content. For example, a solution should:•enable staff to develop and test executable preservation action plans for sets of managed content. Including:

– Single and multiple step actions (combining manual and automated workflows)– Replacing files/s and linkages in complex objects– Linking to a specific emulation environment (if available)– Replacing access software– Specifying that no action is required

•Support simulations or testing of preservation actions against a content Testbed. For example, enable staff to perform 'what if' simulations to determine impact of changes to availability of support for access, including:

– a. Removal of software or hardware sets supporting access, to assess risks or impacts on access; and– b. Addition or revision of software or hardware sets supporting access, to assess proposed remedial preservation action

plans.

•enable staff to define quality assurance criteria for preservation action plan outcomes

Page 43: Dave Pearson The Adventures of Digi

9.) Preservation Options Evaluation

NLA 2011

Page 44: Dave Pearson The Adventures of Digi

Preservation options evaluation

• support import and integration of preservation-treated content and metadata, from either internal or external processes, including:

– a. Verifying that preservation-treated digital content conforms to acceptance criteria for preservation outcomes for designated sets of digital content;

– b. Enabling staff to quality assure and approve preservation-treated digital content for incorporation into the collection; and

– c. After approval, send to preservation action scheduler for treatment of file/s, metadata and associated relationships.

• support ‘rollback’ of updated versions of content, metadata and associated relationships to restore previous versions, if necessary.

• enable staff to define and approve acceptance criteria for preservation action outcomes for sets of managed content.

Page 45: Dave Pearson The Adventures of Digi

10.) So what!

Currently, these ideas and requirements have become ‘partially real’. They still need to be implemented.

They formed the basis for the preservation requirements in a subsequent:

• RFP (Request for Proposal) process; and

• RFT (Request for Tender) process.

http://www.wildsound-filmmaking-feedback-events.com/images/austin_powers_dr_evil.jpg

Page 46: Dave Pearson The Adventures of Digi

RFP

So all of these ideas where consolidated as requirements for a Request for Proposal which went to the market in July 2011.

A number of responses were received for:• Core systems• Preservation • Digitisation• Other Workflows

These were evaluated and some of the vendors were invited to participate in the next stage.

http://www.melbournesumos.com.au/pics/twister/Twister078.jpg

Page 47: Dave Pearson The Adventures of Digi

RFT

Based on the RFP, the NLA clarified the requirements for the next process.

A select group from the RFP process were invited to participated in a Request for Tender in which closed in late December 2011.

http://simpro.co/wp-content/uploads/2010/10/paperwork2.jpg

Page 48: Dave Pearson The Adventures of Digi

What version of reality

have we decided upon?

What version of reality

have we decided upon?

http://www.flickr.com/photos/ricksmit/15671245/

Everything, for EveryoneForever

DigiBy Imogene Pearson (7 years)

(March 2012)