digital preservation
DESCRIPTION
Digital Preservation. The Past is Prologue. Developing Preservation Approaches. Diagram by Nancy Y. McGovern based on PhD Research, March 2001. 5 Stages of Digital Preservation. Digitization leads to understanding that digital content needs to be managed and protected - PowerPoint PPT PresentationTRANSCRIPT
Digital Preservation
The Past is Prologue
Developing Preservation Approaches
Diagram by Nancy Y. McGovern based on PhD Research, March 2001
5 Stages of Digital Preservation
1. Digitization leads to understanding that digital content needs to be managed and protected
2. Digital Preservation Projects are initiated
3. Digital Preservation Projects segue into Programs
4. Digital Preservation Programs become comprehensive and coordinated
5. Institutional Programs embrace Inter-institutional Collaboration
Digital Preservation Officer
• First DPO appointed January 2002http://www.library.cornell.edu/iris/dpo/
• coordinates digital preservation policy development and implementation
• serves as the liaison to digital preservation initiatives and projects
• developing a conceptual framework for a cohesive digital preservation program
Models and Standards
• Attributes of a Trusted Digital Repository (RLG-OCLC)
http://www.rlg.org/longterm/attributes01.pdf
• OAIS Reference Model (CCSDS)http://www.ccsds.org/documents/pdf/CCSDS-650.0-R-2.pdf
Models and Standards
• SIP Transfer Issues: • Producer-Archive Interface Methodology Abstract Standard
(CCSDS)
http://ssdoo.gsfc.nasa.gov/nost/isoas/CCSDS-651.0-W-1.pdf
• AIP Components (OCLC/RLG PMWG): • Content Information
• Preservation Description Informationhttp://www.oclc.org/research/pmwg/
• Format Issues: • Draft Standard - Data Dictionary - Technical Metadata for Digital Still
Images (NISO) http://www.niso.org/committees/committee_au.html
Attributes of a Trusted Repository
1. Administrative responsibility
• Provide evidence of fundamental commitment to standards, best practices
• Commit to OAIS model
• Meet standards on environment (6)
• Share measurements with depositors (6)
• Involve external community experts in validating/certifying practices (6)
• Commit to transparency and accountability (6)
2. Organizational viability
• Demonstrate viability and trustworthiness (3)• Reflect commitment to long-term retention/management in
mission statements• Have appropriate legal status, staff and professional development
(1)(3)• Establish transparent business practices, effective management
policies (6)(3)• Define inclusive agreements with depositors (6)• Review/maintain policies and procedures (6)• Undertake risk management, contingency and succession (trusted
inheritors) planning (6)(3)
3. Financial sustainability
• Establish/maintain good business practices and an auditable business plan (1)(2)
• Demonstrate financial fitness and ongoing financial commitment (1)(2)
• Balance risk, benefit, investment, expenditure
• Maintain adequate budget and reserves and actively seek potential funding sources
4. Technological suitability
• Consider/adopt appropriate preservation strategies (6)• Ensure appropriate infrastructure for acquisition,
storage, access (5)• Establish technology management policy for repository
(2)(3)• Comply with relevant standards and best practices,
adequate expertise (6)• Undergo regular external audits on system components
and performance (6)
5. System security
• Assure security of systems for digital assets (3)
• Establish policies and procedures to meet requirements (4)(6)
• Stress processes that will detect, avoid and repair loss, document and notify of changes and resulting actions (4)(6)
6. Procedural accountability
• Enact policies and procedures for tasks and functions, document practices (1)(2)
• Establish monitoring mechanisms to ensure continued operation of systems and procedures (4)(5)
• Record/justify preservation strategies (1)(2)
• Set up feedback mechanisms for problem resolution; negotiate evolving requirements between providers and consumers (1)(2)
Framework Components
• Administrative Responsibility
• Organizational Viability
• Financial Sustainability
• Technological Suitability
• System Security
• Procedural Accountability
Diagram by Nancy Y. McGovern based upon the RLG-OCLC Attributes of a Trusted Repository
Open Archival Information System (OAIS)
Framework to Model
Overview of the OAIS Model
from Reference Model for an Open Archival Information System [4]
OAIS Categories
• [Data Object]• Representation Information
(Structure, Semantic, and Other Information)
• Content Information [1](Data Object + Representation Information)
• Preservation Description Information [2](Reference, Context, Provenance and Fixity Information)
• Descriptive Information (Content Information + PDI)
• Packaging Information [physically and logically binds]
OAIS at Cornell
Preserving Essential Elements
• Content
• Context
• Structure
• Appearance
• Behavior
Emulation
• Jeff Rothenberg
• Dutch National Library
• IBM
• CAMiLEON Project
• David Bearman
Migration
• Risk Management of Digital Information: A File Format Investigation
• Charles Dollar
• Margaret Hedstrom
• CAMiLEON Project
• Dutch Testbed Project
XML and Object-Based
• NARA and SDSC
• Dutch Testbed Project
• Victoria Electronic Records Project (VERS)
• Harvard SIP proposal
Project Prism
CUL Research Team:Anne R. Kenney
Nancy Y. McGovern
Peter Botticelli
Richard Entlich
Risk Management Stages
Typical Stages Prism Stages
1. Risk identification 1. Data gathering
Characterization2. Risk classification
3. Risk assessment 2. Simple risk declaration
3. Contextualized
declaration/detection4. Risk analysis
5. Program implementation 4. Automated enforcement
Levels of Context
• Web page • as a stand-alone object, ignoring its hyperlinks• in local context, considering the links into it
and out from it
• Web site• as a semantically coherent set of linked Web
pages• as an entity in a broader technical and
organizational context
Page-level Monitoring
• Formatting: TIDY• Standards compliance• Document structure• Metadata:
• HTTP headers• HTML headers
• Changes• Content• Location
• Links• Out-link structure• In-link structure• Intra-site • Hub• Volatility
• Page provenance• URL parsing
• Log analysis
Site-level Monitoring
• Graph analysis
• Static site analysis and Longitudinal study
• Aggregate page analyses
• Site maintenance indicators• Backup and archiving policies and procedures
• Hardware and software environment
• Network configuration and maintenance
Research Plan
• Preservation Risk Management for Web Resources: Virtual Remote Control in Cornell’s Project Prism
By Anne R. Kenney, Nancy Y. McGovern, Peter Botticelli, Richard Entlich, Carl Lagoze, and Sandra Payette
DLib Magazine, January 2002http://www.dlib.org/dlib/january02/kenney/01kenney.html
Publisher-Based Digital Archives
Subject-Based Digital Archives
Intersection of Digital Archives
Format-based
Relevant Initiatives
• Metadata Encoding and Transmission Standard (METS) http://www.loc.gov/standards/mets/
[highlighted Web site in RLG DigiNews February 2002]
• Flexible and Extensible Digital Object and Repository Architecture (FEDORA)
• Mellon Fedora Projecthttp://fedora.comm.nsdlib.org
Slides from January 2002 briefing: http://www.cs.cornell.edu/payette/presentations
Relevant External Projects• NEDLIB
• http://www.kb.nl/coop/nedlib/
• CAMiLEON (CEDARS)• http://www.si.umich.edu/CAMILEON/index.htm
• http://www.leeds.ac.uk/cedars/
• PANDORA• http://pandora.nla.gov.au/index.html
• Harvard University LDI• http://hul.harvard.edu/ldi/
• NARA & SDSC• http://www.nara.gov/era/