digital preservation and management. preserving digital resources: why is it an issue? technology...
Post on 28-Mar-2015
216 Views
Preview:
TRANSCRIPT
Digital Preservation and Management
Preserving Digital Resources: Why is it an Issue?
Technology obsolescenceDigital media life expectancyVariety of file formatsDigital rights managementCostsOrganizational resistance
Assumptions
Digital preservation is more challenging and complex than preservation of analog objectsDigital preservation is more than a technical preservation strategy
“THE” solution doesn’t existDigital preservation needs to be integrated into organizational culture
Assumptions
Change HappensFile formats matter
Non-proprietary is best; de facto standards are good
System architecture and documentation mattersOpen systems that can be moved to other platforms
Technology isn’t the whole solutionPolicies, planning, and resources
The community is just beginning to work on these issues – and everything is new and is changing
Terms
Digital Object: Any resource that can be stored or manipulated by a computerDigitized Resources: Any resource that has been digitized from an analog sourceBorn Digital: Any resource that was created digitally and will be managed and preserved digitally
Terms
Digital preservation/archiving: Storage, maintenance, and access to a digital object over the long term, usually as a consequence of applying one or more preservation strategies
Terms
Viability: maintenance of the bitstreamRenderability: viewable by humans and “processable” by computersUnderstandability: interpretable by humansFixity: The state or quality of being fixed or unchanged.Reliability: the digital objects are created in a trustworthy way. They are what they say they areAuthenticity: the digital object remains reliable over time
Digital Preservation Strategies
Bitstream CopyingRefreshing Durable/Persistent Media Technology Preservation Digital ArchaeologyAnalog BackupsMigration
ReplicationReliance on StandardsNormalizationCanonicalizationEmulationEncapsulationUniversal Virtual Computer
Trusted Digital Repositories
A repository whose mission is to provide reliable, long term access to managed digital resources to a community, now and in the future.
Trusted Digital Repositories
AttributesAdministrative responsibilityOrganizational viabilityFinancial sustainabilityTechnological suitabilitySystem securityProcedural accountabilityOAIS compliant
Trusted Digital Repositories
Implementation approaches will varyApproach will depend on:
ContextUsers (designated community)
Underlying issue remains constantFunctionalityReliability and authenticity
Open Archival Information System (OAIS) Reference Model
Conceptual framework for an archival system dedicated to preserving and maintaining access to digital information over the long termConsists of people and systems
http://ssdoo.gsfc.nasa.gov/nost/isoas/overview.html (overview)http://ssdoo.gsfc.nasa.gov/nost/wwwclassic/documents/pdf/CCSDS-650.0-B-1.pdf (standard)
OAIS: What is it?
Any organization or system charged with the task of preserving information over the long term and making it accessible to a specific group of users
An OAIS archive is expected to meet certain minimum responsibilities
OAIS: Minimum Responsibilities
Negotiate and accept appropriate information from information creatorsObtain sufficient control over the information to ensure preservationDetermine the scope of the “Designated Community” (the users)Ensure that users can understand the information without assistance from the information creators
OAIS: Minimum Responsibilities
Follow documented policies and procedures
Ensure preservationAuthenticate informationDisseminate (provide access to) information
Make the information available to the Designated Community
Preservation Planning
Monitoring technology and users; developing preservation actions
Preservation planning is part of the administration functions of any archival program; OAIS has highlighted it as a distinct function
Importance of constant and ongoing management and planning for digital preservation call for this
Components of a Digital Preservation Program
TDR and OAIS imply that there are three components of a digital preservation program
Resources Framework (trust)Organizational Infrastructure (policy)Technological Infrastructure (technology)
Resource Framework
Nothing is sustainable without ongoing commitment of resourcesA high level commitment to digital preservation must demonstrate an adequate resource commitment
Deliverables that meet the goalsLine item budgetsStaff commitmentStrategic planningProjections for costs and funding scenarios
Resource Framework
Commitment of resources (time, money, staff) implies organizational commitment and reflects organizational prioritiesStaffing is the expensive part!Curatorial functions
Appraising, acquiring, processing, metadata creation, ongoing management, access
Technical functionsComputer operation, system administrator, database administrator, storage administrator, application programmer, preservation expertise
Planning
Identify stakeholders and their rolesEducateAll partners need a desired outcome
Tangible or intangibleBuy-inMission, goals, outcomes
Organizational Infrastructure
Organizational and Curatorial Responsibilities
Policy frameworkOperational Responsibilities
Planning frameworkFunctions and roles
Organizational and Curatorial Responsibilities – Policy Framework
Strategic PlanCollection PolicySecurity PolicyPreservation PolicyAccess Policy
Strategic Plan
Overview and scope of the digital preservation program and its contextMission/PurposeHigh level goals and objectivesCommitment to OAIS and community best practicesRelated documentation and who is responsibleAdministrative/Oversight structureHigh level audience statement
Audience (Designated Community)
OAIS requirement
ExplicitAll collectionsPer collection
Audience=assumed knowledge and resources
Impacts of Audience Identification
The kinds of collections you will acceptThe kind of descriptive information (metadata) you will provideThe kind of services you will offer
Software, translatorsThe kind of preservation actions chosen
Significant propertiesThe access mechanisms you need to provide
Collection Policy
What kinds of digital resources are you going to collect and digitally preserve?Content considerations
Are you focusing on a specific content area?
Rights management considerationsMetadata responsibilities and requirementsRequirements for documenting acquisitions
Collection Policy
Technical considerationsDigitization with no physical counterpartDigitization with a physical counterpartAnything born digitalBorn digital that can’t be reformatted to eye readable
Collection Policy
Are there further limitations on what you will collect? (examples)
Non-proprietary formats onlySpecific formats only (TIFF)Systems/databases onlyDistinct documents onlyMinimum amount of metadata required at time of acquisitionMaterials that can be digitally reformatted in a specific way
Move everything to TIFF?Move everything to XML?
Documenting Acquisitions
OAIS requires agreements with depositors that address acquisition, maintenance, access and withdrawal
Should already be using these kinds of agreementsMay need to revise for digital materials, to include
What happens if functionality is lost?Is reformatting to eye readable an acceptable preservation option?What kind of access can you provide and is it acceptable?Are there digital-specific copyright issues to consider?
Documenting Acquisitions
May need to revise for digital materials, to include
Metadata creation responsibilitiesRights managementWhat level of functionality will be available from the digital repository?
Security Policy
System securityPhysical environmentBackup and recovery Fixity of the data (reliability)Disaster preparedness and responsePlanning and documentation requirementsAssign responsibility
Preservation Policy
Commitment to digital preservationGoals of digital preservationScope of materials
FormatsMetadata suppliers
Access commitments
Preservation Policy
Definition of overall preservation strategyAre there limitations?What happens if preservation actions go wrong?Is reformatting to eye-readable an acceptable preservation action? Under what circumstances?
Planning and documentation requirementsResponsibilities assigned
Operational Responsibilities
Based on work done by OAIS community to define the principle obligations of an OAIS compliant repositoryAppropriate planning documentation will be necessary to carry out operations
Specific planning based on strategic plan and policies
Operational Responsibilities
AcquisitionPhysical and intellectual controlDetermines audience (designated community)Follows policies and procedures to assure preservation of authentic informationAccessPromotes development of best practices and standards
Acquisition
Development of collection policiesIncludes specific required formats, if appropriate
Procedures and workflows for copyright clearance for access and preservationMetadata specifications and implementationProcedures to ensure the authenticity of submitted materialAssessment of the completeness of the submissionDocumentation of all acquisition transactions
Control
Preparing the materials for storage
Content analysisSignificant propertiesVerification of metadataUnique and persistent identifier assignedAuthenticity and integrity checkMove to archival storage
Preservation Actions
Monitoring of technology and the digital materialsTechnology watchPreservation planning
Classes of materialActions to be takenDocumentation of actions and resultsFunctionality considerations
Access
A system for resource discoveryMechanism for authenticity checkAccess control mechanismsUser support
Standards and Best Practices
Promote and utilizeResults in economies of scaleCreation of high quality digital resources that are more amenable to preservation
Work with software suppliers, potential depositors, designated communities
In-house
Significant investmentTechnical expertiseWorkflow impacts
Maintain physical control
Outsource
Can the service provider meet your needs and requirements?Less investment?
No cost models to show if this is accurate
Less reliance on in-house technical expertise and infrastructure necessaryWhat happens if the service provider goes out of business?
Combination
Build what you canBuild what you need that can’t be outsourcedBuy what you can’t build
Now, digital repositories…
OAIS Metadata Implications
Metadata is data that facilitates the management, description, and preservation of a digital object or aggregation of digital objects. Standards and best practices are developed to promote the creation of metadata to it supports interoperability and collaboration. Metadata setsMetadata encoding schema
Types of Metadata
DescriptiveTechnicalStructuralAdministrativePreservation
Metadata
Each type of metadata will be needed to facilitate the preservation and usability of born digital materialUse standards and best practice metadata setsThink interoperability
TechnologicallyElement sets
Immediate Actions
Get Your Team TogetherIdentify your needs
Do you really need a digital repository right NOW?Is there an interim solution until the field is more settled?
Agree on vision and goalsPlan
Immediate Actions
Discuss strategyCommunication
Any institutional repository depends on a relationship with IT staff
PrioritiesLanguage barriers
Immediate Actions
Identify the organizational infrastructure changes that need to be madeInvestigate existing tools and digital repositoriesLearn and experiment with existing toolsMake high level decisions
What kind of digital materials are we going to commit to preserving?
Immediate Actions
FundingInventories of digital resourcesEstablish metadata standards and practicesIdentify and understand users
Take Home Concepts
Use standards and best practices
The solution is complex; the tools are incomplete
Organizational and technological challenges
Learn about what others are doing and build on itDon’t reinvent the wheel
Take Home Concepts
Resources are the issuePeople, not computers!
Expect and plan for changeThis is all a work in progress“First generation” technologies, tools, understanding of issuesYou will redo work
Existing Tools
Tools
Technical toolsInterfaces, infrastructure and technologies that allow you to do the work necessary to create, manage and preserve digital resources
Examples might include:Metadata creationFile format verificationAlgorithms for fixity checksAppraisal/processing toolsAccess tools – indexing, finding aids, etc.Acquisition tools
Tools
Few currently existOptions
WaitBuild your ownModify existing toolsUse what there is
Tools
DSpaceFedoraTM LOCKSSGreenstoneOCLC Digital Archive
DSpace
A specialized content management system that:
manages and distributes digital itemsallows for creation, indexing and searching of metadatasupports long term preservation of materialdesigned to make submission and administration easy
DSpace
Developed by MIT and Hewlett PackardBased on freely available software
can use proprietary software as well with minor modifications
Customizable Academic community is especially active in the use of this implementationUNIX based; written in Java
DSpace
No support availablePreservation is done locally and is not inherent in the systemDownloads and specific information at http://www.dspace.org
Dspace Demo - MIT Presshttps://hpds1.mit.edu/handle/1721.1/1776
FedoraTM
Flexible Extensible Digital Object and Repository Architecture“An Open-Source Digital Repository Management System” – the architectural underpinning or plumbingUsed to support institutional repositories, digital libraries, content management, digital asset management, scholarly publishing, and digital preservation
FedoraTM
Cornell and University of Virginia, funded by MellonFreely available Based on open source software and web based technologiesLimited interfaces
ManagementAccessAccess Lite
Persistent ID (P ID)
Method DefinitionMetadata
System Metadata
Datastreams (specs)
Persistent ID (P ID)
Method ImplementationMetadata
System Metadata
Datastreams(executables)
Behav ior DefinitionObject
Behav ior MechanismObject
Persistent ID (PID)
Disseminators
System Metadata
Datastreams
Data Object
FedoraTM Architectural Model
FedoraTM
Installs on Windows PCPackaged to get up and running quicklyDemo set of objectsScales with hardware in a production environmentNo support availablePlumbing only; no inherent preservationDownloads and information available at http://www.fedora.info
LOCKSS
Lots of Copies Keeps Stuff SafeTo safeguard web journals libraries subscribe toMimics the way libraries manage paper collections
Redundant, distributed, decentralized
LOCKSS
Works only for HTTP/HTML standard file types (html, jpeg, gif, pdf, etc)Open source code
It can be modifiedDesigned to be low cost, low time
Will run on a dedicated PCPC specs available on the LOCKSS site
LOCKSS
Publishers can prevent LOCKSS from caching their contentPublishers must give libraries permission
Licensing language available on the LOCKSS web site
Freely availableNo support (ease of use is highlighted)Preservation is not inherenthttp://lockss.stanford.edu/
Greenstone
A suite of software for building and distributing digital library collectionsProduced by the New Zealand Digital Library Project at the University of WaikatoDeveloped and distributed in cooperation with UNESCO and the Human Info NGO. Open-source, multilingual software, issued under the terms of the GNU General Public License.
Greenstone
“Should in fact work on any Windows or Unix system.”“Local library”“Web library”Greenstone Librarian InterfaceThe “Organizer”
Greenstone
Documentation is availableInstaller's GuideDeveloper's Guide Paper to Collection Inside Greenstone Collections MG/MG++
Workshops are also heldListservs for implementorsSome technical support availableNot preservation orientedhttp://www.greenstone.org/cgi-bin/library
OCLC Digital Archive
Standards basedOAIS compliantMETS encoded dissemination packages
Phased support for various formats and material type
Currently text and still imageCan integrate with current library selection and cataloging activitiesContent owner manages the archived objects and determines accessKnown costsOffers bit preservation
OCLC Digital Archive Functions
Harvest from webpreview and review
Metadata creationIngest
From web or batchAccess management
public or restricted ViewingDisseminationReportsPeriodic Audits of Objects in the ArchiveFrequent Backups and Disaster Prevention
Digital Archive Web Services
End User Access
OCLC Digital Archive Development
Preservation policy and plans in progress
Expanding formats and object types accepted
Active in development of preservation metadata standard and will comply
Active in developing digital repository certification
Additional information available at:http://www.oclc.org/support/training/digitalarchive/http://www.oclc.org/support/documentation/digitalarchive/
Other Tools
Australian PANDAS-PANDORACONTENTdm (content management)SDSC Data Grid TechnologyWeb harvesting toolsE-records management softwareDocument management systemsData warehousing technologyXML parsing tools
SDSC and others
top related