artstor digital preservation shared challenges, gaps, and needs
TRANSCRIPT
ARTstor
Digital Preservation
Shared Challenges, Gaps, and Needs
ARTstor
DJW, 3/2/05: 2
Shared Challenges, Gaps, and Needs
Reflections on the anniversary of the Report of the CLIR/RLG Task Force on Digital Archiving
The grand challenges of Digital Preservation
Intellectual property and the relation between preservation and access
A network of trusted institutions and the question of certification
The interoperability gap
Business models and models of cooperation
ARTstor
DJW, 3/2/05: 3
The Task Force on Archiving of Digital Information
May 1st was the tenth anniversary of the report
Full disclosure
The test of time:
Key findings
Recommendations
ARTstor
DJW, 3/2/05: 4
Key findings of the Task Force
The first line of defense rests with the creators, providers and owners of digital information.
Long-term preservation requires a deep infrastructure capable of supporting a distributed system of digital archives.
A critical component of the digital archiving infrastructure is the existence of a sufficient number of trusted organizations
A process of certification for digital archives is needed to create an overall climate of trust about the prospects of preserving digital information.
Certified digital archives must have the right and duty to exercise an aggressive rescue function as a fail-safe mechanism for preserving valuable digital information that is in jeopardy of destruction, neglect or abandonment by its current custodian.
ARTstor
DJW, 3/2/05: 5
Task Force Recommendations (1)
Pilot projects
Preserve objects of the early digital age
Identify and address economic and legal barriers
Applications of technologies and services (emulation, IP transactions, object authentication mechanisms
ARTstor
DJW, 3/2/05: 6
Task Force Recommendations (2)
Support structures
Preservation-friendly national policies (network pricing, security, tax incentives and accounting structures)
Legal foundations for effective fail-safe archives
Community- and discipline-based archives
Standards, criteria, and mechanisms for certification
Points of contact for international cooperation
ARTstor
DJW, 3/2/05: 7
Task Force Recommendations (3)
Best practice case studies
Facilitating archiving at creation
Massive storage
Metadata standards
Migration paths
ARTstor
DJW, 3/2/05: 8
Preservation state of the art
From prayers … James Gleick reported in The New York Times (“The Digital Attic:
An Archive of Everything,” 12 April 1998) that "the Daiho Temple of Rinzai Zen Buddhism held a ‘memorial service for lost information’ in Kyoto and online."“After the effort of transforming all this knowledge into electronic information has been completed, is it enough then to say that we are finished? ... There are many 'living' documents and softwares that are thoughtlessly discarded or erased without even a second thought. It is this thoughtlessness that has drawn the concern and attention of Head Priest Shokyu Ishiko. Head Priest Ishiko hopes that through holding an "Information Service" and by teaching the words of Buddha, that this 'information void' will cease to exist” (http://www.thezen.or.jp/jomoh/kuyo.html).
We need to help them!
… To this workshop
ARTstor
DJW, 3/2/05: 9
The grand challenges of Digital Preservation
Intellectual property and the relation between preservation and access
A network of trusted institutions and the question of certification
The interoperability gap
Business models and models of cooperation
ARTstor
DJW, 3/2/05: 10
Intellectual property
Common complaint: it is very difficult to preserve intellectual property because copyright law limits copying for such purposes.
The issue is especially complicated because of the popular formula—“preservation is access”
This equation arose mainly to mobilize efforts to preserve out-of-copyright brittle books.
But if preservation is a backdoor for redistribution of “at-risk” in-copyright materials then, rights-holders would resist
ARTstor
DJW, 3/2/05: 11
An aggressive rescue function
The Task Force subcommittee addressing the IP topic had high aspirations for detailed recommendations, but generated little consensus and so dealt with intellectual property in a limited way.
It focused on a simple call for the law, and particularly Section 108 of the copyright law, to allow for an aggressive rescue function.
The Task Force formulation:
“No distributed system of digital archives will afford effective protection of electronic information unless it provides for a powerful rescue function allowing one agency, acting in the long-term public interest of protecting the cultural record, to override another’s neglect of or active interest in abandoning or destroying parts of that record.”
Neglect and willful abandonment and destruction are prominent attributes of the digital environment
ARTstor
DJW, 3/2/05: 12
Neglect and willful abandonment or destruction
The loss of the Task Force correspondence archives and the preservation prayer.
The judgments against Morgan Stanley for failing to preserve email.
Publishers appear to be highly vulnerable to legal demands, editorial second-guessing, and other activities that result in the removal of materials from their own archives.
The Elsevier “Vanishing act” was documented on the LIBLICENSE listserv in 2004
Such “acts” produce a “Swiss cheese” effect in the cultural record
Jim O’Donnell: The “Vanishing Act” discussion “is disturbing, because it is the tip of the iceberg, I think: if for fairly transient reasons, publishers will pull articles, when might not publishers prove unreliable for other reasons?”
ARTstor
DJW, 3/2/05: 13
Solutions?
Jane Ginsburg and June Besek are engaged in a Mellon-funded study of legal strategies for protecting archives that are preserving parts of the cultural record from being subject to takedown demands
Emergence of discussion of light versus dark archives, where “dark archives” restrict archives
Will restricted archives attract investment?
The experience of LOCKSS and especially Portico is relevant
The Portico business model had to be changed to exclude deep investment in access, because publishers would not contribute and the access investment proved too expensive for libraries
The Section 108 Study Group recently took public comments on its version of the “aggressive rescue function:” a “preservation-only” exception to the copyright law
ARTstor
DJW, 3/2/05: 14
Preservation-only exception
An exception would allow copying of “at risk” materials by qualified institutions from which access would be severely restricted
Access would be articulated in the definition of “trigger conditions” allowing varying types of access depending on conditions such as:
Monitoring by staff and “auditors”
Need for use by qualified researchers
Creating a “replacement copy,” if the work is no longer available on the market at a fair price
If the work is abandoned or orphaned
If permission is given
When copyright expires
ARTstor
DJW, 3/2/05: 15
Conclusion on IP issues
There are no easy answers
Although preservation may not equal access, the “triggers” suggest that there is a relationship between preservation and access, and that it is subtle and nuanced.
In the preservation arena, IP may need to be addressed by attention to nuance.
ARTstor
DJW, 3/2/05: 16
The grand challenges of Digital Preservation
Intellectual property and the relation between preservation and access
A network of trusted institutions and the question of certification
The interoperability gap
Business models and models of cooperation
ARTstor
DJW, 3/2/05: 17
A network of trusted institutions
The Task Force suggested that:
“Repositories claiming to serve an archival function must be able to prove that they are who they say they are by meeting or exceeding the standards and criteria of an independently-administered program for archival certification.”
Certification has attracted considerable attention.
But trust is the central issue.
ARTstor
DJW, 3/2/05: 18
The Need for Trust
Trust is important mainly in its absence:
When there is a lack of confidence
When there are multiple interests in an object or activity and one or more parties lack control
When there is the potential for those in control to cause loss or harm, either deliberately or inadvertently
In the preservation context, there is a need for trust because:
There is uncertainty about how to preserve information in a digital environment
Those interested in preserving the scientific and cultural record do not control it
The potential for loss is great and growing
ARTstor
DJW, 3/2/05: 19
Preservation features that build trust
Technical ability:
The repository must be able to maintain the authenticity and integrity of the information
The information must demonstrably be what it purports to be.
Organizational design
A commitment to the preservation mission,
Protections and controls against loss
Well-defined preservation services
ARTstor
DJW, 3/2/05: 20
A significant body of work on trust in archives
RLG/OCLC Working Group on Digital Archive Attributes (2002)
RLG-NARA Task Force on Digital Repository Certification (2005)
Center for Research Libraries is testing the feasibility of a certification process against a variety of existing or emerging archives
Meanwhile:
The National Science Board report on “Long-Lived Data Collections” (2005) observed layers of responsibility (local, group, national) within scientific communities with the need for oversight growing as the collections are more important to each community
The Digital Preservation Coalition study, Mind the gap: Assessing digital preservation needs in the UK (2006) has generalized the point to suggest that different communities have different needs for regulation and certification of preservation activity.
ARTstor
DJW, 3/2/05: 21
Enhancing trust through the law?
Preservation repositories certainly need to be protected against “takedown” orders
Specific, highly regulated industries might require preservation certification as a condition of doing business
In general, a consensus is emerging thatcertification should be community-driven rather than mandated centrally for all repositories because it would be too difficult and cumbersome to administer as a unified process across communities with differing requirements and needs
Open questions: Where are the communities of common interest? How do they encourage openness and transparency in the interest of establishing trust? What standards need to be enforced? How general are these standards across communities? When is it in the self-interest of preservation agents to submit to certification and audit processes?
ARTstor
DJW, 3/2/05: 22
Open questions
Where are the communities of common interest?
How do they encourage openness and transparency in the interest of establishing trust?
What standards need to be enforced?
How general are these standards across communities?
When is it in the self-interest of preservation agents to submit to certification and audit processes?
ARTstor
DJW, 3/2/05: 23
The grand challenges of Digital Preservation
Intellectual property and the relation between preservation and access
A network of trusted institutions and the question of certification
The interoperability gap
Business models and models of cooperation
ARTstor
DJW, 3/2/05: 24
The interoperability gap
There is room for extensive cooperation among archives, including various kinds of divisions of labor for more efficient operations
An example: preservation requires bulk transfer of content across and among repositories
What standards and services are needed?
PREMIS data model and data dictionary
Format registries
Need for practical experiments and testing
NDIIPP Archive Ingest and Handling Test
Preliminary results of the New York workshop on “Augmenting Interoperability across Scholarly Repositories”
ARTstor
DJW, 3/2/05: 25
Augmenting Interoperability
Focus on complex digital objects
Key service functions need to be represented in repository interfaces
Get/Obtain/Harvest
Put, or request for submission
Notion of “surrogate” as a representation of the complex object: Does it reference the object or contain it?
Generalized data model:
Unique identifier (need standard ways to locate repository)
Other features: lineage or provenance
Need for substantial inter-repository experiments in multiple domains (e.g. chemistry; museums; archaeology; preservation) to formulate and test the general data model.
ARTstor
DJW, 3/2/05: 26
The grand challenges of Digital Preservation
Intellectual property and the relation between preservation and access
A network of trusted institutions and the question of certification
The interoperability gap
Business models and models of cooperation
ARTstor
DJW, 3/2/05: 27
Business models
Preservation depends on commitment AND sustained resources
How to generate sustained resources?
Preservation is a public good and is subject to free-riding problems
Government tax and funding addresses the problem, but is it enough?
Libraries, archives, and museums may be able to take responsibility as part of their mission, but economies of scale is an issue
Service bureaus solutions work at scale but may be operating in special classes of enterprise such as those in a two-sided market
ARTstor
DJW, 3/2/05: 28
Two-sided markets
Credit cards are a classic case How to crack the chicken-egg problem: customers won’t use the
card if stores don’t accept it; stores won’t accept the card if there aren’t enough customers
The Visa solution: charge the merchant; the AMEX “prestige” solution: charge both
LOCKSS and Portico both have struggled with this problem of publisher and library participation
Portico required several rounds of intensive negotiations and marketing analyses on both sides
Abandoned initial interest in access provisions and lowered costs in what might be characterized as an insurance model
Are other business models appropriate in other arenas?
ARTstor
DJW, 3/2/05: 29
Resources depend on interaction with commercial entities
Commercial versus public and not-for-profit solutions?
Commercial interests in preservation are powerful if the IP is profitable
Government funding depends on competition in a political process
Not-for-profit missions may be divided; in libraries, for example between preservation and access
There need to be imaginative approaches to developing shared vision and mutual support across sectors
Learn and draw from commercial preservation expertise in film, sound recording, pharmaceutical, and other industries
Commercial uses of preservation archives: indexing; content from one publisher to another; addressing perpetual access claims
More imagination needed?
ARTstor
DJW, 3/2/05: 30
Next steps
The grand challenges of Digital Preservation
Intellectual property and the relation between preservation and access
A network of trusted institutions and the question of certification
The interoperability gap
Business models and models of cooperation
These are some suggested areas of priority