preserving our digital present for future memories ... · preserving our digital present for future...
Post on 09-May-2018
219 Views
Preview:
TRANSCRIPT
Preserving our digital present for future memories: technical and social challenges
XXXIV Reunión Nacional de Archivos 2012Villahermosa, Tabasco, México
Francisco BarbedoDireção Geral do Livro, Arquivos e Bibliotecas
francisco.barbedo@dgarq.gov.pt
So… What is digital preservation?• Keep digital information available (usable,
authentic, reliable)• For the time it is operationally required and
socially relevant. (may be 20 y, or forever)• Independently of the technology originally used▫ The objective of preservation is not only to
transmit our heritage to future generations and maintain the capability to understand and reuse what we have preserved, but also to permit the ongoing use of information inside institutions
▫ We’ll get back to this..
Digital information, digital object
• Digital information is functionally similar to paper, in the way that both support business actvities. But it’s features turn it into a completely different object.
• 0 and 1 are difficult to read
• Things change, disapear, cease it’s existence…▫ Software, hardware, knowledge, people
Digital information, digital object
• Digital information depends of an intermediarysystem. Can not be used/accessed directly by thehuman being.
Intermediary system
• The intermediary system is the software andhardware in which the information wasproduced. But has also other components likeoperative system, applets, java, browsers, etc
• Greater complexity and richeness ofinformation.
• It’s everywhere. Everybody has it and produce it
6
Digital information, digital object• Informatic industry• Very quickly evolving market• Fast pace of obsolescence• Av. 7 years backwards compatibility• If no preservation actions are performed, the
risk of obsolescence increases considerabily after7 years
• This means of course that the need of preservingdigital information has become part of theagenda of institutions
Digital information, digital object
• The problem with obsolescence is that we can no longer access the information because the IS in which was produced does not exist anymore
• We may still have the information that we can no longer access, probably stored in a, alsoarchaic, media, to which we no longer possessthe adequate devices to run…
A real case: Gabinete da Área de Sines• Year: 1971• Time of existente: 1971-1989• Place: Sines, Portugal• Mission : manage the construction of a
international deep water harbour; urbanisingthe region around, build and run a big energycentral (the biggest in Portugal, by that time)
• Big project, lots of resources• Informatic resources acquired (mainframe
computer . UNIVAC)
9
A real case: Gabinete da Área de Sines
• The documentation was ingested in the NationalArchives
• Among 13.864 boxes, 88 magnetic tapes containing data
• Actions developed:• Find a company with devices and knowledge to
read the tapes. Not found in Portugal. Eventuallya company in UK commited to the job
• Data existed in the tapes: refreshed to a DVD
10
Case study
• We received a lot of files containing data.
• Was the problem solved?• NO
• Because the data appeared like this
13
14
� �B ñ :PÁ– � �Á“S (E(EÊ [“ Å(E(E(q‡q·0 3Žµ� � � � � � ?� � � � � � � � � � � � � � � � � � � � � Œ� � � � � � � � � � � � �
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �˜ � � CÝ� � � �Ž (E(E(E @� � � � � <†ñ 0� � � � � � � Ü� � � È� � �
<(� � ðÿÿÿÿ� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � èØ� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �àØ� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � àØ� � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Ø€AIRES BARROS GOMES DE VALLERA � � � � � � � � � � *� � � � � � � � � � � G� � �‘� � � � � � � � � � � � � � � � � � M� � � …� � � � � � � e 244645/OA� � � � � � çè� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � S� � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � à€CARLOS JOSE DA CONCEICAO VIEIRA � � � � � � � � � � � � � � � � � � � � � � � G� � � ±� � � � � � � � � � � � � � � � � � � G� � � .� � � � � � � e 250021/OA� � � � � � Q„� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � U� � �� � � � � � � ¶� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 3J� � � � � � � � � à€DANIEL GOMES DOS SANTOS � � � � � � �
� � � *� � � � � � � � � � � G� � � ±� � � � � � � � � � � � � � � � � � J� � � .� � � � � � � e 250023/OA� � � � � � Rþ� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � S� � � � � � � � � � � ¶� � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � z¬� � � � � � � � � à€GUILHERME LUIS FARIA CANCIO MARTINS � � � � � � �
Digital information, digital object• What do we need to perform digital preservation?• Standards ..we actually have some and are pretty
effective (OAIS - ISO 14721)• Certification. ▫ ISO 16363 (Audit and certification of trustworthy
digital repositories)▫ European Framework for Audit and Certification of
Digital Repositories• Strategies and methods. We have fairly good
methods of preserving information although not allinformation can be totally preserved.▫ Eg. Migration
Digital information, digital object
• Anything else??
• Oh, that’s right…
• MONEY… lots of it and forever…
DP problems: review
1. Technological2. Financial3. Legal4. Trustworthiness of repositories5. Loss of knowledge6. Social issues
DP: technological issues• Software upgrades fail to support legacy files.• The format itself is superseded by another or evolves in
complexity.• The format "take up" is low or industry fails to create
compatible software.• The format fails, stagnates, or is no longer compatible with
the current environment.• Software supporting the format fails in the marketplace or is
bought by a competitor and withdrawn.• Hardware also evolves very quickly. New software does not
run on old hardware and vice versa.• Storage media and systems , including compression
algorithms and backup technology are highly volatile
• In any case the information becomes technologically “isolated”, ant becomes impossible to access and to use.
DP: formats• Without a format specification, a file is just a
meaningless string of ones and zeros. The specification indicates the proper subdivision, encoding, sequence, arrangement, size, and internal relationships that uniquely identify the particular format and allow it to be properly interpreted and rendered.
• If the specification is open, ie, everybody can consult it, the problem is mitigated.
• BUT…Most of the times vendors keep their specifications closed , even those that have been discontinued.
DP: legal issues• What can we really preserve?
• The objet or a representation of the object?
▫ Everytime we act over the information for preservationpurposes, we change it and it becomes, in a certain way, different from the original information. That’s whathappens with migration…
• The fact is that law is still being made thinking in the waysthings worked in the paper world. But paper is not digital…Wecan not expect digital information to behave like paper. Although it is all information.
• Plus. We must consider digital rights about material to bepreserved
DP: financial issues• Factors that affect dp costs:
The cost of the digital archival system (a digital depot or repository) and functionality for the long term preservation of digital records
+ Personnel costs
+ The cost of the development (or procurement) of
software and methods for the preservation of digital records, eg, conversors.
+ The cost of the actual storage of digital records
+ Other factors that exert an influence on the total
1. eg,. communications
DP: financial issues• As Dp is a permanent process, money must keep
supporting that effort.
• The amount of information to be preserved has a cumulative growth (just like in the paper world).
• DP is a highly expensive business, and there are no guarantees whatsoever of any organisationbeing financially able to support DP on the longterm.
DP: trustworthiness of repositories• Difficult to prove the efectiveness of our preservation
methods at a long term. Simply because not enough time has elapsed since we started digitally preserving
▫ Informatics has c. 61 years existence (1951 UNIVAC 1st commercialcomputer. Paper existes since many centuries.
• How can we certify that we really are effectivelypreserving information that was delivered to us?
▫ How is trust build?
Certification of digital repositories, helps. The concept is identical to ISO 9000. If a repository is certified according to a recognised standard, weexpect that it runs it’s business effectively
• Standard ISO 16363 for digital repositories certification.
What do we expect from repositories?• Long-term preservation of readability and accessibility
in a way that is independent from any specific software or hardware
• Reliability and authenticity of digital records while carrying them across successive generations of information technologies. Because… we are talking or archival information and evidence
• Scalability to accommodate a huge amount of data and records
• Users expect… Low cost, Low trouble
But is all this possible?
24
DP: trustworthiness of repositories• What compromise to Dp can/should we assume?
Because we shouldn’t do promises we are notcertain to keep.
• We can testify that a bit sequence has not beenphysically changed or corrupted
• We must be able to preserve metadata about thatbit sequence, which means data about it’sstructure, digital rights, original process, intermediary system, social environement, etc…
DP: Preserving knowledge• Because…
• The social and organisational environementsdisapear, so: how are we suppose to be able to recreate it, i.e, to preserve knowledge?
• Metadata helps a lot. Because it documents allaspects of the information to be preserved, so as it can be recreated in the future.
DP: Social issues
• To have a better understanding on digital preservation social issues, we must take a short trip on globalisation and information growth
27
globalisation• Globalisation is about things, people, countries
getting closely connected. ▫ For good and bad times!
• The actions we (individual or collective agents) perform have global impact in everyone’s life(whether minor or major)
• This fact is in part due to information massification, which is greatly explained by the possibilities offeredby technology▫ Information production increases, is more complex,
rich and disseminates quickly.• It is pervasive => global (everywhere and
everyone).• It impacts all range of human and social activities.
28
globalisation
• In 2020 total estimated amount of informationcreated = 35.000 Exabytes! (in 2009 = 800 exabytes) (source: Oracle, 2012) Exabyte
Petabyte
Terabyte
Gigabyte
29
Globalisation: some aspects of information
• Access to devices and media that allow anyone to produce information (cameras, videos)
• Institutional information rely heavily in database systems, sometimes mixed with more complex data such as multimedia (GIS, medical imagiology)
• Archivists must include into their area ofinfluence all information that supports business and constitutes evidence of it. Not only “records” or “documents”
30
globalisation
• Powerfull and available mechanisms for information search and retrieving (google…)
• Shared and accessable mechanisms for knowledge sharing (eg. Wikipedia)
• Social and professional networks available to everyone… connections.▫ Fbook, linkedin
31
globalisation• New ways to interconnect, socially and in work.
• More potencial for cooperation and theestablishement of networks of activities, people, work, organisations, etc
• Opportunity for sharing resources, maybe assets, creating horizontal structures (national orinternational), instead of vertical ones.
32
Social habits• Remote social interconnection
• Heavy dependance of internet for developement of commonprofessional and social activities (booking hotels, reservations, trips, ecommerce, social and professionalrelations (networks such as fbook, Linkedin), etc
• New information users. People under 25 that grew withinternet. They have no idea that once there were no computers.
• The concept of phisically visiting a specific place to getinformation, such a library or archive, might seems a bit weird…
• What services do they expect from archives?
33
DP: Social issues
• New users . How will they react or their attitudewill be regarding preserving information thathas always been easy to get?
• Will they be willing to spend money for keepingsomething of which they might not see thevalue?
• We value more things difficult to obtain thanthose that are easy.
DP: Social issues• Digital information also brings needs of DP to the
core of organisation issues.
▫ Or, at least it should…
• Unless one accepts becoming informationless on a short term, institutions must formally recognise DP as an issue to deal with.
• That’s a new situation that requires adaptation fromorganisations and people
• They must put DP in their planning processes andbudget.
Let’s build a community!• Community is a network of actors that share
interests in a specific domain. In this case: Domain = digital preservation
• All kinds of actors can exist in the network: institutions (public/private), developers, business, consultants, citizens
community
• It can be based on a specific platform that must beopen and freely reusable
• Advantages must be clearly perceived by thecommunity to be.
• Sharing services and development
• Sharing Costs▫ Holders want low cost/low trouble, so, let’s divide the
effort and smooth our job.
37
Sustainable network• Other issues must be considered for a
community to work:
• a/ Identification of problems regardingparticular realities at a national level▫ political commitement (for public administration)▫ Installed based (social and technological)▫ Installed skills and knowledge▫ Existing platform. level on informatics use That’s because for a network to function it’s agents
should be leveraged.
38
network
b/ Training and capacitation• In order to achieve common levels of expertise
and knowledge in DP. platform ofconvergence▫ In all domains: informatics, archival,
organisational, etc
39
network
c/ Developement
• Around an accepted common plataform▫ (why not RODA? Anyone can use it for free)
• That is open and reusable• Within a strategic development plan accepted by the
community• Broad, inclusive but specific enough so as every
development might be interconnected to theplatform and do not overlap▫ Control of quality necessary
40
Platform• Although all actors can and actually should build
add-ons and other informatic functionalities, itis best that every individual effort is gatheredinto a common planned development strategy,
• Advantages: ▫ no overlaping▫ Usable add-ons▫ Everyone can use and integrate those add-ons into
their specific solutions
41
network
• d/ envolvement and dynamics
• The network must be motivated and boosted soas it can keep it’s sustainability and permanentevolution
• If no dynamisation is performed, the communitywill certainly desapear on the short term.
42
network• Possibility of defining the best architecture for
the community in terms of building digital repositories. What is the best distribution ofdigital repositories, for example?
and
• Shared storage capabilities▫ Common, shared
43
my way or our way?
44
• “El camiño se hace caminãndo…”
• We all have the same problems. Why don’tmanage them together?
• But we can not only stick to models andplanning.
• We must act to build paths• May not be the better, but is is surely better than
not acting.• Video en You Tube Preserva Digital DGARQ
http://www.youtube.com/watch?v=47BZ6rXNcsQ
top related