digital preservation at hul & drs 2
DESCRIPTION
Digital Preservation at HUL & DRS 2. HMS Countway Library Andrea Goethals July 20, 2009. Agenda. The problem What are we doing about it? DRS 2 Open for questions. 1. The problem …. The problem is twofold. 1. Keeping the bits safe. 2. Keeping the bits useful to people. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/1.jpg)
Digital Preservation at HUL & DRS 2
HMS Countway LibraryAndrea Goethals
July 20, 2009
![Page 2: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/2.jpg)
Agenda1. The problem2. What are we doing about it?3. DRS 24. Open for questions
![Page 3: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/3.jpg)
1. The problem …
![Page 4: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/4.jpg)
The problem is twofold1. Keeping the bits safe
2. Keeping the bits useful to people
![Page 5: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/5.jpg)
Keeping the bits safe Digital things are amazingly easy to
destroy Bad people Software or hardware failure Human mistakes
Destruction is not always apparent Data not used frequently is at risk of unnoticed
damage Some damage is not noticeable to human eyes
and ears
![Page 6: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/6.jpg)
Keeping the bits useful to people Digital material is fragile
Humans are dependent on technology to interpret the content...
Technologies must understand the format of the content
Technologies age and disappear!
![Page 7: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/7.jpg)
Using information content
informationcontent
bitsformats
SWHW
HW (paper)informationcontent
HW (paper)
symbols
language
Analog bookUnmediated use
Digital bookTechnology-mediated use
![Page 8: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/8.jpg)
Formats are key to determining usability
informationcontent
bitsformats
SWHW
supporting
technologies
digital
content
Formats are the bridge between the content we want to preserve and supporting technologies
![Page 9: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/9.jpg)
2. What are we doing about it?
![Page 10: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/10.jpg)
Keeping the bits safe Store the bits in multiple copies, in
multiple places Make sure the bits are not corrupt Replace media periodically Restrict who can access the bits Be able to recover the bits!
![Page 11: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/11.jpg)
Keeping the bits safe at HUL 3-4 copies of each file, 2 different media
1-2 (tape and sometimes disk): 60 Oxford Street, Cambridge
1 (disk): Summer Street, Boston 1 (tape): Southborough
![Page 12: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/12.jpg)
Keeping the bits safe at HUL Automated integrity monitoring
Drscheck script Compares the MD5 of each file at the Summer
Street location to the MD5 stored in a database Also checks the 60 Oxford Street disk copy
A copy of each file checked ~every 2 weeks Recent enhancement: Trigger on database
update of MD5 Storage media replaced every 4-5 years
![Page 13: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/13.jpg)
Keeping the bits safe at HUL Overseen by OIS and UIS IT staff Just-in-case plans
Disaster recovery Server fail-overs Software failure Tape libraries Fabric switches Lost or damaged tapes
Data recovery (corruption)
![Page 14: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/14.jpg)
It’s safe - but is it usable??? It’s not enough to preserve the bits if the
format of the bits is obsolete! WordStar? AppleWorks? Excel 1.0?
For digital content we are dependent on software that can understand the format…
![Page 15: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/15.jpg)
The importance of format Understanding formats is fundamental to
preservation
ffd8ffe000104a46494600010201008300830000ffed0fb050686f746f73686f7020332e30003842494d03e90a5072696e7420496e666f000000007800000000004800480000000002f40240ffeeffee030602520347052803fc00020000004800480000000002d80228000100000064000000010003030300000001270f0001000100000000000000000000000060080019019000000000000000000000000000000000000000000000000000000000000000003842494d03ed0a5265736f6c7574696f6e0000000010008313a3000200 ...
![Page 16: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/16.jpg)
The importance of format Understanding formats is fundamental to
preservation
ffd8ffe000104a46494600010201008300830000ffed0fb050686f746f73686f7020332e30003842494d03e90a5072696e7420496e666f000000007800000000004800480000000002f40240ffeeffee030602520347052803fc00020000004800480000000002d80228000100000064000000010003030300000001270f0001000100000000000000000000000060080019019000000000000000000000000000000000000000000000000000000000000000003842494d03ed0a5265736f6c7574696f6e0000000010008313a3000200 ...
SOIAPP0 JFIF 1.2APP13 IPTCAPP2 ICCDQTSOF0 183x512DRIDHTSOSECS0RST0ECS1RST1ECS2...
![Page 17: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/17.jpg)
The importance of format Understanding formats is fundamental to
preservation
ffd8ffe000104a46494600010201008300830000ffed0fb050686f746f73686f7020332e30003842494d03e90a5072696e7420496e666f000000007800000000004800480000000002f40240ffeeffee030602520347052803fc00020000004800480000000002d80228000100000064000000010003030300000001270f0001000100000000000000000000000060080019019000000000000000000000000000000000000000000000000000000000000000003842494d03ed0a5265736f6c7574696f6e0000000010008313a3000200 ...
SOIAPP0 JFIF 1.2APP13 IPTCAPP2 ICCDQTSOF0 183x512DRIDHTSOSECS0RST0ECS1RST1ECS2...
![Page 18: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/18.jpg)
Keeping the bits useful to people Know what formats you have Make sure there’s technology to support
the formats! Provide ways for people to find it Provide ways for curators to manage it Keep records of significant events Repair, replace
![Page 19: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/19.jpg)
Can we approach the problem differently? In way that’s more proactive? And more efficient? And less expensive?
Yes…
![Page 20: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/20.jpg)
The content production matters! The least expensive, and most effective
preservation measure is to think about the future when digital content is created!
It makes good sense to try to influence the content creation process
![Page 21: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/21.jpg)
Preservation lifecycle Create digital content Ingest into a preservation repository
Continuous cycle of: Monitoring Planning Intervention
Subject to collection management decisions Transfer to next generation of the
repository or to a different repository
![Page 22: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/22.jpg)
Keeping the bits useful to people at HUL Guidelines
More ‘preservable’ files formats: standard, well-understood, well-supported,
open Recommended supplementary documentation
(metadata) Tools
FITS, JHOVE: check quality of files, automated metadata extraction
Staff available to consult
![Page 23: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/23.jpg)
Keeping the bits useful to people at HUL Collection management applications Discoverable content
Catalogs Persistent names Search engines
Extensive metadata Administrative, Technical, Structural,
Provenance Suite of delivery applications…
![Page 24: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/24.jpg)
Keeping the bits useful to people at HUL Suite of delivery services
Delivery applications created and maintained at OIS
IDS, PDS, SDS, ADS, FTS Third party middle-ware maintained at OIS
RealServer, Luratech JPEG 2000 Server Third party rendering applications on users’
desktops Web browsers, RealAudio Players, TIFF viewers, ZIP
utilities
![Page 25: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/25.jpg)
Involvement in broader preservation community efforts E-journal archiving Technical metadata
Still images, audio, documents METS (package for metadata and digital objects) PDF-A PREMIS (preservation metadata) AIHT (repository interaction demonstration) Registry of digital masters Repository certification Formats registry (UDFR)
![Page 26: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/26.jpg)
4. DRS 2 …
![Page 27: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/27.jpg)
DRS 2 changesWhy?1. To better support digital preservation2. To better support needs of DRS
depositors, curators and collection managers
![Page 28: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/28.jpg)
DRS 2 changes1. New conceptual foundation
Objects, content models
2. User improvements Opaque objects, new file formats, tools,
guidance
3. A new approach to metadata4. Increased preservation planning and
activities
![Page 29: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/29.jpg)
Objects Currently only a file level in the DRS
All management has to be done at the individual file level
Objects are aggregations of files Page-turned object Still image object
More intuitive unit for management, reporting and searching Example: How many Page-turned objects do I
have in the DRS?
![Page 30: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/30.jpg)
Content models Types of objects Example: audio content model
![Page 31: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/31.jpg)
Support for opaque objects A special content model Allows files in any format Digital equivalent of buying time at HD
Content can be minimally processed, or can be fully processed by depositors but not yet supported by the DRS
Must be intended for long-term preservation Will receive some preservation services Will be on a path to fuller DRS
preservation
![Page 32: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/32.jpg)
Support for new file formats PDF Audio
MP3, MP4/AAC Drawings
AutoCAD Adobe Illustrator
Video What’s next?
![Page 33: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/33.jpg)
Deposit, management & delivery tools Enhanced Batch Builder
Integrated with File Information Tool Set (FITS) Enhanced DRS Web Admin
Better searching Richer management and reporting Ability to perform batch updates
File Delivery Service (FDS) Created for PDF delivery Delivers a file to user’s web browser
![Page 34: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/34.jpg)
Future of http://hul.harvard.edu/ois/
![Page 35: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/35.jpg)
Guidance & user communityNew website for digital preservation Formats central Content models DRS practices HUL digital preservation projects Emerging standards and best practices Tools, services, registries Resources & Experts
![Page 36: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/36.jpg)
A new approach to metadata Moving towards community-standard
schemas PREMIS, MODS, MIX, textMD, etc.
Metadata files on the file system alongside content files “object descriptor files”
Preservation, rights, descriptive metadata More reliance on embedded metadata
Automatic extraction at deposit time by FITS Third party delivery applications are becoming aware of
file-embedded metadata
![Page 37: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/37.jpg)
Increased preservation planning and activities More granular format identification Sub-file characterization Preservation plans per content model
Digital first aid (content & metadata) “Localization,” migrations, normalizations
Technology watch Virus checking
![Page 38: Digital Preservation at HUL & DRS 2](https://reader036.vdocument.in/reader036/viewer/2022062519/568150a0550346895dbe9dab/html5/thumbnails/38.jpg)
5. Open questions …