integrating local preservation systems with national ... · beyond the repository: goals...
TRANSCRIPT
Beyond the Repository:IntegratingLocalPreservationSystemswith
NationalDistributionServices
@DIGITIZED_LAURA
LG-72-16-0135-16
Beyond the Repository:Goals
• Investigatecommonproblemsindigitalobjectcuration,versioning,andinteroperabilitybetweenlocalrepositoriesanddistributedpreservationsystems
• Identifybroadlyapplicableusecasesanddesignpatterns
• Proposehigh-leveltechnicalsolutions
Beyond the Repository:People and institutions
Northwestern UniversityEvviva Weinraub (PI)CarolynCaizziLauraAlagnaBrendanQuinnGinaPetersen
University ofCalifornia SanDiegoSibylSchaefer
AdvisoryBoardMikeGiarlo (Stanford)BertLyons(AVPreserve)MaryMolinaro (DPN)MikeRitter(UniversityofMaryland)JustinSimpson(Artefactual)DavidWilcox(Fedora/DuraSpace)AndrewWoods(Fedora/DuraSpace)
Beyond the Repository:Research questions
• Howdoesonecurateobjectstoingestintoalong-termdarkpreservationsystem?
• Howdoesversioningofobjectsandmetadataplayoutinlong-termdarkpreservationsystemsandhowtoautomatetheseactions?
• Howcansystemsthatstoredatadifferentlybemademoreinteroperable?
Beyond the Repository:Methodology
1.Gatherinformationonthefirsttworesearchquestionsviaasurveyofpractitioners
a.Understandthebreadthofimplementedlocalsystemsb.Identifylocalworkaroundsandmetadatafixesinplacetoaddresstheseissuesc.Gatherdataaboutlocalpreferencesaroundversioningd.Identificationofpreservationpoliciesandrightsissues
2.Holdaseriesofin-depthinterviewstogatheradditionalqualitativeinformation3.Usingthisdata,workwiththeAdvisoryBoardtodesignhigh-levelrequirementsforincreasedinteroperabilitybetweenlocalanddistributedsystems4.Disseminatefindings
Preliminary results:survey metrics
• 170validresponses
• 65%havecollected10TBormore
• Morethan80%expectedtheircontenttogrowbyatleast10TBinthecomingyear
• Widegeographicdistributionrepresented,including15internationalresponses
• Mostlyacademiclibraries(77%)
• 73peoplewerewillingtodiscussfurtherwithus
Survey results:Systems used
Survey results:Distributed storage & number of copies
• Respondentswhoreportednotkeepingmultiplecopiescitedfundingasthemostcommonbarrier
• 85%ofrespondentsreportedkeepingmultiplecopiesinmultiplelocations
• Ofthese,thevastmajoritykeepthreecopies
2
3
4
5
6
7+
Survey results:Where copies of data are stored
Survey results:How copies are tracked
Automatic
Don’t keep track
Homegrown tool
IT support does it
MetaArchive Conspectus
Spreadsheet, database, or other manual method
Survey results:Versioning & curatorial decisions
Whenversioningdistributedcopies:• 85%ofrespondentsreportedkeepingallversions
• 20%reportedonlykeepingthenewestversion
• 20%wereunsure• Manyindicatedthatversioningpracticesaredependentonthetypeofmaterials
Intermsofselection:• 48%ofrespondentssaytheyselectasubsetofmaterialstogotoadistributedrepository
• Thetoptwoselectioncriteriaforthesematerialswere:
• Mandate(legal,grant,orother)
• Intrinsicvalue
Survey results:What is lacking in current tools and services
Survey results:What is lacking in current tools and services
Systems are too specializedSystems not well-integratedContent tracking and reporting
Preservation metadata/eventsMigration/versioning support
Scalability
Technology requires too much expertise
Automation
Survey results:What is lacking in current tools and services
It’s not the technology, it’s how we use it
Support for multiple content types
Fixity
Nothing is lacking
Storage options
Access control/rights statements
Technology requires too much staff time
Next steps
July/August: Interviews
August: Interviewanalysis
September/October: Reportwriting
October: Advisoryboardmeeting
December: Reportdissemination
Thank you
LG-72-16-0135-16