saa 2015 web archiving roundtable
TRANSCRIPT
WAS to Archive-It Migration
Visualization of linking between websites of different languages, Babel 2012 project
Rosalie [email protected]
Who?
WAS is …• A service of University of California’s
California Digital Library• 2004: Funded by National Digital Information
Infrastructure Preservation Program (NDIIPP)• 2006: Launched with partner institutions• 2009: Transition to subscription model• 2015: 21 UC institutions; 12 external
Archive-It
A subscription service from the Internet Archive, which allows institutions to build, manage and search their own web archive
Over 300 partner orgs in the U.S. and worldwide
www.archive-it.org
Why?
Flickr by Daniel Foster
Lean Books in Wikimedia Commons
Flickr by James
How?
CUL-hosted Web Archiving Policies and Practice in the US summit
“… an articulation of a small number of model programs for web archiving, and development of ‘best practices’ for documenting program elements”
May 2012
Attendees: CDL, Columbia, CRL, Cornell, Duke, Georgetown, Frick, Harvard, Indiana, IA, LC, Michigan, North Texas, NYU, Sloan, Stanford, UC Irvine, UT Austin, Virginia Tech https://webarch.cul.columbia.edu/
CDL-hosted meeting
“… more robust collaboration was desirable in order to collectively address these challenges [research use, intensive resource requirements, the pace of change, fragmented collection development, etc.] and went so far as to brainstorm the benefits and risks of an all-in, formal association”
June 2014
Attendees: CDL, Columbia, George Washington, Harvard, IA, LC, North Texas, Stanford http://bit.ly/1N1GgGj
Collections/Access/QA Opportunities
• Federation/aggregation/collocation• Collaboration on collection development• Crowd sourced selection and QA • Education and advocacy• Create and endorse policies, best
practices and standards
Supporting research
• Outreach• Pilot projects • Computational analysis tools• Tools, tools, tools
Opportunities
Technology
• Shared infrastructure/operations• Data capture tools• Collaborate on API development • Preservation solutions• Tools, tools, tools
Opportunities
Steps toward collaboration: Community Principles for Web Archiving at Scale
“… a lightweight structure by which web archiving institutions can work collectively in order to achieve significant functional goals and operational efficiencies that they are unlikely to achieve individually”
September 2014
CDL, Columbia, George Washington, Harvard, IA, LC, North Texas, Stanford http://bit.ly/1NoB2l1
“…rely on external service providers whenever possible,
and restrict local efforts to areas in which institutions can uniquely add value.”
Value-added services locally or collaboratively developed
Next Steps
• Complete the migration• Conduct user research into researcher needs• Define, build and share APIs to meet
specialized needs• Explore feasibility of a national collaborative
model for web archiving• Continue to look for funding opportunities to
help facilitate this effort (IMLS 2016)
Flickr by stu_spivack
Questions?
Rosalie Lack [email protected]
Potential architecture