was uc3-nov2012wkshps-final
DESCRIPTION
TRANSCRIPT
Imagine a world …
This is our world …
WAS … is
A service of the UC Curation Center to collect, manage, preserve and publish websites and documents.
WAS Snapshot
53 public archives
120+ archives total
7,500+ sites
50+ TB
23 institutions
WAS Institutions• Institute of Governmental
Studies Library, UCB• UC Berkeley Office of Public
Affairs• UC Davis Libraries• UC Irvine Libraries • UC Los Angeles Libraries • UC Riverside Libraries • UC San Diego Libraries • UC San Francisco Libraries • UC Santa Barbara • UC Santa Cruz McHenry
Library
• Emory University Library• Institute for Research on Labor
and Employment• New York University• Northwestern University Library• Purdue University • Stanford University Libraries • Temple University• University of Arkansas Libraries • University of Illinois at Urbana
Champaign Libraries • University of Michigan, Bentley
Historical Library • USDA Economic Research
Service • Water Resources Collections and
Archives
WAS Overview
A) Curator Tools
Curator Workflow
1. Create Site
• Enter site name, URL and description• Scope• Capture frequency• Robots.txt
2. Capture Sites
3. View Captures
• View captures• QA• Compare
4. Public Access
• Customize the archive• Write description• Create custom banner and icon
WAS Overview
B) Public Archives
Web Archive ‘home page’
Browse: Site List + Tags
Search: All Sites in an Archive
Integration with your Systems
How are people using WAS?
Institution’s website
• Preserve intuitional history
• Capture university news and events
Geographically focused
Topical
Support special research collections
Event• Sudden action
required• May need many
selectors• Start date / end
date
Researcher’s Perspective
• Building collections for research– Study the topic / event– Study site change or web-based
communication– Websites are datasets for analysis and data
mining
• Preservation of research– Archive grant-funded websites– Selected sites
• Create stable citations for publications
Get started!
• Each library has WAS administrator(s)
• Unlimited number of curators per account
• What’s the cost?–UC does not pay a service fee– Storage only: $1040/per TB (average
site is $1.46/annually); storage costs to go down
Challenges
• Shared collection development• Metadata issues• Workflow and cost models for faculty
projects• Time!• Limitations of web crawlers• Websites are messy
Contact me!
Rosalie LackWAS Service [email protected]
Imagine a world …
“Imagine a world in which libraries and archives
had never existed. No institutions had ever
systematically collected or preserved our
collective cultural past: every book, letter, or
document was created, read and then
immediately thrown away. What would we know
about our past?’’
This is our world …
“Yet, that is precisely what is happening with the
web: more and more of our daily lives occur
within the digital world, yet more than two
decades after the birth of the modern web, the
“libraries” and “archives” of this world are still
just being formed.”
A Vision Of The Role And Future Of Web ArchivesKalev H. Leetaru, Graduate School of Library and Information Science, University of Illinois. Presented as the keynote address at the 2012 IIPC General Assembly in Washington, DC.http://netpreserve.org/sites/default/files/resources/VisionRoles.pdf