web archiving collaborations: a presentation for colleagues working in the libraries of the...

Post on 21-Jun-2015

317 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

These slides were used to support a presentation on web archiving collaborations for colleagues working in the Libraries of the Metropolitan Museum of Art.

TRANSCRIPT

Web  archiving  collabora/ons  at  Columbia  University  Libraries  

Anna  Perricci  

Columbia  University  Libraries  

Metropolitan  Museum  of  Art  (August  19,  2014)  

Web    Resources  Archiving  Collabora/on  

Many  thanks  to  the  Mellon  FoundaFon  

Building  collaboraFons  among  •  The  web  archiving  community  

•  Other  research  libraries  •  Users  and  potenFal  users  of  web  archives  •  Website  creators  

Incen/ves  grants  to    advance  web  archiving  tools  

Image  source:  hNp://imgur.com/gallery/vG7KE48  

Incen/ve  awards  projects  

Warcbase:  Building  a  Scalable  Web  Archiving  PlaWorm  on  HBase  and  Hadoop.  (Jimmy  Lin,  University  of  Maryland)  

Archiving  TransacFons  Towards  UninterrupFble  Web  Service  (Zhiwu  Xie  and  Edward  A.  Fox,  Virginia  Tech  University)  

Incen/ve  awards  projects  

Visualizing Digital Collections of Web Archives (Michele Weigle, Old Dominion University)

Tools for Managing Seed URLs (Michael Nelson, Old Dominion University)

Incen/ve  awards  projects  

Perma.cc:  MiFgaFng  the  Pervasive  Problem  of  Link  Rot  in  Scholarly  Works  and  Preserving  Online  Content  (Kim  Dulin,  The  Harvard  Library  InnovaFon  Lab)  

Free  Law  Project    

 Providing  free  access  to  primary  legal  materials,  developing  legal  research  tools,  and  supporFng  academic  research  on  legal  corpora  

Building  an  efficient  and  scalable  na/onal  framework  for  collec/ng  web  content    

Image  source:  hNp://imgur.com/gallery/1m5MBKf      

Designated  space  for  collabora/ve  collec/ng  

Collabora/ve  Architecture,  Urbanism  and  Sustainability  Web  Archive  (CAUSEWAY)  

hNps://archive-­‐it.org/collecFons/4638    

Collabora/on  with  music  librarians  

Contemporary  composers—the  perfect  storm?  

Contemporary  Composers  Web  Archive  

Selectors  

•  Borrow  Direct  Music  Librarians  Group:  music  librarians  at  Brown,  Columbia,  Cornell,  Dartmouth,  Harvard,  Johns  Hopkins,  Princeton,  and  Yale  universiFes,  MIT,  and  the  universiFes  of  Chicago  and  Pennsylvania  

Cataloging  exper/se  

•  Russell  MerriN  (cataloger  specializing  in  music  resources)  •  Kate  Harcourt  (Director  of  Original  and  Special  Materials  Cataloging)  

•  Alex  Thurman  (Web  Resources  CollecFon  Coordinator)  

Contemporary  Composers  Web  Archive  hNps://archive-­‐it.org/collecFons/4019    

Quality  Assurance  

Crea/ng  MARC  records  for  web  archives  

•  CreaFng  MARC  records  for  archived  websites  is  standard  pracFce  at  CUL  – MARC  records  make  web  archives  discoverable  in  CLIO  (Columbia  Libraries  InformaFon  Online)  

•  CollecFon  level  and  seed  level  records  

•  Will  use  Archive-­‐It  interface  to  make  Dublin  Core  records  

Patron  view  of  record  in  CLIO  

Cataloger’s  view  of  record  in  CLIO  

An/cipa/ng  wider  use  of  MARC  records  

•  Records  have  been  released  to  WorldCat  

•  Collaborators  on  cataloging  were  aNenFve  to  which  fields  will  ordinarily  be  stripped  out  when  a  MARC  record  is  imported  to  another  insFtuFon’s  OPAC  

CCWA  MARC  records  

•  So  far  sample  of  10  records  has  taught  us…  

•  PosiFve  feedback  from  music  librarians  

•  Next  we  will  add  another  44  records  for  the  archived  sites  in  CCWA  soon  

Project  tracking  

Use  cases  

Who  are  the  web  archives  for?    Are  they  being  used?    Could  we  encourage  more  effec/ve  use?  

hSp://hrwa.cul.columbia.edu  

Using  the  Human  Rights  Web  Archive  &  learning  from  human  rights  scholars’  work  (publica/ons,  cita/ons)  

Cita/ons  scraped  from  ar/cles  published  in  2010  in  select  scholarly  journals  

Isola/ng  URLs  from  list  of  cita/ons  (approximately  10%  of  cita/ons  scraped  have  URLs  in  them)  

Best  Prac/ces  for  site  creators:  working  with  website  creators  

Image  source:  hNp://imgur.com/gallery/NWJ12Pl    

Open  issues:  division  and  maintenance  of  coopera/ve  efforts  

(communica/on,  so]ware  and  more)  

Process  over  next  16  months  

•  Further  planning  (revision  as  needed)  and  user  interviews  •  Maintain  group  communicaFon  

•  Ongoing  growth  (scale  of  collecFng  and  distribuFon  of  effort)  •  Present  shared  costs  and  sustainability  models  (currently  in  

development)  

•  3-­‐5  year  plan  for  Borrow  Direct  collaboraFons  (collecFons  strategy,  finances,  workflows  and  governance)  

•  If  collaboraFon  persists,  idenFfy  themes  for  further  collecFng  

•  Catalog  resources  to  high  standards  •  Quality  Assurance  and  ongoing  evaluaFon  

Web  archiving  ini/a/ves    focusing  on  art  resources  

An  iniFaFve  designed  to  address  the  “urgent  need  to  document  the  dynamic  web-­‐based  versions  of  aucFon  catalogues,  catalogues  raisonnés,  and  scholarly  research  projects,  as  well  as  arFst,  gallery,  and  museum  websites”  (hNp://www.nyarc.org/content/web-­‐archiving)  

ArFsts  Files  Special  Interest  Group  

Ques/ons?  

Image  source:  hNp://imgur.com/gallery/qoCqQoh    

Resources  that  came  up  in  the  Q  &  A  

•  Internet  Archive  "Save  a  Page"  Plug-­‐In  for  Chrome  hNps://github.com/lintool/chrome-­‐archive-­‐this-­‐page    

•  SAA  Web  Archiving  Roundtable  hNp://webarchivingrt.wordpress.com/    

Thanks!  

Anna  Perricci  

alp2198@columbia.edu    @AnnaPerricci    

Columbia  University  Libraries  

top related