building archivable websites - nullhandle.org · 4/19/2014  · nicholas taylor web archiving...

35
Building Archivable Websites Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014

Upload: others

Post on 09-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

Building Archivable

Websites

Nicholas Taylor

Web Archiving Service Manager

Digital Library Systems and Services

Drupal Camp

April 19, 2014

Page 2: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

ARCHIVABLE WEBSITES?

Why Build

“Frosted Spiders' Web” by Jess Wood under CC BY 2.0

Page 3: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

future users are users, too

“a connection between past and future” by Gioia De Antoniis under CC BY-NC-ND 2.0

Page 4: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

maintain web usability

“Broken Web Connections? Welcome to 2009...” by Paul:Ritchie under CC BY-NC-ND 2.0

Page 5: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

improve temporal web usability

Internet Archive: “Wayback Machine”

Page 6: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

improve temporal web usability

Internet Archive: “Wayback Machine”

Page 7: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

recover your lost website

“Warrick”

Page 8: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

refer to earlier website versions

“The Iraq War: Wikipedia Historiography” by STML under CC BY-SA 2.0

Page 9: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

institutional history

Internet Archive Wayback Machine: “Stanford University Homepage”

Page 10: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

websites are cultural artifacts

“The World Wide Web project”

Page 11: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

facilitate compliance

Page 12: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

optimize for other crawlers

“SEO on a railway platform” by superboreen under CC BY-NC-ND 2.0

Page 13: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

IMPROVE ARCHIVABILITY

How to

“metal web” by paul:74 under CC BY-NC-SA 2.0

Page 14: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

follow web standards and

accessibility guidelines

“Web Standards Fortune Cookie” by Matt Herzberger under CC BY-SA 2.0

Page 15: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

use a site map, transparent links,

and contiguous navigation

“Card sorting” by Manchester Library under CC BY-SA 2.0

Page 16: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

maintain stable URLs and

redirect when necessary

“San Francisco-Oakland Bay Bridge 1442a” by Don Barrett under CC BY-NC-ND 2.0

Page 18: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

be careful w/ robot exclusion rules

“drupal/robots.txt at 7.x”

Page 19: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

minimize reliance on external

assets necessary for presentation

Internet Archive Wayback Machine: “Stanford Department of English”

Page 20: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

minimize reliance on external

assets necessary for presentation

“Stanford Department of English”

Page 21: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

serve reusable assets from a

single, common location

Google Images: “stanford university seal site:stanford.edu”

Page 22: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

specify HTTP response headers for

caching and content encoding

“time capsule on Alcatraz” by inajeep under CC BY 2.0

Page 23: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

embed metadata, especially

character encoding

“Keep the Packaging!” by davidd under CC BY 2.0

Page 24: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

use durable data formats

“Lascaux cave painting” by Christine McIntosh under CC BY-ND 2.0

Page 25: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

prefer responsive design over user-

agent personalization

“«Responsive web design» - 217/366” by Roger Ferrer Ibáñez under CC BY-NC-SA 2.0

Page 26: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

examine your site in the Internet

Archive Wayback Machine

Internet Archive Wayback Machine: “Welcome to A Multidimensional Perception ~/*\= & PCGuru”

Page 27: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

TOOLS AND SERVICES

Web Archiving

“giant mechanical spider & crowd” by mjtmail (tiggy) under CC BY 2.0

Page 28: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

Heritrix

Wikimedia Commons: “File:Heritrix-screenshot.png”

Page 30: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

HTTrack

“HTTrack Website Copier”

Page 31: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

Wayback

“Internet Archive Wayback Machine”

Page 32: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

Web Archiving Integration Layer

“Web Archiving Integration Layer”

Page 33: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

Memento

“Memento”

Page 34: Building Archivable Websites - nullhandle.org · 4/19/2014  · Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014. ARCHIVABLE

assess archivability w/ Archive Ready

“Archive Ready”