(web) archiving online media - sieprweb) archiving online media.… · archiving •web archiving...

Post on 14-Jun-2020

14 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

(Web) Archiving

Online Media

Nicholas Taylor

Web Archiving Service Manager

Digital Library Systems and Services

Stanford Media Group

December 10, 2014

overview

• services for media

archiving

• web archiving use case

• web archiving mechanics

• technical challenges

• approaches for online

media archiving

“LAX on take off” by Doug under CC BY-NC-ND 2.0

MEDIA ARCHIVING

Services for

SUL media archiving

services

• analog and digital

reformatting

• long-term preservation

• discovery and

referenceabilty

• access interfaces and

research tools

scope

• broad range of library

collections

– institutional legacy

– creative works

– oral histories, documentary

– research outputs

– lectures, events

– broadcasts and podcasts

– rare commercial works

SUL media archiving

services scope

• broad range of library

collections

– institutional legacy

– creative works

– oral histories, documentary

– research outputs

– lectures, events

– broadcasts and podcasts

– rare commercial works

• analog and digital

reformatting

• long-term preservation

• discovery and

referenceabilty

• access interfaces and

research tools

SUL web archiving

services

• one-time and repeating

collection

• long-term preservation

• discovery and

referenceabilty

• access interfaces and

research tools

scope

• select SU sub-sites

– institutional legacy

– compliance

– scholarly outputs

• third-party content

– government information

– scholarly inputs

SUL web archiving

services

• one-time and repeating

collection

• long-term preservation

• discovery and

referenceabilty

• access interfaces and

research tools

scope

• select SU sub-sites

– institutional legacy

– compliance

– scholarly outputs

• third-party content

– government information

– scholarly inputs

USE CASE

Web Archiving

whole greater than sum of parts

>

“Stanford University”

web archiving not always best approach

FAS: “Congressional Research Service Reports - Space Policy”

video or webpage?

PayPal

MECHANICS

Web Archiving

web page composed of files

Constituent Files

Web Page

collect and store w/ metadata

Constituent Files

Collect Data

Web Page

Web Archives

repeat over time

Constituent Files

Collect Data

Web Page

Web Archives

make accessible via SWAP

Constituent Files

Collect Data

Stanford Web

Archive Portal

Index Data

Web Page

Web Archives

make accessible via SWAP

Stanford University Libraries: “Stanford Web Archive Portal”

TECHNICAL CHALLENGES

Web Archiving Online Media

easy: linked video files

TED: “TED | Talks”

hard: streaming video

YouTube: “Steve Jobs' 2005 Stanford Commencement Address (with intro by President John Hennessy)”

hard: streaming video

YouTube: “Steve Jobs' 2005 Stanford Commencement Address (with intro by President John Hennessy)”

capture challenges

• obfuscated or short-

lived links

• ties up crawler,

harming crawl quality

• hard to delimit scope

• few objects, big data

volume

“I have the light in my hands” by Ashley Campbell under CC BY 2.0

replay challenges

• relating capture

context

• re-inserting

embeddable players

• platform-specific

engineering

“G. She told me to look at the light.” by John Twohig under CC BY-NC 2.0

ONLINE MEDIA ARCHIVING

Approaches for

mix-and-match approaches

media archiving

none

master

derivatives

web archiving

none

exclude derivatives only

low-resolution version only

highest-resolution version

best effort

archiving approach considerations

• technical affordances

• cost and budget

• formats of extant

media

• nature of work

• what else?

“A bit on the left...” by Federhirn under CC BY-NC 2.0

Josh Schneider

Assistant University Archivist

Stanford University

University Archives Use Cases

•Media Only

Stanford Technology Ventures

Program

• Entrepreneurship Corner

Describing Digital Audio And Video

Content for the SDR

Preparing Digital Audio And Video

Content for the SDR

Daniel Hartwig

dhartwig@stanford.edu

Josh Schneider

josh.schneider@stanford.edu

Contact the University Archives

top related