1/22/2006 columbia university notable new yorkers … project objective –digitally preserve oral...

19
1/22/2006 Columbia University Notable New Yorkers … Project objective Digitally preserve oral history recordings on variety of media and paper or electronic transcripts of interviews

Upload: mitchell-knight

Post on 03-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

1/22/2006

Columbia UniversityColumbia University

Notable New Yorkers …

Project objective

– Digitally preserve oral history recordings on variety of media and paper or electronic transcripts of interviews

1/22/2006

Columbia UniversityColumbia University

Notables New Yorkers include

» John B. Oakes» Bennett Cerf» Kenneth Clark» Ed Koch» Mary Lasker» Frances Perkins» Mamie Clark

1/22/2006

Columbia UniversityColumbia University

Source files– Analog cassettes

• 5” and 7” reels– Lengths: 7 minutes to in excess of 67 hours

– Typed transcripts• 100 pages to maximum 5,566 pages

– Binders– MS Word files

1/22/2006

Columbia UniversityColumbia University

Recorded media to be re-recorded to common file digital format

– Create preservation masters and access copies

» Preservation masters: 96kHz/24bit WAVE files

» Access copies: 44.1kHz/16 bit WAV files

» Web-accessible copies: .mp3 format

1/22/2006

Columbia UniversityColumbia University

Transcripts

– OCLC to provide to CUL• Archival TIFF images• Re-keyed files• Electronically formatted interviews

1/22/2006

Columbia UniversityColumbia University

Physical condition assessment– Interview transcripts quality varied

• Moderate to extensive written revisions & edits

– Audio files varied in quality & format• None provided for preview• Known life-expectancies & recovery issues

for media (audio cassette, reel-to-reel, etc.)• No evidence of ‘sticky shed’ or vinegar

syndrome identified by library

1/22/2006

Columbia UniversityColumbia University

Recommended workflow– OCLC p/u materials at CUL– OCLC deliver audio masters to Safe

Sound Archive– OCLC to return material to CUL

1/22/2006

Columbia UniversityColumbia University

Digitization specifications – OCLC delivered bi-tonal text pages

• 1-bit TIFF• Group IV TIFF compression• 600 dpi

– XML mark-up of full-text• OCLC facilitated

– Re-keying with 99.95% accuracy

TEI-Lite DTD mark-up

1/22/2006

Columbia UniversityColumbia University

Audio reformatting specifications & workflow

– Transmittal and trafficking policies• Database customization• Material log-in • Cross-checking against packing list

1/22/2006

Columbia UniversityColumbia University

Evaluation & engineers notes– Compact audio cassettes played back

on Nakamichi cassette decks w/mechanical & electrical playback alignments

– Digitization w/Prism Sound analog to digital converters

• Output 96kHz/24bit preservation master & 44.1kHz access copy concurrently

1/22/2006

Columbia UniversityColumbia University

Reel-to-reel tapes played back on Studer tape decks

– With mechanical & electrical playback alignments

– Digitization w/Prism Sound analog to digital converters

• Output 96kHz/24bit preservation master & 44.1kHz access copy concurrently

1/22/2006

Columbia UniversityColumbia University

CUL elected semi-monitored approach

– Up to 3 originals transferred simultaneously

– SSA guarantees 1:1 representation– Monitoring alternative – 100%

monitored

1/22/2006

Columbia UniversityColumbia University

Quality Control Procedures• OCLC performs 100% quality assurance of

all original TIFF images– Reviewed for completeness, alignment,

illumination regularity, and detail consistency throughout image» SW allows 1:1 viewing, zooming @

100%+ and reduced full-page view

1/22/2006

Columbia UniversityColumbia University

Quality Control Procedures

• SSA expects zero returns or rework– Achieved via database automation of file

naming (reduces human error)– Each file on delivery medium opened &

auditioned on separate computer – to assure recoverability

– Each file checked @ beginning & end to assure completeness

1/22/2006

Columbia UniversityColumbia University

Quality Control Procedures

• SSA expects zero returns or rework– And spot-checked throughout for

consistency– File names checked by separate person for

naming and contents against original recordings» Person also proofs file headers

1/22/2006

Columbia UniversityColumbia University

File Delivery– Text files – XML as per CUL

specifications• Challenges encountered due to variations

in the interview format/transcript format (some memoir style, some strict Q&A format)

– Audio files:• Portable hard drives

– ftp impractical due to file size» Est. 270GB = 1,000 hours connect time

1/22/2006

Columbia UniversityColumbia University

Digital Archive

– CUL files delivered for uploading to the OCLC Digital Archive repository

1/22/2006

Columbia UniversityColumbia University

Client acceptance of deliverables

– OCLC standard: 30-day image file retention

– Planned 30-day customer acceptance period

1/22/2006

Q&A