beyond ndnp: technical specifications working group

11

Click here to load reader

Upload: karen-estlund

Post on 11-Apr-2017

162 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Beyond NDNP: Technical Specifications Working Group

Beyond NDNP Technical Specifications Working

Group

NDNP 2015

Karen EstlundAssoc. Dean for Technology & Digital [email protected]

Page 2: Beyond NDNP: Technical Specifications Working Group

Working Group Membership● Chair: Karen Estlund (Pennsylvania/Oregon)● Luis Baquera (California)● Brian Geiger (California)● Mark Phillips (Texas)● Shawn Schollmeyer (Washington)● Kopana Terry (Kentucky)● Laura Weakly (Nebraska)● Eric Weig (Kentucky)● Frederick Zarndt (Independent)

Page 3: Beyond NDNP: Technical Specifications Working Group

Activities1. Metadata Application Profile

a. Functional Requirementsb. Data Modelc. Metadata Schema

2. File Formats & Resolution Recommendations3. Directory Structure Recommendations4. Backup & Storage Recommendations

https://sites.google.com/site/digitalnewspaperspractices/technical-specifications

Page 4: Beyond NDNP: Technical Specifications Working Group

Metadata Application Profile

Page 5: Beyond NDNP: Technical Specifications Working Group

Functional Requirements1. Newspapers should be retrieved based on issues2. Items may be sorted and retrieved by date of issue3. Multiple editions for particular issues may be related to an issue4. Aggregated and common titles can be used to retrieve user-friendly

results beyond the serials catalog record for a title5. The model must use the NDNP model as a baseline6. Identifiers should be present to correspond with additional metadata

resources whenever possible7. Newspaper content should be retrieval based on copyright associated

with the work at an issue level8. Full-text searching is assumed and not represented in the descriptive

model

Page 6: Beyond NDNP: Technical Specifications Working Group

Mandatory Metadata PropertiesDigital Responsible Institution Edition Order

[at least one identifier]:● LCCN● ISSN● OCLC● Local Identifier

Issue Date

Title Rights

Publication Location

Page 7: Beyond NDNP: Technical Specifications Working Group

Metadata Properties Added to Profile

Digital Responsible Institution* Common Title / Curated Title

ISSN Common Title / Curated ID

OCLC Number Rights*

Local Identifier Language

[Original Object Information]

Page 8: Beyond NDNP: Technical Specifications Working Group

File Format & Resolution Recommendations1. Microfilm

a. 300-400 ppib. 8-bit grayscale

2. Papera. 250 ppib. 8-bit grayscale (even for

color)3. Born Digital

a. PDF -> PDF/Ab. PDF -> TIFF imagesc. Websites -> harvest to

WARC

Preservation Formats:● TIFF 6.0 Uncompressed● PDF/A Flavor of your choice● WARC

Access Formats:● JP2● JPEG, and/or● PDF

Page 9: Beyond NDNP: Technical Specifications Working Group

Directory Structure Recommendations● University of Kentucky Libraries

○ [collection uniquecode]/[lccn]/issues/[YYYY]/[unqiuecode][YYYYMMDDED]/

○ lvc/sn86069643/issues/2012/lvc2012030101/

● Center for Bibliographical Studies and Research (CBSR) at the University of California, Riverside○ [batch_directory]/[pub_code]/YYYYMMDD[_EE]/○ batch_curiv_eagle/SFC/19101226/

Page 10: Beyond NDNP: Technical Specifications Working Group

Storage & Backup Recommendations1. External Hard Drives2. Networked Local Server3. Engineering Backup Servers4. Cloud Hosting

Preservation Best PracticesFollowing preservation best practices for digital newspaper content is encouraged. More information about digital preservation best practices is available from the Library of Congress: http://www.loc.gov/preservation/.

Page 11: Beyond NDNP: Technical Specifications Working Group

Scripts & SoftwareScripts and software to help with processing, hosting, or preserving digital newspapers. For additional resources, see PaperVault "Tools for Working with Digital News".

● Open Source Newspaper Viewer, chronam/LC Newspaper Viewer: https://github.com/LibraryOfCongress/chronam

● Open Source JP2 Image Server, RAIS: https://github.com/uoregon-libraries/rais-image-server

● ALTO-like XML○ PDF2ALTO, https://github.com/cokernel/pdf2alto○ PDF to Text, https://github.com/uoregon-libraries/pdftotext

● PDFs to NDNP-like technical specification, https://github.com/uoregon-libraries/pdf-to-chronam