![Page 1: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/1.jpg)
1
Metadata Toolsfor JISC Digitisation Projectsof still images and text
Ed FayBOPCRIS, Hartley Library
University of Southampton
![Page 2: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/2.jpg)
2
Overview: BOPCRIS today
Move to work natively with standards• Interoperability• Preservation
Design project procedures from ground up with metadata in mind
• File-naming and directory structuring• Metadata capture processes
Production workflow that automates where possible Minimize possibility for human error / subjectivity “Final package” of digital object that records preservation
information on the “digital shelf” and aims for maximum interoperability between systems, all in one place
![Page 3: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/3.jpg)
3
Overview: technical details
File-naming / directory structure• Incorporating project-specific “unique ids”
Final package (digital object)• Internally consistent “tarball” [*.TAR]• Relative path-naming conventions• METS wrapper• Extension formats for metadata: descriptive (MODS);
technical (MIX); process (PREMIS) Production workflow
• Automated production of final package Metadata recording
• Dynamic input by scanner operators
![Page 4: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/4.jpg)
4
History
Eighteenth Century Parliamentary Papers• Project under Phase 1 of JISC Digitization Programme• Proprietary system and data formats (Agora)• Manual input of metadata
o Descriptive and Structural
• Advantages and Disadvantages
![Page 5: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/5.jpg)
5
History: Advantages
Proprietary system with advanced functionality:• OCR workflow• Web presentation
Highly customizable• Metadata fields specified and modified at will
![Page 6: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/6.jpg)
6
History: Disadvantages
Non-standard metadata fields • No mapping to standard formats difficulties: interoperability; metadata harvesting
Translation• Between systems, or between “use” and “archive” formats introduces possibility of versioning issues
No scope for preservation metadata• Separation between workflow / presentation system and
preservation strategy
Resulted in disparate collection of scripts and tools to manage data
![Page 7: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/7.jpg)
7
Present: Metadata Standards
Bibliographic database export File-system level
• Directory structure• File-naming conventions
Scanning level• TIFF headers• Additional descriptive metadata
METS profile• Tailored to project needs• Extension formats (MODS, MIX, PREMIS)
Checksums (MD5)
![Page 8: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/8.jpg)
8
Present: Metadata Origins
Scanned Images• TIFF headers
METS
OCR (Agora / ABBYY)
MIX
(Z39.87)
File-naming
Directory structure
(TAR)
Other metadata• Process• Additional descriptive
PREMIS
Bibliographic Metadata
MARC21 / MODS / etc.
File formats• TIFF master / Derived JPEG
• Flat text (TXT) & Word-co-ordinated OCR
Custom dmdSec
PRECURSORS
GENERATED
![Page 9: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/9.jpg)
9
Future
One tool for entire process, from scanned images to METS
Tool would:• Extract technical metadata• Include descriptive metadata• Build flat-structure METS
Tool would require:• File-naming, directory-structuring conventions• Image file sources
![Page 10: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/10.jpg)
10
Future: Advantages
Abstraction = standardization All digitization projects will produce metadata in
similar formats interoperability Certain technical base-standards will be present
preservation Any centrally developed preservation or
presentation systems would be able to ingest output from any project
Saves wasted effort developing similar solutions many times, when one solution can be developed once and adapted
![Page 11: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/11.jpg)
11
Future: Questions…
Usefulness of such a tool? Relevance to your project? Problems / obstacles? How much flexibility is necessary? Manual input / editing?
Main points: Abstraction, functionality, flexibility
![Page 12: 1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton](https://reader036.vdocument.in/reader036/viewer/2022070305/5515c656550346c6278b45a0/html5/thumbnails/12.jpg)
12
Further information
Ed Fay, Software Developer• BOPCRIS, Hartley Library• University of Southampton• [email protected]• 023 8059 3575