systems, processes & how we stop the wheels falling off

26
Systems, processes & how we stop the wheels falling off Digitisation Open Day, September 2013 Dave Thompson Digital Curator, Wellcome Library

Upload: wellcome

Post on 29-Nov-2014

410 views

Category:

Technology


0 download

DESCRIPTION

Presentation from Digital Curator Dave Thompson on systems and processes for digitisation at the Wellcome Library for our second Digitisation Open Day.

TRANSCRIPT

Page 1: Systems, processes & how we stop the wheels falling off

Systems, processes & how we stop the wheels falling off

Digitisation Open Day, September 2013 Dave Thompson

Digital Curator, Wellcome Library

Page 2: Systems, processes & how we stop the wheels falling off

Digitisation – process overview

Plan project

Catalogue

Identify material

Identify resources

Plan process

Review as you go

Digitise/process

Deliver

Refine processes

Document/share

Document/share

Document/share

Funding, staff, equipment, IT, storage, data management

planning

Open source player

Page 3: Systems, processes & how we stop the wheels falling off

Meanwhile, at the coal face…

Administrative metadata

Descriptive metadata

Digitised images

Ingestion into repository

Creation of METS Access

+

=+

+ +

Page 4: Systems, processes & how we stop the wheels falling off

Thinking conceptually … OAIS

Administrative metadata

Descriptive metadata

Digitised images

Ingestion into repository

Creation of METS Access

+

=+

+ +

In OAIS speak this is a SIP. An aggregation of object & its metadata in a form that is acceptable to the repository, e.g. JPEG2000 images and MARC XML.

The Open Archive Information System Reference model (OAIS) is an ISO that describes a conceptual model of an archive. It sets out the activities of an archive & the processes involved in submission, storage & access. Developed by NASA after they ‘lost’ space data through obsolescence.

Page 5: Systems, processes & how we stop the wheels falling off

Thinking conceptually… OAIS

Administrative metadata

Descriptive metadata

Digitised images

Ingestion into repository

Creation of METS Access

+

=+

+ +

In OAIS speak this is a AIP. This is the object & its metadata stored in a repository.

OAIS talks of 3 information packages.1.Submission Information package = what is ingested2.Archive Information Package = what is stored3.Dissemination Information package = what is made available

Page 6: Systems, processes & how we stop the wheels falling off

Thinking conceptually …OAIS

Administrative metadata

Descriptive metadata

Digitised images

Ingestion into repository

Creation of METS Access

+

=+

+ +

In OAIS speak this is a DIP. This is the parts of the object & its metadata that we are able to make available.

As defined in the (#DPC) handbook, access is assumed to mean continued, ongoing usability of a digital resource, retaining all qualities of authenticity, accuracy and functionality deemed to be essential for the purposes the digital material was created and/or acquired for.

Page 7: Systems, processes & how we stop the wheels falling off

Lets tackle the basics…processing

Administrative metadata

Descriptive metadata

Digitised images

Ingestion into repository

Creation of METS Access

+

=+

+ +

Administrative metadata, (AMD) technical description of the files. Automatically created by Safety Deposit Box (SDB) on ingest into our repository. Used by the player for display purposes.

Administrative MetaData is typically created automatically, it could be:•File size•Image HxW•File format•Checksum

Page 8: Systems, processes & how we stop the wheels falling off

Lets tackle the basics…processing

Administrative metadata

Descriptive metadata

Digitised images

Ingestion into repository

Creation of METS Access

+

=+

+ +

DMD. MARC, converted to MARC XML. This becomes MODS in the METS. Material must be catalogued before we can store it & make it available.

Descriptive MetaData (DMD), typically human generated, AKA cataloguing metadata. ISAD(g) for archival material, MARC for bibliographic material. Metadata Object Description Schema (MODS)

Page 9: Systems, processes & how we stop the wheels falling off

Lets tackle the basics…processing

Administrative metadata

Descriptive metadata

Digitised images

Ingestion into repository

Creation of METS Access

+

=+

+ +

Safety Deposit Box (SDB), the place where we store digital stuff. Ingest is automatically initiated by Goobi. Database that associates objects with DMD & AMD. Source for dissemination.

Digital Repositories offer a convenient infrastructure through which to store, manage, re-use and curate digital materials. They are used by a variety of communities, may carry out many different functions, and can take many forms.

Page 10: Systems, processes & how we stop the wheels falling off

Lets tackle the basics…processing

Administrative metadata

Descriptive metadata

Digitised images

Ingestion into repository

Creation of METS Access

+

=+

+ +

METS is metadata about structure & pagination created by humans, METS file built automatically.

A Metadata Encoding & Transmission Standard (METS) file is an aggregated collection of DMD & AMD (a file list with structure) that provides a mechanism for managed access. A METS file allows metadata from different system to be combined into a portable format.

Page 11: Systems, processes & how we stop the wheels falling off

The formats

• JPEG2000 is our master image format.

• We create dissemination images (JPEG) on the fly.

• Also use PDF, MPEG2, MP3

Page 12: Systems, processes & how we stop the wheels falling off

The systems

• Goobi. Manages & tracks the production of digitised content.

• SDB. Repository that stores digitised content along with its DMD & AMD.

• Player. User interface to view digitised material.

Page 13: Systems, processes & how we stop the wheels falling off

How Goobi works – the basics

• Project based.

• Workflow driven.

• Users accept ‘tasks’.

• A users role determines what projects they belong to & what roles they have.

Page 14: Systems, processes & how we stop the wheels falling off

How Goobi works – a workflow

Page 15: Systems, processes & how we stop the wheels falling off

How Goobi works – METS editing

Pagination as per original

Descriptive metadata

Structure

Page 16: Systems, processes & how we stop the wheels falling off

Lessons from Goobi

• Design your workflows in advance. But be flexible.

• Automate as much as possible, saves time & more efficient.

• Document processes & procedures.

• Share what you learn.

Page 17: Systems, processes & how we stop the wheels falling off

How SDB works – the basics

• Workflow based easily ‘talks’ to other systems.

• Content agnostic.

• Creates administrative metadata on ingest.

• Preservation orientated.

Page 18: Systems, processes & how we stop the wheels falling off

How SDB works

Page 19: Systems, processes & how we stop the wheels falling off

How SDB works – behind the scenes

• No public access to SDB.

• Little direct staff access to SDB content.

• High levels of automation of ingest, Goobi.

• Platform for dissemination mediated by the player.

Page 20: Systems, processes & how we stop the wheels falling off

Lessons from SDB

• Plan your systems integration, which system talks to which, and how.

• Plan workflows & processes.

• Data management plan. Your eggs in one basket.

• Plan what you’ll do when it all turns to custard.

Page 21: Systems, processes & how we stop the wheels falling off

How the player works – the basics

Page 22: Systems, processes & how we stop the wheels falling off

How the player works

• Makes HTTP request to SDB for content.

• Draws access conditions from METS file.

• Permitted actions drawn from METS.

• Draws DMD from live catalogue.

Page 23: Systems, processes & how we stop the wheels falling off
Page 24: Systems, processes & how we stop the wheels falling off

Summary

• Digitisation is an end to end process that brings together objects & metadata.

• Have to think about the whole system to deliver results. Process is one of combining metadata from different systems.

• Document plans & document process.

• Be prepared to be flexible & to change as necessary. But try to stick to the plan!

Page 25: Systems, processes & how we stop the wheels falling off

Further reading

• Wellcome Library – http://wellcomelibrary.org

• Metadata Encoding & Transmission Standard at the Library of Congress - http://www.loc.gov/standards/mets/

• Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2. June 2012 - http://public.ccsds.org/publications/RefModel.aspx

• Tessella, Safety Deposit Box - http://www.tessella.com/tag/safety-deposit-box/

• Data management planning - http://www.dcc.ac.uk/resources/data-management-plans

• Repository Software Comparison: Building Digital Library Infrastructure at LSE - http://www.ariadne.ac.uk/issue64/fay

Page 26: Systems, processes & how we stop the wheels falling off

Thank you

Questions now, questions later…?

Dave Thompson, Digital CuratorWellcome Library

[email protected] - #welldigi

http://wellcomelibrary.org/