optimising metadata workflows in a distributed information environment r. john robertson & jane...

22
Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of Strathclyde, UK

Upload: maya-pruitt

Post on 28-Mar-2015

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Optimising metadata workflows ina distributed information environment

R. John Robertson & Jane BartonCentre for Digital Library Research

University of Strathclyde, UK

Page 2: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Overview

Introductions & definitions:

Metadata, workflow & optimisation

Diversity & the distributed information environment

Models and frameworks:

Generic models: repositories, objects & metadata

Existing models & frameworks

Developing a metadata lifecycle model

Using the metadata lifecycle model tooptimise workflow

Moving forward

Page 3: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Metadata, workflow & optimisation

Metadata= good quality metadata= metadata that meets repository requirements

Metadata workflow= quality assured metadata by design= metadata creation & QA processes designed to meet repository requirements with available resources

Metadata workflow optimisation= refining metadata workflow to improve quality & enhance metadata

Critical to functionality, interoperability & sustainability of repositories

Page 4: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Optimising metadata workflow

Determine required metadata quality

Determine target metadata quality

Design & implement workflow

Refine workflow

Review

Determine purpose of metadata

Local environment

Wider environment

Barton, J. & Robertson, R.J. Designing workflows for quality assured metadata. CETIS Metadata & Digital Repositories SIG Meeting, Edinburgh, 10th March 2005.

Page 5: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Diversity & the dIE

In the wider environment, there is considerable diversity

• of purpose

• of metadata requirements

• of metadata creation processes & priorities

Diversity presents challenges for interoperability between repositories

Diversity also offers potential for refinement of metadata workflow among repositories

Assumes/requires persistent object identifiers

Page 6: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Optimising metadata workflow in the dIE

Workflow optimisation requires a model of the dIE

• to facilitate strategic partnerships

• to inform allocation of resources

• to foster holistic approach to creation, augmentation & enhancement of metadata

To achieve this, two conditions must be met:

• local workflow must be articulated

• local workflow must be placed in context of wider environment

Page 7: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Reference models for workflow optimisation

Ecology of repositoriesprovides a typology of repositories & associated servicesmodels the relationships between them & between their domains

Object lifecycle modelprofiles objects within repositories & their movement, transformation & adaptation within the dIE

Metadata lifecycle modelprofiles metadata within repositories & its movement, augmentation & enhancement within the dIE

Page 8: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Existing models & frameworks

Existing models that relate to (parts of) the reference models:

the E-Learning Framework

McLean & Blinco’s cosmic view

the JISC Information Environment

CORDRA

the work of Gonçalves et al

Page 9: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

The E-Learning Framework (ELF)

A common approach to service oriented architectures for education via:a definitional model of service componentsstandards & tools to support their interoperability

Addresses a specific domain & provides a typology of functions within that domain

(The E-Learning Framework. http://www.elframework.org)

Page 10: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

McLean & Blinco’s cosmic view

A service domain typology of repositoriesmore comprehensive than ELF but less detailedhighlights potential for cross-domain approachidentifies need for better articulation of context & methodologies to deal with complex contextual issues

(McLean, N. The ecology of repository services: a cosmic view. ECDL, 2004. http://www.ecdl2004.org/presentations/mclean/)

Page 11: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

The JISC Information Environment

Provides convenient access to a comprehensive collection of scholarly & educational materialscan be viewed as a specific implementation of ELFprovides a superstructure to inform & co-ordinate technical infrastructure developmentfocuses on technical solutions to support structural & syntactical interoperabilitytaking a lead in addressing unresolved issues in the object lifecycle

(JISC. Strategic activities: Information Environment.2004. http://www.jisc.ac.uk/about_info_env.html)

Page 12: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

CORDRA

Enables access to wide range of learning object repositories through federated searching:high common denominator for participating LORscreates community of repositories behind interoperability boundaryassumes federation as method of interaction, with metadata integration rather than interoperability, so little potential for metadata workflow optimisation

(Kraan,W. & Mason,J. Issues in federating repositories: a report on the first International CORDRA Workshop. D-Lib Magazine, 11(3), 2005.)

Page 13: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Gonçalves et al’s 5S

Complex formal taxonomy of repositories:comprehensively catalogues repositories from five perspectivesengages with all three reference models but does not engage with interactions & offers only a static view

(Goncalves,M.A. et al. Streams, structures, spaces, Scenarios, societies (5S): a formal model for digital libraries. ACM Transactions on Information Systems, 22(2), 2004.)

Page 14: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Existing models & frameworks

In general, existing models

address structural & syntactic interactions to a degree but do not address semantic interactions

provide voices, vocabularies & grammar for repositories

could usefully be extended to profile not only what repositories do but how they might interact with each other

Page 15: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Developing a metadata lifecycle model

A metadata lifecycle model (MLM) must:

• include profiles of each repository’s metadata, ideally at element level, more realistically in terms of structure, semantics & syntax

• distinguish between local requirements & those of the wider community

• enable clusters of similar repositories to be identified & relationships established

• include processes carried out as a result of these relationships, formal or informal

Page 16: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Components of the model

Page 17: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Using the MLM to optimise workflow

MLM enables repositories to optimise workflow by:

• exploiting known metadata sources elsewhere in the dIE via intelligent import or harvesting

• exploiting formal metadata relationships between repositories & services via negotiation & establishment of minimum standards

• provides a framework for assessing the cost/benefit of eg implementing particular metadata elements or participating in consortia

Page 18: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Using the MLM: example

The NSDL is a centralised service harvesting metadata from multiple sources:

• breaks harvested metadata into elements & assigns provenance metadata to them

• creates optimum records by combining metadata elements from various sources

• creates metadata profiles of sources to enable these processes to be automated

• demonstrates that metadata workflow optimisation & intelligent harvesting can yield real benefits

Page 19: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Using the MLM: use cases

LOR using LOM wants to harvest metadata records, has crosswalks & mappings for structure & syntax, seeks repositories with similar semantic approach

federated search service wants to dynamically select search targets that can support MESH

departmental repository enhances its metadata byre-harvesting general subject terms from its IR & specialist subject terms from a subject repository

centralised service augments metadata automatically & original source re-harvests improved record

Page 20: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Moving forward…

In context of rapid repository development with limited resources, must use available resources as effectively as possible

Optimising metadata workflow across the dIE can enable repositories to:

• expand element sets without compromising on quality

• expand functionality

• improve ingest processes

• support more automatic metadata transformation & enhancement

Page 21: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Moving forward…

Development of the MLM to support metadata workflow optimisation requires:

• standard way of profiling repositories at repository, object & metadata level

• integration with registry projects for repositories, standards, application profiles & vocabularies

• at individual repository level, a method for the design of metadata workflows that makes reference to & exploits workflows elsewhere in the dIE

Page 22: Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of

Optimising metadata workflow

Determine required metadata quality

Determine target metadata quality

Design and implement workflow

Refine workflow

Review

Determine purpose of metadata

Local environment

Wider environment

Barton, J. & Robertson, R.J. Designing workflows for quality assured metadata. CETIS Metadata & Digital Repositories SIG Meeting, Edinburgh, 10th March 2005.