the paper trail:steps towards a reference model for the metadata ecology
DESCRIPTION
The paper trail: steps towards a reference model for the metadata ecology, presentation at ~CoLIS5 workshop. Presentation with Jane Barton. http://mwi.cdlr.strath.ac.uk/Colisworkshop.htm Archiving- from June 2005. please note this presentation is currently all rights reserved until i contact the other author.TRANSCRIPT
The paper trail:steps towards a reference model
for the metadata ecology
R. John Robertson & Jane BartonCentre for Digital Library Research
University of Strathclyde, UK
Overview
The paper trail:tracking an object and its metadataviews of the object’s metadata lifecycleanalysis of metadata quality
Modelling the metadata ecology:metadata lifecycle, extended lifecycle & ecologya continuum of reference modelscomponents of the metadata ecology modelexisting models & frameworksapplications of the model
Tracking an object & its metadata
Introduction to exercise
The first thing you do
Looking for Barton, J. Currier, S. and Hey, J. M. N. (2003) Building quality assurance into metadata creation: an analysis based on the learning objects and e-prints communities of practice. In Sutton, S. and Greenberg, J. and Tennis, J., Eds. Proceedings 2003 Dublin Core Conference: Supporting Communities of Discourse and Practice - Metadata Research and Applications, Seattle, Washington (USA), 39-48.
Highlight simple object, simple purpose of metadata
Key to diagramsE-Prints Soton
E-Prints UK Worldcat zniff
Copy of paper and metadata
OAI harvested
record
Automatically created
metadata
Manually created
metadata
Tracking the object
DC2003/ DCMI
E-lis
Strathprints
E-Prints Soton
E-Prints UK
Arc (ODU)
Oaister
Metalis
HAIRST
Citebase
Worldcat
CDLR pubs
Erpanet
Stephen’s Web
zniff Tardis list
CIS pubs
Erpanet
Other resource lists
Tracking the object’s metadata
DC2003/ DCMI
E-lis
Strathprints
E-Prints Soton
E-Prints UK
Arc (ODU)
Oaister
Metalis
HAIRST
Citebase
Worldcat
CDLR pubs
Erpanet
Stephen’s Webzniff Tardis list
CIS pubs
ErpanetOther resource lists
Tracking the object’s metadata
DC2003/ DCMI
E-lis
Strathprints
E-Prints Soton E-Prints UK
Arc (ODU)
Oaister
Metalis
HAIRST
Citebase
Worldcat
CDLR pubs
Erpanet
Stephen’s Webzniff Tardis list
CIS pubs
ErpanetOther resource lists
Four metadata lifecycles for the object
Zoom in to look at metadata activity in sections of previous diagram.
What do these relationships imply for the metadata lifecycles associated with this paper?
What does this look like in terms of metadata workflows?
Author deposit
Worldcat
Resource list
Harvester
Author metadata lifecycle for the object
DC2003
Strathprints
CDLR pubs
CIS pubs
Erpanet
TitleAuthorsAbstract
PublisherDateUrl
Pages
TitleAuthors
PublisherDateUrl bibtex
Review process
Review process
Worldcat metadata lifecycle for the object
E-Prints Soton
Worldcat
TitleAuthors
PublisherDateUrl DC
MaRC
Yahoo search
Authority files
Controlled vocabular
y tools
Subject
Resource list metadata lifecycle for the object
DC2003
Erpanet Stephen’s Web Tardis
zniff
Other resource lists
TitleAuthors
DateUrl
TitleUrl
Harvester metadata lifecycle for the object
DC2003
E-lis
Strathprints
E-Prints Soton
E-Prints UK
Oaister
Metalis
HAIRST
CitebaseErpanet
GET
GET
GET
GET
GET
GETGET
GET
Review process
Analysis of metadata quality
The return on the metadata investment in this paper
What metadata do we look for when searching for a doc?
Author
Title
Date
url
Searching for a citation
Analysis of discovery metadata
Element Repository
DC2003e-prints Soton e-lis
strathprints
e-Prints UK metalis
Title y y y y y y
Author y y y n y y
Date y y p n n p
conference paper url y p y n n n
Element HAIRST worldcat cdlr zniffstephen's web erpanet
Title y y y n y y
Author p y y n n n
Date n y y n y y
conference paper url n n y y y y
Analysis of Citation completeness
Repository HAIRST worldcat cdlr zniffstephen's
web erpanetresource
list
Citation score 1 5 6 2 5 4
2 (typically)
Repository DC2003e-prints
Soton e-lisstrathprint
se-Prints
UK metalis Tardis list
Citation score 6 5 9 2 2 2 6
Author Title Date ConferenceConference date / location
Publisher EditorsPublication place Pages URL
Reflecting on this paper trail
Duplication of effort
Confusion rather than good diversity
Points in system capable of metadata exchange or augmentation- not happening; neither are tools in use.
The possibility of joining lifecycles up and so addressing these issues depends on being able to locate and understand relevant sections of ecology – this is in turn dependent on model for this ecology
Defining the metadata ecology
The ‘metadata ecology’ captures all metadata activity associated with a single object:
the object’s metadata lifecycle at any given point in the systemextended metadata lifecycles for the object integrated across several points in the systemthe relationships between all metadata lifecycles associated with the object throughout the system
Illustrating the metadata ecology:metadata lifecycle
Strathprints
Review process
HAIRST
GETAuthor/
depositor
Illustrating the metadata ecology:extended metadata lifecycle
Strathprints
CDLR pubs
CIS pubs
Review process
HAIRST
GET
Illustrating the metadata ecology:metadata relationships
Strathprints
CDLR pubs
CIS pubs
Review process
HAIRST
GET
E-lis Metalis
Review process
GET
Metadata relationship
Reference models
The ‘metadata ecology’ is part of a continuum of reference models for the distributed information environment at various levels of granularity:
ecology of repositories
object ecology
metadata ecology
Ecology of repositories
provides a typology of repositories and associated servicesmodels the relationships between them and between their domainsrequires an understanding of the purpose(s) of repositories locally and in the wider community, as well as their technical profiles and interactions
Object ecology
profiles objects within repositoriesmaps their movement, transformation and adaptation within individual repositories and in the wider environmentgoes beyond object lifecycle to include extended object lifecycle and associated relationshipsrequires resolution of persistent object identification and digital rights issuespossible parallels with the learning object economy or the scholarly publishing model
Metadata ecology
profiles metadata within repositories
maps the movement, augmentation and enhancement of metadata in the wider system
distinguishes between local metadata requirements and those of the wider system
enables clusters of similar repositories to be identified and relationships established
includes metadata activity resulting fromthese relationships, formal or informal
Components of the ecology model
Existing models & frameworks
Existing models that relate to (parts of) the reference models:
the E-Learning Framework
McLean & Blinco’s cosmic view
the JISC Information Environment
CORDRA
the work of Gonçalves et al
CIDOC Conceptual Reference Model (CRM)
FRBR data model
The E-Learning Framework (ELF)
A common approach to service oriented architectures for education via:a definitional model of service componentsstandards & tools to support their interoperability
Addresses a specific domain & provides a typology of functions within that domain
(The E-Learning Framework. http://www.elframework.org)
McLean & Blinco’s cosmic view
A service domain typology of repositories:more comprehensive than ELF but less detailedhighlights potential for cross-domain approachidentifies need for better articulation of context & methodologies to deal with complex contextual issues
(McLean, N. The ecology of repository services: a cosmic view. ECDL, 2004. http://www.ecdl2004.org/presentations/mclean/)
The JISC Information Environment
Provides convenient access to a comprehensive collection of scholarly & educational materialscan be viewed as a specific implementation of ELFprovides a superstructure to inform & co-ordinate technical infrastructure developmentfocuses on technical solutions to support structural & syntactical interoperabilitytaking a lead in addressing unresolved issues in the object lifecycle
(JISC. Strategic activities: Information Environment.2004. http://www.jisc.ac.uk/about_info_env.html)
CORDRA
Enables access to wide range of learning object repositories through federated searching:high common denominator for participating repositoriescreates a community of repositories behind an interoperability boundaryassumes federation as the method of interaction, with metadata integration rather than interoperability
(Kraan,W. & Mason,J. Issues in federating repositories: a report on the first International CORDRA Workshop. D-Lib Magazine, 11(3), 2005.)
Gonçalves et al’s 5S
Complex formal taxonomy of repositories:comprehensively catalogues repositories from five perspectivesengages with all three reference models but does not engage with interactions & offers only a static view
(Gonçalves,M.A. et al. Streams, structures, spaces, scenarios, societies (5S): a formal model for digital libraries. ACM Transactions on Information Systems, 22(2), 2004.)
Existing models & frameworks
In general, existing models
address structural & syntactic interactions to a degree but do not address semantic interactions
provide voices, vocabularies & grammar for repositories
could usefully be extended to profile not only what repositories do but how they might interact with each other
Moving forward…
Development and exploitation of the metadata ecology requires:
a standard way of profiling repositories at repository, object and metadata level
clear articulation of metadata requirements, in terms of structure, semantics and syntax, and of associated metadata workflows
integration with registries of repositories, standards, application profiles and vocabularies
Potential applications
The metadata ecology enables repositories to optimise metadata workflow and quality by:exploiting known metadata sources via intelligent import or harvestingexploiting formal metadata relationships between repositories via negotiation and establishment of minimum standardsproviding a framework for assessing the cost/benefit of eg implementing metadata elements or participating in consortia