a logical framework for metadata interoperability 16th august 2007 the advanced digital library...

51
A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Upload: letitia-york

Post on 20-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

A Logical Framework for Metadata Interoperability

16th August 2007

The Advanced Digital Library Seminar 2007Guilin, China

Page 2: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Contents

• 1. Metadata interoperability in libraries• 2. Metadata interoperability• 3. Two interoperability prototypes• 4. Consistency issues• 5. Models and interoperability• 6. Semantics and syntax of metadata• 7. Interoperability across models• 8. Dublin Core application profile• 9. Discussion: library metadata practices

Page 3: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

1. Metadata interoperability in libraries

Metadata interoperability is increasingly accepted as something that libraries have to deal with.

Page 4: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

New challenges for libraries

• A new information environment– Variety of resources – Distributed locations– Heterogeneous structures

Page 5: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

New challenges for libraries

• New library services – putting everything together

Page 6: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

New challenges for libraries

• New library system architecture

Page 7: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

WorldCat

• WorldCat achieves interoperability across geographically distributed library catalogues with a unique metadata system, MARC21.

• Centralized architecture.

Page 8: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Z39.50

• Z39.50 performs interoperability with decentralized architecture.

Page 9: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

2. Metadata interoperability

When we talk about interoperability in relation to metadata, we are generally talking about search interoperability, or the ability to perform a search over a diverse set of metadata records to obtain meaningful results.

(Caplan, 2003)

Caplan, P. (2003). Metadata Fundamentals for All Librarians: American Library Association

Page 10: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Levels of metadata interoperability • Schema level – focused on interoperability

between elements of the schemas. • Record level – integrate and convert metadata

records. • Repository level – harvested or integrated

records from varying sources, mapping value strings.

(Chen & Zeng, 2006)

Chen, L. M., & Zeng, M. L. (2006). Metadata Interoperability and Standardization – A Study of Methodology Part I. D-Lib Magazine, 22(6).

Page 11: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Machine-understandable

• When does “metadata interoperability” become a problem?– Human understandable

– Machine understandable

• The core issue of metadata interoperability is how to make the relationships among metadata systems machine-understandable.

Page 12: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Machine-understandable

Real worldHuman understandable

Metadata

Machine world

Machine understandable

Page 13: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

3. Two interoperability prototypes

• There are two basic mechanisms to achieve interoperability between metadata systems: mapping and integrating.

• Mapping is a process of matching original metadata systems to target metadata systems. It assigns a binary relationship between a pair of members in different metadata systems.

• Integrating is a process of combining different metadata systems together into a unique metadata system. The principle of the federating is integrating different metadata systems into one metadata system, both on the record level and the element level, and making them work together.

Page 14: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Mapping and Integrating

Metadata Interoperability

Mapping

Integrating

Crosswalk

Switching schema

Metadata packagesWarwick Framework

Element reusingApplication Profile

Page 15: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Crosswalk

Element 1

Element 2

Element 3

Element n

Element 1

Element 2

Element 3

Element n

Metadata A Metadata B

Page 16: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Switching schema

MARC-XML

GEM

Dublin Core

MARC

ONIX

EDA

Interoperable Core

Godby, C. J., Smith, D., & Childress, E. (2003). Two paths to interoperable metadata. Retrieved 31 July, 2005, from http://www.oclc.org/research/publications/archive/2003/godby-dc2003.pdf

Page 17: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Metadata packages: Warwick Framework• Metadata module

– Container – Packages

• Metadata set• Indirect• Container

Lagoze, C. (1996). The Warwick Framework [Electronic Version]. D-Lib Magazine. Retrieved 8 July, 2007 from http://www.dlib.org/dlib/july96/lagoze/07lagoze.html.

Page 18: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Application profiles

• The principle of the application profile approach is the reuse of existing metadata schemas.

• Application profiles consist of data elements drawn from one or more namespace schemas combined together by implementors and optimised for a particular local application. (Heery & Patel, 2000)

Heery, R., & Patel, M. (2000). Application profiles: mixing and matching metadata schemas [Electronic Version]. Ariadne. Retrieved July 8, 2007 from http://www.ariadne.ac.uk/issue25/app-profiles/intro.html

Page 19: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Canada government metadata project: an example of application profiles

Devey, M. & Côté, M. (2006). The development and use of metadata application Profiles: the Government of Canada experience. The Serials Librarian, 51(2)

Page 20: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

4. Consistency issues

• “ Two communities may agree about the meaning of the term title or creator or identifier, but until they have a shared convention for identifying and encoding values, they cannot easily exchange their metadata. ”

(Duval, Hodgins, Sutton, & Weibel, 2002)

Duval, E., Hodgins, W., Sutton, S., & Weibel, S. L. (2002). Metadata Principles and Practicalities [Electronic Version]. D-Lib Magazine, 8. Retrieved July 10, 2007 from http://www.dlib.org/dlib/april02/weibel/04weibel.html.

Page 21: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Barriers to interoperability

• Semantic inconsistency e.g. dc.Title vs. MARC.245 dc.Creator vs. EAD. Author

• Syntax inconformity e.g. ISO2709 vs. XML

• Representation inconformity e.g. Hillmann, Diane I. vs. Hillmann, D. I.

• Vocabulary inconformity e.g. LCSH vs. MeSH

Page 22: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

5. Models and interoperability

• A metadata model is an abstract construct that represents metadata by a set of components and a set of logical relationships between them.

• A metadata model may help us to gain a better understanding of metadata and its relationship with the real word, as well as the relationship within an encoding system that is machine-understandable.

• A metadata model facilitates the development of better interoperability between metadata systems.

Page 23: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Models and interoperability

Real World

Machine World

(i.e. computer)

Metadata System

Metadata System

Metadata Model

Better understanding

Interoperability

Interoperability

Better understanding

Page 24: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

DC Abstract Model (DCAM)

• Dublin Core Abstract Model is for encoding metadata. It acts as a grammar for Dublin Core.

• DCAM is intended to be independent from any specific encoding syntax, such as RDF framework, and to be an universal meta syntax model for metadata encoding.

Page 25: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Principles of DC Abstract Model (DCAM)

• One to one principle A description describes exactly one resource.

• Dumb-down principle Refinements principle for narrower-broader

relationship consistency

• Appropriate values Well-typed values to ensure that the usage of

metadata elements in a particular context will be well guided

Page 26: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

One to one principle: Based on FRBR Model

A Book

Title

Publisher

Subject

ISBN

Author

Subject

SchemeTerms

Author

Name

AddressDescription A

Description B

Description C

Group 1 : work

Group 3 : Subject

Group 2 : People, Co.

Page 27: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

DCAM resources model

• Any object that the metadata is intended to describe is a resource.

• A resource is composed of property/value pairs.

• Each value is either a literal value or a non-literal value.

Property Value

literal

Non-literal

Resource

Resource

Page 28: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Property/value pair with different formats • Property/value pairs are the core of the DCAM.

They are the minimal semantic units of metadata.

Page 29: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

DCAM description set model

• Property/value pairs are represented by a statement which is made up of a property URI/value surrogate pair.

• A set of statements, which describes a resource, is a description.

• A set of descriptions is a description set.

Property URI Value Surrogate

Statement

Description

Description Set

Page 30: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Description Set

Description set model

DescriptionStatement

Property -Value

Page 31: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

DCAM vocabulary model

• Vocabulary is a set of terms. The terms denote the semantics both of the property and the value.

• In the DCAM model, the vocabulary may be one of three things, value vocabulary, property vocabulary, or class.

• Property vocabulary includes metadata elements set and it’s definitions.

• Value vocabulary includes vocabulary encoding schemes, e.g. LCSH, DDC; syntax encoding schemes, e.g. 2007-1-12 or 1-12-2007 or 12-1-2007.

Page 32: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

DCAM : a summary

Domain ModelDomain Model

Syntax ModelSyntax Model

Page 33: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

6. Semantic and syntax of metadata

• The semantics of metadata is about machine understandable meanings of the metadata.

• All of machine understandable meanings are inherited from human understandable meanings. So the human understandable meaning space of the metadata is also a significant part of the metadata model.

• The vocabulary model of metadata is the human understandable semantic space that defines and identifies the attributes and content of the resource. It is also a mechanism which transforms the meanings between humans and machines.

Page 34: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Domain models

• The meanings of the two functional components, property and value, are based on a general view of the resource which the metadata intends to describe.

• Property denotes the attributes of the resource, therefore, definitions of property relies on how the resource is structured into a set of attributes.

• The value is extracted from the resource under guidelines for selecting, formulating and presenting values.

• The general view of the resource can be indicated by domain models, such as FRBR in the bibliography domain and CIDC CRM in the cultural heritage domain.

Page 35: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Syntax of metadata

• The syntax of metadata is about machine processable formats.

• Metadata should be encoded into a machine readable format e.g. MARC, XML, RDF, etc. following standards for encoding. These standards are designed to provide a common way to describe metadata information and make it easily read and understood by computer applications.

• The framework for encoding metadata into a particular kind of format is the syntax of the metadata.

Page 36: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Metadata model structure

Semantic

Domain Model FRBR, ABC, CRM

Vocabulary Model

Property ModelMetadata elements schemes

Value Model

Vocabulary encoding schemes and syntax encoding schemes , AACRII, RDA

Resources Model Property /Value Pairs

Syntax

Description Set Model

Syntax Model RDF, SKOS

Page 37: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

7. Interoperability across models

• Metadata interoperability may occur between each layer of the metadata models.

Domain Model

Property Model

Value Model

Vocabu

lary M

odel

Resource Model

Description Set Model

Syntax Model

Sem

antic

Synta

x

Domain Model

Property Model

Value Model

Vocabu

lary M

odel

Resource Model

Description Set Model

Syntax Model

Sem

antic

Synta

x

Page 38: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Semantic interoperability

• Interoperability at the semantic layer makes agreement between semantic models, such as metadata element sets, encoding schemes and, at the top level, domain models. It must be seriously considered when achieving an interoperability at schema level.

• Terminology Harmonization• Elements Mapping • Common Ontology

Page 39: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Terminology harmonization

• The simplest way to achieve semantic interoperability is by extracting a common set of terminology for metadata standards, (Pierre & LaPlant, 1998) which is called harmonization.

• However, terminology can not carry whole meanings for achieving semantic interoperability. Sometimes, the precise definition of the same terms are implemented differently.

Pierre, M. S., & LaPlant, W. P. (1998). Issues in Crosswalking Content Metadata Standards [Electronic

Version]. Retrieved July 8, 2007 from http://www.niso.org/press/whitepapers/crsswalk.html.

Page 40: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Elements mapping

Page 41: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Interoperability between domain models• To achieve perfect semantic interoperability, the

property in each of the metadata systems should be defined under the same conceptual framework. This common conceptual framework is a common ontology.

• An ontology is a formal explicit description of concepts. It consists of classes, slots, facets, and instances. An ontology of metadata consists of a set of individual instances of concepts, and constitutes a knowledge base for metadata interoperability.

• A common ontology can sufficiently present the semantics of metadata using knowledge representation languages, such as OWL et al., and achieve common understanding of metadata across different domains.

Page 42: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

ABC model, example of common ontology• ABC ontology, developed

within Harmony International Digital Library Project, provides a common conceptual model to facilitate interoperability between metadata ontologies from different domains.

Lagoze, C., & Hunter, J. (2001). The ABC Ontology and Model [Electronic Version]. Journal of Digital Information, 2. Retrieved July 2007 from http://journals.tdl.org/jodi/issue/view/10.

Page 43: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Interoperability between domain models • Ontology mapping between ABC and CRM

Doerr, M., Hunter, J., & Lagoze, C. (2003). Towards a Core Ontology for Information Integration [Electronic Version]. Journal of Digital Information, 4. Retrieved July, 2007 from http://journals.tdl.org/jodi/article/view/jodi-109/91.

(Doerr, Hunter, & Lagoze, 2003)

Page 44: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Interoperability between domain models• Extending ABC with MPEG-7

Hunter, J. (2003).

Hunter, J. (2003). Enhancing the semantic interoperability of multimedia through a core ontology. IEEE Transactions on Circuits and Systems for Video Technology, 13(1), 49-58.

Page 45: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Syntax interoperability

• The interoperability at the syntax layer is to ensure format conformity. Syntax or format inconformity often happens during metadata record conversion at the record level of metadata interoperability. For example, when we convert a MARC record into MARCXML format, we must follow exactly the syntax of MARCXML.

e.g. 24510 |a Arithmetic / |c Carl Sandburg.

<datafield tag="245" ind1="1" ind2="0"><subfield code="a">Arithmetic /</subfield><subfield code="c">Carl Sandburg</subfield></datafield>

<datafield tag="245" ind1="1" ind2="0">a Arithmetic / |c Carl Sandburg.</datafield>

Page 46: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Syntax interoperability

• The approach to metadata interoperability at the syntax level is now more focused on developing syntax-independent data models. The models could be equally applicable in a variety of syntax contexts. According to such models, metadata property/value pairs and their statements can be encoded with different encoding frameworks, such as RDF, <META>, SHOE etc. .

Page 47: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

• In DCAM, a value in the property/value pair could be another resource, which is identified by URI. It means that a resource could be described by another resource. This principle provides a framework which links different metadata description sets together

URI and interoperability

Metadata B

Metadata A

Property Value URI

Resource A

Property URI

Value

Resource B

Page 48: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

8. Dublin Core application profile

• CEN MMI-DC Workshop, a research group of the European Committee for Standardization, developed application profile guidelines to give guidance on how to create a customized or adapted metadata schema for a particular application. DCAP is based on Dublin Core. It allows people to extend Dublin Core by drawing the elements from other existing metadata schema or creating new elements to meet particular needs.

• DCAP follows DCAM, and especially follows the Principle of Appropriate Identification, Principle of Readability. (CEN MMI-DC Workshop, 2003)

CEN MMI-DC Workshop. (2003). Dublin Core Application Profile Guidelines. Retrieved July, 2007, from ftp://ftp.cenorm.be/PUBLIC/CWAs/e-Europe/MMI-DC/cwa14855-00-2003-Nov.pdf

Page 49: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Dublin Core application profile

Name of Term title

Term URI http://purl.org/dc/elements/1.1/title

Label Title

Defined By http://dublincore.org/documents/dcmi-terms/

Source Definition A name given to the resource

DC-Lib Definition -

Source Comments Typically, a title will be a name by which the resource is formally known.

DC-Lib Comments A parallel/transliterated title is considered a main title, i.e. the Title element is repeated.

  Either a title or identifier is mandatory. If no title is available, best practice is to give a constructed title, derive a title from the resource or supply [no title]. If using qualified Dublin Core, an element refinement for titles other than the main title(s) should be included.

  Retain initial articles and use local sorting algorithms based on language. A language qualifier may be used to indicate language of title if appropriate. (For example, see: Initial Definite and Indefinite Articles for a list of articles in various languages).

Type of term element

Refines  

Refined By alternative

Has Encoding Scheme  

Obligation M

Occurence  

CEN MMI-DC Workshop. (2003). Dublin Core Application Profile Guidelines. Retrieved July, 2007, from ftp://ftp.cenorm.be/PUBLIC/CWAs/e-Europe/MMI-DC/cwa14855-00-2003-Nov.pdf

Page 50: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

9. Discussion: library metadata practices• Information organizing paradigm in library community

shifts• Single metadata schema to multiple metadata schemas.• Application profile becomes cornerstone of information

organization in library community.• Working for both users and machines – new library

professional.

Page 51: A Logical Framework for Metadata Interoperability 16th August 2007 The Advanced Digital Library Seminar 2007 Guilin, China

Questions and Thanks!

Special thanks to

Dr. MARCIA LEI ZENG

Professor of School of Library and Information Science, Kent State University, U.S.A.

Dr. LIU WEI

Senior Research Librarian of Shanghai Library, China

Ms. KAAREN HIYAMA

Asian Languages Librarian, Asian Studies Subject Librarian of University of Auckland Library , New Zealand

for their help and contribution with this presentation.