1 document ontologies in library and information science: an introduction and critical analysis...

Post on 13-Jan-2016

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Document Ontologies in Document Ontologies in Library and Information Library and Information

Science: An Introduction Science: An Introduction and Critical Analysisand Critical Analysis

Allyson Carlyle iSchool, University of Washington, Seattle, WA, USA acarlyle@u.washington.edu http://purl.oclc.org/net/acarlyle Knowledge Technologies Conference 2002, Seattle, WA, USA

2

Overview

Where I’m coming from : knowledge and document organization tasks in LIS (Library and Information Science)

Factors affecting the organization of knowledge and documents

Ontologies & Ontological assumptions in LIS IFLA ontology: Physical/Abstract status of

documents Hirons & Graham ontology: Temporal status of

documents

3

Where I’m coming from: organizing tasks in LIS Creating document representations,e.g., cataloging

records; Arranging documents, e.g., in Dewey number order

on a library bookshelf; Creating organizational standards (Dewey) and

techniques (alphabetical ordering) to use in representing and arranging documents;

Creating organizational standards and techniques to provide pathways (via titles, author names, taxonomies, classifications, etc.) that guide people to documents and organized knowledge.

4

Factors affecting the organization of knowledge and documents People: individuals and groups (e.g., social,

cultural, occupational orientations); Systems: retrieval, display / organization,

interface; Knowledge / documents: delivery

mode/format, subject content, disciplinary aspect, artifactual importance;

Administration & environment: costs, other constraints.

5

Ontology?

In philosophy: “the branch of metaphysics that deals with the nature of being”

In computer related communities: “a specification of a conceptualization” (Tom Gruber); or “a set of vocabulary definitions that expresses a community’s consensus knowledge about a domain. This knowledge is meant to be stable over time, and reused to solve multiple problems.” (Peter Weinstein)

6

Ontological assumptions in LIS

Documents have a simultaneous existence as both physical and abstract entities; this is being referred to in the library cataloging community as “content vs. carrier”

7

8

9

Content vs. Carrier

The physical/abstract dichotomy presents the following problem – if we are creating document representations, what should we represent? The carrier? The content? Both?

Whatever decision is made, the physical/abstract dichotomy may result in complications for people when they are searching, navigating, and trying to determining relevance in systems.

10

IFLA ontology of the physical/abstract status of documents International Federation of Library Associations

(IFLA) charged a study group to identify “functional requirements for bibliographic records” (in other words, an explanation an optimum model for creating document representations)

Functional Requirements for Bibliographic Records available at: http://www.ifla.org/VII/s13/frbr/frbr.pdf

11

IFLA ontology of the physical/abstract status of documents Proposes that documents are single physical

entities representing multiple abstract entities each with its own distinct, and sometimes contradictory, attributes: work (an intellectual or artistic creation) expression (a realization of a work in alpha-numeric,

musical, image, etc. form) manifestation (a physical embodiment of an

expression of a work) item (a single exemplar of a manifestation)

12

Alternative definitions for IFLA entities work: a set of items embodying a distinct

intellectual or artistic content expression: a set of items embodying a

realization of a work manifestation: a set of “identical” items;

items sharing many intellectual and physical attributes

item: a single item

13

Items

Item attributes: condition, access restrictions on item, history (provenance), marks or inscriptions present

14

Manifestations

Manifestation attributes: edition designation (3rd edition), publisher/distributor, date of publication/distribution, physical medium, access restrictions on manifestation, file characteristics (electronic document)

What is a manifestation in the web environment? What you see on the screen or what is stored in a file on a server?

If manifestation defined as what you see on a screen, how useful is it to describe web page “manifestations”?

15

Expressions

Expression attributes: expression title (The Haunted Pool, The Devil’s Pool, La Mare au Diable), expression creator (e.g., translator), type of score (musical notation), projection or scale (cartographic expression), etc.

Do all expressions have unique attributes? Dune vs. Dune – some interpretations would

make manifestation attributes into expression attributes

16

Works

Work attributes: creator, work title (La Mar au Diable), date of creation / date of publication or appearance, key (for a musical work), coordinates (for a cartographic work), etc.

Problem: What is a work? When does a version (expression) of a work become different enough to become its own “distinct intellectual or artistic creation”

Charles Dickens’ A Christmas Carol vs. Scrooged – the same work? different works that are related?

17

Solutions to the Physical/Abstract Multiple Entity Problem IFLA ontology is one approach; others, both simpler

and more complex, are possible – see Indecs Framework, a variation of the IFLA ontology for “intellectual property” e-commerce (http://www.indecs.org/ ).

Standardized approaches or ontologies are possible that: recognize multiple abstract entities embodied in a single

physical item; represent each entity using a particular set of attributes,

clearly distinguished; display relationships among items to users in an

unambiguous and consistent manner

18

Hirons & Graham ontology: temporal status of documents Some documents, such as magazines,

annual reports, and websites, may be seen as distinct works that accumulate or change as time passes. Hirons and Graham identify these as “ongoing entities.” How do we best create representations for ongoing entities?

19

Hirons & Graham ontology: temporal status of document

With their ontology, Hirons and Graham clarify the nature of ongoing entities to improve library cataloging rules

However, their ontology may also be used to improve identification of metadata in web documents.

20

Hirons & Graham ontology

21

Strengths of the Hirons & Graham ontology Recognizes both similarities and differences

between documents such as serials that are “successive with discrete parts” and those that are “integrating”, such as Websites

Recognizes the fundamental nature of “integrating” documents; that they are not not made up of parts, but are wholes that are updated or changed.

22

Complications

How do we maintain attribute values for ongoing entities? See Carl Lagoze et al. for a possible solution, using “event aware” metadata: http://www.cs.cornell.edu/lagoze/papers/ev.pdf

23

Complications

Can the Hirons & Graham ontology and the IFLA ontology be successfully integrated? How can we talk about an integrating work or expression? What attributes are associated with them?

For example, are “serials”, such as magazines or e-journals, really “works”? If they are works (Time Magazine) composed of other works (Time articles), what are the implications for representation?

24

References IFLA Study Group on the Functional Requirements for

Bibliographic Records. Functional Requirements for Bibliographic Records: Final Report. UBCIM Publications, New Series, vol. 19. München: K.G. Saur, 1998. http://www.ifla.org/VII/s13/frbr/frbr.pdf

The Indecs (INteroperability of Data in E-Commerce Systems) Framework. At: http://www.indecs.org/ Used the IFLA ontology as an initial framework.

Jean Hirons and Crystal Graham. “Issues Related to Seriality.” From: The Principles and Future of AACR. Jean Weihs, ed. Ottawa: Canadian Library Association, 1998. [Written for the library cataloging community, so it parts may be difficult to understand.]

Carl Lagoze, Jane Hunter, and Dan Brinkley. “An Event Aware Model for Metadata Interoperability” At: http://www.cs.cornell.edu/lagoze/papers/ev.pdf

25

“Ontology” References

Tom Gruber. (2001) “What is an Ontology?” At: http://www-ksl.stanford.edu/kst/what-is-an-ontology.html.

Peter Weinstein. “Ontology-Based Metadata: Transforming the MARC Legacy”, from Digital Libraries 98, Pittsburgh, PA, USA: pp. 254-263.

top related