elag 2006. matei: pml1 cataloguing the romanian cultural heritage or yet another schema for heritage...

24
ELAG 2006. Matei: PML 1 Cataloguing the Romanian Cultural Heritage or yet another schema for heritage assets Dan Matei (CIMEC)

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

ELAG 2006. Matei: PML 1

Cataloguing the Romanian Cultural Heritage or yet another

schema for heritage assets

Dan Matei (CIMEC)

ELAG 2006. Matei: PML 2

A new schema: why ?

• (CIMEC) the difficulty to manage many databases with overlapping content;

• limitations of MARC formats and the non-standard (but too simple) museum matadata;

• new insights via FRBR and CRM;• the need for a data model for the (future)

Romanian Shared Catalogue.

ELAG 2006. Matei: PML 3

Why MARC is not enough ?ELAG 2004 (Trondheim) WS 10:• the "1 to 1 principle" is not observed (i.e.

matadata about about several resources in the same record);

• it is not too flexible, i.e. it is almost flat, it allows only 2 (let's say 3) hierarchical levels; no good control of the granularity of data;

• some tags (e.g. those for the headings) express two different things: a) the nature of the related resource, b) the kind of relationship;

• it does not allow (naturally) multilingual data within a record (for the fields with values in the language of the cataloguing).

ELAG 2006. Matei: PML 4

Functions of the catalogueFRBR: the Frankfurt Principles (2003) [the Buenos

Aires version (2004)] wording: ... to enable a user:

• to find bibliographic resources in a collection (real or virtual) as the result of a search using attributes or relationships of the resources:– to locate a single resource– to locate sets of resources

• to identify a bibliographic resource or agent;• to select a bibliographic resource that is

appropriate to the user’s needs;• to acquire or obtain access to an item

described;• to navigate a catalogue.

ELAG 2006. Matei: PML 5

(Extra) functional requirements for the shared catalogues

• FR1: language neutrality, i.e. the textual elements could be expressed in several languages, and the language of en element could be detected automatically;

• FR2: traceability of changes, i.e. the modifications could be tracked, dated and attributed (thus, reversed);

• FR3: opinion neutrality, i.e. different opinions could coexist in the metadata, that is the elements could have alternative values, with clearly assigned intellectual responsibilities.

ELAG 2006. Matei: PML 6

PML = Panizzi Markup Language

sir Anthony Panizzi (1797-1879)

• chief librarian of British Museum (1856 – 1867);

• the famous 91 cataloguing rules (1839).

ELAG 2006. Matei: PML 7

Other XML-based formats

• marcxml (LC) – 2003;• MODS : Metadata Object Description

Schema (LC) – 2005;• MADS : Metadata Authority Description

Schema (LC) – 2005;• BiblioML (French Ministry of Culture) –

1999;• rdfs:frbr (Stefan Gradmann) – 2005.

ELAG 2006. Matei: PML 8

PML: "design principles"

• P1: a data model based on FRBR & CRM, i.e. accommodating library and museum resources;

• P2: to observe the three (extra) functional requirements for the shared catalogues;

• P3: to enhance the (lexicographic and chronologic) browsing of the access points;

• P4: to make the simple easy and the complex possible (corollary: to accommodate a scalable granularity of data);

• P5: descriptions could include "elements not required for the stated objectives" (i.e. only half of Svenonius' "Principle of sufficiency and necessity").

ELAG 2006. Matei: PML 9

two (contradictory) ambitionsa) to have specific elements for the

frequent resources (e.g. books, articles, paintings, coins), but also generic ones, for the many, less frequent types of resources (e.g. artifact) – a new element is imposed by the specific mixture of the resource's properties;

b) to come up with an elegant language (i.e. with economy of means).

a) much easier than b) !

ELAG 2006. Matei: PML 10

PML: outline (the catalog)<catalog ...> <records> <book guid="g1"...>

<coin guid="g2" ...> ...

</records> <cataloguers> <cataloguer guid="g3"...>

... </cataloguers> <archive> <replacedElement replacedBy="g4"...>

... </archive></catalog>

ELAG 2006. Matei: PML 11

PML: outline (the vocabulary)<vocabulary>

<vocabularyClass name="languages"> <term canonical="English"> <version languageRef="Romanian">engleză</version>

<version languageRef="English">English</version>

</term> <term canonical="Romanian"> <version languageRef="Romanian">română</version> <version languageRef="English">Romanian</version> </term>

......</vocabularyClass>

.....</vocabulary>

ELAG 2006. Matei: PML 12

PML: a sample<catalog> <records> <book cataloguerId="dm" guid="b001" timestamp="2006-03-11">

<titlePage> <responsibility>Gellu Naum</responsibility><br/> <title>Zenobia</title><br/> <publisher>Humanitas</publisher><br/> <publishingPlace>Bucureşti</publishingPlace> </titlePage> <publication> <place><statement>Bucureşti</statement></place> </publication> <ISBN-10><number>973-50-0324-4</number></ISBN-10> <language><languageRef>Romanian</languageRef></language>

<responsibility main="true" doubtful="false"> <targetId>p1</targetId><typeRef>author</typeRef></responsibility>

</book>

ELAG 2006. Matei: PML 13

PML: a sample (cont.)<person guid="p1" timestamp="2006" cataloguerId="dm"> <appelation><signature>

<name typeRef="real name" guid="gn"> <segment guid="g" classRef="first name">Gellu</segment> <segment guid="n" classRef="last name">Naum</segment> </name> <version languageRef="English"> <qualifier> <segment>Romanian poet and playwright</segment> </qualifier> </version> <dates guid="t" type="life"> <segment guid="y">1915</segment>-2001 </dates></signature>

<indexEntry> <alphaKey1 ref="n"/><alphaKey2 ref="g"/><dateKey2 ref="y"/>

</indexEntry> </appelation></person>

ELAG 2006. Matei: PML 14

PML: updates<note guid="111">

<version languageRef="French"><update cataloguerId="dm"

timestamp="2006-04-21"><deleted>ancien</deleted><inserted>nouveau</inserted>

</update>en francais</version><version languageRef="English">

in English</version></note>

ELAG 2006. Matei: PML 15

Abstractions

ELAG 2006. Matei: PML 16

Items

ELAG 2006. Matei: PML 17

The index: problems (1)

ELAG 2006. Matei: PML 18

The index: problems (2)

ELAG 2006. Matei: PML 19

The index: problems (3)

ELAG 2006. Matei: PML 20

The index: keys1. alphaKey12. alphaKey1Type3. numKey14. dateKey15. dateKey1Precision6. alphaKey27. numKey28. dateKey29. dateKey2Precision

ELAG 2006. Matei: PML 21

The index: keys types/ranks

Terms 1xx

Temporal entities 2xx

Places 3xx

Corporate bodies 4xx

Persons 5xx

Titles 6xx

ELAG 2006. Matei: PML 22

The index: date precision

ante, i.e. < -3

non post, i.e. <= -2

circa, i.e. ~ -1

exact, i.e. = 0

non ante, i.e. >= 1

post, i.e. > 2

ELAG 2006. Matei: PML 23

PML-based database: a suggestionTables:

– resources:• guid,• XML document;

– relations;• guid,• sourceId,• targetId,• XML document;

– index:• keys

ELAG 2006. Matei: PML 24

Doubts and open problems

• how to handle multiple views (interfaces) ?• how to handle an "original", i.e. an object

which is work, expression, manifestation and item (e.g. Mona Lisa) ?

• how to handle a concept which is also an UDC class (e.g. 'hysteria') ?

• it is a sound approach ?