© keith g jeffery & anne assersoncerif course: data model 1 20021024 1 cerif course session3:...

47
© Keith G Jeffery & Anne Asserson CERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@ rl .ac.uk Anne Asserson, University of Bergen [email protected]

Upload: paulina-stokes

Post on 26-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024 1

CERIF COURSESession3: DataModel

1Keith G Jeffery, Director, IT CLRC [email protected]

Anne Asserson, University of Bergen [email protected]

Page 2: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024 2

Structure of Session

• Full, exchange and metadata models• Full model – overview (nutshell)• The concept of binary relations,

linking relations and recursion• The concept of character / language

variants• The concept of enumerated lists –

dictionaries, thesauri, ontologies

Page 3: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024 3

Structure of Session

• Full, exchange and metadata models• Full model – overview (nutshell)• The concept of binary relations,

linking relations and recursion • The concept of character / language

variants• The concept of enumerated lists –

dictionaries, thesauri, ontologies

Page 4: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024 4

Full, exchange and metadata models

Metadata Model is a subset of Exchange Model is a subset of Full Model

Full Model is intersection of existing CRISs excluding uncommon variants

Page 5: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024 5

Structure of Session

• Full, exchange and metadata models• Full model – overview (nutshell)• The concept of binary relations,

linking relations and recursion• The concept of character / language

variants• The concept of enumerated lists –

dictionaries, thesauri, ontologies

Page 6: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024 6

CERIF2000 Data model

– Extended relational model– Linking relations with attributes (roles and time

stamp)– 3 base entities Person, Organisation, Project– 12 secondary base entities (linked to base entities)– 36 Look up tables (to ensure data quality)– 39 Link tables (flexibility)– all text fields have multiple language fields

– maximum representativity with minimum complexity

Page 7: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024 7

CERIF2000 in a Nutshell

Page 8: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024 8

PROJECT

ORGUNIT

Skills

CV

GeneralFacility

ParticularEquipment

ContactResults

PublicationResultsPatentResultsProduct

Service

FundingProgramme

Event

ClassificationPrize/Award

PERSON

Page 9: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024 9

PROJECT

ORGUNITPERSON

Three Primary Entities

Concepts:(1) entities that reflect main ‘views of entry’ into CRISs(2) entities with no direct functional dependency on each other(3) entities that can refer to themselves (recursion)(4) entities linked in pairs by ‘linking relations’(5) ‘linking relations’ represent temporally-bound roles(6) ‘linking relations’ have primary key of each entity, role, date/time start, date/time end and any other constraints

Page 10: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

10

PROJECT

ORGUNITPERSON

Linking Relations

As an Example: PERSON-ORGUNITConcepts:(1) May have many instances of the relationship for each instance of PERSON and ORGUNIT due to role and temporal bounding (2) Role: the purpose of the relationship e.g. employee | head | ….(3) Temporal: the use of <Start Date/Time> and <End Date/Time> defines the duration of this relationship

Analagous for PROJECT_ORGUNIT and PERSON_PROJECT

Person-Orgunit

Page 11: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

11

PROJECT

ORGUNITPERSON

Primary Base Entity: ORGUNIT

Concepts:(1) ORGUNIT may have an organisationally subordinate relationship to another ORGUNIT e.g. a Group within a Department(2) ORGUNIT may have a symbiotic relationship to another ORGUNIT e.g. two Groups that have a cooperation agreement(3) ORGUNIT may have a financial relationship to another ORGUNIT e.g. customer - contractor

Orgunit-Orgunit

Page 12: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

12

PROJECT

ORGUNITPERSON

Primary Base Entity: PROJECT

Concepts:(1) PROJECT may have an organisationally subordinate relationship to another PROJECT e.g. a sub-Project (2) PROJECT may have a symbiotic relationship to another PROJECT e.g. two Projects that cooperate by agreement(3) PROJECT may have a temporal relationship to another PROJECT e.g. one project follows on from another

Page 13: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

13

PROJECT

ORGUNITPERSON

Primary Base Entity: PERSON

Concepts:(1) PERSON may have a socially subordinate relationship to another PERSON e.g. a child of a parent (2) PERSON may have a symbiotic relationship to another PERSON e.g. two researchers that cooperate by agreement(3) PERSON may have a temporal relationship to PERSON

e.g. a lecturer (dates) becomes a reader (dates)

Page 14: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

14

PROJECT

ORGUNITPERSON

FundingProgramme

Concepts:

(1) Funding Programme is related to (a) ORGUNIT and / or (b) PROJECT

(2) A Person is only funded via (a) ORGUNIT and / or (b) PROJECT

(3) any other entities are only funded via (a) ORGUNIT and / or (b) PROJECT

FUNDING PROGRAMMESecondary Base Entities:

Page 15: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

15

PROJECT

ORGUNITPERSON

Contact

Secondary Base Entities: example: CONTACT

Concepts:(1) all contacts in one place - no replication, no update problems(2) >1 contact dependent on role e.g. private address|work address(3) the PROJECT contact is usually the project leader: a PERSON(4) the ORGUNIT contact is usually the head: a PERSON(5) but may have a generic address

e.g. project URI | Orgunit email ([email protected])Analagous for Publication, Product, Patent, Event, Prize/Award....

Page 16: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

16

PROJECT

ORGUNITPERSON

Result_Publication

Secondary Base Entities: example: RESULT_PUBLICATION

Concepts:(1) temporally-bound role linking relations(2) >1 linking relation : Result_Publication and other entities(3) PERSON role may be author, co-author, editor, reviewer….(4) ORGUNIT role may be publisher, IPR or copyright owner..(5) PROJECT role may be the source of the idea

Page 17: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

17

PROJECT

ORGUNITPERSON

Result_Publication

Can Express: (where DT-date/time)Person A (DT1 - DT2) (is author of) Publication XOrgunit O (DT1 - DT2) (is owner of IPR in) Publication XPerson A (DT1 - DT2) (is employee of ) Orgunit OPerson A (DT1 - DT2) (is project leader of) Project PPerson A (DT1-DT2) (is member of) Orgunit MPerson A (DT1-DT2) (is member of) Orgunit NOrgunit M (DT1-DT2) (is part of) Orgunit OOrgunit N (DT1-DT2) (is part of) Orgunit O

Secondary Base Entities: example: RESULT_PUBLICATION

Page 18: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

18

PROJECT

ORGUNITPERSON

Skills

CV

GeneralFacility

ParticularEquipment

Contact

ResultsPublication

ResultsPatent

ResultsProduct

Service

FundingProgramme

Event

PERSON Links

Prize/Award

Page 19: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

19

PROJECT

ORGUNITPERSON

Skills

CV

GeneralFacility

ParticularEquipment

Contact

ResultsPublication

ResultsPatent

ResultsProduct

Service

FundingProgramme

Event

Prize/Award

PROJECT Links

Page 20: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

20

PROJECT

ORGUNITPERSON

Skills

CV

GeneralFacility

ParticularEquipment

Contact

ResultsPublication

ResultsPatent

ResultsProduct

Service

FundingProgramme

Event

ORGUNIT Links

Prize/Award

Page 21: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

21

PROJECT

ORGUNITPERSON

Skills

CV

GeneralFacility

ParticularEquipment

Contact

ResultsPublication

ResultsPatent

ResultsProduct

Service

FundingProgramme

Event

Classification

Classification

Prize/Award

Page 22: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

22

PROJECT

ORGUNITPERSON

Skills

CV

GeneralFacility

ParticularEquipment

Contact

ResultsPublication

ResultsPatent

ResultsProduct

Service

FundingProgramme

Event

Classification

The Whole Thing

Prize/Award

Page 23: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

23

End of CERIF2000 in a Nutshell

Page 24: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

24

Structure of Session

• Full, exchange and metadata models• Full model – overview (nutshell)• The concept of binary relations,

linking relations and recursion• The concept of character / language

variants• The concept of enumerated lists –

dictionaries, thesauri, ontologies

Page 25: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

25

Binary Relations The Problem

• Wish to link flexibly– An instance in an entity to a related

instance in another entity (relationship)– An instance in an entity to another

instance in the same entity (recursion)

• Examples– Person <-> Project e.g. x is leader of z– Person <-> Person e.g. x is boss of y

Page 26: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

26

Binary Relations Relationship

Usual Relation

Project

Person

Person

PROJECT PERSON

PK PKFK

Page 27: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

27

Binary Relations Relationship

Problem• Supports only 1 (Project) to n (Persons)• i.e. the persons on any 1 project, with

all their attributes (dependencies)• In many cases need to indicate that

– The same person works on several projects– In different roles (e.g. leader, programmer)– At different (or the same) time periods

• i.e. 1 (Person) to n (Projects)

Page 28: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

28

Binary Relations Relationship

Binary Relation

Project

Person

PROJECT PERSON

Project

Person

n m

Page 29: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

29

Binary Relations Relationship

Binary Relation

Project

Person

PROJECT PERSON

Project

Person

Role

StartDate

EndDate

In practice usually have more attributes than Project / Person

Page 30: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

30

Binary Relations Recursion

Usual RelationPK &FK

Person

PERSON

Person

Actually works like this

PERSON

Page 31: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

31

Binary Relations Recursion

Binary Relation

Person

PERSON

Person

Person

Page 32: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

32

Binary Relations Recursion

Binary Relation

Person

PERSON

Person

Person

How the tuples from Person are represented in the binary relation

Page 33: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

33

Binary Relations Recursion

Binary Relation

Person

PERSON

Person

Role

StartDate

EndDate

Person

In practice usually have more attributes than Person / Person

Page 34: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

34

Binary Relations Binary Relation

• Flexible• Allows n : m• With added attributes e.g. role,

date/time• Thus permitting

– Conditional relationships– Temporal relationships– i.e. rich semantics

Page 35: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

35

Structure of Session

• Full, exchange and metadata models• Full model – overview (nutshell)• The concept of binary relations,

linking relations and recursion• The concept of character / language

variants• The concept of enumerated lists –

dictionaries, thesauri, ontologies

Page 36: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

36

Character / Language Variants

Character sets• Character sets

– Not only ‘Latin-1’ (need also to handle Greek, Arabic, Chinese…)

– Can use escape codes technique but only works in linear data streams

– Better to use a rich code that can handle any character from any language (including mathematics, financial currencies) as an atomic item - Unicode

– But it requires more storage

Page 37: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

37

Character / Language Variants

Language• Language• CERIF has many text fields• Each field may exist in multiple languages• For retrieval or update need to know the

language (for text-matching)• So have within the logical record multiple sub-

records differentiated by language for each text field

• Example: Project.Abstract will usually exist in (US) English and original language and maybe language of country/region where stored

Page 38: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

38

Structure of Session

• Full, exchange and metadata models• Full model – overview (nutshell)• The concept of binary relations,

linking relations and recursion• The concept of character / language

variants• The concept of enumerated lists –

dictionaries, thesauri, ontologies

Page 39: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

39

Enumerated Lists, Dictionaries, Thesauri,

Ontologies

• Purpose– Higher quality data: data validation– More accurate retrieval: query

keywords limited and stored words (for any attribute) limited

– Classification – allowing grouping and ranking by value of attribute

Page 40: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

40

Enumerated Lists, Dictionaries, Thesauri,

Ontologies Enumerated List

• Example: Country Code• There is an ISO standard list of valid

2-character and 3-character country codes

• On input can validate country code is from this list (commonly with a pull-down)

• If changes in countries, update the list in one place and whole system reconfigured

Page 41: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

41

Enumerated Lists, Dictionaries, Thesauri,

Ontologies Dictionaries• Example: meaning of a word (term)

– Used in ensuring correct use of a value in an attribute

– For explanation of result output

• Example: multilingual– Used in multilingual query (query in

language 1 and retrieve from records stored in languages 2….n)

– Used in result output – translate (crudely) to single language as required

Page 42: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

42

Enumerated Lists, Dictionaries, Thesauri,

Ontologies Thesauri

• Provide the structural relationships of words (terms)– Synonym (different word same meaning)– Homonym (same word different meaning)– Antonym (word with opposite meaning)– Super-term (a word whose meaning

includes the word being used e.g. person includes {student|worker | ….}

– Sub-term (a word whose meaning is included in a Super-term)

Page 43: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

43

Enumerated Lists, Dictionaries, Thesauri,

Ontologies Ontologies• Ontology: philosophical study of

existence and nature of reality• In practice a resource of terms, their

definitions and their logical inter-relationships

• E.g. For a publication to exist it is necessary to have a title, at least 1 author

• Publication [ title AND >=1 author]

Page 44: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

44

Enumerated Lists, Dictionaries, Thesauri,

Ontologies Ontologies• Domain Ontology: Ontology covering a

domain (subject area of interest)• Example Publication• Publication [ title] AND author]• Collection [title + >1 author +

editor]• If Publication has title, > 1 author and

editor it is a collection• Publication is_part_of Collection • Collection is_a_kind_of Publication

Page 45: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

45

Enumerated Lists, Dictionaries, Thesauri,

Ontologies Ontologies• Domain Ontologies in IT• A representation in first order

logic allowing– Facts to be expressed– Relationships to be expressed– Constraints to be expressed– New facts and relationships to be

deduced or induced

Page 46: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

46

Enumerated Lists, Dictionaries, Thesauri,

Ontologies Ontologies• Used

– Data validation on input– Clarification and improvement of a

query– Resolving heterogeneity of terms to

homogeneity– Expanding super-terms to subterms and

vice-versa conditionally– Deducing or inducing new facts and

relationships from stored facts and relationships

Page 47: © Keith G Jeffery & Anne AssersonCERIF Course: Data Model 1 20021024 1 CERIF COURSE Session3: DataModel 1 Keith G Jeffery, Director, IT CLRC k.g.jeffery@rl.ac.ukk.g.jeffery@rl.ac.uk

© Keith G Jeffery & Anne Asserson

CERIF Course: Data Model 1 20021024

47

Conclusion

• CERIF is a data model with ‘levels’– Primary base entities

• e.g. Person

– Secondary base entities • e.g. Result_Publication

– Language-base entities• e.g. Abstract

– Lookup Tables• e.g. Role of Person

– Linking Relations• e.g. Project <-> Person