introduction to the semantic webelite.polito.it/files/courses/01lhv/2009/1-knowledge.pdf ·...

85
Introduction to the Semantic Web

Upload: others

Post on 16-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

Introduction tothe Semantic Web

Page 2: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 2

Semantic Web

Web second generationWeb 3.0

“Conceptual structuring of the Web in an explicit machine-readable way” (Tim Berners-Lee)In other words…

…let the machine do most of the work!!!

http://www.w3.org/2001/sw/

Page 3: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 3

“Official” introductionThe Semantic Web is a web of data. There is lots of data we all use every day, and its not part of the web. I can see my bank statements on the web, and my photographs, and I can see my appointments in a calendar. But can I see my photos in a calendar to see what I was doing when I took them? Can I see bank statement lines in a calendar?Why not? Because we don’t have a web of data. Because data is controlled by applications, and each application keeps it to itself

Page 4: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 4

Example

Page 5: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 5

“Official” introductionThe Semantic Web is about two thingsIt is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of documents.It is also about language for recording how the data relates to real world objects. That allows a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by being about the same thing

Page 6: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 6

An example …

How can a machine distinguish the meanings … ?

“I am a professor of computer science.”

“I am a professor of computer science, you may think. Well,…”

Page 7: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 7

Key principlesThe Semantic Web is the Web

Same base technologies, evolutionaryDecentralized (incomplete, inconsistent)

Provide explicit statements regarding web resources

Authors, original information providersIntermediaries (humans and/or machines)

Information consumers determine consequences of the statements

Distributed ‘reasoning’

Page 8: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 8

1989: WWW original proposal

Page 9: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 9

Technology stack (old: pre-2008)

Page 10: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 10

Technology stack (current: 2008)

Page 11: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 11

Goal of the semantic Web

The Semantic Web will enable machines to COMPREHEND semantic documents and data, NOT human speech and writing

Then, how???Semantic Web foundation: metadata

Page 12: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

Metadata andMetadata Standards

Page 13: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 13

Resource and description

ResourceContent, format, …Access method dependent on format (I can read it if I “know” its language)

Resource descriptionIndependent of the format (I can read “people’s comments” about the resource… provided that I know the language in which the comment is written)

Page 14: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 14

Resource and descriptiondescription

resource

this resource was created on April 14th, 2009

the title of this resource is

“Introduction to the Semantic

Web”

the author of this resourceis L. Farinetti

this resource is related to computer science,

knowledge representation and metadata

the quality of this resource

is high, according to

F. Corno

this resource is suitable for PhD students

Page 15: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 15

Resource and descriptionResource

Content, format, …Access method dependent on format (I can read it if I “know” its language)

Standardization (i.e. common language for applications) ???

Practically impossible …Huge amount of existing informationHundreds of human languagesHundreds of computer languages (other word for formats)

Page 16: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 16

Resource and descriptionResource description

Independent of the format (I can read “people’s comments” about the resource… provided that I know the language in which the comment is written)

Standardization (i.e. common language for applications) ???

FeasibleSmaller amount of information, possibly newSolution: define a standard language for writing comments (“metadata” in semantic web terminology)

Page 17: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 17

Resource and description

this resource was created on April 14th, 2009

the title of this resource is

“Introduction to the Semantic

Web”

the author of this resourceis L. Farinetti

this resource is related to computer science,

knowledge representation and metadata

the quality of this resource

is high, according to

F. Corno

this resource is suitable for PhD students

Metadata

Field name = field value

Page 18: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 18

Resource and descriptiondescription

resource

Date =2009-04-14

Title =“Introduction to

the Semantic Web”

Author =L. Farinetti

Topic ={computer science,

knowledge representation,

metadata}

Quality = high

Level = PhD students

Rated by F. CornoRated by F. Corno

Page 19: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 19

Semantic Web main tasksMetadata annotation

Description of resources using standard languages

SearchRetrieve relevant information according to user’s query / interest / intentionUse metadata (and possibly content) in a “smart” way (i.e. “reasoning” about the meaning of annotations)

Page 20: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 20

Meaningful metadata annotations

Common language for describing resourcesResource description standards

Common language for description field namesMetadata standards

Common language for description field valuesMetadata standards + controlled vocabularies

Semantically rich descriptions to support searchKnowledge representation techniques, ontologies

Page 21: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 21

Common language for describing resources

Page 22: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 22

Common language for describing resources

Resource Description Framework (RDF)Resource = URI (retrievable, or not)RDF is structured in statements

A statement is a tripleSubject – predicate – objectSubject: a resourcePredicate: a verb / property / relationshipObject: a resource, or a literal string

Page 23: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 23

Common language for describing resources

Diagram:

Simple RDF assertion (triple):

triple (hasAuthor, URI, L.Farinetti)triple (hasAuthor, URI, L.Farinetti)

URIURI L.FarinettiL.FarinettihasAuthor

Author =L. Farinetti

Page 24: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 24

Common language for describing resources

RDF in XML syntax:

Author =L. Farinetti

<RDF xmlns=“http://www.w3.org/TR/ … ” ><Description about=“http://www.polito.it/semweb/intro”>

<Author>L.Farinetti</Author></Description>

</RDF>

<RDF xmlns=“http://www.w3.org/TR/ … ” ><Description about=“http://www.polito.it/semweb/intro”>

<Author>L.Farinetti</Author></Description>

</RDF>

Page 25: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 25

Common language for field names

Title = ...

Problem

Author = …

Creator, Maker, Contributor …

Synonymy

Topic = …

Topics, Subject, Subjects, Argument, Arguments

Singular / plural

Level = …

Difficult to clearly define concept in a few words

Educational level, destination, suitability, …

Date = …

Date of creation, date of last modification, date of revision, …

Different concepts: need for more details

Page 26: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 26

Common language for field names

Solution: metadata standardsMany standardization bodies are involvedStandards may be general

e.g. Dublin Core (DC)or may depend on goal, context, domain, …

e. g. educational resources (IEEE LOM), multimedia resources (MPEG- 7), images (VRA), people (FOAF, IEEE PAPI), geospatial resources (GSDGM), bibliographical resources (MARC, OAI), cultural heritage resources (CIDOC CRM)

Page 27: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 27

Metadata standards examples

Page 28: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 28

Dublin Core

Dublin Core Metadata Element Set (DCMES)

Building blocks to define metadata for the Semantic Web15 elements, or categories, general enough to describe most of the published resourcesExtra elements and element refinements

Page 29: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 29

DC metadata element set

Page 30: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 30

Example of description using Dublin Core (in RDF)

A paper in the “Ariadne” journal

Page 31: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 31

Common language for field values

ProblemsValue type

Title =“Introduction to

the Semantic Web”

type = string

Date =2009-04-14

type = date

Author =L. Farinetti

type = string“standard” format? Laura Farinetti, FarinettiLaura, Farinetti L., …

Page 32: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 32

Common language for field values

ProblemsValue typeValue restrictions?

freedom vs shared understanding

Quality = high

High, medium, low?1 to 5?any value?

Level = PhD students

any value?list of possible values?

Topic ={computer science,

knowledge representation,

metadata}

any value?any number of values?

Page 33: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 33

Common language for field values

Solution: metadata standards + controlled vocabulariesMetadata standards

Only some, and partiallyControlled vocabularies

Explicit list of possible values

Page 34: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 34

Examples from IEEE LOM

1484.12.1 - 2002 Learning Object Metadata (LOM) Standard

Developed by the IEEE Learning Technology Standards Committee (LTSC)

Standard to describe the “Learning Objects” in order to guarantee their interoperability

Page 35: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 35

Examples from IEEE LOM

Page 36: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 36

Examples from IEEE LOM

Page 37: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 37

Examples from IEEE LOM

Page 38: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 38

… + controlled vocabularies

A closed list of named subjects, which can be used for classificationMetadata field values arerestricted to a list of terms(selected by experts)

Topic ={computer science,

informatics, knowledge

representation, metadata}

Page 39: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 39

Semantically rich descriptions to support search

Page 40: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 40

Semantically rich descriptions to support search

http://dictybase.org/db/html/help/GO.htmlhttp://dictybase.org/db/html/help/GO.html

Topic ={metabolism, …}

Page 41: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

KnowledgeRepresentation

Page 42: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 42

Need for knowledge representation

Semantically rich descriptions need “understanding” the meaning of a resource and the domain related to the resource

Disambiguation of termsShared agreement on meaningsDescription of the domain, with concepts and relations among concepts

Page 43: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 43

Example: Dublin Core metadataMetadata of a single paper

Page 44: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 44

ProblemsTitle usually offers good clues, but

it does not necessarily mention all names of all subjects the user is interested init may presuppose knowledge the user does not actually possess

Subject is meant to convey precisely what the document is about, but

much depends on how extensive the set of keywords is, whether all related subjects are mentioned, and whether too many subjects are listed

Metadata does not say much about “how related” a resource is to a given subject

Page 45: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 45

Search results for “topic maps”

Page 46: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 46

Problems

Authors were free to define their own subject keywords

Results are not “about” topic maps, but “related to” topic mapsIf an author forgets to list “topic maps”, his paper will never be found

Page 47: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 47

Subject-based classificationAny form of content classification that groups objects by their subjects

e.g the use of keywords to classify papers Metadata fields describe what the objects are about by listing discrete subjects inside a subject-based classificationImportant: difference between describing the objects being classified and describing the subjects used to classify them

Metadata describe objects Subject- based classification is the approach to describe subject

Page 48: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 48

Subject-based classification ...“On those remote pages it is written that animals are divided into:a. those that belong to the Emperor b. embalmed ones c. those that are trained d. suckling pigse. mermaids f. fabulous ones g. stray dogs h. those that are included in this classificationi. those that tremble as if they were mad j. innumerable ones k. those drawn with a very fine camel's hair brush l. others m. those that have just broken a flower vase n. those that resemble flies from a distance"

From The Celestial Emporium of Benevolent Knowledge, Borges

Page 49: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 49

Subject-based classification techniques

Controlled vocabulariesTaxonomiesThesauriFaceted classificationOntologiesFolksonomiesOthers

… Most come from library science

Page 50: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 50

Controlled vocabularyA closed list of named subjects, which can be used for classificationComposed of terms: particular name for a particular concept

similar to keywordsTerms are not concepts

A single term may be the name of one or more conceptsA single concept may have multiple namesAmbiguity avoided by forbidding duplicate terms

Page 51: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 51

Controlled vocabulary

GoalPrevent authors from defining terms that are meaningless, too broad or too narrowPrevent authors from misspelling Prevent different authors from choosing slightly different forms of the same term

The simplest form of controlled vocabulary is a list of terms (or “pick list”)

Topic ={computer science,

knowledge representation, mtadata, RDF,

topic navigation maps}topic maps

Page 52: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 52

Controlled vocabularyReduce ambiguity inherent in normal human languages Solve the problems of homographs, homonyms, synonyms and polysemes by ensuring

That each concept is described using only one authorized term That each authorized term in the controlled vocabulary describes only one concept

Page 53: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 53

Problems solved

Synonymdifferent words with identical or very similar meanings

Page 54: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 54

Problems solved

Synonymdifferent words with identical or very similar meanings

close“Will you please close that door!”

“The tiger was now so close that I could smell it...”

pupilstudent

opening in the iris of the eye

axes('æk.səz) plural of axe

('æk.siz) plural of axis

Page 55: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 55

Problems solved

Synonymdifferent words with identical or very similar meanings

student and pupil (noun) buy and purchase (verb) sick and ill (adjective)

to get

take (I'll get the drinks)

become (she got scared)

wood

understand (I get it)

a piece of a tree

a geographical area with many trees

Page 56: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 56

Controlled vocabulary examples

Practically no “real” examplesWith very little extra effort: taxonomies and thesauri!

Circuit theoryElectronic circuitsMicrowave technology Electron tubes Semiconductor materials and devicesDielectric materials and devicesMagnetic materials and devicesSuperconducting materials and devices…

Circuit theoryElectronic circuitsMicrowave technology Electron tubes Semiconductor materials and devicesDielectric materials and devicesMagnetic materials and devicesSuperconducting materials and devices…

BloodCord bloodErythrocyte Leukocyte BasophilEosynophilLymphoblastLymphocyte MonocyteNeutrophil…

BloodCord bloodErythrocyte Leukocyte BasophilEosynophilLymphoblastLymphocyte MonocyteNeutrophil…

Page 57: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 57

Taxonomy

Subject-based classification that arranges the terms in the controlled vocabulary into a hierarchy

Dates back to Carl Linnæus’s work on zoological and botanical classification (18th century)

Page 58: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 58

TaxonomyAllow related terms to be grouped together

It is clear that “topic maps” and “XTM” are relatedEasier to classify documentsEasier to choose search keywords

Page 59: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 59

Taxonomies and metadata

Metadata are stored as usual with the resourceThe “subject” will contain only controlled termsControlled terms belong to a hierarchy, shared by all papers

Page 60: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 60

Taxonomy example: INSPEC

http://www.theiet.org/publishing/inspec/index.cfmhttp://www.theiet.org/publishing/inspec/index.cfm

Page 61: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 61

Taxonomy example: INSPEC

Page 62: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 62

INSPEC journal article database

INSPEC journal article database

Page 63: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 63

Taxonomy example: anatomy terms

http://www.cbil.upenn.edu/anatomy.php3http://www.cbil.upenn.edu/anatomy.php3

Page 64: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 64

Taxonomy example

Page 65: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 65

Taxonomy example

http://www.acm.org/class/1998/ccs98.htmlhttp://www.acm.org/class/1998/ccs98.html

Page 66: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 66

Taxonomy limitsOnly two kinds of relationships between terms

Parent = broader termChild = narrower term

topic navigation mapssynonym

no more in use

difference?

synonymXML topic map

difference?

Page 67: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 67

Thesaurus

Extends taxonomiessubjects are arranged in a hierarchy

Other statements can be made about the subjectsTwo ISO standards

ISO2788 for monolingual thesauriISO5964 for multilingual thesauri

Page 68: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 68

Thesaurus relationshipsBT – broader term

Refers to a term with wider or less specific meaningSome systems allow multiple BTs for one term, while others do notInverse property: NT - narrower termA taxonomy only uses BT and NT

SN – scope noteString explaining its meaning within the thesaurusUseful when the precise meaning of the term is not obvious from context

Page 69: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 69

Thesaurus relationshipsUSE

Another term that is to be preferred instead of this termImplies that the terms are synonymousInverse property: UF

TT – top termThe topmost ancestor of this termThe BT of the BT of the BT...

RT – related termA term that is related to this term, without being a synonym of it or a broader/narrower term

Page 70: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 70

Thesaurus example

http://www.ukat.org.uk/thesaurus/http://www.ukat.org.uk/thesaurus/

Page 71: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 71

Thesaurus example

http://www.swinburne.edu.au/corporate/registrar/rms/keywords.htmhttp://www.swinburne.edu.au/corporate/registrar/rms/keywords.htm

Page 72: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 72

http://www.loc.gov/cds/lcsh.htmlhttp://www.loc.gov/cds/lcsh.html

Thesaurus exampleLibrary of Congress Subject Heading

Page 73: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 73

Faceted classification

Proposed by S.R. Ranganathan in the ‘30sFacets are the different axes along which documents can be classifiedEach facet contains a number of terms

Usually with a thesaurus organizationUsually a term belongs to one facet onlyA document is classified by selecting one term from each facet

Page 74: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 74

Faceted classification example

http://flamenco.berkeley.edu/http://flamenco.berkeley.edu/

Page 75: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 75

Advantages

Multi-dimensionalityPersistenceScalabilityFlexibility

http://freeable.polito.it/http://freeable.polito.it/

Page 76: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 76

Ontology

Model for describing the world that consists of a set of types, properties, and relationships Extends the other subject-based classification approaches

Has open vocabulariesHas open relationship types (not just BT/NT, RT and USE/UF)

Page 77: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 77

Ontology structure

ConceptsRelationships

Is-aOther

Instances

Page 78: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 78

Ontology example

http://onto.stanford.edu:8080/wino/index.jsphttp://onto.stanford.edu:8080/wino/index.jsp

Page 79: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 79

FolksonomyInternet-mediatedsocial environmentsTags compiledthrough social taggingSocial tagging

Decentralized practice where individuals and groups create, manage and share tags to annotate digital resources in an online social environmentGenerally characterized by non-standard tagging

Page 80: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

Knowledge Management - Intro 80

Example (flickr - del.icio.us)

Page 81: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 81

Other subject-based techniques

Synonym ringsConnect together a set of terms as being equivalent for search purposeSimilar to UF/USE relationship of thesauri, but no preferred term

Page 82: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 82

Other subject-based techniques

Authority fileSimilar to a synonym ring, but consists of UF/USE relationships instead of synonym relationshipsOne term in each synonym ring is indicated as the preferred term for that subjecte.g. Library ofCongress NameAuthority File

Page 83: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 83

Subject-based classification summary

Terminology is rarely used in a consistent way

Controlled vocabularies are thesauri, thesauri are ontologies, …

Terminology is rarely used in a consistent way

Controlled vocabularies are thesauri, thesauri are ontologies, …

http://www.iesr.ac.uk/profile/vocabs/index.html/#CtrldVocabsListhttp://www.iesr.ac.uk/profile/vocabs/index.html/#CtrldVocabsList

Page 84: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 84

Subject-based classification summary

Page 85: Introduction to the Semantic Webelite.polito.it/files/courses/01LHV/2009/1-Knowledge.pdf · “Official” introduction The Semantic Web is about two things It is about common formats

F. Corno, L. Farinetti - Politecnico di Torino 85

Coming next …

More on RDFMore on Ontologies