o n t o p e d i a the identity of everything creating topic maps + topic maps and knowledge...
Post on 21-Dec-2015
227 views
TRANSCRIPT
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Creating Topic Maps+ Topic Maps and Knowledge Organization
Steve [email protected]
Oslo University College, 2007-09-15
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Course agenda
Week 37 – 09-08 Introduction to Topic Maps – Part 1 Week 38 – 09-15 Creating a topic map Week 39 – 09-22 Introduction to Topic Maps – Part 2 Week 42 – 10-13 Ontology-driven editing Week 43 – 10-20 The machinery of Topic Maps Week 46 – 11-10 (Semantic Web) Week 48 – 11-24 (Ontologies)
Terminology:– Topic Maps: The technology and the standard
– topic maps: The artefacts (documents) we create
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Today’s agenda
Quick recap: basic concepts and building blocks Topic Maps and Knowledge Organization
– Metadata, taxonomies, thesauri, faceted classification
Interchange syntaxes– XTM, LTM and CTM
Demo: Creating a topic map using LTM– Pay close attention...
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Recap: Core concepts
A pool of information or data, and
information
• Associations– representing relationships between
subjects
composed by
born in
composed by
• Occurrences– links to information that is somehow
relevant to a given subject
= The TAO of Topic Maps
a knowledge layer consisting of
knowledge
• Topics– a set of topics representing the key
subjects of the domain in question
Puccini
Tosca
Lucca
MadameButterfly
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Recap: Basic building blocks
Basic building blocks are– Topics: e.g. “Puccini”, “Lucca”, “Tosca”
– Associations: e.g. “Puccini was born in Lucca”
– Occurrences: e.g. “http://www.opera.net/puccini/bio.html
is a biography of Puccini”
Each of these constructs can be typed– Topic types: “composer”, “city”, “opera”
– Association types: “born in”, “composed by”
– Occurrence types: “biography”, “street map”, “synopsis”
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic Maps and Knowledge Organization
Keywords & controlled vocabulariesTaxonomies, thesauri & classificationsIndexes & glossariesOntologies
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Bibliographic languages
Work language– Author language
– Title language
– Edition language
– Subject language Classification language Index language
Document language– Production language
– Carrier language
– Location language
Svenonius, Elaine (2000):The Intellectual Foundation of Information Organization.Cambridge, MA: MIT Press (p.54)
Work languages– “Work languages describe information
entities, their intellectual (as opposed to physical) attributes, and relationships among them.” (p.87)
Document languages– ”A document is a particular space-time
embodiment of information: a document language describes and provides access to this embodiment.” (p.107)
Subject languages– “A subject language is used to depict
what a document is about.” (p.127)
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Two perspectives
Works have tended to be conflated with documents– So in practice there have been two kinds of language
Document languages– describe the work and its manifestations– document-centric (or resource-centric), e.g.
document metadata (Dublin Core) bibliographic records (MARC)
Subject languages– describe the subject space in which the work exists– subject-centric, e.g.
thesauri, taxonomies (ICD) classification schemes (LCSH, DDC) faceted classification (Colon)
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Metadata
“Data about data”– Information about documents– e.g. author, title, publisher, date, format,
keywords Useful for managing the content
– Especially suitable for librarians Somewhat useful for searching
– Especially for experts Less useful for end-users
– the user starts out wanting to know more about a subject
– traditional metadata, however, focuses on the document
– if aboutness is provided at all, it gets squeezed into a single field
Title: Creating Topic Maps
Author: Steve Pepper
Date: 2007-09-13
Format: appl/ppt
Keywords: topic maps, syntax,knowledge organization
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Keywords
Primitive form of subject-based classification– The keywords are used to describe the subject
– Cheap and simple… Folksonomies and tagging.
But also problematic because authors– misspell keywrods,
– use different keywords/terms/tags for the same thing, and
– use keywords that make no sense
Secondary problem– No way for the user to find out what keywords have been used
A keyword is a topic name
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Controlled vocabularies
Solution: create a list of legal keywords!– Requires somewhere to keep the list, and a process for new terms
Benefits– Solves problems of misspelling and duplicates (synonyms)
Disadvantages– Introduces some overhead (a flat list is difficult to manage)
– Users can still search using the wrong terms
– Users (and authors) still have difficulty finding terms
A controlled vocabulary is a well-defined set of topics with one name per topic
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Taxonomies
Organize the keywords into a tree– Most general at the top, more specific further down
– Common structure used by Yahoo!, etc.
– The folder metaphor file systems, email, favourites
Requires relationships between terms– Relationships state that one term is more specific
than another
– Advantage: terms somewhat easier to find
– Disadvantage: real world does not fit neatly into a hierarchy
A taxonomy is a set of topics related through a specific type of hierarchical association
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Thesauri
Like a taxonomy, but with some extensions– Also better defined: there are ISO standards for thesauri
Relationship types:– BT Broader term NT Narrower term– USE Preferred term UF Non-preferred terms– RT Related term– SN Scope note
A thesaurus is a set of topics related through particular, predefined association types
– BT/NT (hierarchical) and RT (untyped, associative)– (Scope notes are a kind of occurrence)– (USE and UF represent multiple names for the same concept/topic)
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Faceted classification
Invented by S. R. Ranganathan in the 1930s– Defines a number of facets or dimensions
– Defines a set of terms within each facet
– Sometimes these terms are arranged in a taxonomy
– Documents are classified against each facet separately
A faceted classification is a collection of topic “hierarchies”
– Each “hierarchy” contains topics whose names are used as terms within a particular facet
– XFML: An XML interchange syntax for faceted classification inspired by Topic Maps
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Expressivity progression
Topic maps– use any types, properties, and relationships you like
Faceted classification– multiple vocabularies, taxonomies or thesauri (one per facet)
Thesauri– more formal taxonomy; still no topic types; two association types
Taxonomy– terms arranged in a hierarchy; no topic types; single association type
Controlled vocabulary, folksonomies– just a list of terms; no topic types; no associations
open model
fixed model
no model
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Document-centric approaches
Traditional metadata is document-centric– Provides substantial descriptive power for documents
– Allows connection into subject-based classification
– Crucial for the management of content
– However, users are most interested in the subjects
Taxonomies, thesauri, and faceted classification are also document-centric
– These are methods for subject-based classification
– They provide hardly any descriptive power for subjects
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Subject-centric approaches
Topic maps are subject-centric– They provide great descriptive power for subjects
– Good as finding aids, because subjects are what users care about
Documents can be treated as subjects– This enables topic maps to capture metadata as well
– It also enables topic maps to stitch metadata and subject-based classification together into one seamless whole
Topic Maps is the knowledge model par excellence:– A subject-centric knowledge model that encompasses every other
kind of knowledge organization model
– Topic Maps can therefore be used to relate and combine taxonomies, indexes, thesauri, classifications, etc. etc.
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Syntaxes
XTM, LTM and CTM
What are they?
When should I use which?
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic Maps Syntaxes
HyTM (HyTime Topic Maps)– Original syntax, expressed in terms of SGML and HyTime
– No longer part of ISO 13250
XTM (XML Topic Maps Syntax)– Later, XML-based syntax, recently moved to version 2.0
– Easy to understand but very verbose
LTM (Linear Topic Map Notation)– Defined by Ontopia in 2001 and supported by other products
– A simple ASCII syntax for rapid prototyping
CTM (Compact Topic Maps Syntax)– ISO standard replacement for LTM
– Complete draft exists, but no implementations yet
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic Map – XTM 1.0 Syntax
<!ELEMENT topicMap ( topic | association | mergeMap )* ><!ATTLIST topicMap id ID #IMPLIED xmlns CDATA #FIXED 'http://www.topicmaps.org/xtm/1.0/' xmlns:xlink CDATA #FIXED 'http://www.w3.org/1999/xlink' xml:base CDATA #IMPLIED >
<?xml version="1.0" encoding="ISO-8859-1"?><topicMap xmlns="http://www.topicmaps.org/xtm/1.0/" xmlns:xlink="http://www.w3.org/1999/xlink">
<!-- topics, associations, and mergeMap elements go here -->
</topicMap>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic Map – LTM Syntax
/* topics, associations, and occurrences go here */
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic – XTM 1.0 Syntax
<!ELEMENT topic ( instanceOf*, subjectIdentity?, ( baseName | occurrence )* )><!ATTLIST topic id ID #REQUIRED>
<topic id="italy"> ...</topic>
<topic id="puccini"> ...</topic>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic – LTM Syntax
[topic-id]
[italy]
[puccini]
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic Name – XTM 1.0 Syntax (1 of 2)
<!ELEMENT baseName ( scope?, baseNameString, variant* ) ><!ATTLIST baseName id ID #IMPLIED >
<!ELEMENT baseNameString ( #PCDATA ) ><!ATTLIST baseNameString id ID #IMPLIED >
<!ELEMENT variant ( parameters, variantName?, variant* ) ><!ATTLIST variant id ID #IMPLIED >
<!ELEMENT variantName ( resourceRef | resourceData ) ><!ATTLIST variantName id ID #IMPLIED>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic Name – XTM 1.0 Syntax (2 of 2)
<topic id="la-boheme"> <baseName> <baseNameString>La Bohème</baseNameString> <variant> <parameters> <subjectIndicatorRef xlink:href="http://www.topicmaps.org/xtm/1.0/core.xtm#sort"/> </parameters> <variantName> <resourceData>Bohème, La</resourceData> </variantName> </variant> </baseName></topic>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic Name – LTM Syntax
[topic-id = basename; sortname?; dispname?]
[la-boheme = ”La Bohème"; "Bohème, La"]
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic Type – XTM 1.0 Syntax
Use <instanceOf> subelement
<topic id="opera"> ...</topic>
<topic id="tosca"> <instanceOf> <topicRef xlink:href="#opera"/> </instanceOf></topic>
<topic id="boito"> <instanceOf> <topicRef xlink:href="#composer"/> </instanceOf> <instanceOf> <topicRef xlink:href="#librettist"/> </instanceOf></topic>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic Type – LTM Syntax
[topic-id : topic-type]
[tosca : opera]
[boito : composer librettist]
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Occurrence – XTM 1.0 Syntax
Use <occurrence> subelement:external/internal resources: <resourceRef> or <resourceData>
<!ELEMENT occurrence ( instanceOf?, scope?, ( resourceRef | resourceData ) )><!ATTLIST occurrence id ID #IMPLIED>
<topic id="la-boheme"> <occurrence> <instanceOf><topicRef xlink:href="#homepage"/></instanceOf> <resourceRef xlink:href="http://www.opera.it/Opere/La-Boheme/La-Boheme.html"/> </occurrence> <occurrence> <instanceOf><topicRef xlink:href="#premiere-date"/></instanceOf> <resourceData>1896 (1 Feb)</resourceData> </occurrence></topic>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Occurrence – LTM Syntax
{topic-id, occurrence-type, [URL | data]}
{la-boheme, homepage, "http://www.opera.it/Opere/La-Boheme/La-Boheme.html"}
{la-boheme, premiere-date, [[1896 (1 Feb)]]}
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic – Complete XTM 1.0 Syntax
<topic id="la-boheme"> <instanceOf><topicRef xlink:href="#opera"/></instanceOf> <baseName> <baseNameString>La Bohème</baseNameString> <variant> <parameters> <subjectIndicatorRef xlink:href="http://www.topicmaps.org/xtm/1.0/core.xtm#sort"/> </parameters> <variantName><resourceData>Boheme, La</resourceData></variantName> </variant> </baseName> <occurrence> <instanceOf><topicRef xlink:href="#homepage"/></instanceOf> <resourceRef xlink:href="http://www.opera.it/Opere/La-Boheme/La-Boheme.html"/> </occurrence> <occurrence> <instanceOf><topicRef xlink:href="#premiere-date"/></instanceOf> <resourceData>1896 (1 Feb)</resourceData> </occurrence></topic>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Topic – Complete LTM Syntax
[la-boheme : opera = "La Bohème"; "Boheme, La” ]
{la-boheme, homepage, "http://www.opera.it/Opere/La-Boheme/La-Boheme.html"}
{la-boheme, premiere-date, [[1896 (1 Feb)]]}
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Association – XTM 1.0 Syntax
<!ELEMENT association (instanceOf?, scope? , member+)><!ATTLIST association id ID #REQUIRED><!ELEMENT member (roleSpec?, (topicRef | ...)+) >
<!ATTLIST member id ID #IMPLIED><!ELEMENT roleSpec (topicRef | ...) >
<association> <instanceOf><topicRef xlink:href="#composed-by"/></instanceOf> <member> <roleSpec><topicRef xlink:href="#composer"/></roleSpec> <topicRef xlink:href="#puccini"/> </member> <member> <roleSpec><topicRef xlink:href="#work"/></roleSpec> <topicRef xlink:href="#tosca"/> </member></association>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Association – LTM Syntax
assoc-type ( role-player, role-player, ... )
composed-by( puccini , tosca )
Note 1: There can be more than two role-players in an association. We’ll talk about that next week.
Note 2: The above is an oversimplification due to the fact that we have not yet talked about role types. We’ll do that next week.
The exact syntax should be as follows:
assoc-type ( role-player : role-type, role-player : role-type, ... )
composed-by( puccini : composer, tosca : work )
When omitted, the role type will be assumed to be identical to the type of the role-playing topic. This can be a useful short-hand and we will use it for now, but it is not always what you want...
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Subject Identity – XTM 1.0 Syntax
<!ELEMENT topic (instanceOf*, subjectIdentity?,...)><!ELEMENT subjectIdentity (resourceRef?, (topicRef | subjectIndicatorRef)*) > <!– Refer to a resource as subject: --><topic id="foo"> <subjectIdentity> <resourceRef xlink:href="http://www.ontopia.net"/> </subjectIdentity> <baseName> <baseNameString>The Ontopia Website</baseNameString> </baseName></topic>
<!– Refer to a subject indicator: --><topic id="bar"> <subjectIdentity> <subjectIndicatorRef xlink:href="http://www.ontopia.net/about.html"/> </subjectIdentity> <baseName> <baseNameString>Ontopia</baseNameString> </baseName></topic>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Subject Identity – LTM Syntax
[topic-id = names %subject-address-URL][topic-id = names @subject-indicator-URL]
/* Refer to a resource as subject: */[foo = "The Ontopia Website" %"http://www.ontopia.net" ]
/* Refer to a subject indicator: */[bar = "Ontopia" @"http://www.ontopia.net/about.html"]
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Scope – XTM 1.0 Syntax
<!-- "scope" subelements on baseName, occurrence, and association (also "parameters" on variantName) -->
<topic id="composed-by"> <baseName> <baseNameString>composed by</baseNameString> </baseName> <baseName> <scope><topicRef xlink:href="#composer"/></scope> <baseNameString>composer of</baseNameString> </baseName></topic>
<topic id="la-boheme2"> <baseName> <baseNameString>La Bohème (Leoncavallo)</baseNameString> </baseName> <baseName> <scope><topicRef xlink:href="#leoncavallo"/></scope> <baseNameString>La Bohème</baseNameString> </baseName></topic>
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Scope – LTM syntax
(name or occurrence or association) / scoping-topic(s)
[born-in = "composed by" = "composer of" / composer ]
[la-boheme1 = "La Bohème (Puccini)" = "La Bohème" / puccini ]
[la-boheme2 = "La Bohème (Leoncavallo)" = "La Bohème" / leoncavallo ]
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Home assignment
1. Prerequisites– You have installed Java and the OKS Samplers
– You know the basics of LTM http://www.ontopia.net/download/ltm.html
2. Create your first topic map– Decide what domain you want to cover
– Write LTM in a text editor (Notepad, TextPad, emacs, ...)
– Keep it in its own directory
– Copy to .../apache-tomcat/webapps/omnigator/WEB-INF/topicmaps for testing in the Omnigator
– Use Reload function
www.ontopedia.net
O N T O P E D I AThe Identity of Everything
Your own topic map
Choose something that really interests you
– It’s much more fun than something boring!
Some ideas:– Sport (football, cricket, ...)
– Culture (music, film, literature, theatre, ...)
– Study courses
– Project management
– Conference website
– Languages
– Geography
This first topic map is your own personal one
– The next one will be a group project for term assessment
Requirements:– Minimum 4 topic types, 4
association types, 4 occurrence types
– Minimum 10 topics, 20 associations, 10 occurrences
– Send to [email protected] by Monday 29 September