It’s all semantics! The premises and promises of the semantic web.
Tony RossCentre for Digital Library Research,University of StrathclydeEmail: [email protected]
What is the Semantic Web?
“The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in co-operation.” (Berners-Lee et al., 2001)
“There is realization now, ‘It's not the documents, it is the things they are about which are important’. Obvious, really.” (Berners-Lee, 2007)
The Semantic Web: basic ideas [1]
The Web evolved largely as a platform for the linking and sharing of documents.
Simplicity was key. A largely syntactic rather than semantic
framework. Hence browsers display data without
actually being aware of its ‘meaning’.
The Semantic Web: basic ideas [2]
Currently, intermediate programmes must be built to allow interoperability between specific programmes. E.g. Insurance price comparison sites
Web data is controlled by applications; the structure and format of that data is therefore particular rather than universal.
Wouldn’t it be better if machines were able to interpret and process the content of documents?
The Semantic Web: basic ideas [3]
But, this will require a lot of metadata and a lot of accompanying mark up!
Plus a lot of common infrastructural services and standards of application …
Q. How can automated technologies deal with subjectivity of language (e.g. context, intention, tone, etc.) or linguistic quirks (homonymy, synonymy, etc.)?
A. Only if we (humans) explicitly mark them up as such…
Building a common framework
It’s the same old problem for cataloguing and indexing, i.e.: We need to ensure we are describing things in the same way!
We must register (and thus control): vocabularies; services; names, etc.
And construct (and agree upon) common frameworks for the way such metadata is to be applied.
The NSDL Metadata Registry
Aims to make possible: (1) “the unambiguous identification of metadata
schemas (attribute spaces or element/property sets) and schemes (value spaces or controlled vocabularies);
(2) the machine declaration for encoding those schemes and schemas; and
(3) the publication of those schemes and schemas to communities and applications” (Hillmann et al, 2006)
Metadata Registries
Provide a common, openly-accessible site for the registration of metadata schema.
Thus, a locally produced vocabulary – e.g. JISC IE Vocabulary – is remotely accessible to all.
This means it can be referred to and reused both within JISC and across communities.
Promotes interoperability!
eXtensible Markup Language (XML)
Enables users to annotate (markup) documents with their own locally-defined elements.
The document then points to a location for the declaration of schema format – a namespace
Other users and other documents can then use these elements and point to the namespace
Resource Description Framework (RDF) [1]
Official W3C recommendationPublished 2004Result of work by the RDF
Core Working Group
Resource Description Framework (RDF) [2]
A framework to allow commonly interpretable specifications of relations
Simple logical assertions based on:
{subject} {predicate} {object} e.g. {Document A} {has title} {“Romeo and Juliet”}
Thus, semantic metadata can be attached to a document (as XML). The ‘meaning’ of a document becomes machine processable.
Resource Description Framework (RDF) [3]
RDF doesn’t itself specify attributes or vocabularies – it is an enabling framework
Hence it can be used in conjunction with emergent standards such as RDFS, OWL, FOAF, SKOS, Dublin Core.
Simple Knowledge Organisation Systems (SKOS) [1]
Has W3C Working Draft status SKOS-Core Guide published 2005 Developed to allow expression of the basic
structure of controlled vocabularies (thesauri, classification schemes, subject heading lists, taxonomies, ‘folksonomies’, etc.)
Simple Knowledge Organisation Systems (SKOS) [2]
Divides (5) classes of resources: skos:ConceptSchemeskos:Concept
And sub-divides (26) properties of that class:skos:Preflabelskos:Broader
Demonstration
The JISC Information Environment vocabulary, developed in support of CDLR project Resource Discovery iKit
As declared using the NSDL Metadata Registry
JISC Information Environment
Persistence: the responsibilities of ownership
In order for this to work, we need stable indicators reliably pointing to resources.
The responsibilities of ownership: who will assume responsibility for issues such as persistence, security, version control – funding becomes an issue (especially as project-funding dries up).
DDC has OCLC, LCSH has LoC, AAT has Getty, etc.
Metadata Registry
http://www.metadataregistry.org
References
Hillmann, D., Phipps, J., Sutton, S.A. and Laundry, R. (2006). A metadata registry from the vocabularies up: the NSDL Registry project.
Berners-Lee, T., Hendler, J. and Lasilla, O. (2001). The Semantic Web. Scientific American. 284(5).
Berners-Lee, T. (2007). timbl's blog: Giant Global Graph. Posted 21st November 2007. Available: http://dig.csail.mit.edu/breadcrumbs/node/215