lecture: ontologies and the semantic web

Seman&c  Analysis  in  Language  Technology  http://stp.lingfil.uu.se/~santinim/sais/2016/sais_2016.htm

Ontologies and the Semantic Web

Marina  San(ni  san$nim@stp.lingfil.uu.se  

 Department  of  Linguis(cs  and  Philology  Uppsala  University,  Uppsala,  Sweden  

 Spring  2016  



•  Most  slides  based  on  Harrocks  (2008).  

•  The  Seman(c  Web  

•  Ontologies  

Chronology  hNp://en.wikipedia.org/wiki/

History_of_the_World_Wide_Web    •  On  August  6,  1991,Berners-­‐Lee  posted  a  short  summary  of  the  World  Wide  

Web  project  on  the  alt.hypertext  newsgroup,  invi(ng  collaborators.  This  date  also  marked  the  debut  of  the  Web  as  a  publicly  available  service  on  the  Internet,  although  new  users  could  only  access  it  aEer  August  23.  

•  Beginning  in  2002,  new  ideas  for  sharing  and  exchanging  content  ad  hoc,  such  as  Weblogs  and  RSS,  rapidly  gained  acceptance  on  the  Web.  This  new  model  for  informa(on  exchange,  primarily  featuring  user-­‐generated  and  user-­‐edited  websites,  was  dubbed  Web  2.0.    

•  Popularized  by  Berners-­‐Lee's  book  Weaving  the  Web  (2000)  and  a  Scien(fic  American  ar(cle  by  Berners-­‐Lee,  James  Hendler,  and  Ora  Lassila,  the  term    

•  Seman&c  Web  describes  an  evolu&on  of  the  exis&ng  Web  in  which  the  network  of  hyperlinked  human-­‐readable  web  pages  is  extended  by  machine-­‐readable  metadata  about  documents  and  how  they  are  related  to  each  other,  enabling  automated  agents  to  access  the  Web  more  intelligently  and  perform  tasks  on  behalf  of  users.    

•  In  2006,  Berners-­‐Lee  and  colleagues  stated  that  the  idea  "remains  largely  unrealized"  

Web  1.0  

•  Web  1.0  is  a  retronym  referring  to  an  early  stage  of  the  World  Wide  Web's  evolu(on.  

•  Some  design  elements  of  a  Web  1.0  site  include:  

–  Personal  web  pages  were  common,  consis(ng  mainly  of  sta(c  pages  

–  Sta(c  pages  instead  of  dynamic  HTML.  –  The  use  of  HTML  3.2-­‐era  elements  such  as  Framing  (World  Wide  Web)s  and  tables  to  posi(on  and  align  elements  on  a  page    (now  we  use  css  and  frames  are  deprecated)  

–  GIF  buNons...  

Web  2.0  •  Web  2.0  describes  World  Wide  Web  sites  that  use  technology  

beyond  the  sta(c  pages  of  earlier  Web  sites.    •  The  key  features  of  Web  2.0  include:  

–  Tagging  -­‐  allows  users  to  collec(vely  classify  and  find  informa(on  (e.g.  Tagging)  

–  Rich  User  Experience-­‐  dynamic  content;  responsive  to  user  input  –  User  Par(cipa(on  -­‐  informa(on  flows  two  ways  between  site  owner  and  site  user  by  means  of  evalua(on,  review,  and  commen(ng.    

–  Site  users  add  content  for  others  to  see  –  Mass  Par(cipa(on  -­‐  Universal  web  access  leads  to  differen(a(on  of  concerns  from  the  tradi(onal  internet  userbase.  

–  etc.  

Web  3.0  

•  “Web  3.0,  a  phrase  coined  by  John  Markoff  of  the  New  York  Times  in  2006,  refers  to  a  supposed  third  genera(on  of  Internet-­‐based  services  that  collec(vely  comprise  what  might  be  called  ‘the  intelligent  Web’  —  such  as  those  using  seman(c  web,  microformats,  natural  language  search,  data-­‐mining,  machine  learning,  recommenda(on  agents,  and  ar(ficial  intelligence  technologies  —  which  emphasize  machine-­‐facilitated  understanding  of  informa(on  in  order  to  provide  a  more  produc(ve  and  intui(ve  user  experience.”  

•  Web  3.0  will  be  more  connected,  open,  and  intelligent,  with  seman(c  Web  technologies,  distributed  databases,  natural  language  processing,  machine  learning,  machine  reasoning,  and  autonomous  agents.  –  hNp://lifeboat.com/ex/web.3.0    

This  has  yet  to  happen.      

•  "The  Web  was  designed  as  an  informa$on  space,  with  the  goal  that  it  should  be  useful  not  only  for  human-­‐human  communica(on,  but  also  that  machines  would  be  able  to  par(cipate  and  help.    

•  One  of  the  major  obstacles  to  this  has  been  the  fact  that  most  informa$on  on  the  Web  is  designed  for  human  consump$on,  and  even  if  it  was  derived  from  a  database  with  well  defined  meanings  (in  at  least  some  terms)  for  its  columns,  that  the  structure  of  the  data  is  not  evident  to  a  robot  browsing  the  Web.    

•  Leaving  aside  the  ar(ficial  intelligence  problem  of  training  machines  to  behave  like  people,  the  Seman$c  Web  approach  instead  develops  languages  for  expressing  informa$on  in  a  machine  process-­‐able  form"-­‐  

–  Tim  Berners-­‐Lee,  The  Seman&c  Web  Roadmap,  1998  –  hNp://www.w3.org/DesignIssues/Seman(c.html    

The  web:  present  and  future  


•  The  web  is  rela(vely  simple:  – Hypertexts  and  hypermedia  – Access  is  engineered  via  a  combina(on  of  keyword-­‐based  search  and  link  nagiva(on.  

This  simplicity  has  been  one  of  the  great  strengths  of  the  web,  and  has  been  an  important  factor  in  its  popularity  and  their  own  content.    

Examples:  •  Finding  informa(on  about  people  with  very  common  names  can  be  a  frustra(ng  experience.  

   •  Answering  more  complex  queries  along  with  more  general  informa(on  retrieval,  integra(on,  sharing  and  processing  can  be  difficult  ….  We  have  seen  that…  

Some  solu(ons    •  Sosware  glue:  Mashups  

–  loca(on  informa(on  from  one  source  might  be  combined  with  map  informa(on  from  another  source  in  order  to  show  the  loca(on  of  and  provide  direc(ons  to  points  of  interest  such  as  hotels  and  restaurants.  

•  Tagging  via  social  networks  (Web  2.0)  –  harness  the  power  of  user  communi(es  in  order  to  share  and  annotate  informa(on.  

•  Examples  include  image  and  video  shar-­‐ing  sites  such  as  Flickr  and  YouTube,  and  auc(on  sites  such  as  eBay.    

–  In  these  applica(ons,  annota(ons  usually  take  the  form  simple  tags,  such  as  ”each",  ”birthday",  ”family"  and  ”friends".  The  meaning  of  tags  is,  however,  typically  not  well  defined,  and  may  be  impenetrable  even  to  human  users:  typ-­‐ical  examples  (from  Flickr)  include  "asquatchmusicfes(val",  "elebritylookalikes",  and  "wab08".  

The  ”travel  agent”  

•  The  classic  example  of  a  seman(c  web  applica(on  is  an  automated  travel  agent  that,  given  various  constraints  and  preferences,  would  offer  the  user  suitable  travel  or  vaca(on  sugges(ons.    

•  A  key  feature  of  such  a  "sosware  agent"  is  that  it  would  not  simply  exploit  a  predetermined  set  of  informa(on  sources,  but  would  search  the  web  for  relevant  informa(on  in  much  the  same  way  that  a  human  user  might  do  when  planning  a  vaca(on.  

The  goal  

•  The  goal  of  the  Seman(c  Web  is  to  allow  web  informa(on  and  services  to  be  more  effec(vely  exploited  by  humans  and  automated  tools.    


Seman(c  Web  •  The  focus  of  the  seman(c  web  is  to  share  data  instead  of  documents.    

•  In  other  words,  it  is  a  project  that  should  provide  a  common  framework  that  allows  data  to  be  shared  and  reused  across  applica(on,  enterprise,  and  community  boundaries.    

•  It  is  a  collabora(ve  effort  led  by  World  Wide  Web  Consor(um  (W3C).  

Semantic Web & Ontologies •  How  are  we  going  to  represent  meaning  and  knowledge  on  the  web?  

•  A  key  idea  behind  the  seman&c  web  is  to  address  this  problem  by  giving  machine-­‐accessible  seman&cs  via  annota&on.    

•  Knowledge  is  represented  in  the  form  of  rich  conceptual  schemas  called  ontologies.    

•  Ontologies  are  the  backbone  of  the  Seman(c  Web.  

•  Ontologies  are  rich  conceptual  schemas  that  give  formally  defined  meanings  to  the  terms  used  in  annota&ons,  transforming  them  into  seman&c  annota&ons.  

•  They  provide  the  knowledge  that  is  required  for  seman(c  applica(ons  of  all  kinds.     15 The  Seman(c  Web  &  Ontologies  

Main  Difficulty  

•  Current  web  content  is  intended  for  humans  (HTML  markup  with  layout,  images  and  other  presenta(onal  features).    

•  Humans  understand  this  content,  but  machines  can’t.  

Basically... •  Ontologies provide a shared understanding of a domain.

•  They provide background knowledge to automatize certain tasks.

•  By the process of annotation, knowledge can be linked to ontologies. –  Example: “Angelina Jolie” (Text) linked to concept Actress –  In our ontology we also know that an actress always is female and a


•  Ontologies allow the creation of annotations à machine-readable and machine-understandable content.

•  If machines can understand content, they can also perform more meaningful and intelligent queries. –  Distinction of Jaguar the animal and the car. –  Combination of information that is distributed on the Web.

Old  and  New  Issues  Old  ones:  •  knowledge  representa(on    •  Reasoning  •  Harnessing  the  idiosyncracies  of  natural  languages  •  …  

New  ones:  •  integra(ng  different  ontologies  may  prove  to  be  at  least  as  

hard  as  integra(ng  the  resources  that  they  describe    •  Crea(on  of  suitable  annota(ons  •  …  

Regardless  these  issues…  

•  …  considerable  progress  has  been  made  in  the  development  of  the  infrastructure  needed  to  support  the  seman(c  web.    

•  In  par(cular,  there  has  been  impressive  progress  in  the  development  of  languages  and  tools  for  content  annota(on  and  for  the  design  and  deployment  of  ontologies.  

Seman(c  Annota(on  

•  To  facilitate  the  process  of  seman(c  annota(on,  RDF  and  OWL  have  been  developed  as  standard  formats  fo  the  sharing  and  integra(on  of  data  and  knowledge.  

•  RDF  and  OWL  are  standards:  – RDF  (Resource  Descrip(on  Framework)  – OWL  (Web  Ontology  Language)  

Ontologies  (Metaphysics)  

•  Ontology,  in  its  original  philosophical  sense,  is  a  fundamental  branch  of  metaphysics  focusing  on  the  study  of  existence.  

•  Its  objec(ve  is  to  determine  what  en((es  and  types  of  en((es  actually  exist,  and  thus  to  study  the  structure  of  the  world.    

•  The  study  of  ontology  can  be  traced  back  to  the  work  of  Plato  and  Aristotle,  and  includes  the  development  of  hierarchical  categorisa(ons  of  different  kinds  of  en((es  and  the  features  that  dis(nguish  them  

Tree  of  Porphyry  

Tree  of  Porphyry,    III  AD  

•  The  Porphyrian  tree,  Tree  of  Porphyry  or  Arbor  Porphyriana  is  a  classic  device  for  illustra(ng  what  is  also  called  a  "scale  of  being".  It  was  suggested  by  the  3rd  century  AD  Greek  neoplatonist  philosopher  and  logician  Porphyry    

Ontology  (Computer  Science,  AI,  LT,  IR…)  

•  Engineering  artefact,  usually  a  model  of  some  aspect  of  the  world.  

•  It  introduces  vocabulary  describing  various  aspects  of  the  domain  being  modelled,  and  provides  an  explicit  specifica(on  of  the  intended  meaning  of  the  vocabulary.    

•  This  specifica(on  osen  includes  classifica(on-­‐based  informa(on,  not  unlike  that  in  Porphyry's  tree.    

What is an ontology (i)?


“An  ontology  is  a  formal,  explicit  specifica&on  of  a    shared  conceptualiza&on”  

Studer,  Benjamins,  Fensel.  Knowledge  Engineering:  Principles  and  Methods.  Data  and  Knowledge  Engineering.  25  (1998)  161-­‐197  


An ontology is an explicit specification of a conceptualization Gruber, T. A translation Approach to portable ontology specifications. Knowledge Acquisition. Vol. 5. 1993. 199-220


Abstract model and simplified view of some phenomenon in the world that we want to represent


Concepts, properties relations, functions, constraints, axioms, are explicitly defined

Consensual Knowledge

What is an ontology (ii)? •  An ontology is a hierarchically structured set of terms for describing a

domain that can be used as a skeletal foundation for a knowledge base

B. Swartout; R. Patil; k. Knight; T. Russ. Toward Distributed Use of Large-Scale Ontologies Ontological Engineering. AAAI-97 Spring Symposium Series. 1997. 138-148

•  An ontology defines the basic terms and relations comprising the vocabulary of a topic area, as well as the rules for combining terms and relations to define extensions to the vocabulary

Neches, R.; Fikes, R.; Finin, T.; Gruber, T.; Patil, R.; Senator, T.; Swartout, W.R. Enabling Technology for Knowledge Sharing. AI Magazine. Winter 1991. 36-56

•  An ontology provides the means for describing explicitly the conceptualization behind the knowledge represented in a knowledge base

A. Bernaras;I. Laresgoiti; J. Correra. Building and Reusing Ontologies for Electrical Network Applications ECAI96. 12th European conference on Artificial Intelligence. Ed. John Wiley & Sons, Ltd.


Examples  •  Top  level  ontology:  Standard  Upper  Ontology  

–  In  informa(on  science,  an  upper  ontology  (also  known  as  a  top-­‐level  ontology  or  founda(on  ontology)  is  an  ontology  (in  the  sense  used  in  informa(on  science)  which  describes  very  general  concepts  that  are  the  same  across  all  knowledge  domains.  

•  Linguis(c  ontology:  WordNet  •  General  Ontology:  Cyc,  UNSPSC,  ecl@ss  •  Domain  ontology:  MeSH  (Medical  Subject  Headings),  

CHEMICALS,  UMLS  •  Research  ontology:  KA2  (Knowledge  Acquisi(on  

Community  Ontology)  

Resource  Descrip(on  Framework  (i)  

•  A  language  that  has  been  developed  in  order  to  provide  a  extensible  mechanism  for  describing  web  resources  and  rela(onships  between  them.    

•  A  key  feature  of  RDF  is  the  use  of  Interna(onalized  Resource  Iden(fiers  (IRIs)  (which  is  a  generalisa(on  of  Uniform  Resource  Locators  (URLs)  to  refer  to  resources.    

•  RDF  is  a  very  simple  language:  its  underlying  data  structure  is  a  labelled  directed  graph,  and  its  only  syntac(c  construct  is  the  triple.    

•  A  triple  consists  of  three  components,  referred  to  as  the  subject,  predicate  and  object.  

a  directed  graph  is  a  set  of  nodes  connected  by  edges,  where  the  edges  have  a  direc(on  associated  with  them.  


RDF  (ii)  •  More  formally,    a  triple  represents  a  single  edge  (labelled  

with  the  predicate)  connec(ng  two  nodes  (labelled  with  the  subject  and  object);  it  describes  a  binary  rela(onship  between  the  subject  and  object  via  the  predicate.    

•  The  predicate  of  a  triple  is  always  an  IRI,  and  an  IRI  that  is  used  in  the  predicate  posi(on  of  a  triple  is  called  a  property.    

•  A  set  of  triples  is  called  an  RDF  graph.    

•  In  order  to  facilitate  the  sharing  and  exchanging  of  graphs  on  the  web,  an  XML  serialisa(on  has  also  been  defined.  

”Harry  PoNer  has  a  pet  called  Hedwig…”  

The  Seman(c  Web  &  Ontologies   29  


RDF  graph  

Lect  09:  Rela(on  Extrac(on:  DBPediaRela(on  database  that  draw  from  Wikipedia  

•  Resource  Descrip&on  Framework  (RDF)  triples  subject  predicate  object  Golden Gate Park location San Francisco!dbpedia:Golden_Gate_Park      dbpedia-­‐owl:loca(on      dbpedia:San_Francisco  !

•  DBPedia:  The  DBpedia  project  uses  the  Resource  Descrip(on  Framework  (RDF)  to  represent  the  extracted  informa(on  and  consists  of  3  billion  RDF  triples,  580  million  extracted  from  the  English  edi(on  of  Wikipedia  and  2.46  billion  from  other  language  edi(ons  (wikipedia,  March  2016).  

…  but  …  not  enough…  

•  Capabili(es  of  RDF  as  ontology  language  are  limited  – No  cardinality    – No  possible  to  describe  conjunc(on  of  classes  – …  

RDF  is  a  very  simple  language    

cardinality  of  a  set  is  a  measure  of  the  "number  of  elements  of  the  set”.  For  example,  the  set  A  =  {2,  4,  6}  contains  3  elements,  and  therefore  A  has  a  cardinality  of  3  

Need  for  a  more  expressive  ontology  language:  OWL  (Web  Ontology  Language)  

•  Since  the  architecture  of  the  web  depends  on  agreed  standards,  the  World  Wide  Web  Consor(um  (W3C)  set  up  a  standardisa(on  working  group  to  develop  a  standard  for  a  web  ontology  language.  

•   The  result  of  this  ac(vity  was  the  OWL  ontology  language  standard.  

•  The  integra(on  of  OWL  with  RDF  has  the  advantage  of  making  OWL  ontologies  directly  accessible  to  web  based  applica(ons.  

Back  Story:    hNp://ileriseviye.wordpress.com/2011/11/01/why-­‐web-­‐ontology-­‐language-­‐is-­‐abbreviated-­‐as-­‐owl-­‐and-­‐not-­‐wol/    

Descrip(on  Logics  (DLs)  

•  A  key  feature  of  OWL  is  its  basis  in  Descrip(on  Logics,  a  family  of  logic-­‐based  knowledge  representa(on  formalisms  that  have  a  formal  seman(cs  based  on  first-­‐order  logic  (FOL).  

Descrip(on  Logics  •  We  can  use  DLs  to  model  an  applica(on  domain.  The  focus  is  then  on:  –  Representa(on  of  knowledge  about  categories  –  The  set  of  categories  in  an  applica(on  domain  is  called  terminology  

–  The  terminology  is  arranged  in  a  hierachical  organiza&on  called  ontology,  which  capture  superset  &  subset  rela(ons  among  categoires/concepts.    

–  In  order  to  specify  a  hierachical  structure,  we  can  use  subsump$on  rela(ons  betw  the  appropriate  concepts  in  a  terminiology    

–  Subsump$on  is  a  form  of  inference.  Determines  whether  a  superset/subset  rela(on  (based  on  the  fact  asserted  in  a  terminology)  exists  betw  two  concepts.  

In  short,  DLs  are…  •  …  formalisms  based  on  an  object-­‐oriented  modelling,  in  which  the  domain  is  described  in  terms  of  individuals  (instances),  concepts  (classes),  and  roles  (proper(es/predicates):  

–  individuals,  e.g.,  "Hedwig",  are  the  basic  elements  of  the  domain;    

–  concepts,  e.g.,  "Owl",  describe  sets  of  individuals  having  similar  characteris(cs;    

–  roles,  e.g.,  "hasPet",  describe  rela(onships  between  pairs  of  individuals,  such  as  "HarryPoNer  hasPet  Hedwig".  

Axioms  •  An  OWL  ontology  consists  of  a  set  of  axioms  

•  Exemple:    –  given  the  axiom  C  equivalentClass  D,  then  an  individual  is  an  instance  of  C  if  and  

only  if  it  is  an  instance  of  D.    –  i.e.  Combining  axioms  with  class  descrip(ons  allows  for  easy  extension  of  the  

vocabulary  by  introducing  new  names  as  abbrevia(ons  for  descrip(ons.    

See  the  following  axiom:    Class: HogwartsStudent!

!EquivalentTo: Student and attendsSchoolvalue Hogwarts!  introduces  the  class  name  HogwartsStudent,  and  asserts  that  its  instances  are  just  those  Students  who  aNend  Hogwarts.  

TBox  &  ABox  

•  Axioms  describe  constraints  on  the  structure  of  the  domain:  –  in  DLs  such  a  set  of  axioms  is  called  a  TBox  (Terminology  Box).    

•  OWL  also  allows  for  axioms  asser&ng  facts  about  some  concrete  situa(on,  similar  to  data  in  a  database  se�ng:  –  in  DLs  such  a  set  of  axioms  is  called  an  ABox  (Asser(on  Box).  

Decid-­‐ability  (i)  

•  Descrip(on  Logics  are  fully-­‐fledged  logics  and  so  have  a  formal  seman(cs.  

•   DLs  can  be  seen  as  decidable  subsets  of  FOL  with:  –   individuals  being  equivalent  to  constants,    – concepts  to  unary  predicates,  –  roles  to  binary  predicates.    

FOL  …  undecidable  (some(mes)  

•  The  Incompleteness  Theorem  ,  proven  in  1930,  demonstrates  that  first-­‐order  logic  is  in  general  undecidable.    

•  That  means  there  exist  statements  in  this  logic  form  that,  under  certain  condi(ons,  cannot  be  proven  either  true  or  false.  

•  Ex:  can’t  solve  the  Hal$ng  Problem  

Hal(ng  Problem  •  In  1936  Alan  Turing  proved  that  it's  not  possible  to  decide  whether  

an  arbitrary  program  will  eventually  halt,  or  run  forever.    

•  The  official  defini&on  of  the  problem  is  to  write  a  program  (actually,  a  Turing  Machine*)  that  accepts  as  parameters  a  program  and  its  parameters.  That  program  needs  to  decide,  in  finite  &me,  whether  that  program  will  ever  halt  running  these  parameters.  

•  The  hal(ng  problem  is  a  cornerstone  problem  in  computer  science.  It  is  used  mainly  as  a  way  to  prove  a  given  task  is  impossible,  by  showing  that  solving  that  task  will  allow  one  to  solve  the  hal(ng  problem.  

*A  Turing  machine  is  a  hypothe(cal  device  that  manipulates  symbols  according  to  a  table  of  rules.  Despite  its  simplicity,  a  Turing  machine  can  be  adapted  to  simulate  the  logic  of  any  computer  algorithm,    

Decid-­‐ability  (ii)  

•  DLs  give  a  precise  and  unambiguous  meaning  to  descrip(ons  of  the  domain  

•  This  also  allows  for  the  development  of  reasoning  algorithms  that  can  provide  correct  answers  to  arbitrarily  complex  queries  about  the  domain.  

Reasoning:  OWL  vs  Databases  

OWL  axioms  behave  like  inference  rules  rather  than  database  constraints.    

!Class: Phoenix!

!SubClassOf: isPetOf only Wizard!!Individual: Fawkes!

Types: Phoenix!Facts: isPetOf Dumbledore!

•  Fawkes  is  said  to  be  a  Phoenix  and  to  be  the  pet  of  Dumbledore,  and  it  is  also  stated  that  only  a  Wizard  can  have  a  pet  Phoenix.    

•  In  OWL,  this  leads  to  the  implica(on  that  Dumbledore  is  a  Wizard.  That  is,  if  we  were  to  query  the  ontology  for  instances  of  Wizard,  then  Dumbledore  would  be  part  of  the  answer.    

•  In  a  database  se�ng  the  schema  could  include  a  similar  statement  about  the  Phoenix  class,  but  in  this  case  it  would  be  interpreted  as  a  constraint  on  the  data:  adding  the  fact  that  Fawkes  isPetOf  Dumbledore  without  Dumbledore  being  already  known  to  be  a  Wizard  would  lead  to  an  invalid  database  state,  and  such  an  update  would  therefore  be  rejected  by  a  database  management  system  as  a  constraint  viola(on.  

Ontology  Development  Tools  

•  State  of  the  art  ontology  development  tools,  such  as  SWOOP,  Protégé,  and  TopBraid  Composer,  use  DL  reasoners  to  provide  feedback  to  the  user  about  the  logical  implica(ons  of  their  design:    –  i.e.  warnings  about  inconsistencies  and  synonyms.  

WebProtégé  hNp://protege.stanford.edu/products.php#web-­‐protege    


VOWL:    Visual  Nota(on  for  OWL  

Ontologies  hNp://vowl.visualdataweb.org/v2/    


Domain-­‐specific  ontologies  •  The  availability  of  tools  has  contributed  to  the  increasingly  widespread  use  of  OWL,  and  it  has  become  the  de  facto  standard  for  ontology  development  in  fields  as  diverse  as      –  Biology  – Medicine  – Geography  – Geology  – Agriculture    – Defence  –  etc  

Complex  Queries  •  The  use  of  DL  reasoners  allows  OWL  ontology  applica(ons  to  answer  complex  queries  and  to  provide  guarantees  about  the  correctness  of  the  result.  

•  Reliability  and  correctness  are  clearly  important  features  of  any  informa(on  system;    

•  They  are  par(cularly  important  if  ontology  based  systems  are  to  be  used  in  safety-­‐cri(cal  applica(ons  such  as  medicine,  where  incorrect  reasoning  could  adversely  impact  pa(ent  care.  

Standard  Query  Language  

•  It  has  long  been  recognised  that  the  seman(c  web,  and  seman(c  web  knowledge  representa(on  languages  such  as  RDF  and  OWL,  would  also  benefit    from  the  availability  of  a  standardised  query  language  such  as  SQL  

•  A  W3C  standardisa(on  working  group  was  set  up,  and  has  completed  its  work  on  the  SPARQL  query  language  standard.  

SPARQL  Protocol  and  RDF  Query  Language  …  

•  …  is  an  RDF  query  language,  ie  a  query  language  that  can  retrieve  and  manipulate  data  stored  in  RDF  format  (ie  triples).    

•  SPARQL  allows  for  a  query  to  consist  of  triple  paSerns,  conjunc(ons,  disjunc(ons,  and  op(onal  paNerns  

Tags  &  Ontologies  

•  Tagging  facili(es  within  Web  2.0  applica(ons  have  shown  how  it  might  be  possible  for  user  communi(es  to  collabora(vely  annotate  web  content,  and  create  simple  forms  of  ontology  via  the  development  of  hierarchically  organised  sets  of  tags,  osen  called  folksonomies….    

•  Currently  hard  to  combine:    –  Increased  expressive  power  (by  using  more  sophis(cated  logics)  with  scalability  (large  ontologies)  

Ontology  Learning  •  Ontology  learning  (ontology  extrac(on,  ontology  genera(on,  or  ontology  

acquisi(on)  is  the  automa(c  or  semi-­‐automa(c  crea(on  of  ontologies,  including  extrac(ng  the  corresponding  domain's  terms  and  the  rela&onships  between  those  concepts  from  a  corpus  of  natural  language  text,  and  encoding  them  with  an  ontology  language  for  easy  retrieval.    

•  As  building  ontologies  manually  is  extremely  labor-­‐intensive  and  (me  consuming,  there  is  great  mo(va(on  to  automate  the  process.  

•  Typically,  the  process  starts  by  extrac(ng  terms  and  concepts  or  noun  phrases  from  plain  text  using  linguis(c  processors  such  as  part-­‐of-­‐speech  tagging  and  phrase  chunking.  Then  sta(s(cal  techniques  are  used  to  extract  rela(on,  osen  based  on  Machine  Learning.  –  hNp://en.wikipedia.org/wiki/Ontology_learning    

In  summary…  

Why  to  build  an  ontology?      •  To  share  common  understanding  of  the  structure  of  informa(on  among  people  or  sosware  agents  •  To  enable  reuse  of  domain  knowledge  •  To  make  domain  assump(ons  explicit  •  To  analyze  domain  knowledge  

How  to  build  an  ontology  

•  concepts  of  the  domain  or  tasks,  which  are  usually  organized  in  taxonomies  Ex:  in  a  university  ontology,  student  and  professor  are  two  classes  

A  type  of  interac(on  between  concepts  of  the  domain:  Ex:  subclass-­‐of  or  is-­‐a    are  rela(ons  

Asser(ons  that  are  always  true  for  the  domain  of  interest    Ex:  if  a  student  aNends  both  ”Math”  and  ”Basic  text  processing”  courses,    then  he  or  she  must  be  a  1st  year  student.  

Represent  specific  elements  Ex:  a  Student  called  Peter  is  the  instance  of  Student  class  

•   There  is  no  single  correct  class  hierarchy  for  any  given  domain.  

•  The  hierarchy  depends  on  the  possible  uses  of  the  ontology.  

•  The  level  of  detail  is  depend  on  the  applica(ons  and  purposes.    

The  end  

