semantic web: customers and suppliers

50
Semantic Web: Customers and Suppliers Rudi Studer Institut AIFB, Universität Karlsruhe (TH) & FZI Forschungszentrum Informatik & Ontoprise GmbH Invited Talk @ ISWC2006, Athens, GA, USA November 9th, 2006

Upload: jameson-sparks

Post on 30-Dec-2015

22 views

Category:

Documents


0 download

DESCRIPTION

Semantic Web: Customers and Suppliers. Rudi Studer Institut AIFB, Universität Karlsruhe (TH) & FZI Forschungszentrum Informatik & Ontoprise GmbH Invited Talk @ ISWC2006, Athens, GA, USA November 9th, 2006. ISWC: Looking Back. ISWC2002: Sardinia, IT 95/27 sub/acc, 4 tutorials - PowerPoint PPT Presentation

TRANSCRIPT

Semantic Web:Customers and Suppliers

Rudi StuderInstitut AIFB, Universität Karlsruhe (TH) &

FZI Forschungszentrum Informatik &Ontoprise GmbH

Invited Talk @ ISWC2006, Athens, GA, USANovember 9th, 2006

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 2

ISWC: Looking Back

• ISWC2002: Sardinia, IT95/27 sub/acc, 4 tutorials

• ISWC2003: Sanibel Island, FL, US262/49 sub/acc, 6 workshops, 4 tutorials

• ISWC2004: Hiroshima, JP205/48 sub/acc, 8 workshops, 6 tutorials

• ISWC2005: Galway, IR 217/54 sub/acc, 9 workshops, 4 tutorials

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 3

ISWC: Connecting Communities

• 2002: Tutorial on Description Logic (KR)

• 2003: Workshops on Practical and Scalable Semantic Systems (DB) and on Human Language Technology (NLP) for the Semantic Web and Web Services

• 2004: Paper session on Semantic Web Mining (ML)

• 2005: Workshop on Semantic Web Enabled Software Engineering (SE)

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 4

Agenda

• Presence of Semantic Web at Top Events of Other Communities

• Customers and Suppliers– Knowledge Representation (KR)– Databases (DB)– Software Engineering (SE)– Natural Language Processing (NLP)– Machine Learning (ML)

• Business Aspects

• Trends and Take Home Messages

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 5

• IJCAI2001: Workshops on Ontology Learning, on Ontologies and Information Sharing and on IEEE Standard Upper Ontology

• IJCAI2003: Tutorials on Ontologies - Representation, Engineering and Applications and Ontology-Based Information Integration, Workshop on Ontologies and Distributed Systems

• IJCAI2007: Invited talk by Carole Goble on The e-Scientist is the Semantic Web's Friend (or a Friend Of A Friend), Workshop on Semantic Web for Collaborative Knowledge Acquisition

• KR2002: Invited talk by Jim Hendler on The Semantic Web: KR's Worst Nightmare?, Workshops on …

• KR2004: Invited talk by Peter Patel-Schneider on What Is OWL (and Why Should I Care)?, Workshops on …

• KR2006: Invited talk by Alon Halevy on Dataspaces: Co-existence with Heterogeneity

• …

IJCAI, KR

Status:Synergetic

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 6

VLDB, SIGMOD/PODS• VLDB2001: Invited talk by Pierre-Paul Sondag on The Semantic Web Paving the

Way to the Knowledge Society

• VLDB2003: Tutorial on The Semantic Web: Semantics for the Data on the Web

• VLDB2004: Invited talk by Alon Halevy on Structures, Semantics and Statistics

• Semantic Web and Databases (SWDB) 2003 - 2006 , VLDB workshops in 2005, 2006 on Ontologies-based techniques for DataBases and Information Systems

• SIGMOD/PODS2006: Invited talk by Alon Halevy on Principles of Dataspace Systems, Invited tutorial by Enrico Franconi on The Logic of the Semantic Web

Status:Knowledgeable

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 7

ACL, SIGIR, ECML, ICML• COLING/ACL 2006: Workshop on Ontology Learning and Population

• EACL 2006: Tutorial on Ontology Learning from Text

• LREC 2006: Invited talk by Enrico Motta on The role of language and mining technologies in engineering and utilizing the semantic web

• ICML 2005: Tutorial on Machine Learning and the Semantic Web

• ECML/PKDD 2004 Workshop on Knowledge Discovery and Ontologies

• SIGIR 2003 Workshop on the Semantic Web

Status:Informed

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 8

ICSE, SEKE

• ICSE2004: Tutorial on Software Modeling Techniques and the Semantic Web

• Specialized conference SEKE: Software Engineering and Knowledge Engineering, e.g. sessions on Ontologies in 2005, 2006, Workshop on Ontology in Action in 2004

Status:Aware

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 9

Customers and Suppliers

• Customers: what does Semantic Web research deliver to other communities?

• Suppliers: what do other communities deliver to Semantic Web research?

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 10

Agenda

• Presence of Semantic Web at Top Events of Other Communities

• Customers and Suppliers– Knowledge Representation (KR)– Databases (DB)– Software Engineering (SE)– Natural Language Processing (NLP)– Machine Learning (ML)

• Business Aspects

• Trends and Take Home Messages

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 11

Knowledge Representation (KR)

We establish ontology languages for knowledge

representation on the Web

We deliver only finest KR formalisms and

deduction algorithms

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 12

KR as Customer

• Semantic Web delivers challenges beyond Tweety– E.g. in application domains such as e-Science

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 13

KR as Customer

- Initiation of standardization processes

RDFOWL

RIF WG

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 14

KR as Supplier

- Semantic Web requirements on KR:- efficient algorithms- tractable language fragments- handling uncertainty and inconsistency- non-monotonic reasoning

- KR delivers representation formalisms and reasoning algorithms- Description Logic (FaCT, Racer, Pellet, KAON2)- Logic Programming (Ontobroker)- Integration of DL/LP ([Motik et al., 2004 and 2006])

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 15

Integrating DLs and Logic Programming Acknowledgements: Boris Motik, Riccardo Rosati

• DLs are good at…– …representing taxonomical knowledge

– …representing incomplete information• unknown individuals and disjunctive knowledge

• But we also want…– …to represent arbitrary relationships between objects

– …represent database-like constraints

– …represent exceptions

• Logic programming addresses many of these issues

• Hybrid MKNF knowledge bases …– … consist of a DL knowledge base + a logic program

– … are fully compatible with DLs

– … are fully compatible with logic programming

– … bring together the best of both worlds

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 16

Databases (DB)

We deliver models for query answering in

open, heterogeneous environments

We deliver efficient management of large data sets

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 17

Databases vs. Semantic Web

• Databases• Scalability

– Performance– Performance– Performance

[B. Lindsay, IBM Fellow]

• Controlled settings

• Closed world assumption

• Semantic Web• Expressive KR languages

– Description logics– Uncertainty, heterogeneity

and openness of the WWW

• Decentralized, ad-hoc settings

• Open world assumption

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 18

Trends in DatabasesAcknowledgement: Alon Halevy

• DB trends – From DB’s to dataspaces– From integration to co-existence

• Dataspaces: [Franklin, Halevy, Maier 2005]– “pay-as-you-go” data management– Dataspace querying, evolution and reflection– Need for KR services

• Example Scenarios– Personal Information Management – Enterprise Information Integration – Querying the WWW

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 19

OriginatedFrom

PublishedIn

ConfHomePage

ExperimentOf

ArticleAbout

BudgetOf

CourseGradeIn

AddressOf

Cites

CoAuthor

FrequentEmailer

HomePage

Sender

EarlyVersion

Recipient

AttachedTo

PresentationFor

Example: Personal Information Management[Semex, CALO, Haystack]

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 20

DB as Customer: Challenge 1 Integration in Dataspaces

• Formal models for query answering

„Recent developments in the field of knowledge representation (and the Semantic Web) offer two main benefits as we try to make sense of heterogeneous collections of data in a dataspace: simple but useful formalisms for representing ontologies, and the concept of URI (uniform resource identifiers) as a mechanism for referring to global constants on which there exists some agreement among multiple data providers." [Abiteboul et al., The Lowell Database Research Self-Assessment, CACM May 2005/Vol. 48, No. 5]

– Semantic integration of data sources– Integration of structured and unstructured data– Ranking of answers

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 21

DB as Customer: Challenge 2 Semantic Mappings

• Methods for answering queries from multiple sources without set of pre-defined correct semantic mappings

„A semantic heterogeneity solution capable of deployment at Web scale remains elusive. […] The same problem is being investigated in the context of the Semantic Web. Collaboration between groups working on these and other related problems, both inside and outside the database community, is important.“[Abiteboul et al., The Lowell Database Research Self-Assessment, CACM May 2005/Vol. 48, No. 5]

– Approximate mappings– Measuring the accuracy of mappings– Emergent Semantics: Gossip based algorithms– Infer mappings, reasoning about mappings

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 22

DB as Customer Challenge 3:Uncertainty and Inconsistency

• Life is imperfect with dataspaces:– Semantic relationships are uncertain– Data sources may be imprecise– Data will often be inconsistent

• Reasoning with inconsistent knowledge• Diagnosis and repair, belief revision• Languages for modeling fuzzy and uncertain

knowledgeWorkshop on Uncertainty Reasoning for the Semantic Web (URSW @ ISWC2005,2006)

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 23

DB as Supplier: Deductive Database Techniques in KAON2

• Efficient reasoning with large datasets (ABox) is hard with standard methods for OWL reasoners (tableaux algorithms)

• Deductive databases can efficiently handle large data quantities• Idea: apply techniques from the field of (disjunctive) deductive databases

– join-order optimization– magic sets optimization

DL knowledge base KB

Disjunctive datalog program DD(KB)

Query

KB |=

if and only if

DD(KB) |=

for a ground fact

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 24

Software Engineering

We deliver formal models which enable e.g. reasoning about

resources

We deliver architectures, tool support, visual

modelling techniques

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 25

Ontologies vs. ModelsAcknowledgements: Colin Atkinson

• Ontologies• originated from the artificial

intelligence world for the purpose of precisely structuring “knowledge”

• new “knowledge” derived by automated reasoning

• characterized by OWL as the flagship language– formal semantics (description

logic)

• Models (à la MDA) • originated from the software

engineering world for structuring the specification of software, abstracting from platform specific aspects

• information defined prescriptively for construction

• characterized by UML as the flagship language– semi-formal semantics

(metamodels)

ModelsOntologies ?

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 26

Ontology Definition Metamodel

UML Profile(Visual Syntax)

Ontology Metamodel

UML 2.0 ModelOntology

Mappings

Metaobject Facility (MOF)

Metaobject Facility (MOF)

Model

Meta-Model

Meta-Meta-Model

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 27

SE as Supplier:MOF-based Ontology Development

• MDA enables interoperability

• MDA-based tool support (modeling tools, model management)

• Independence of specific formalisms– Definition of the ontology model in an abstract form, independent

of the particularities of specific logical formalisms– Language mappings (groundings) define the transformation to

particular formalisms

• Reuse of UML for visual modeling

see NeOn approach for networked ontology model

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 28

SE as Customer: MOF and Semantic Web Acknowledgements: Elisa F. Kendall

• MOF technology streamlines the mechanics of managing models and model transformation

• Semantic Web technologies provide reasoning about resources– Semantic alignment among differing vocabularies and nomenclatures– Consistency checking and model validation, e.g. business rule

analysis – Ask questions over multiple resources that one could not answer

previously– Policy-driven applications to leverage existing knowledge and

policies

• Example: A Formal Framework for Reasoning on UML Class Diagrams [Lenzerini et al. 2002]

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 29

Natural Language Processing (NLP)

We deliver all kinds of ontologies and reasoning support, e.g. to improve

disambiguation

We deliver solid information extraction

methods and tools

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 30

NLP as Customer

• Domain ontologies for disambiguation, e.g.– Compound interpretation (see OntoQuery project)– Lexical Ambiguities:

• „corner“ has 11(!) meanings (synsets) in Wordnet• but in specific domains much less meanings are

typically relevant, e.g. in the soccer domain (SmartWeb)

– „corner“ as location on the playing ground– „corner“ as a player action

– Syntactic ambiguities (PP-attachment, …)

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 31

NLP as Customer

• Foundational ontologies for capturing domain-independent aspects of meaning – see [Cimiano and Reyle 2006]

• Spatial and temporal ontologies to support NL interpretation by reasoning

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 32

NLP as Supplier• Many methods for information extraction (IE) from text

have been developed in the past – see Message Understanding Conferences (MUC)

• Use the Web as a corpus of evidence – A-Box (PANKOW [Cimiano et al. 2004], KnowItAll [Etzioni et al.

2004]) – T-Box (synonym discovery [Turney 2001])

• Automating ontology evaluation– e.g. w.r.t OntoClean (see AEON [Völker et al. 2005])

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 33

NLP for Ontology Evaluation• Understanding OntoClean requires (at least …)

philosophical, modelling and particular domain knowledge• Even for experts applying OntoClean is tedious and time-

consuming

• Automatic Evaluation of ONtologies (AEON) facilitates tagging wrt OntoClean meta-properties

• Nature of concepts reflected by human language and what is said about instances of these concepts– „He is no longer a student.“ (student not rigid)– „Connecting more than two computers

requires a hub.“ (computer is countable thus carries identity)

• Pattern-based approach• Detect positive and negative evidence for meta-properties• Use WWW as corpus: Overcome data-sparseness

Ahh … and how do I evaluate the ontology?

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 34

AEON – Architecture

Input:Ontology

Output:Tagged Ontology

+R-I..

QuickTag

Pattern Library

Web Search Eng.

LinguisticAnalyser

EvaluationComponent

Classifier

World

WWW

AEON

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 35

Machine Learning (ML)

We improve bag-of-words models with

semantics

We deliver learnedA-Boxes and T-Boxes

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 36

ML as Customer• Inclusion of semantics in bag-of-word models

– Text clustering and classification ([Bloehdorn et al. 2005])– Information Retrieval ([Gonzalo et al. 1999])

• Semantics in image recognition:– Fuse information from

different classifiers

(see ACEMEDIA Project)

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 37

ML as Supplier

• Use of machine learning methods for A-Box and T-Box learning:– Inductive Logic Programming (ILP) for induction of concept

definitions, e.g. for restructuring concept hierarchies ([Esposito et al. 2004])

– Discover new associations between concepts (e.g. via association rules) ([Maedche & Staab 2000])

– Learning Taxonomies by unsupervised clustering techniques, e.g. OntoGen ([Grobelnik et al., 2006])

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 38

Disclaimer

Other important areas which I could not mention here include Agents, Blogs, Grids, Peer-to-Peer Systems, Social Networks, Web Services, …

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 39

Agenda

• Presence of Semantic Web at Top Events of Other Communities

• Customers and Suppliers– Knowledge Representation (KR)– Databases (DB)– Software Engineering (SE)– Natural Language Processing (NLP)– Machine Learning (ML)

• Business Aspects

• Trends and Take Home Messages

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 41

It is estimated, that there will be slices of about 5-20% in the ERP, PLM and ECM market, which will be filled by semantic technology applications.

The Semantic Technology Market Offers High Growth Potential

Application areas drive the market for semantic technologies:

Enterprise Information Integration (EII) Enterprise Content Management (ECM) Enterprise Resource Planning (ERP) Product Lifecycle Management (PLM)

Central functionalities can be transferred to a semantic middleware layer (carve out effect).

Web Service Access

Layer 3: NeOn Toolkit (eclipse based)

Layer 2: Distributed Components

Layer 1: Distributed Repository

Texts

NeOnEditor

NeOn Browser

Databases,Catalogues

Ontologies,Instances

SemanticAnnotations

… External

NeOnReasoning

ServiceExternal

NeOnAnnotation

Service…

NeOn Reference Architecture

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 42

Market EstimationSemantic Access & Integration market

  In mio. € 2006 2007 2008 2009 2010   CAGR

ECM Worldwide 2.754 3.277 3.900 4.641 5.523 19,00%

Semantic ECM (%) 5% 7% 10% 15% 20%

  Semantic ECM 138 229 390 696 1105    

EII Worldwide 4.580 5.985 7.116 8.461 10.060 18,90%

Semant. Info. Integration (%) 5% 10% 15% 20% 25%

  Semant. Info. Integration 229 599 1067 1692 2515    

ERP Worldwide 17.470 18.309 19.187 20.108 21.074 4,80%

Semantic ERP (%) 0% 0,5% 2% 10% 20%

  Semantic ERP 0 92 384 2011 4215    

PLM Worldwide 6.600 7.920 9.108 10.474 12.045 15,00%

Semantic PLM (%) 5% 8% 8% 8% 8%

  Semantic PLM 330 634 729 838 964    

Semantic Access & Integration 697 1.553 2.570 5.237 8.798    

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 43

Information Integrator

Ontologies of Sources

Business Ontology

Declarative Mappings

Automated Mapping

Heterogeneous Sources

<article> <articleid>a-5634</articleid> <category>printer</category> <name>hp81</name> <price currency=‘USD’>500</price> <producer>hp</producer> <resolution>1960 dpi</resolution> <type>laser</type>….</article>

Views

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 44

A Look at REAL CustomersAcknowledgement: Richard Benjamins, iSOCO

• Ontologies are the key differentiating feature of Semantic Web technologies– Semantic integration of heterogeneous sources– Automatic processing of unstructured information

• One of the current main obstacles for Semantic Web technologies is the need for Ontologies

– They are hard to construct and maintain – May involve many stakeholders– Their costs are difficult to estimate and control

• Before Semantic Web technology goes to mainstream market, potential customers (businesses and governments) need to perceive that ontologies are

– Doable, controllable and manageable, affordable– An asset for creating/maintaining competitive advantage– An asset that can be sold as high-value content

• Understanding and controlling cost factors of ontology engineering is critical (see OntoCom [Paslaru et al., 2006])

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 45

Agenda

• Presence of Semantic Web at Top Events of Other Communities

• Customers and Suppliers– Knowledge Representation (KR)– Databases (DB)– Software Engineering (SE)– Natural Language Processing (NLP)– Machine Learning (ML)

• Business Aspects

• Trends and Take Home Messages

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 46

Take Home Messages

• KR: Synergetic collaboration

• DB: Similar strategic challenges and goals

• NLP+ML: Huge potential, not yet exploited

• SE: Potential recognized, but still in early stage

• Business Aspects: Existing and growing market for Corporate Semantic Web applications

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 47

Trends

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 48

Trends

• Web Science (Berners-Lee et al.) – studies the scientific, technical and social challenges

underlying the growth of the Web – Semantic Web as important building block

• Convergence Web 2.0 and Semantic Web – Web 2.0: Collaborative development of content,

community effects → The Social Web– Semantic Web: structuring principles, well-defined

and reusable meaning for metadata, mash ups on the fly

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 49

Semantic MediaWiki• Enable wiki authors to structure information• RDF export of this structure• Knowledge reusable inside the wiki

Athens,GA

ISWC2006

11/07/2006

location

starts

• Typed links– ISWC2006 is in [[Athens, GA]]– ISWC2006 is in [[location::Athens,

GA]]

• Typed Attributes– ISWC2006 starts November 7– → [[starts:=November 7, 2006]]

• Query for all US conferences in autumn 2006

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 50

Semantic MediaWiki

• Collaborative management of semantically enriched content, tailored towards usability and simplicity– Edit & Annotate: annotation easy as wiki-editing –

unconstrained, collaborative, version-controlled– Search & Explore: semantic search, novel browsing, and

easier maintenance as instant rewards– Share & Reuse: content exported as browsable OWL DL/RDF,

reusing existing vocabularies and ontologies

• Thousands of real users in many languages:

Semantics to the people!

http://ontoworld.org/wiki/Semantic_MediaWiki

„The Semantic Web: Customers and Suppliers“, Rudi Studer, 2006 Slide 51

Thank You!

Rudi StuderInstitut AIFB, Universität Karlsruhe (TH)

http://www.aifb.uni-karlsruhe.de/

with contributions from Philipp Cimiano, Peter Haase, Pascal Hitzler, Markus Krötzsch, Hans-Peter Schnurr, York Sure, Denny Vrandecic &

Semantic Karlsruhe