lessons and requirements from a decade of deployed semantic web apps

21
© Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute www.deri.ie Enabling Networked Knowledge Lessons and requirements from a decade of deployed Semantic Web apps Benjamin Heitmann, Richard Cyganiak, Conor Hayes, Stefan Decker Funded by Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2)

Upload: benjamin-heitmann

Post on 28-Jan-2015

111 views

Category:

Education


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Lessons and requirements from a decade of deployed Semantic Web apps

© Copyright 2011 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Lessons and requirements from a decade of deployed Semantic Web apps

Benjamin Heitmann, Richard Cyganiak, Conor Hayes, Stefan Decker

Funded by Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2)

Page 2: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Input for this workshop

LEDP workshop CfP calls for: requirements patterns gaps in Linked Data

standards + guidelines

Where should this input come from ?

2

Page 3: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

The Semantic Web: a decade is a long time

3

2001 2011

Page 4: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Choice of methodology?

Goal: patterns, requirements and gaps

regarding LD

Data: 10 years of Semantic Web research

Which scientific approach fits ? Empirical software engineering

Full IEEE transactions journal paper:

http://tinyurl.com/semweblessons

4

Page 5: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Overview

5

Empiricalsurvey

Architecture:arch. pattern

LD standards:gaps

Software Eng. Process: shortcomings

Software engineering solutions

Page 6: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Empirical survey

Sources: 124 apps total Semantic Web Challenge

(ISWC): 2003-2009,

101 apps

Scripting for SemWeb Challenge (ESWC), 2006-2009, 23 apps

includes industry & research apps

Checklist (12 questions) Data collection:

1. own analysis of paper

2. validation by email

6

Page 7: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Empirical survey results

widespread support for SemWeb specific features

clear difference to database-driven apps big uptake of Linked Data principles and

eco-system integration requires human intervention

top 3 standards: RDF, OWL, SPARQL top 3 vocabularies: FOAF, DC, SIOC

7

Page 8: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Conceptual architecture

Conceptual architecture: describes major design elements of

a system (+ relations) domain specific

(e.g. the Semantic Web)

provides architectural pattern documents community consensus

8

Page 9: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Components of conceptual architecture

9

RDF datahandling

starting point:

Data integration

Graph-based navigation interface

(91%)

Userinterface

Structured data authoring interface

(29%)

Data homogenisation service (74%)

Data discoveryservice (30%)

Graph access layer (100%)

RDF store (88%)

Graph query language service

(77%)

decouple +specialise

Page 10: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

LD gaps: publishing/consuming

all applications consume RDF 73% import API, 69% export API but: incompatible

implementations LD principles in 2006 led to

consolidation

embedding RDF: web for humans vs. web for machines

2008: introduction of RDFa

10

Page 11: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

LD gaps: beyond open data

writing/changing/updating RDF data is difficult

71% of apps do not support data changes

Writing to remote RDF store: draft status in 2011: SPARQL Update

Restricting access (read/write): no standards no interoperability closest ideas (?): R/W design note, WebID

11

Page 12: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Software Eng. process shortcomings (1)

Integrating noisy RDF data: 60% semi-automatic integration this involves human intervention only 20% use automatic heuristics

major part of Semantic Web specific code

Distribution of application logic: multiple components and standards queries(41%), rules(52%) or formal

vocabularies hard to maintain

12

Page 13: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Software Eng. process shortcomings (2)

Mismatch of data models between components graph versus relational or

object oriented (90%)

overhead in communication

inconsistent round-trip conversion

3 way ORM needed ?

13

relationalobject oriented

graph-based

Page 14: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Software Eng. solutions (1)

More guidelines, best practices and design patterns: current examples:

– Linked Data principles and publishing guidelines

– guidelines for naming of URIs– Linked Data patterns collection

result: more interoperability, more coherent Web of Data

14

Page 15: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Software Eng. solutions (2)

More software libraries (beyond RDF storage!) guidelines can be hardcoded in

reusable libraries good libraries can make

complicated guidelines easy to use (See HTTP, SSL, SMTP and DNS lookups)

current examples: – any23, d2r server, Semantic

Web Client Library

15

Page 16: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Software Eng. solutions (3)

More software factories: create complete applications requires patterns + libraries or: “opinionated software”

components can be customised for domain

Interface, homogenisation and data discovery usually made from scratch

16

https://developers.facebook.com/docs/beta/opengraph/tutorial/

Page 17: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Summary

17

Empiricalsurvey

Architecture:arch. pattern

LD standards:gaps

Software Eng. Process: shortcomings

Software engineering solutions

Full article:

http://tinyurl.com/semweblessons

Page 18: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Appendix: threats to validity

Representativeness: only complete applications part of challenges (not tools or

libraries) apps needed to use real-world data submission of paper describing the app was required challenge extends of multiple years, allows trends to be seen

Number of authors who verified checklist (65%): academic email addresses expire quickly we manually tried to find new email addresses

no source code was used: source code was not required for challenges due to e.g. IP

issues

18

Page 19: Lessons and requirements from a decade of deployed Semantic Web apps

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Table: Impl. details

19

2003 2004 2005 2006 2007 2008 2009 overall

Programminglanguages

Java 60%C 20%

Java 56%JS 12% Java 66%

Java 10%JS 15%PHP 26%

Java 50%PHP 25%

Java 43%PHP 21%

Java 46%JS 23%PHP 23%

Java 48%PHP 19%JS 13%

RDF libraries —Jena 18%Sesame 12%Lucene 18%

—RAP 15%RDFLib10%

Sesame 33%Jena 8%

Sesame 17%ARC 17%Jena 13%

Sesame 23% Sesame 19%Jena 9%

SemWeb standards RDF 100%OWL 30%

RDF 87%RDFS 37%OWL 37%

RDF 66%OWL 66%RDFS 50%

RDF 89%OWL 42%SPARQL15%

RDF 100%SPARQL50%OWL 41%

RDF 100%SPARQL17%OWL 10%

RDF 100%SPARQL69%OWL 46%

RDF 96%OWL 43%SPARQL41%

Schemas/vocabularies/ontologies

RSS 20%FOAF 20%DC 20%

DC 12%SWRC 12%

—FOAF 26%RSS 15%Bibtex 10%

FOAF 41%DC 20%SIOC 20%

FOAF 30%DC 21%DBpedia13%

FOAF 34%DC 15%SKOS 15%

FOAF 27%DC 13%SIOC 7%

Page 20: Lessons and requirements from a decade of deployed Semantic Web apps

2003 2004 2005 2006 2007 2008 2009manual 30% 13% 0% 16% 9% 5% 4%

semi-automatic

70% 31% 100% 47% 58% 65% 61%

automatic 0% 25% 0% 11% 13% 4% 19%not needed 0% 31% 0% 26% 20% 26% 16%

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Tables: Data integration and other properties

20

2003 2004 2005 2006 2007 2008 2009

Data creation 20% 37% 50% 52% 37% 52% 76%

Data import 70% 50% 83% 52% 70% 86% 73%

Data export 70% 56% 83% 68% 79% 86% 73%

Inferencing 60% 68% 83% 57% 79% 52% 42%

Decentralised

sources90% 75% 100% 57% 41% 95% 96%

Multiple

owners90% 93% 100% 89% 83% 91% 88%

Heterogeneous

formats90% 87% 100% 89% 87% 78% 88%

Data updates 90% 75% 83% 78% 45% 73% 50%

Linked Data

principles0% 0% 0% 5% 25% 26% 65%

Page 21: Lessons and requirements from a decade of deployed Semantic Web apps

year num

ber

ofap

plic

atio

ns

grap

hac

cess

laye

r

RD

Fst

ore

grap

h-ba

sed

navi

-ga

tion

inte

rfac

e

data

hom

ogen

i-sa

tion

serv

ice

grap

hqu

ery

lang

uage

serv

ice

stru

ctur

edda

taau

thor

ing

inte

rfac

e

data

disc

over

yse

rvic

e

2003 10 100% 80% 90% 90% 80% 20% 50%2004 16 100% 94% 100% 50% 88% 38% 25%2005 6 100% 100% 100% 83% 83% 33% 33%2006 19 100% 95% 89% 63% 68% 37% 16%2007 24 100% 92% 96% 88% 88% 33% 54%2008 23 100% 87% 83% 70% 78% 26% 30%2009 26 100% 77% 88% 80% 65% 19% 15%total 124 100% 88% 91% 74% 77% 29% 30%

Digital Enterprise Research Institute www.deri.ie

Enabling Networked KnowledgeBenjamin Heitmann, slide: /17

Table: architectural analysis

21