the semantic web from 30.000ft frank van harmelen creative commons license: allowed to share &...
TRANSCRIPT
The Semantic Web from 30.000ft
Frank van Harmelen
Creative Commons License: allowed to share & remix,but must attribute & non-commercial
http://www.youtube.com/watch?v=tBSdYi4EY3s
The Semantic Web =a big engineering effort +
a set of information structuring principles
People
Web
Machines
HOW?
The Current Web of text and pictures
a web page in English
aboutFrank
And this page is about
LarKC
and another web page
aboutFrank
And this page is about
Stefano
This page is about the Vrije
Uniersitei
linked web-pages, written by people, written for people, used only by people...
Many of these pagesalready come from data,that is usable by computers!But we can’t link the data....
?
? ?
?
The Future Web of Data
?
linked data,usable by computers!useful for people!
P1. Give all things a name
P2. Relations form a graph between things
P3. The names are addresses on the Web
x T
[<x> IsOfType <T>]
differentowners & locations
<village>
P1+P2+P3 = Giant Global Graph
P4. explicit & formal semantics
• assign types to things• assign types to relations• organise types in a hierarchy• empose constraints on
possible interpretations
What’s it like to be a computer on the web?
Examples of “semantics”
Semantics = predictable inference
Frank Lyndamarried-to
• Frank is male• married-to relates
males to females
• married-to relates 1 male to 1 female
• Lynda = Hazel
lowerbound upperbound
Hazelmarried-to
Did you get anywhere? (1/2)
already many billions of facts & rules
Encyclopedia
Encyclopedia
Geographic names (millio
ns)
Geographic names (millio
ns)
names of artists
& art works
(10.000’s)
names of artists
& art works
(10.000’s)
scientific bibliographies
scientific bibliographies
hierarchical dictionaries
(UK, FR, NL)
hierarchical dictionaries
(UK, FR, NL)
life-sc
ience databases
life-sc
ience databases
any CD ever recorded (almost)
any CD ever recorded (almost)
May ‘09 estimate > 4.2 billion triples + 140 million interlinksMay ‘09 estimate > 4.2 billion triples + 140 million interlinks
basic facts on every country
on the planet
basic facts on every country
on the planet
common sense rules &
facts (100.000’s)
common sense rules &
facts (100.000’s)
It gets bigger every month
25 billion facts & relations…
Real life examples• handcrafted
– music: CDnow (2410/5), MusicMoz (1073/7)– biomedical: SNOMED (200k), GO (15k),
Emtree(45k+190kSystems biology
• ranging from lightweight – Yahoo, UNSPC, Open directory (400k)
to heavyweight (Cyc (300k))
• ranging from small (METAR) to large (UNSPC)
Did you get anywhere? (2/2)
Did you get anywhere? (3/2)
used by• media: BBC, Reuters, New York Times, • governments• retail (10K companies):
BestBuy, Sears, Kmart, Volkswagen, Renault• IT: IBM, Oracle• Search: Bing, Yahoo, Google
(before May 2012, after May 2012)
Any lessons?
heterogeneityis unavoidable
Much heterogeneityis solved socially
knowledge obeysa long-tail distribution
Types
Semantic Web from 30.000ft ?
It’s not yet very well understood
It’s a surprisingly successful engineering effort
It’s a handful of principles