evolving the web into a giant global database

34
Evolving the Web into a Giant Global Database Marko A. Rodriguez T-5, Center for Nonlinear Studies Los Alamos National Laboratory http://markorodriguez.com February 12, 2009

Upload: marko-rodriguez

Post on 10-May-2015

1.836 views

Category:

Technology


1 download

DESCRIPTION

The Web as we know it today will not be the Web as we know it tomorrow. The Web of today is oriented towards the universal accessibility of files (e.g. web pages, images). The Web of today can be thought of as a large-scale, distributed file system. The Web of tomorrow will encode any datum (e.g. strings, integers, dates). The Web of tomorrow can be thought of as a large-scale, distributed database. This talk will discuss the the future Web with special focus on the supporting standards and application visions.

TRANSCRIPT

Page 1: Evolving the Web into a Giant Global Database

Evolving the Web into a Giant Global Database

Marko A. RodriguezT-5, Center for Nonlinear StudiesLos Alamos National Laboratory

http://markorodriguez.com

February 12, 2009

Page 2: Evolving the Web into a Giant Global Database

Abstract

The Web as we know it today will not be the Web as we know ittomorrow. The Web of today is oriented towards the universal accessibilityof files (e.g. web pages, images). The Web of today can be thought of asa large-scale, distributed file system. The Web of tomorrow will encodeany datum (e.g. strings, integers, dates). The Web of tomorrow can bethought of as a large-scale, distributed database. This talk will discuss thethe future Web with special focus on the supporting standards andapplication visions.

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 3: Evolving the Web into a Giant Global Database

Outline

• The Space of Uniform Resource Identifiers

• The World Wide Web vs. the Semantic Web

• Relational Databases vs. Triple Stores

• Ontologies and Reasoning

• General-Purpose Computing on the Semantic Web

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 4: Evolving the Web into a Giant Global Database

Outline

• The Space of Uniform Resource Identifiers

• The World Wide Web vs. the Semantic Web

• Relational Databases vs. Triple Stores

• Ontologies and Reasoning

• General-Purpose Computing on the Semantic Web

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 5: Evolving the Web into a Giant Global Database

Internet Address Spaces

• The Uniform Resource Identifier (URI) is the superclass of the UniformResource Locator (URL) and Uniform Resource Name (URN).

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 6: Evolving the Web into a Giant Global Database

The Uniform Resource Locator

• The set of all URLs is the address space of all resources that can belocated and retrieved on the Web. URLs denote where a resource is.

? http://markorodriguez.com/index.html∗ Domain name server (DNS): markorodriguez.com→ 216.251.43.6∗ http:// means GET at port 80,∗ /index.html means the resource to get at that Internet location.

markorodriguez.com216.251.43.6

Web Server

index.html

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 7: Evolving the Web into a Giant Global Database

The Uniform Resource Name

• The set of all URNs is the address space of all resources within the urn:namespace.

? urn:uuid:bd93def0-8026-11dd-842be54955baa12? urn:issn:0892-3310? urn:doi:10.1016/j.knosys.2008.03.030

• Named resources need not be retrievable through the Web.

• URNs denote what a resource is.

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 8: Evolving the Web into a Giant Global Database

The Uniform Resource Identifier

• The URI address space is an infinite space for all Internet resources.

? http://markorodriguez.com/index.html? urn:issn:0892-3310? ftp://markorodriguez.com/private/markos_secrets.txt? http://www.lanl.gov#fluffy

• Imporant: URIs can denote concepts, instances, and datum.

lanl:fluffy lanl:fluffy_legs

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 9: Evolving the Web into a Giant Global Database

The “Uniform Resource Graph”

• We can denote where something is, what something is, but how do wedenote how something relates to something else?

• How can we denote what something means, where meaning is determinedby its place within a larger relational structure?

? URIs are like words. They denote things in the real or imaginary world.? Linking URIs is like defining words. Similar to how a dictionary defines

words in terms of other words.

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 10: Evolving the Web into a Giant Global Database

Outline

• The Space of Uniform Resource Identifiers

• The World Wide Web vs. the Semantic Web

• Relational Databases vs. Triple Stores

• Ontologies and Reasoning

• General-Purpose Computing on the Semantic Web

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 11: Evolving the Web into a Giant Global Database

Undirected Single-Relational Network

Human-B

Human-C

Human-D

Human-E

Human-F

Human-A

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 12: Evolving the Web into a Giant Global Database

Directed Single-Relational Network

Article-B

Article-C

Article-D

Article-E

Article-F

Article-A

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 13: Evolving the Web into a Giant Global Database

From the World Wide Web to the Semantic Web

• The World Wide Web is primarily concerned with the Hyper-TextTransfer Protocol (HTTP) and with retrievable resources in the URLaddress space.

• These retrievable resources are files: HTML documents, images, audio,etc. The “web” is created when HTML documents contain URLs.

index.html

Home.html Research.htmlResume.html hrefhref

href

http://markorodriguez.com/

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 14: Evolving the Web into a Giant Global Database

Directed Multi-Relational Network

Article-A

Journal-A

Publisher-A

Article-B

Human-B

Human-A

authored

authored

authoredcontainedIn

editorOf

publishedBy

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 15: Evolving the Web into a Giant Global Database

From the World Wide Web to the Semantic Web

• The Semantic Web is primarily concerned with URIs. If the WorldWide Web is the web of files, the Semantic Web is the web of concepts.In other words, for the World Wide Web, the level of granularity isthe retrievable file. For the Semantic Web, it is the ideas in that file.Moreover, these ideas are not necessarily contained in a file. Thereexistence is predicated on their URI. Their meaning is predicated on theirrelationship to other URIs. The web of URIs is the Semantic Web.

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 16: Evolving the Web into a Giant Global Database

The Resource Description Framework

• The Resource Description Framework (RDF) is the standard forrepresenting the relationship between URIs and literals (e.g. float, string,date time, etc.). I would have preferred the name “Uniform ResourceGraph” (URG).

• Relationships are directed, labeled links between URIs. A subject URIpoints to an object URI or literal by means of a predicate URI.

lanl:marko lanl:jhwfoaf:knows

foaf:name

"Marko A. Rodrigez"^^xsd:string

foaf:name

"Jennifer H. Watkins"^^xsd:string

subject objectpredicate

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 17: Evolving the Web into a Giant Global Database

lanl:marko lanl:jhwfoaf:knows

foaf:name

"Marko A. Rodrigez"^^xsd:string

foaf:name

"Jennifer H. Watkins"^^xsd:string

foaf:member

lanl:lanl

foaf:member

foaf:name

"Los Alamos National Laboratory"^^xsd:string

unm:unm

foaf:member

foaf:name

"University of New Mexico"^^xsd:string

urn:doi:10.1016/j.joi.2008.04.002

foaf:publicationsrdf:type

foaf:Person

rdf:type

foaf:Document

rdf:type

foaf:Organization

rdf:type rdf:type

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 18: Evolving the Web into a Giant Global Database

The RDF Data Model and its Serializations

• RDF is a data model. As such, there exists many serializations(encodings) of that model.

• RDF/XML is not RDF. It is a serialization of RDF. It is smart to, at allcosts, avoid learning RDF/XML as it is an unintuitive standard. Otherserializations include: N-TRIPLE, N3, TRIX, TRIG, ...

<http://www.lanl.gov/uc33c7c98> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.mesur.org/schemas/2007-01/mesur#Journal> .<http://www.lanl.gov/uc33c7c98> <http://xmlns.com/foaf/0.1/name> "Journal of Neuroscience Research"^^<http://www.w3.org/2001/XMLSchema#string> .<http://www.lanl.gov/uc33c7c98> <http://www.mesur.org/schemas/2007-01/mesur#hasDoi> "urn:doi:10.1002/(issn)1097-4547"^^<http://www.w3.org/2001/XMLSchema#anyURI> .<http://www.lanl.gov/uc33c7c98> <http://www.mesur.org/schemas/2007-01/mesur#hasIssn> "urn:issn:0360-4012"^^<http://www.w3.org/2001/XMLSchema#anyURI> .<http://www.lanl.gov/uc33c7c98> <http://www.mesur.org/schemas/2007-01/mesur#hasIssn> "urn:issn:1097-4547"^^<http://www.w3.org/2001/XMLSchema#anyURI> .

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 19: Evolving the Web into a Giant Global Database

The Semantic Web is a Distributed Database

• The URI address space is distributed.

• URIs can denote datum.

• RDF denotes the relationships URIs.

• The Semantic Web’s foundational standard is RDF.

• Therefore, the Semantic Web is a distributed database.

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 20: Evolving the Web into a Giant Global Database

The World Wide Web vs. the Semantic Web

Web Server

127.0.0.1

HTML

Web Server

127.0.0.2

HTMLhref

Web Server

127.0.0.1

RDF

Triple Store

127.0.0.2

foaf:knows

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 21: Evolving the Web into a Giant Global Database

Linked Data Cloud1

SWConference

Corpus

DBpedia RDF Book Mashup

DBLPBerlin

Revyu

Project Guten-berg

FOAFprofiles

Geo-names

Music-brainz

Magna-tuneJamendo

World Fact-book

DBLPHannover

SIOCprofiles

Sem-Web-

Central

Euro-stat

ECS South-ampton

BBCLater +TOTP

Doap-space

Open-Guides

Gov-Track

US Census Data

W3CWordNet

flickrwrapprWiki-

company

OpenCyc

lingvoj

Onto-world

BBCJohnPeel

Flickrexporter

Audio-Scrobbler QDOS

updated

RKB Explorer

NEW!riese

NEW!

1provided by Richard Cyganiak ([email protected])

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 22: Evolving the Web into a Giant Global Database

Outline

• The Space of Uniform Resource Identifiers

• The World Wide Web vs. the Semantic Web

• Relational Databases vs. Triple Stores

• Ontologies and Reasoning

• General-Purpose Computing on the Semantic Web

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 23: Evolving the Web into a Giant Global Database

Relational Databases vs. Triple Stores: Technology

• A relational databases’ (e.g. MySQL, PostgreSQL, Oracle) naturalrepresentation is a collection interlinked tables.

• A triple stores’ (e.g. OpenSesame, AllegroGraph, Neo4j) naturalrepresentation is a multi-relational network, or graph.

Triple Store

127.0.0.2

Relational Database

127.0.0.1

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 24: Evolving the Web into a Giant Global Database

Relational Databases vs. Triple Stores: Culture

• Relational databases tend to not maintain public access points.

• Relational database users tend to not publish their schemas.

• Triple stores maintain public access points called SPARQL end-points.

• Triple store users tend to reuse and extend public schemas calledontologies.

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 25: Evolving the Web into a Giant Global Database

SQL vs. SPARQL

SELECT ?x WHERE {?y foaf:name "Los Alamos National Laboratory"^^xsd:string .?y foaf:member ?x .?x foaf:knows ?z .?z foaf:name "Marko A. Rodriguez"^^xsd:string }

SELECT p1.idFROM person p1, organization o1 AS r1, person p2 WHEREo1.name="Los Alamos National Laboratory" ANDo1.id = p1.member ANDp1.id = r1.id ANDr1.knows=p2.id ANDp2.name="Marko A. Rodriguez";

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 26: Evolving the Web into a Giant Global Database

Outline

• The Space of Uniform Resource Identifiers

• The World Wide Web vs. the Semantic Web

• Relational Databases vs. Triple Stores

• Ontologies and Reasoning

• General-Purpose Computing on the Semantic Web

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 27: Evolving the Web into a Giant Global Database

lanl:marko lanl:jhwfoaf:knows

foaf:name

"Marko A. Rodrigez"^^xsd:string

foaf:name

"Jennifer H. Watkins"^^xsd:string

foaf:member

lanl:lanl

foaf:member

foaf:name

"Los Alamos National Laboratory"^^xsd:string

unm:unm

foaf:member

foaf:name

"University of New Mexico"^^xsd:string

urn:doi:10.1016/j.joi.2008.04.002

foaf:publicationsrdf:type

foaf:Person

rdf:type

foaf:Document

rdf:type

foaf:Organization

rdf:type rdf:type

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 28: Evolving the Web into a Giant Global Database

Ontologies

• An ontology defines your domain of discourse.

• An ontology helps to define the types of abstract classes that exist inyour domain and the types of relationships that exist between instancesof those classes.

• An ontology allows you to infer implicit knowledge from explicitknowledge.

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 29: Evolving the Web into a Giant Global Database

lanl:marko lanl:fluffylanl:hasOwner

foaf:Person

rdf:type

lanl:Dog

owl:Restriction

lanl:jhw

lanl:hasOwner

lanl:hasOwner owl:onProperty

owl:maxCardinality

"1"^^xsd:int

rdfs:subClassOf

_:12345

rdf:type

owl:differentFrom

rdf:type

lanl:Mammal

rdf:type

rdf:subClassOf

rdf:typerdf:type

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 30: Evolving the Web into a Giant Global Database

Outline

• The Space of Uniform Resource Identifiers

• The World Wide Web vs. the Semantic Web

• Relational Databases vs. Triple Stores

• Ontologies and Reasoning

• General-Purpose Computing on the Semantic Web

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 31: Evolving the Web into a Giant Global Database

General-Purpose Computing on the Semantic Web

• It is possible to represent a virtual computing machine and software inthe Semantic Web.

• Given that the URI address space is distributed, the computing structuresare inherently distributed.

• Thus, the Semantic Web can be used as a giant computer – data,programs, and virtual machines all within the same address space.

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 32: Evolving the Web into a Giant Global Database

x = (3 ∗ 2) + 1

urn:uuid:361604bc neno:hasSymbol

"x"^^xsd:string

neno:nextInst

"1"^^xsd:intlanl:Push urn:uuid:3cff4d2e neno:hasValue

"2"^^xsd:inturn:uuid:403d632c neno:hasValue

"3"^^xsd:inturn:uuid:47fe91e2 neno:hasValue

neno:nextInst

urn:uuid:7c08528e

neno:nextInst

urn:uuid:7c08528e

neno:nextInst

neno:nextInst

lanl:Push

lanl:Multiply

rdf:type

rdf:type

lanl:Push rdf:type

lanl:Add

lanl:Set

rdf:type

rdf:type

rdf:type

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 33: Evolving the Web into a Giant Global Database

halt

Fhat

Instruction

programLocation

Frame

hasFrame

[0..*]

[0..1]

returnTop

ReturnStack

Instruction

rdf:firstrdf:rest

[0..1][0..1]

blockTop

[0..*]

FrameVariable

rdf:li

hasValue

rdfs:Resource

operandTop

OperandStack

rdfs:Resource

rdf:firstrdf:rest

[0..1]

[0..1]

[0..1]

RVM

[0..*]

hasSymbol

xsd:string

[1]

xsd:boolean[1]

forFrame[1]

fromBlock

Block

[1]

currentFrame

[0..1]

methodReuse

xsd:boolean[1]

[0..1]

BlockStack

Block

rdf:firstrdf:rest

[0..1]

[0..1]

[0..1]

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009

Page 34: Evolving the Web into a Giant Global Database

Conclusion

• Thank you for your time...

? My homepage: http://markorodriguez.com? Neno/Fhat: http://neno.lanl.gov

New Mexico Internet Professionals Association Lecture (NMIPA) – Santa Fe, New Mexico – February 12, 2009