Download - 121004 linking open_data_with_drupal_v1
Linking Open Datawith Drupal
Emmanuel Jamin
Drupal.cat October 4th, 2012 Citilab, Cornellá
Who am I?
Emmanuel Jamin
– PhD• At Paris XI university (LIMSI-CNRS, Orsay)
– Research and development (EU projects)• At Edelweiss (INRIA, Sophia Antipolis)
• At the Knowledge Lab (ATOS, Barcelona)
– Now
• Semantic Web consultant in Barcelona
• www.OpenData-consulting.com
• @openDataC
Plan
� Introduction to Open Data Open Data Open Data Open Data
� Introduction to the Semantic WebSemantic WebSemantic WebSemantic Web
� From Open DataOpen DataOpen DataOpen Data to Linked DataLinked DataLinked DataLinked Data
�
�
A – OD - Definition
“Open data is data that can be freely used, reused and redistributed by anyone – subject only, at most, to the requirement to attribute and sharealike.”
http://OpenDefinition.org
A – OD - Principles
Availability and Access
Reuse and Redistribution
Universal Participation
> Availability and Access
> Reuse and Redistribution
> Universal Participation
A – OD – Small history
� 1957-1958: 1st concept
“open access to scientific data”
� 2001: 1st definition
“the web of data” (Tim Berners Lee)
� 2004-05: 1st fondation
Open Knowledge Fondation (http://okfn.org/)
� 2009-05: 1st Open Government platform in US
http://data.gov
� 2012-09: 1st Open Knowledge Festival
http://okfestival.org
Image by Peter Ito (2009): http://www.flickr.com/photos/peterito/3054501076/lightbox/
A – OD - Platforms
Open Government
Open Science
Open Cities
Open Education
Open Health
…
Open Government
Open ScienceOpen Cities
Open Education
Open HealthOpen Culture
Transparency
ParticipationCollaboration
A – OD – Status of OD
Topics
From: http://okfn.org/opendata/
A – OD – Status of OD
Types of Data
Documents
Raw data
Structured data
Linked data
Open Data
Structured Data
Raw Data
Documents
Linked Data
Geo Data
Database
A – OD – Status of OD
Heterogenous standards (Open Standard)
PDF - DOC
CSV
XML
RDF JSONXML
RDF
TXT
ODT
XSL
CSV
JSON
ZIP
KML-KMZ
A – OD – Comparison
� Datos.gov.es / Gen.cat / barcelona.catBarcelonaBarcelonaBarcelonaBarcelona CatalunyaCatalunyaCatalunyaCatalunya EspañaEspañaEspañaEspaña
WebsiteWebsiteWebsiteWebsite http://w20.bcn.cat/opendata/
http://www20.gencat.cat/portal/site/dadesobertes/
http://datos.gob.es/datos/
TopicsTopicsTopicsTopics Economy, Cartography, Population, Environment, Administration
Cartography and maps, Facilities Statistics, Meteorology. Nomenclators, Health, Public transport, Turism
Public sector, Culture and hobbies, Science and technologies, Environment, Education, Tansport
FormatsFormatsFormatsFormats CSV, PDF, XLS, XML, RDF, TXT, ZIP
TMX, ZIP, PDF, CSV, KML-KMZ, DOC, XLS, XML, JSON, RDF, SHP, SPARQL
XHTML, HTML, PDF, XLS, XML, ZIP
A – OD – Why opening up data?
Why opening up the data?
Why opening up the DataDataDataData?
A – OD – Why opening up data?
Facet search and browsing
Data integration
to compare easily
http://civio.es
A – OD – Why opening up data?http://manybills.researchlabs.ibm.com/
A – OD – Why opening up data?
Graphic representation of dataset
to visualize it easily
Data reuse and combination
Data integration
to compare easily
Facet search and browsing
– to contextualize information easily
Data reuse
Facet search and browsing
Data vizualizationData integration
Graphic representation
Data contextualization
Data mapping
Statistics
Big Data analysis
A – OD – Why opening up data?
Opening the data
Reuse it
Mix it
Analyse it
Vizualize it
For a better comprehension
Open Data
Analyze it …
Mix it …
Reuse it …
Visualise it …
for a better comprehension!
OD – The big challenge
The OD movement has:
The energy
The Open Mind philosophy
The public resources
Etc.
But something is missing ...
The big challenge
From: http://www.nathan.com/thoughts/unified/3.html
OD – The big challenge
Opening the data is great !
But it is not enough …
Linking Open Data !
Semantic Web
From: http://salesenablement.wordpress.com/2010/09/07/the-importance-of-context/
B – SW - Principios
Do not read the next slide!
Do not read the next slide
B – SW - Principios
You loose!
B – SW - Principios
Humans identify and interpret information
Machines don't
HumansHumansHumansHumans identify and interpret information
MachinesMachinesMachinesMachines don't
B – Towards the structured web
Separate the form and the content
XML and metadata
Separate the contentcontentcontentcontent and the formformformform
B –
Arbitrary metadata
<book/>
|
<chapter/>
|
<paragraph/>
XML and the metadata
Towards the Structured Web
B –
Arbitrary metadata
<hbskm/>
|
<rzañokt/>
|
<kmcsuhdd/>
What do really understand the machines?
B –
What is the last document have your read?
Which is the last document you read?
B –
Document
{ book, newspaper, paper, post-card … }
Document?
B –
The answer is based on a shared knowledge
The answer is based on a
Shared Ontology
We can understand
You can reason
B –
Document
Book
Roman / Novel
DocumentDocument
Book
Roman Novel
B –
“An ontologyontologyontologyontology is a specification of a conceptualization”
(i.e. the logical description of the conceptsconceptsconceptsconcepts and relationshipsrelationshipsrelationshipsrelationships that can exist for an agent or a community of agents).
Tom Grüber (1993)
B –
Towards the Semantic Web
Towards the Semantic Web
B – SW - Definition
the Semantic WebSemantic WebSemantic WebSemantic Web is
"a web of dataweb of dataweb of dataweb of data that can be processed directly and indirectly by machines."
Tim Berners Lee (2001)
B –
The W3C normalization / scale
From: http://mmt.me.uk/slides/london011209/#(2)
B – SW – Resources
Everything is a resource
– Person Berners Lee
– Organisation W3C
– Document paper.html
– Event SW conference 2012
– … etc.
Everything is a resource
Each resource identified with an URI
www.w3c.org/people/timbl.html#this Berners Lee
www.w3c.org/index.html#this W3C
www.w3c.org/papers/paper.html#this paper.html
www.w3c.org/events/swcon12.html#this SW con'12
B – SW – ResourcesEach resource is identified
with a unique reference.
Namespace to simplify URI
Namespace:
www.w3c.org/people/timbl.html#
Prefixtbl: www.w3c.org/people/timbl.html#
CURIEtbl:this
B – SW – Resources
Namespace to reference the URI
Namespace to simplify URI
w3c:timbl foaf:Person
w3c:this foaf:Organisation
dblp:this foaf:Document
event:this foaf:Event
B – SW – Resources
CURIE to simplify the URI
RDF
(Subject, predicate, object)
B – SW – Triples
RDF
(Subject, predicate, object)
B – SW – Triples
RDF triples
� web.html has author Tim Berners Lee
� LinkedData.html has author Hausenblas
� W3C has employee Tim Berners Lee
� web.html is published at SW conference
B – SW – Ontologies
RDF-S → RDF-Schema
Definition of the
• Classes (concepts)
• and Properties (conceptual relations)
Hierachy organisation with conceptual relations
B – SW – Ontologies
– Book is sub-type of Document
– Novel is sub-type of Book
– Roman is sub-type of Book
RDFS
B – SW – RDF graph
RDF triples = LinkedData
– W3C.html has author Tim Berners Lee
– W3C.html is type of Document
– Tim Berners Lee is type of Person
– W3C.html is presented at Web Conference 2012
– Web Conference 2012 is type of Conference
– Conference is sub class of Event
RDF triples => Linked Data
B – SW – RDF graph
RDF triples = RDF graph
RDF graph
web.html
SW conference
Tim Berners Lee
Document Person
W3C
RDF triples => RDF graph
Organisation
Conference
Event
B – SW – Federated Dataset
Resources are connected over the web
tim:this
w3c:this w3c:this
ivan:this
doc1:thisdoc3:this
doc2:this doc2:this
All resources are connected over the Web
Federated dataset
LOD site 1 LOD site 2
B – SW – SPARQL
Find and retrieve information from the graph
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?document ?authorName
WHERE {
?person rdf:type foaf:Person
?person foaf:name ?authorName
?authorName foaf:made ?document
}
Search and retrieve information from the graph with SPARQL
B – SW – Giant Global Graph
�
� The web becomes one giant database
Global Giant Graph
B – SW
� Is this a fiction?
Is it a fiction?
B – SW
Google Rich SnippetRich Snippets
From: http://openspring.net/blog/2011/09/30/schemaorg-rich-snippets-drupal-7-rdfa
B – SW
Open GraphOpen GraphOpen GraphOpen Graph
B – SW
Google Knowledge GraphKnowledge Graph
C – OD + LD
C – OD + LD
From Open Data to Linked Data
Open Data
Structured Data
CSV
Linked Data
XML
RDF
RDFS
JSON
From Open Data to Linked Data
C – OD + LD
From PDF to RDF
1. Document engineering
• Content extraction
• Content format
• Multimedia extraction
2. Knowledge engineering
• Term extraction (indexation)
• Recognition of Named Entities
• Ontology engineering
• Conceptual recognition and mapping
From PDF to RDF
C – OD + LD
Síntesis de los formatos (table)Synthesis about data formats
C – to arrive in LOD
Linking Open Data
1. Data formalization
• Create or reuse ontologies (RDF, RDFS, OWL)
2. Data annotation
• Associate semantic metadata (RDF, RDFa, Microdata)
3. Data publication
• Publish your semantic data (RDFa, Microdata)
4. Data consumption
• Reuse all available data (SPARQL endpoints)
To succeed with Linked Data
C – OD + LD
From Open Data to Linked Data
Data quality
B – SW – Big Giant Graph
Open Data + Data Interconnection
Linked Open Data
25 billion RDF triples over the web
Linked Open Data
25 billion of RDF triples over the web
From: http://www.w3.org/DesignIssues/diagrams/lod/2010-color.png
B – SW – Big Giant Graph
Open Data + Data Interconnection
Linked Open Data
25 billion RDF triples over the web
Linked Open Data
http://dbpedia.orghttp://dbpedia.orghttp://dbpedia.orghttp://dbpedia.org
B – SW – Big Giant Graph
Open Data + Data Interconnection
Linked Open Data
25 billion RDF triples over the web
The Web 3.0
is already here ...
Linking Open Data with Drupal
D – LODrupal - Drupal
Entities ↔ Resources
RDF in Core
Semantic Web modules
Availability and Access
Reuse and Redistribution
Universal Participation
Entities ↔ Resources
RDF in Drupal Core
and Semantic Web modules
LOD and Drupal
Main Semantic Web modules
Import Linked Data
Microdata
SPARQL
SPARQL Views
RDFx
RDFx
schema.org
SPARQL Views
SPARQL
Microdata
D – LODrupal – Mod1 ...RDFx
From: http://drupal.org/project/rdfx
D – LODrupal – Mod1 ...schemaorg
From: http://drupal.org/project/schemaorg
D – LODrupal – Mod1 ...SPARQL
From: http://drupal.org/project/sparql
D – LODrupal – Drupal Prototype
Demonstration
Demo
E – LODrupal Hackathon
LOD + Drupal hackathon
E – LODrupal Hackathon
Sprint 1:
A1 - Consume OD
A2 - OD Integration
Saturday
10/11/2012
Sprint 2:
B1 - Publish LOD
B2 - Build LOD applications
Saturday
08/12/2012
LOD + Drupal hackathon
E – LODrupal Hackathon
− http://okfn.org/opendata/
− http://www.slideshare.net/fabien_gandon/web-smantique-donnes-lies-et-smantique-des-schmas-2184768
− http://www.slideshare.net/scorlosquet/how-to-build-linked-data-sites-with-drupal-7-and-rdfa
− http://www20.gencat.cat/portal/site/dadesobertes/
− http://w20.bcn.cat/opendata/
− http://datos.gob.es/datos/
− http://drupal.org/project/odv
References
Questions
Questions?
Thanks!