Download - SPARTIQULATION - Verbalizing SPARQL queries
KIT – University of the State of Baden-Wuerttemberg and
National Research Center of the Helmholtz Association
Institute of Applied Informatics and Formal Description Metthods (AIFB)
www.kit.edu
?uri
?string ?states
?population yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
SPARTIQULATION Verbalizing SPARQL queries
Basil Ell, Denny Vrandečić, Elena Simperl International Workshop on Interacting with Linked Data, Extended Semantic Web Conference 2012
28 May 2012
Institute of Applied Informatics and Formal Description Methods 2 29.05.2012
MOTIVATION
Basil Ell – Verbalizing SPARQL queries
Institute of Applied Informatics and Formal Description Methods 3 29.05.2012
Motivation
Basil Ell – Verbalizing SPARQL queries
SPARQL
[QALD 2011]
[Haase et al., 2009]
[Shekarpour et al., 2011]
Institute of Applied Informatics and Formal Description Methods 4 29.05.2012
Motivation
Basil Ell – Verbalizing SPARQL queries
SPARQL SPARQL
[QALD 2011]
[Haase et al., 2009]
[Shekarpour et al., 2011]
Institute of Applied Informatics and Formal Description Methods 5 29.05.2012
Motivation
Basil Ell – Verbalizing SPARQL queries
SPARQL SPARQL Text
[QALD 2011]
[Haase et al., 2009]
[Shekarpour et al., 2011]
Institute of Applied Informatics and Formal Description Methods 6 29.05.2012
Motivation
Basil Ell – Verbalizing SPARQL queries
SPARQL SPARQL Text
[QALD 2011]
[Haase et al., 2009]
[Shekarpour et al., 2011]
Institute of Applied Informatics and Formal Description Methods 7 29.05.2012
APROACH
Basil Ell – Verbalizing SPARQL queries
Institute of Applied Informatics and Formal Description Methods 8 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
[Reiter and Dale, 2000]
Institute of Applied Informatics and Formal Description Methods 9 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
1. Select the information to communicate
2. Constructing messages and deciding
for their ordering and structure
[Reiter and Dale, 2000]
Institute of Applied Informatics and Formal Description Methods 10 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
1. Select the information to communicate
2. Constructing messages and deciding
for their ordering and structure
3. Decide which words to use in order to
express the content
4. Decide how to refer to an entity
5. Map to linguistic structures
[Reiter and Dale, 2000]
Institute of Applied Informatics and Formal Description Methods 11 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
1. Select the information to communicate
2. Constructing messages and deciding
for their ordering and structure
3. Decide which words to use in order to
express the content
4. Decide how to refer to an entity
5. Map to linguistic structures
6. Create natural language
7. Add structure to text such as
HTML elements
[Reiter and Dale, 2000]
Institute of Applied Informatics and Formal Description Methods 12 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
1. Select the information to communicate
2. Constructing messages and deciding
for their ordering and structure
3. Decide which words to use in order to
express the content
4. Decide how to refer to an entity
5. Map to linguistic structures
6. Create natural language
7. Add structure to text such as
HTML elements
[Reiter and Dale, 2000]
Institute of Applied Informatics and Formal Description Methods 13 29.05.2012
Restrictions
Basil Ell – Verbalizing SPARQL queries
SPARQL 1.0 SELECT
Institute of Applied Informatics and Formal Description Methods 14 29.05.2012
Restrictions
Basil Ell – Verbalizing SPARQL queries
SPARQL 1.0 SELECT
UNION and GROUP BY queries
Institute of Applied Informatics and Formal Description Methods 15 29.05.2012
Restrictions
Basil Ell – Verbalizing SPARQL queries
SPARQL 1.0 SELECT
UNION and GROUP BY queries
„Disconnected“ query graphs
Institute of Applied Informatics and Formal Description Methods 16 29.05.2012
Restrictions
Basil Ell – Verbalizing SPARQL queries
SPARQL 1.0 SELECT
UNION and GROUP BY queries
„Disconnected“ query graphs
Regular expressions etc.
Institute of Applied Informatics and Formal Description Methods 17 29.05.2012
Example – SPARQL query
Basil Ell – Verbalizing SPARQL queries
01 PREFIX dbo: <http://dbpedia.org/ontology/>
02 PREFIX yago: <http://dbpedia.org/class/yago/>
03 PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
04 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
05 PREFIX dbp: <http://dbpedia.org/property/>
06 SELECT DISTINCT ?uri ?string
07 WHERE {
08 ?states rdf:type yago:AfricanCountries .
09 ?states dbo:capital ?uri .
10 ?uri dbp:population ?population .
11 FILTER ( ?population < 1000000 ) .
12 OPTIONAL { ?uri rdfs:label ?string. FILTER (lang(?string) = 'en') }
13 }
Institute of Applied Informatics and Formal Description Methods 18 29.05.2012
Example query – graph representation
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
Institute of Applied Informatics and Formal Description Methods 19 29.05.2012
Document structuring – 4 Steps
Basil Ell – Verbalizing SPARQL queries
Main entity
identification
Graph
trans-
formation
Message
creation
Create
Document
Plan
Institute of Applied Informatics and Formal Description Methods 20 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
Select a variable that is verbalized as subject
Institute of Applied Informatics and Formal Description Methods 21 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
Select a variable that is verbalized as subject
Institute of Applied Informatics and Formal Description Methods 22 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
Select a variable that is verbalized as subject
?string Labels if available of capitals of African countries ...
Bad: subject is optional.
Institute of Applied Informatics and Formal Description Methods 23 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
Select a variable that is verbalized as subject
?popu-
lation
Population < 10^6 of capitals of African countries ...
Bad: variable is not selected.
Institute of Applied Informatics and Formal Description Methods 24 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
Select a variable that is verbalized as subject
?states African countries having capitals that have populations < 10^6 ...
Bad: variable is not selected.
Institute of Applied Informatics and Formal Description Methods 25 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
Select a variable that is verbalized as subject
?uri Capitals of African countries having population < 10^6 ...
Good: Label for main entity is requested.
Institute of Applied Informatics and Formal Description Methods 26 29.05.2012
Graph transformation
Idea: Reduce the set of message types
to simplify verbalization
Main entity is transformed into root node
Reversal of some edges necessary
Basil Ell – Verbalizing SPARQL queries
Institute of Applied Informatics and Formal Description Methods 27 29.05.2012
Graph transformation
Idea: Reduce the set of message types
to simplify verbalization
Main entity is transformed into root node
Reversal of some edges necessary
Basil Ell – Verbalizing SPARQL queries
Institute of Applied Informatics and Formal Description Methods 28 29.05.2012
Example – transformed graph
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital- rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
Institute of Applied Informatics and Formal Description Methods 29 29.05.2012
Message creation
Cut graph into independently verbalizable parts
Filters are stored in VAR messages
Basil Ell – Verbalizing SPARQL queries
1 1
Messages (1-9) represent paths,
message types are path classes
Institute of Applied Informatics and Formal Description Methods 30 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital- rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
Institute of Applied Informatics and Formal Description Methods 31 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital- rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
(5) M(RV)*RlV
Institute of Applied Informatics and Formal Description Methods 32 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital- rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
(3) M(RV)*RV
(5) M(RV)*RlV
Institute of Applied Informatics and Formal Description Methods 33 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital- rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
(7) M(RV)*RtR
(3) M(RV)*RV
(5) M(RV)*RlV
Institute of Applied Informatics and Formal Description Methods 34 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital- rdfs:label
<1000000
LANG=en
optional
?var ?var
resource
filter
selected var variable
(7) M(RV)*RtR
(3) M(RV)*RV
(5) M(RV)*RlV
(10) VAR +4 x
Institute of Applied Informatics and Formal Description Methods 35 29.05.2012
Document Plan
Basil Ell – Verbalizing SPARQL queries
constraints requests modifiers
DP
1 2 3
Institute of Applied Informatics and Formal Description Methods 36 29.05.2012
Document Plan
Basil Ell – Verbalizing SPARQL queries
constraints requests modifiers
DP
1 2 3
Constraits for main entity, e.g. its class,
having population < 10^6
Institute of Applied Informatics and Formal Description Methods 37 29.05.2012
Document Plan
Basil Ell – Verbalizing SPARQL queries
constraints requests modifiers
DP
1 2 3
Requested information, e.g. its name
Constraits for main entity, e.g. its class,
having population < 10^6
Institute of Applied Informatics and Formal Description Methods 38 29.05.2012
Document Plan
Basil Ell – Verbalizing SPARQL queries
constraints requests modifiers
DP
1 2 3
Modifiers, e.g. LIMIT, ORDER BY ...
Requested information, e.g. its name
Constraits for main entity, e.g. its class,
having population < 10^6
Institute of Applied Informatics and Formal Description Methods 39 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri ?states
yago:AfricanCountries
dbo:capital-
M(RV)*RtR (cons)
Institute of Applied Informatics and Formal Description Methods 40 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri ?states
yago:AfricanCountries
dbo:capital-
M(RV)*RtR
Capitals of African countries
(cons)
Institute of Applied Informatics and Formal Description Methods 41 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri ?states
yago:AfricanCountries
dbo:capital-
M(RV)*RtR
Capitals of African countries
?uri
?population <1000000
M(RV)*RV
(cons)
(cons)
Institute of Applied Informatics and Formal Description Methods 42 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri ?states
yago:AfricanCountries
dbo:capital-
M(RV)*RtR
Capitals of African countries
?uri
?population <1000000
M(RV)*RV
that are having populations that are less
than 1000000
(cons)
(cons)
Institute of Applied Informatics and Formal Description Methods 43 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri ?states
yago:AfricanCountries
dbo:capital-
M(RV)*RtR
Capitals of African countries
?uri
?population <1000000
M(RV)*RV
that are having populations that are less
than 1000000
?uri
?string rdfs:label
LANG=en
optional
M(RV)*RlV
(cons)
(cons)
(req)
Institute of Applied Informatics and Formal Description Methods 44 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri ?states
yago:AfricanCountries
dbo:capital-
M(RV)*RtR
Capitals of African countries
?uri
?population <1000000
M(RV)*RV
that are having populations that are less
than 1000000
?uri
?string rdfs:label
LANG=en
optional
M(RV)*RlV
where available their English labels. and
(cons)
(cons)
(req)
Institute of Applied Informatics and Formal Description Methods 45 29.05.2012
SUMMARY AND FUTURE WORK
Basil Ell – Verbalizing SPARQL queries
Institute of Applied Informatics and Formal Description Methods 46 29.05.2012
Summary and Future Work
Summary:
Presented an approach for explaining SPARQL
SELECT queries in natural language
Schema-agnostic
Basil Ell – Verbalizing SPARQL queries
Institute of Applied Informatics and Formal Description Methods 47 29.05.2012
Summary and Future Work
Summary:
Presented an approach for explaining SPARQL
SELECT queries in natural language
Schema-agnostic
Directions for future work:
Tackle challenges in the two missing pipeline
components
Exploitation of linguistic features of labels
Evaluation
Basil Ell – Verbalizing SPARQL queries
Institute of Applied Informatics and Formal Description Methods 48 29.05.2012
?QUESTIONS
http://km.aifb.kit.edu/projects/spartiqulator/
Basil Ell – Verbalizing SPARQL queries
?uri
?string ?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
The work presented here is supported by the European Union's 7th
Framework Programme (FP7/2007-2013) under Grant Agreement 257790.
http://bit.ly/KGuDTL
Institute of Applied Informatics and Formal Description Methods 49 29.05.2012
REFERENCES
Basil Ell – Verbalizing SPARQL queries
Institute of Applied Informatics and Formal Description Methods 50 29.05.2012
References
Basil Ell – Verbalizing SPARQL queries
S. Shekarpour, S. Auer, A.-C. Ngonga Ngomo, D. Gerber, S. Hellmann,
and C. Stadler. Keyword-driven SPARQL Query Generation Leveraging
Background Knowledge. In International Conference on Web Intelligence,
2011.
E. Reiter and R. Dale. Building Natural Language Generation Systems.
Natural Language Processing. Cambridge University Press, 2000.
P. Haase, D. M. Herzig, M. Musen, and D. T. Tran. Semantic Wiki Search.
In L. A. P. et al., editor, 6th Annual European Semantic Web Conference,
ESWC2009, Heraklion, Crete, Greece, volume 5554 of LNCS, pages 445-
460. Springer Verlag, Juni 2009.
QALD 2011: http://www.sc.cit-ec.uni-bielefeld.de/qald-1