qald-7 question answering over linked data challenge

17
QALD-7 Question Answering over Linked Data Challenge Presenter: Giulio Napolitano QALD-7 @ ESWC 2017 Portoroz, Slovenia Horizon 2020, GA No 688227 May 30th, 2017 Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 1 / 16

Upload: holistic-benchmarking-of-big-linked-data

Post on 24-Jan-2018

342 views

Category:

Science


1 download

TRANSCRIPT

Page 1: QALD-7 Question Answering over Linked Data Challenge

QALD-7Question Answering over Linked Data Challenge

Presenter: Giulio Napolitano

QALD-7 @ ESWC 2017Portoroz, Slovenia

Horizon 2020, GA No 688227

May 30th, 2017Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 1 / 16

Page 2: QALD-7 Question Answering over Linked Data Challenge

OverviewQuestion answering systems mediate between

a user expressing an information need in natural languageand RDF-modelled data

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 2 / 16

Page 3: QALD-7 Question Answering over Linked Data Challenge

Overview

QALD is a series of evaluation campaigns that provide a benchmark forcomparing different approaches and systems

get a picture of their strengths and shortcomingsgain insight into how we can develop approaches that deal with SemanticWeb data as a knowledge source

QALD-1 @ ESWC 2011 (3)QALD-2 @ ESWC 2012 (4)QALD-3 @ CLEF 2013 (6)QALD-4 @ CLEF 2014 QA track (9)QALD-5 @ CLEF 2015 QA track (7)QALD-6 @ ESWC 2016 (13)QALD-7 @ ESWC 2017 (3)

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 3 / 16

Page 4: QALD-7 Question Answering over Linked Data Challenge

Tasks

Overall task Given a natural language question, retrieve the correctanswer(s) from a given RDF repository.

Types of challenges (specific tasks):1 Multilingual2 Hybrid3 Large scale4 Wikidata

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 4 / 16

Page 5: QALD-7 Question Answering over Linked Data Challenge

Task 1 - Multilingual questions

Dataset: DBpedia 2016-04 (with multilingual labels)

Questions: 215 training, 50 testprovided in 8 languages: English, German, Spanish, Italian, French, Dutch,Romanian, Farsican be answered with respect to the provided RDF dataannotated with corresponding SPARQL queries and answers

Challenge: Lexical and structural gap between natural language expressions anddata, e.g.

high → elevation

have inhabitants → populationTotal

graduate from → almaMater

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 5 / 16

Page 6: QALD-7 Question Answering over Linked Data Challenge

Example

Which book has the most pages?Welches Buch hat die meisten Seiten?Quale libro ha il maggior numero di pagine?Quel livre a le plus de pages?¿Que libro tiene el mayor numero de paginas?. . .

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 6 / 16

Page 7: QALD-7 Question Answering over Linked Data Challenge

Task 2 - Hybrid questions

Dataset: DBpedia 2016-04 (with free text abstracts)

Questions: 105 training, 50 testprovided in Englishcan be answered only by integrating structured data (RDF) and unstructureddata (free text abstracts)annotated with pseudo-queries and answers

Challenge: find information in several sources, process both structured andunstructured information, and combine them into one answer.

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 7 / 16

Page 8: QALD-7 Question Answering over Linked Data Challenge

ExampleWho is the front man of the band that wrote Coffee & TV?

PREFIX r e s : <ht tp : // dbped ia . org / r e s o u r c e/>PREFIX dbo : <ht tp : // dbped ia . org / on to l ogy/>SELECT DISTINCT ? u r iWHERE {

r e s : Cof fee_&_TV dbo : m u s i c a l A r t i s t ? x .? x dbo : bandMember ? u r i .? u r i t e x t : " i s " t e x t : " f rontman " .

}

http://dbpedia.org/resource/Damon_Albarn

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 8 / 16

Page 9: QALD-7 Question Answering over Linked Data Challenge

Task 3 - Large scale

Dataset: DBpedia 2016-04

Questions: 100 training, 2M testprovided in Englishautomatically generatedquestions sent every minute, n+1 questions asked at minute n

Challenge: deal with high volume requests in a short time

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 9 / 16

Page 10: QALD-7 Question Answering over Linked Data Challenge

Task 4 - Wikidata

Dataset: Wikidata 2017-01

Questions: 100 training, 50 testprovided in Englishquestions based on DBpedia but performed on Wikidata

Challenge: formulate generic approaches, adapting to new data sources

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 10 / 16

Page 11: QALD-7 Question Answering over Linked Data Challenge

Participants

Dennis Diefenbach, Kamal Singh, Pierre MaretWDAqua-core0: A Question Answering Component for theResearch CommunityTask 1 and Task 4Nikolay Radoev, Mathieu Tremblay, Michel Gagnon, Amal ZouaqAnswering Natural Language Questions on RDF Knowledge basein FrenchTask 1Daniil Sorokin, Iryna GurevychEnd-to-end Representation Learning for Question Answering withWeak SupervisionTask 4

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 11 / 16

Page 12: QALD-7 Question Answering over Linked Data Challenge

Organization committee

Ricardo UsbeckUniversität Leipzig, GermanyAxel-Cyrille Ngonga NgomoUniversität Leipzig, GermanyBastian HaarmannFraunhofer Institute IAIS, GermanyAnastasia KritharaNational Center for Scientic Research “Demokritos", Greece

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 12 / 16

Page 13: QALD-7 Question Answering over Linked Data Challenge

Data experts

Harsh TakkarUniversität Bonn, GermanyHenning PetzkaFraunhofer Institute IAIS, GermanyJens JehmannFraunhofer Institute IAIS, Germany

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 13 / 16

Page 14: QALD-7 Question Answering over Linked Data Challenge

Program committee

Corina Forascu - Alexandru Ioan Cuza University, Iasi, RomaniaSebastian Walter - CITEC, Universität Bielefeld, GermanyBernd Müller - ZBMed, GermanyChristoph Lange - Fraunhofer Gesellschaft, GermanyDennis Diefenbach - Université de Saint-Étienne, FranceEdgard Marx - Universität Leipzig, GermanyHady Elsahar - Université de Saint-Étienne, FranceIoanna Lytra - Universität Bonn, GermanyJohn McCrae - INSIGHT - The Centre for Data Analytics, IrelandKonrad Höffner - Universität Leipzig, GermanyKuldeep Singh - Universität Bonn, GermanySaeedeh Shekarpour - Kno.e.sis Center, Ohio Center of Excellence inKnowledge-enabled Computing, USASherzod Hakimov - CITEC, Universität Bielefeld, Germany

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 14 / 16

Page 15: QALD-7 Question Answering over Linked Data Challenge

The End

Thank You!

Thanks to Christina Unger for sharing her slides!!!!

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 15 / 16

Page 16: QALD-7 Question Answering over Linked Data Challenge

Don’t forget

Thursday 1st June9:00-11:00 Poster session

17:30 Awards at closing ceremony

Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 16 / 16

Page 17: QALD-7 Question Answering over Linked Data Challenge