wdaqua introduction presentation
TRANSCRIPT
![Page 1: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/1.jpg)
Handling Dynamicity and Temporality of Web Data
Hady [email protected]
Jean Monnet UniversitySaint-Étienne, France
![Page 2: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/2.jpg)
First try with Question Answering Weet it : Natural language interface for Linked Data (ElSahar et al. ‘11 )
![Page 3: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/3.jpg)
● Most of the current knowledge bases focus on static facts and ignore the temporal dimension of facts.
● Aspects of temporality and Dynamicity of Datasets :○ Aspect 1 : Many facts are valid only during a particular time period.
○ Aspect 2 : New extracted facts can contradict with, verify or modify new ones
○ Aspect 3 : Some Facts are collectively induced from a series of Events
Handling Dynamicity of Data
![Page 4: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/4.jpg)
Challenges and Motivations (1) : Stephen HawkingMany facts are valid only during a particular time period.
Use Case : Questions about Temporal facts
● Who is first Wife of Stephen Hawiking ?● Who is the 10th President of France ? ● Who is the past CEO of google ?
![Page 5: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/5.jpg)
Extraction and Represenation of Temporal data
Extraction and representation of Temporal Facts and Events❏ Representation :
❏ Keeping the last updated fact is not enough (DBpedia)❏ Higher order fact (Erdal and Weikum ‘11)
❏ f1:Bill_Clinton isPresidentOf USA.❏ f2:f1 startedOnDate 20-01-1993
❏ Wikidata Qualifiers (Vrandečić ‘12)
❏ Temporal fact and event extraction:❏ Free Text and structured data from wikipedia (patterns and pattern induction)
(Erdal and Weikum ‘11)
![Page 6: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/6.jpg)
Annotation of temporal facts in documents for Question answering
SemEval-2015 Task 5: QA TempEval
![Page 7: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/7.jpg)
SemEval-2015 Task 5: QA TempEval Question Examples in the Evaluation Dataset :Yes / No:
● “Did the the Indonesian stock market rise again after it’s last fall ? List:
● “What happened after the crash?” ● “What happened between the crash and yesterday?”
When (Factoid): ● “When did the Oscar ceremony end yesterday ?”
Applications ?
![Page 8: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/8.jpg)
Challenges and Motivations (2) : Stephen HawkingIn Highly dynamic datasets, new extracted facts can contradict with, verify or modify new ones.
Existing facts New Extracted Fact
Matt Smith
is dbo:starring of
■ dbr:Womb_(film)■ dbr:Lost_River_(film)■ dbr:Bert_and_Dickie■ dbr:The_Science_of_Doctor_Who
“Matt Smith is the doctor”
(Matt Smith, occupation, Medicine)
confidence : 0.1
![Page 9: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/9.jpg)
(Frank Sinatra, profession, Singer) confidence : 0.9
(Jared leto, influenced_by, Frank Sinatra) confidence : 0.8
● People influenced by Writers are probably writers as well ● people are probably born at the same place of their siblings
Challenges and Motivations (2) : Stephen HawkingIn Highly dynamic datasets, new extracted facts can contradict with, verify or modify new ones.
![Page 10: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/10.jpg)
Evaluation of new facts using Link prediction
Link Prediction
● Add new facts without extra knowledge ● Assess the validity of an unknown fact
![Page 11: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/11.jpg)
Embedding Models for knowledge basesTransE : Modeling Relations as Translations (Bordes et al. ’13):
● Modeling Facts as translations between vectors of entities VSubject + VRelation ≅ VObject
● distance is used to Quantify confidence in facts
● Training objective: Find the representations that Minimizes distances across all true facts and maximize across “corrupted” facts ( s’ , o’ ):
![Page 12: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/12.jpg)
Other Embedding Models:● Structured Embeddings (SE) (Bordes et al ‘11 ) ● Collective Matrix Factorization (RESCAL) (Nickel et al., ’11)● Neural Tensor Networks (socher et al. ‘13)● TATEC (Garcia-Duran et al., ’14)
Embedding Models for Text + Knowledge bases:● Joint Learning of Words and Meaning Representations (Bordes et al. ‘12)● Knowledge Graph and Text Jointly Embedding (Wang et al ‘14)
Link prediction using Embedding Models
![Page 13: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/13.jpg)
Applications ? ● Verification of new Extracted Facts● Completeness of new added datasets● Modeing literals dataypes (length, date ..etc ) not only relations and
entities.
Embedding Models other benefits ? (collaboration potential) ● Entity Disambiguation for Fact Extraction and QA (Bordes et al. ‘12)● Paraphrase Detection for Questions, (PARALEX) (Fader et al. ‘13)
![Page 14: WDAqua introduction presentation](https://reader031.vdocument.in/reader031/viewer/2022030318/5a68a1207f8b9a4a258b6acd/html5/thumbnails/14.jpg)
Challenges and Motivations (3) :
Reasoning with more than one supporting facts ● Reasoning about positions (ex: Geo Data)
● Reasoning about Counts● Reasoning about sizes Fact 1 : 55 passengers crammed into the smuggler’s boat.
Fact 2 : The boat made it to the Greek island.
Question : Where are the passengers ?
Stephen HawkingFacts induced from a series of Events
● Towards AI-Complete QA: A Set of Prerequisite Toy Tasks (Wetson et al ‘15)● Memory Networks (Wetson et al ‘14)