text as a virtual knowledge base - stanford nlp group · 2019-10-30 · text as a virtual knowledge...
TRANSCRIPT
![Page 1: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/1.jpg)
Text as a Virtual Knowledge BaseBhuwan Dhingra
Language Technologies Institute Carnegie Mellon University
Work done with: Haitian Sun, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Russ Salakhutdinov, William Cohen
![Page 2: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/2.jpg)
Question Answering
When did Kendrick Lamar’s first album come out?
A. July 2, 2011
Source: Natural Questions (https://ai.google.com/research/NaturalQuestions/dataset)
![Page 3: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/3.jpg)
Sources of InformationText Knowledge Bases
![Page 4: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/4.jpg)
Sources of InformationText Knowledge Bases
[Moldovan, 2002], [Voorhees, 1999], [Ferrucci, 2012], [Yih, 2013],
[Hermann, 2016], [Chen, 2017], [Seo, 2017], [Peters, 2018], [Devlin, 2018] …
[Kwiatkowski, 2013], [Berant, 2013], [Reddy, 2014], [Pasupat, 2015],
[Bordes, 2015], [Yih, 2015], [Jain, 2016], [Liang, 2017], [Das, 2017] …
![Page 5: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/5.jpg)
Sources of InformationText Knowledge Bases
[Moldovan, 2002], [Voorhees, 1999], [Ferrucci, 2012], [Yih, 2013],
[Hermann, 2016], [Chen, 2017], [Seo, 2017], [Peters, 2018], [Devlin, 2018] …
[Kwiatkowski, 2013], [Berant, 2013], [Reddy, 2014], [Pasupat, 2015],
[Bordes, 2015], [Yih, 2015], [Jain, 2016], [Liang, 2017], [Das, 2017] …
• Lexical pattern matching
• No grounding
• High recall
• Clear semantics via parsing
• Grounded
• High precision
![Page 6: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/6.jpg)
Sources of InformationText Knowledge Bases
[Moldovan, 2002], [Voorhees, 1999], [Ferrucci, 2012], [Yih, 2013],
[Hermann, 2016], [Chen, 2017], [Seo, 2017], [Peters, 2018], [Devlin, 2018] …
[Kwiatkowski, 2013], [Berant, 2013], [Reddy, 2014], [Pasupat, 2015],
[Bordes, 2015], [Yih, 2015], [Jain, 2016], [Liang, 2017], [Das, 2017] …
• Lexical pattern matching
• No grounding
• High recall
• Clear semantics via parsing
• Grounded
• High precision
[Gardner & Krishnamurthy, 2017], [Ryu, 2017] Universal Schema [Riedel, 2013], [Verga, 2016], [Das, 2017] …
Our Work
![Page 7: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/7.jpg)
Why Text + KBs?
![Page 8: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/8.jpg)
Why Text + KBs?• Engineering motivation - QA performance
• Text can complete missing information in KBs
• KBs can provide background context for understanding text
![Page 9: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/9.jpg)
Why Text + KBs?• Engineering motivation - QA performance
• Text can complete missing information in KBs
• KBs can provide background context for understanding text
• Scientific motivation - Knowledge Representation
• Text is expressive
• KBs support reasoning
![Page 10: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/10.jpg)
This Talk1. “Reading” heterogeneous graphs of
facts and text [EMNLP’18]2. Traversing text corpora like KBs
[ongoing]
![Page 11: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/11.jpg)
Open-domain QA using Early Fusion of KBs & TextHaitian Sun*, Bhuwan Dhingra*, Manzil Zaheer,
Kathryn Mazaitis, Ruslan Salakhutdinov, William Cohen
EMNLP 2018*equal contribution
![Page 12: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/12.jpg)
Entity-Relation Knowledge BasesMeg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
• Collection of (subject, relation, object) facts
• Organize information around entity nodes
• Freebase: 44M entities, >2B facts
![Page 13: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/13.jpg)
Reasoning in KBs
Y = X.follow(R) = {x0: 9x 2 X s.t. R(x, x
0) holds}
Meg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Relation following operation:
![Page 14: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/14.jpg)
Reasoning in KBs
• Given a set of entities X
Y = X.follow(R) = {x0: 9x 2 X s.t. R(x, x
0) holds}
Meg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Relation following operation:
![Page 15: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/15.jpg)
Reasoning in KBs
• Given a set of entities X
• Follow edges labeled with the relation R
Y = X.follow(R) = {x0: 9x 2 X s.t. R(x, x
0) holds}
Meg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Relation following operation:
![Page 16: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/16.jpg)
Reasoning in KBs
• Given a set of entities X
• Follow edges labeled with the relation R
• To arrive at a set of entities Y
Y = X.follow(R) = {x0: 9x 2 X s.t. R(x, x
0) holds}
Meg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Relation following operation:
![Page 17: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/17.jpg)
Reasoning in KBs
• Given a set of entities X
• Follow edges labeled with the relation R
• To arrive at a set of entities Y
Y = X.follow(R) = {x0: 9x 2 X s.t. R(x, x
0) holds}
Meg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Relation following operation:
E.g. {Family_Guy, That_70s_Show}.follow(network) = {Fox}
![Page 18: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/18.jpg)
Question AnsweringMeg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Who voiced Meg in Family Guy?
![Page 19: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/19.jpg)
Question AnsweringMeg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Who voiced Meg in Family Guy?
![Page 20: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/20.jpg)
Question AnsweringMeg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Who voiced Meg in Family Guy?
{Meg_Griffin}.follow(voiced-by)
![Page 21: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/21.jpg)
Question AnsweringMeg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Who voiced Meg in Family Guy?
{Meg_Griffin}.follow(voiced-by)
{Mila_Kunis, Lacey_Chabert}
![Page 22: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/22.jpg)
Question AnsweringMeg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Who voiced Meg in Family Guy?
{Meg_Griffin}.follow(voiced-by)
{Mila_Kunis, Lacey_Chabert}
Annotating semantic parses of questions is expensive
![Page 23: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/23.jpg)
Question AnsweringMeg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Who voiced Meg in Family Guy?
???
{Mila_Kunis, Lacey_Chabert}
Search for parses which lead to the correct answer in the KB
Learning from Denotations [Liang, 2011], [Berant, 2013], [Reddy, 2014],
[Krishnamurthy, 2017]
![Page 24: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/24.jpg)
But KBs are often incompleteMeg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
directed-by
David Trainer
Min et al, NAACL 2013
![Page 25: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/25.jpg)
Graphs of Facts and TextMeg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Who voiced Meg in Family Guy?
![Page 26: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/26.jpg)
Graphs of Facts and TextMeg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Inject relevant text into the graph
Who voiced Meg in Family Guy?
![Page 27: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/27.jpg)
Graphs of Facts and TextMeg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Inject relevant text into the graph
Who voiced Meg in Family Guy?
TF-IDF Retrieval
![Page 28: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/28.jpg)
Graphs of Facts and TextMeg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Inject relevant text into the graph
Who voiced Meg in Family Guy?
Megatron "Meg" Griffin is a character from the
television series Family Guy.
TF-IDF Retrieval
Entity Linking
links-tolinks-to
![Page 29: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/29.jpg)
Search via Representation Learning
Who voiced Meg in Family Guy?
Meg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Megatron "Meg" Griffin is a character from the
television series Family Guy.
links-to
links-to
![Page 30: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/30.jpg)
Search via Representation Learning
Who voiced Meg in Family Guy?
Meg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Megatron "Meg" Griffin is a character from the
television series Family Guy.
links-to
links-to
Embedding Vector
Graph Neural Networks
![Page 31: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/31.jpg)
Search via Representation Learning
Who voiced Meg in Family Guy?
Meg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Megatron "Meg" Griffin is a character from the
television series Family Guy.
links-to
links-to
Embedding Vector
“Pagerank” score - Initially uniform over entities in the question
Graph Neural Networks
![Page 32: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/32.jpg)
Search via Representation Learning
Who voiced Meg in Family Guy?
Meg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Megatron "Meg" Griffin is a character from the
television series Family Guy.
links-to
links-to
![Page 33: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/33.jpg)
Search via Representation Learning
Who voiced Meg in Family Guy?
Meg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Megatron "Meg" Griffin is a character from the
television series Family Guy.
links-to
links-toSelf connection
Entity neighborsText mentions
![Page 34: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/34.jpg)
Search via Representation Learning
Who voiced Meg in Family Guy?
Meg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Megatron "Meg" Griffin is a character from the
television series Family Guy.
links-to
links-toSelf connection
Entity neighborsText mentions
• Only propagate embeddings from nodes with non-zero pagerank score
• This constrains learning along valid paths in the graph
![Page 35: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/35.jpg)
Search via Representation Learning
Who voiced Meg in Family Guy?
Meg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Megatron "Meg" Griffin is a character from the
television series Family Guy.
links-to
links-toPagerank update
Learned weights
• Only propagate embeddings from nodes with non-zero pagerank score
• This constrains learning along valid paths in the graph
![Page 36: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/36.jpg)
Search via Representation Learning
Who voiced Meg in Family Guy?
Meg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Megatron "Meg" Griffin is a character from the
television series Family Guy.
links-to
links-to
Classify each entity as Answer / Not-Answer:
hTv : Entity Representation
log
![Page 37: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/37.jpg)
Search via Representation Learning
Who voiced Meg in Family Guy?
Meg Griffin
Mila Kunis Lacey Chabert
has-character
voiced-by
Megatron "Meg" Griffin is a character from the
television series Family Guy.
links-to
links-to
Summary:
1. Inject text into KB using retrieval
2. Propagate entity embeddings along paths starting from question entities
3. Classify each entity as answer / no-answer
![Page 38: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/38.jpg)
Evaluation — WebQuestionsSP & WikiMovies
![Page 39: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/39.jpg)
Evaluation — WebQuestionsSP & WikiMovies
• WebQuestionsSP [Berant, 2013; Yih, 2016]:E.g. “What language do they speak in Afghanistan?” – Pashto language
• KB - Freebase, Text - Wikipedia
• 5K QA pairs for training
![Page 40: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/40.jpg)
Evaluation — WebQuestionsSP & WikiMovies
• WebQuestionsSP [Berant, 2013; Yih, 2016]:E.g. “What language do they speak in Afghanistan?” – Pashto language
• KB - Freebase, Text - Wikipedia
• 5K QA pairs for training
• Wikimovies [Miller, 2016]:E.g. “What movies did Quentin Tarantino direct?” – Reservoir dogs, Pulp fiction, …
• KB - Subset of OMDB, Text - Subset of Wikipedia
• 10K QA pairs for training
![Page 41: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/41.jpg)
Evaluation — WebQuestionsSP & WikiMovies
• WebQuestionsSP [Berant, 2013; Yih, 2016]:E.g. “What language do they speak in Afghanistan?” – Pashto language
• KB - Freebase, Text - Wikipedia
• 5K QA pairs for training
• Wikimovies [Miller, 2016]:E.g. “What movies did Quentin Tarantino direct?” – Reservoir dogs, Pulp fiction, …
• KB - Subset of OMDB, Text - Subset of Wikipedia
• 10K QA pairs for training
• Both datasets are answerable using KBs
• But we simulate an incomplete KB setting
![Page 42: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/42.jpg)
WikiMovies ResultsHits @1 Performance
![Page 43: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/43.jpg)
WikiMovies Results
50
63
75
88
100
0% KB 50% KB 100% KB
Universal Schema Ours (KB-only) Ours (KB+Text)KB-SOTA
97.0
Hits @1 PerformanceText-SOTA
Das et al (ICLR’18)Watanabe et al (arxiv’17)
85.8
10% Training Data
![Page 44: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/44.jpg)
WikiMovies Results
50
63
75
88
100
0% KB 50% KB 100% KB
Universal Schema Ours (KB-only) Ours (KB+Text)
93.8
75.3
80.3
KB-SOTA
97.0
Hits @1 Performance
Das et al (ACL’17)Text-SOTA
Das et al (ICLR’18)Watanabe et al (arxiv’17)
85.8
10% Training Data
![Page 45: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/45.jpg)
WikiMovies Results
50
63
75
88
100
0% KB 50% KB 100% KB
Universal Schema Ours (KB-only) Ours (KB+Text)
97.0
67.7
93.8
75.3
80.3
KB-SOTA
97.0
Hits @1 Performance
Das et al (ACL’17)Text-SOTA
Das et al (ICLR’18)Watanabe et al (arxiv’17)
85.8
10% Training Data
![Page 46: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/46.jpg)
WikiMovies Results
50
63
75
88
100
0% KB 50% KB 100% KB
Universal Schema Ours (KB-only) Ours (KB+Text)
96.8
88.486.6
97.0
67.7
93.8
75.3
80.3
KB-SOTA
97.0
Hits @1 Performance
Das et al (ACL’17)Text-SOTA
Das et al (ICLR’18)Watanabe et al (arxiv’17)
85.8
10% Training Data
![Page 47: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/47.jpg)
WebQuestionsSP ResultsHits @1 Performance
![Page 48: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/48.jpg)
WebQuestionsSP Results
0
21
43
64
85
0% KB 50% KB 100% KB
Universal Schema Ours (KB-only) Ours (KB+Text)KB-SOTA
~75
Hits @1 PerformanceText-SOTA
Liang et al (ACL’17)Chen et al (ACL’17)
21.5
![Page 49: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/49.jpg)
WebQuestionsSP Results
0
21
43
64
85
0% KB 50% KB 100% KB
Universal Schema Ours (KB-only) Ours (KB+Text)
40.5
32.5
23.2
KB-SOTA
~75
Hits @1 Performance
Das et al (ACL’17)Text-SOTA
Liang et al (ACL’17)Chen et al (ACL’17)
21.5
![Page 50: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/50.jpg)
WebQuestionsSP Results
0
21
43
64
85
0% KB 50% KB 100% KB
Universal Schema Ours (KB-only) Ours (KB+Text)
66.7
47.740.5
32.5
23.2
KB-SOTA
~75
Hits @1 Performance
Das et al (ACL’17)Text-SOTA
Liang et al (ACL’17)Chen et al (ACL’17)
21.5
![Page 51: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/51.jpg)
WebQuestionsSP Results
0
21
43
64
85
0% KB 50% KB 100% KB
Universal Schema Ours (KB-only) Ours (KB+Text)
68.7
52.3
25.3
66.7
47.740.5
32.5
23.2
KB-SOTA
~75
Hits @1 Performance
Das et al (ACL’17)Text-SOTA
Liang et al (ACL’17)Chen et al (ACL’17)
21.5
![Page 52: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/52.jpg)
Analysis
• Pagerank propagation leads to ~10% improvement
• Errors:
• Retrieval (both text + facts) has only 90% recall
• Complex questions: “Who first voiced Meg in Family Guy?”“Which club did Cristiano Ronaldo play for in 2007?”
![Page 53: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/53.jpg)
Differentiable Reasoning over a Virtual Knowledge BaseBhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran,
Graham Neubig, Ruslan Salakhutdinov, William Cohen
In preparationTitle and authors removed for anonymity
![Page 54: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/54.jpg)
Meg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
Q. Which TV Networks has Mila Kunis appeared on?
Multi-Hop Question Answering
directed-by
David Trainer
![Page 55: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/55.jpg)
Meg Griffin
Mila Kunis Lacey Chabert
network
has-character
voiced-by
starred in
network
Q. Which TV Networks has Mila Kunis appeared on?Y1 = {Mila_Kunis}.follow(starred-in).follow(network) Y2 = {Mila_Kunis}.follow(voiced)....follow(network) Ans = Y1 UNION Y2 A. Fox
Multi-Hop Question Answering
directed-by
David TrainerY = X.follow(R) = {x0
: 9x 2 X s.t. R(x, x
0) holds}
![Page 56: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/56.jpg)
Can we do this with a Text Corpus?
![Page 57: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/57.jpg)
Can we do this with a Text Corpus?
Q. Which TV Networks has Mila Kunis appeared on?
![Page 58: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/58.jpg)
Can we do this with a Text Corpus?
Q. Which TV Networks has Mila Kunis appeared on?
Since 1999, she has voiced Meg Griffin on the animated series Family Guy.
![Page 59: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/59.jpg)
Can we do this with a Text Corpus?
Q. Which TV Networks has Mila Kunis appeared on?
Since 1999, she has voiced Meg Griffin on the animated series Family Guy.
Family Guy is an American animated sitcom created by Seth MacFarlane for the Fox Broadcasting Company.
![Page 60: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/60.jpg)
Can we do this with a Text Corpus?
Q. Which TV Networks has Mila Kunis appeared on?
Since 1999, she has voiced Meg Griffin on the animated series Family Guy.
Family Guy is an American animated sitcom created by Seth MacFarlane for the Fox Broadcasting Company.
A. Fox
![Page 61: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/61.jpg)
Can we do this with a Text Corpus?
Q. Which TV Networks has Mila Kunis appeared on?
Since 1999, she has voiced Meg Griffin on the animated series Family Guy.
Family Guy is an American animated sitcom created by Seth MacFarlane for the Fox Broadcasting Company.
A. Fox Information can be spread out across the corpus
![Page 62: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/62.jpg)
MetaQA Benchmark
![Page 63: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/63.jpg)
MetaQA Benchmark• Expands Wikimovies dataset to 2-hop and 3-hop questions
![Page 64: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/64.jpg)
MetaQA Benchmark• Expands Wikimovies dataset to 2-hop and 3-hop questions
• 1-hop:What movies did Quentin Tarantino direct?
![Page 65: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/65.jpg)
MetaQA Benchmark• Expands Wikimovies dataset to 2-hop and 3-hop questions
• 1-hop:What movies did Quentin Tarantino direct?
• 2-hop:Which movies have the same director as that of Pulp Fiction?
![Page 66: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/66.jpg)
MetaQA Benchmark• Expands Wikimovies dataset to 2-hop and 3-hop questions
• 1-hop:What movies did Quentin Tarantino direct?
• 2-hop:Which movies have the same director as that of Pulp Fiction?
• 3-hop:Who acted in the movies which have the same director as Pulp Fiction?
![Page 67: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/67.jpg)
MetaQA Benchmark• Expands Wikimovies dataset to 2-hop and 3-hop questions
• 1-hop:What movies did Quentin Tarantino direct?
• 2-hop:Which movies have the same director as that of Pulp Fiction?
• 3-hop:Who acted in the movies which have the same director as Pulp Fiction?
• Text corpus:17K first paragraphs from Wikipedia articles about the movies Pulp Fiction is a 1994 American crime film written and directed by Quentin Tarantino, who conceived it with Roger Avary. Starring John Travolta, Samuel L. Jackson, Bruce Willis, Tim Roth, Ving Rhames, and Uma Thurman, it tells several stories of criminal Los Angeles. The title refers to the pulp magazines and hardboiled crime novels popular during the mid-20th century, known for their graphic violence and punchy dialogue.
![Page 68: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/68.jpg)
Graph Networks on MetaQAHits @1 Performance
![Page 69: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/69.jpg)
Graph Networks on MetaQA
0
25
50
75
100
1-hop 2-hop 3-hop
Using KB Using Text
Hits @1 Performance
![Page 70: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/70.jpg)
Graph Networks on MetaQA
0
25
50
75
100
1-hop 2-hop 3-hop
Using KB Using Text
77.7
94.897.0
Hits @1 Performance
![Page 71: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/71.jpg)
Graph Networks on MetaQA
0
25
50
75
100
1-hop 2-hop 3-hop
Using KB Using Text
40.236.2
82.577.7
94.897.0
Hits @1 Performance
![Page 72: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/72.jpg)
Problem: RetrievalQ. Which TV Networks has Mila Kunis appeared on?
Since 1999, she has voiced Meg Griffin on the animated series Family Guy.
Family Guy is an American animated sitcom created by Seth MacFarlane for the Fox Broadcasting Company.
A. Fox Information can be spread out across the corpus
![Page 73: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/73.jpg)
Problem: RetrievalQ. Which TV Networks has Mila Kunis appeared on?
Since 1999, she has voiced Meg Griffin on the animated series Family Guy.
Family Guy is an American animated sitcom created by Seth MacFarlane for the Fox Broadcasting Company.
A. Fox Information can be spread out across the corpus
• Shallow information retrieval does not work
• Can expand retrieval using e.g. pseudo-relevance-feedback
• But “reading” large number of documents is expensive
![Page 74: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/74.jpg)
Our Approach: Read offline, Reason online
![Page 75: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/75.jpg)
Our Approach: Read offline, Reason online
Pre-train contextual representations into an index
ELMo [Peters, 2018], BERT [Devlin, 2018], Phrase-Indexed
QA [Seo, 2018]
Offline and slow
Reading
![Page 76: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/76.jpg)
Our Approach: Read offline, Reason online
Pre-train contextual representations into an index
ELMo [Peters, 2018], BERT [Devlin, 2018], Phrase-Indexed
QA [Seo, 2018]
Offline and slow
Reading Reasoning
Use inner product search and sparse operations
MIPS [Andoni, 2015; Johnson, 2017], Ragged Tensors
[Tensorflow]
Online and fast
![Page 77: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/77.jpg)
Our Approach: Read offline, Reason online
Pre-train contextual representations into an index
ELMo [Peters, 2018], BERT [Devlin, 2018], Phrase-Indexed
QA [Seo, 2018]
Offline and slow
Reading Reasoning
Use inner product search and sparse operations
MIPS [Andoni, 2015; Johnson, 2017], Ragged Tensors
[Tensorflow]
Online and fast
Differentiable Reasoning over a KB of Indexed Text
![Page 78: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/78.jpg)
Reading: Offline Index of Entity Mentions
Based on Phrase-Indexed QA (Seo et al, EMNLP’18, ACL’19)
![Page 79: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/79.jpg)
Reading: Offline Index of Entity Mentions
… Family Guy is an American …
… their children, Meg, Chris, Stewie …
… In 1999, Kunis replaced Lacey Chabert …
… created by Seth MacFarlene for Fox …
… cast members were Topher Grace, Mile Kunis …
… originally aired on Fox from August 23 …
… wanted the show to have a 1970s feel …
Entity Mentions
Based on Phrase-Indexed QA (Seo et al, EMNLP’18, ACL’19)
EntityLinking
![Page 80: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/80.jpg)
Reading: Offline Index of Entity Mentions
… Family Guy is an American …
… their children, Meg, Chris, Stewie …
… In 1999, Kunis replaced Lacey Chabert …
… created by Seth MacFarlene for Fox …
… cast members were Topher Grace, Mile Kunis …
… originally aired on Fox from August 23 …
… wanted the show to have a 1970s feel …
Entity Mentions
Based on Phrase-Indexed QA (Seo et al, EMNLP’18, ACL’19)
EntityLinking
Contextual Representations
![Page 81: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/81.jpg)
Reading: Offline Index of Entity Mentions
… Family Guy is an American …
… their children, Meg, Chris, Stewie …
… In 1999, Kunis replaced Lacey Chabert …
… created by Seth MacFarlene for Fox …
… cast members were Topher Grace, Mile Kunis …
… originally aired on Fox from August 23 …
… wanted the show to have a 1970s feel …
Entity Mentions
Based on Phrase-Indexed QA (Seo et al, EMNLP’18, ACL’19)
Sparse
EntityLinking
Contextual Representations
(Fixed)
![Page 82: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/82.jpg)
Reading: Offline Index of Entity Mentions
… Family Guy is an American …
… their children, Meg, Chris, Stewie …
… In 1999, Kunis replaced Lacey Chabert …
… created by Seth MacFarlene for Fox …
… cast members were Topher Grace, Mile Kunis …
… originally aired on Fox from August 23 …
… wanted the show to have a 1970s feel …
Entity Mentions Dense
Based on Phrase-Indexed QA (Seo et al, EMNLP’18, ACL’19)
Sparse
EntityLinking
Contextual Representations
(Fixed) (Pre-trained)
![Page 83: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/83.jpg)
Reasoning: A “Soft” Textual Follow Op
DenseSparse
“Virtual” Knowledge Base
![Page 84: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/84.jpg)
Reasoning: A “Soft” Textual Follow Op
X1.follow(R1)
RetrieveMention Spans
DenseSparse
“Virtual” Knowledge Base
vX1
qR1
A soft set of entities
A relation vector
vX2
A soft set of entities
![Page 85: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/85.jpg)
Reasoning: A “Soft” Textual Follow OpWhich TV Networks has Mila Kunis appeared on?
X1.follow(R1)
RetrieveMention Spans
DenseSparse
“Virtual” Knowledge Base
vX1
vX2
qR1
![Page 86: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/86.jpg)
Reasoning: A “Soft” Textual Follow OpWhich TV Networks has Mila Kunis appeared on?
X1.follow(R1)
X2.follow(R2)
Fox
RetrieveMention Spans
DenseSparse
“Virtual” Knowledge Base
vX1
vX2
qR1
qR2
![Page 87: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/87.jpg)
Reasoning: A “Soft” Textual Follow OpWhich TV Networks has Mila Kunis appeared on?
X1.follow(R1)
X2.follow(R2)
Fox
RetrieveMention Spans
Key Idea:We can do this with sparse matrix - vector productsand inner product search
DenseSparse
“Virtual” Knowledge Base
vX1
vX2
qR1
qR2
![Page 88: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/88.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
X1.follow(R1)
![Page 89: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/89.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
Entities are represented as a sparse vector • Size = # entities in the corpus • Non-zero values are confidence scores • E.g. from an entity linking model
X1.follow(R1)
k-hot sparse vector of entities
Mila_Kunis
Mila_(film)
vX1
![Page 90: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/90.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparsedense relation vector
Relations are represented as a dense feature vector • We use a 5-layer Transformer Network over the question
X1.follow(R1)
k-hot sparse vector of entities
vX1 qR1
![Page 91: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/91.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
Mentions are retrieved in two steps:
1. TF-IDF against the surface form of entities
X1.follow(R1)
k-hot sparse vector of entities
vX1dense relation vector
qR1
![Page 92: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/92.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
Mentions are retrieved in two steps:
1. TF-IDF against the surface form of entities
X1.follow(R1)
AverageTF-IDF
k-hot sparse vector of entities
vX1dense relation vector
qR1
![Page 93: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/93.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
Mentions are retrieved in two steps:
1. TF-IDF against the surface form of entities
X1.follow(R1)
AverageTF-IDF
k-hot sparse vector of entities
vX1dense relation vector
qR1
![Page 94: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/94.jpg)
Which TV Networks has Mila Kunis appeared on?
Dense
This can bepre-computed
X1.follow(R1)
k-hot sparse vector of entities
vX1dense relation vector
qR1
#entities
#mentions
AE!M
sparse vectorx
sparse matrix
If max non-zero entries in each row is bounded (n) we can implement this efficiently using Ragged Tensors[1] —- O(k max(k, n))
[1] (https://www.tensorflow.org/guide/ragged_tensors)
![Page 95: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/95.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
X1.follow(R1)
Mentions are retrieved in two steps:
1. TF-IDF against the surface form of entities
2. Dot product against relation vector
k-hot sparse vector of entities
vX1dense relation vector
qR1
![Page 96: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/96.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
X1.follow(R1)
Mentions are retrieved in two steps:
1. TF-IDF against the surface form of entities
2. Dot product against relation vector
Approximate Max Inner Product Search —- O(polylog (#mentions))
k-hot sparse vector of entities
vX1dense relation vector
qR1
![Page 97: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/97.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
X1.follow(R1)
Retrieve mentions which score high using both components - Dense part checks whether type is correct - Sparse part checks whether it co-occurs with an input entity
k-hot sparse vector of entities
vX1dense relation vector
qR1
![Page 98: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/98.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
X1.follow(R1)
Aggregate mentions to entities(take maximum score of all coreferent mentions)
k-hot sparse vector of entities
vX1dense relation vector
qR1
![Page 99: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/99.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
X1.follow(R1)
Aggregate mentions to entities(take maximum score of all coreferent mentions)
Family_Guy
That_70s_Show
k-hot sparse vector of entities
vX1dense relation vector
qR1
vX2
![Page 100: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/100.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparsek-hot sparse vector of entities
vX1dense relation vector
qR1
vX2
X1.follow(R1)
![Page 101: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/101.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparsek-hot sparse vector of entities
vX1dense relation vector
qR1
vX2 qR2
X1.follow(R1)
![Page 102: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/102.jpg)
Which TV Networks has Mila Kunis appeared on?
DenseSparse
X2.follow(R2)
k-hot sparse vector of entities
vX1dense relation vector
qR1
vX2 qR2
X1.follow(R1)
![Page 103: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/103.jpg)
Summary
• Every operation is differentiable —- can learn from denotations! • Complexity depends only on log of #entities / #mentions!
Sparse vector of entities Dense relation vector
Entity -> Mention TF-IDF matrix
Mention -> Entity Coreference Matrix
Top-K inner productsearch
vX.follow(R)
= [vXAE!M � TK(qR)]AM!E
� Element-wise Product
![Page 104: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/104.jpg)
Pre-training
… Family Guy is an American …
… their children, Meg, Chris, Stewie …
… In 1999, Kunis replaced Lacey Chabert …
… created by Seth MacFarlene for Fox …
… were Topher Grace, Mile Kunis …
… originally aired on Fox from August 23 …
… wanted the show to have a 1970s feel …
Entity Mentions Dense
![Page 105: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/105.jpg)
Pre-training
… Family Guy is an American …
… their children, Meg, Chris, Stewie …
… In 1999, Kunis replaced Lacey Chabert …
… created by Seth MacFarlene for Fox …
… were Topher Grace, Mile Kunis …
… originally aired on Fox from August 23 …
… wanted the show to have a 1970s feel …
Entity Mentions Dense• Distantly align KB facts to text passages:
![Page 106: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/106.jpg)
Pre-training
… Family Guy is an American …
… their children, Meg, Chris, Stewie …
… In 1999, Kunis replaced Lacey Chabert …
… created by Seth MacFarlene for Fox …
… were Topher Grace, Mile Kunis …
… originally aired on Fox from August 23 …
… wanted the show to have a 1970s feel …
Entity Mentions Dense• Distantly align KB facts to text passages:
Pulp Fiction is a 1994 film written and directed by Quentin TarantinoText
KB fact(Quentin Tarantino, directed, Pulp Fiction)
![Page 107: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/107.jpg)
Pre-training
… Family Guy is an American …
… their children, Meg, Chris, Stewie …
… In 1999, Kunis replaced Lacey Chabert …
… created by Seth MacFarlene for Fox …
… were Topher Grace, Mile Kunis …
… originally aired on Fox from August 23 …
… wanted the show to have a 1970s feel …
Entity Mentions Dense• Distantly align KB facts to text passages:
Pulp Fiction is a 1994 film written and directed by Quentin TarantinoText
KB fact(Quentin Tarantino, directed, Pulp Fiction)
![Page 108: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/108.jpg)
Pre-training
… Family Guy is an American …
… their children, Meg, Chris, Stewie …
… In 1999, Kunis replaced Lacey Chabert …
… created by Seth MacFarlene for Fox …
… were Topher Grace, Mile Kunis …
… originally aired on Fox from August 23 …
… wanted the show to have a 1970s feel …
Entity Mentions Dense• Distantly align KB facts to text passages:
Pulp Fiction is a 1994 film written and directed by Quentin TarantinoText
KB fact(Quentin Tarantino, directed, Pulp Fiction)
![Page 109: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/109.jpg)
MetaQA ResultsHits @1 Performance
![Page 110: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/110.jpg)
MetaQA Results
75
81
88
94
100
1-hop 2-hop 3-hop
PullNet Ours (end-to-end) Ours (strong sup.)KB-SOTA
97.0
99.9
91.4
Hits @1 Performance
Sun et al (EMNLP’19)
![Page 111: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/111.jpg)
MetaQA Results
75
81
88
94
100
1-hop 2-hop 3-hop
PullNet Ours (end-to-end) Ours (strong sup.)
78.2
81.0
84.4
KB-SOTA
97.0
99.9
91.4
Hits @1 Performance
Sun et al (EMNLP’19)
![Page 112: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/112.jpg)
MetaQA Results
75
81
88
94
100
1-hop 2-hop 3-hop
PullNet Ours (end-to-end) Ours (strong sup.)
87.686.0
84.4
78.2
81.0
84.4
KB-SOTA
97.0
99.9
91.4
Hits @1 Performance
Sun et al (EMNLP’19)
![Page 113: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/113.jpg)
MetaQA Results
75
81
88
94
100
1-hop 2-hop 3-hop
PullNet Ours (end-to-end) Ours (strong sup.)
87.187.1
84.5
87.686.0
84.4
78.2
81.0
84.4
KB-SOTA
97.0
99.9
91.4
Hits @1 Performance
Sun et al (EMNLP’19)
![Page 114: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/114.jpg)
MetaQA ResultsQueries / sec on a single 6-core CPU
![Page 115: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/115.jpg)
MetaQA Results
0
11
23
34
45
1-hop 2-hop 3-hop
PullNet Ours
13.0
19.0
33.0
1.73.8
16.3
Queries / sec on a single 6-core CPU
Sun et al (EMNLP’19)
![Page 116: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/116.jpg)
MetaQA Results
0
11
23
34
45
1-hop 2-hop 3-hop
PullNet Ours
13.0
19.0
33.0
1.73.8
16.3
Queries / sec on a single 6-core CPU
Sun et al (EMNLP’19)
Does iterative retrieval +Graph Neural Network to read
![Page 117: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/117.jpg)
New Dataset: WikiData Slot-filling
![Page 118: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/118.jpg)
New Dataset: WikiData Slot-filling• Constructed by aligning WikiData facts to Wikipedia
![Page 119: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/119.jpg)
New Dataset: WikiData Slot-filling• Constructed by aligning WikiData facts to Wikipedia
• 1-3 hop semi-synthetic slot-filling queries over ~300 relations Q. Marcel de Graaff, place of birth, twinned administrative body?Ans. Rotterdam -> AntwerpQ. Muhammad Sanya, member of political party, chairperson, date of birth?Ans. Civic United Front -> Ibrahim Lipumba -> 6 June 1952
![Page 120: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/120.jpg)
New Dataset: WikiData Slot-filling• Constructed by aligning WikiData facts to Wikipedia
• 1-3 hop semi-synthetic slot-filling queries over ~300 relations Q. Marcel de Graaff, place of birth, twinned administrative body?Ans. Rotterdam -> AntwerpQ. Muhammad Sanya, member of political party, chairperson, date of birth?Ans. Civic United Front -> Ibrahim Lipumba -> 6 June 1952
• Task is to extract answer from an unseen corpus of 10K Wikipedia articles (120K passages) - ~200K entities- ~1.2M mentions
![Page 121: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/121.jpg)
Baselines• Cascaded versions of two open-domain QA models - PIQA [1] and DrQA [2]
• Relations were converted to natural language questions using templates
[1] Seo, Minjoon, et al. "Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index." ACL (2019).[2] Chen, Danqi, et al. "Reading wikipedia to answer open-domain questions." ACL (2017).
![Page 122: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/122.jpg)
Baselines• Cascaded versions of two open-domain QA models - PIQA [1] and DrQA [2]
• Relations were converted to natural language questions using templates
[1] Seo, Minjoon, et al. "Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index." ACL (2019).[2] Chen, Danqi, et al. "Reading wikipedia to answer open-domain questions." ACL (2017).
Q. Marcel de Graaff, place of birth, twinned administrative body?
Q1. Where was Marcel de Graaff born? PIQA / DrQA Rotterdam
Q2. Name a sister city of Rotterdam. PIQA / DrQA Antwerp
![Page 123: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/123.jpg)
WikiData Slot-Filling ResultsHits @1 Performance
![Page 124: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/124.jpg)
WikiData Slot-Filling Results
0
25
50
75
100
1-hop 2-hop 3-hop
DrQA (off-the-shelf) PiQA (re-trained) Ours (cascaded) Ours (end-to-end)
Hits @1 Performance
![Page 125: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/125.jpg)
WikiData Slot-Filling Results
0
25
50
75
100
1-hop 2-hop 3-hop
DrQA (off-the-shelf) PiQA (re-trained) Ours (cascaded) Ours (end-to-end)
7.014.1
28.7
Hits @1 Performance
![Page 126: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/126.jpg)
WikiData Slot-Filling Results
0
25
50
75
100
1-hop 2-hop 3-hop
DrQA (off-the-shelf) PiQA (re-trained) Ours (cascaded) Ours (end-to-end)
18.2
36.9
67.0
7.014.1
28.7
Hits @1 Performance
![Page 127: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/127.jpg)
WikiData Slot-Filling Results
0
25
50
75
100
1-hop 2-hop 3-hop
DrQA (off-the-shelf) PiQA (re-trained) Ours (cascaded) Ours (end-to-end)
19.8
40.4
81.6
18.2
36.9
67.0
7.014.1
28.7
Hits @1 Performance
![Page 128: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/128.jpg)
WikiData Slot-Filling Results
0
25
50
75
100
1-hop 2-hop 3-hop
DrQA (off-the-shelf) PiQA (re-trained) Ours (cascaded) Ours (end-to-end)
24.4
46.9
83.4
19.8
40.4
81.6
18.2
36.9
67.0
7.014.1
28.7
Hits @1 Performance
![Page 129: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/129.jpg)
Analysis - Intermediate Predictions
X1.follow(R1)
X2.follow(R2)
question
Intermediate Predictions
![Page 130: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/130.jpg)
Analysis - Intermediate Predictions• Inspected 100 correctly answered 2-hop questions
X1.follow(R1)
X2.follow(R2)
question
Intermediate Predictions
![Page 131: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/131.jpg)
Analysis - Intermediate Predictions• Inspected 100 correctly answered 2-hop questions
• 83% had at least one correct intermediate prediction
X1.follow(R1)
X2.follow(R2)
question
Intermediate Predictions
![Page 132: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/132.jpg)
Analysis - Intermediate Predictions• Inspected 100 correctly answered 2-hop questions
• 83% had at least one correct intermediate prediction
• For 80% the most confident intermediate prediction was correct
• This drops to 47% for incorrectly answered questionsX1.follow(R1)
X2.follow(R2)
question
Intermediate Predictions
![Page 133: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/133.jpg)
Analysis - Intermediate Predictions• Inspected 100 correctly answered 2-hop questions
• 83% had at least one correct intermediate prediction
• For 80% the most confident intermediate prediction was correct
• This drops to 47% for incorrectly answered questions
• Rest 17% the model learned to answer in a single hop:X1.follow(R1)
X2.follow(R2)
question
Intermediate Predictions
![Page 134: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/134.jpg)
Analysis - Intermediate Predictions• Inspected 100 correctly answered 2-hop questions
• 83% had at least one correct intermediate prediction
• For 80% the most confident intermediate prediction was correct
• This drops to 47% for incorrectly answered questions
• Rest 17% the model learned to answer in a single hop:X1.follow(R1)
X2.follow(R2)
question
Intermediate Predictions
What are the genres of the films directed by Justin Simien?Intermediate = drama, Final = drama What genres are the movies acted by Jeremy Lin in?Intermediate = documentary, Final = documentary
![Page 135: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/135.jpg)
Conclusion
• Word embeddings provide linguistic knowledge to NLP systems
• Can mention / span embeddings provide world knowledge?
![Page 136: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/136.jpg)
Future Directions
Enriching the Knowledge Base
• Beyond entities and relations
• Beyond English
• Beyond Text
Expanding to tasks other than QA
• Language Modeling
• Dialog
• Translation
![Page 137: Text as a Virtual Knowledge Base - Stanford NLP Group · 2019-10-30 · Text as a Virtual Knowledge Base Bhuwan Dhingra Language Technologies Institute Carnegie Mellon University](https://reader035.vdocument.in/reader035/viewer/2022070914/5fb5608e61f10013ce18792e/html5/thumbnails/137.jpg)
Thank You.