connecting large-scale knowledge bases and natural language · (score_nn_2, _is_a,...

45
Context Modeling Knowledge Bases Connecting KBs & Natural Language Conclusion Connecting Large-Scale Knowledge Bases and Natural Language Antoine Bordes & Jason Weston CNRS - Univ. Tech. Compiègne & Google UW - MSR Summer Institute 2013, Alderbrook Resort, July 23, 2013 Connecting Large-Scale Knowledge Bases and Natural Language 1

Upload: others

Post on 06-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language Conclusion

Connecting Large-Scale KnowledgeBases and Natural Language

Antoine Bordes & Jason WestonCNRS - Univ. Tech. Compiègne & Google

UW - MSR Summer Institute 2013,Alderbrook Resort, July 23, 2013

Connecting Large-Scale Knowledge Bases and Natural Language 1

Page 2: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language Conclusion

Knowledge Bases (KBs)

l Data is structured as a graph.l Each node = an entity.l Each edge = a relation.l A relation = (sub, rel , obj):

m sub =subject,m rel = relation type,m obj = object.

l Nodes w/o features.

Connecting Large-Scale Knowledge Bases and Natural Language 2

Page 3: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language Conclusion

Example of KB: WordNet

l WordNet: dictionary where each entity is a sense (synset).

l Popular in NLP.l Statistics:

m 117k entities;m 20 relation types;m 500k relations.

l Examples:(car_NN_1, _has_part, _wheel_NN_1)(score_NN_1, _is_a, _rating_NN_1)(score_NN_2, _is_a, _sheet_music_NN_1)

Connecting Large-Scale Knowledge Bases and Natural Language 3

Page 4: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language Conclusion

Example of KB: Freebase

l Freebase: huge collaborative (hence noisy) KB.

l Part of the Google Knowledge Graph.l Statistics:

m > 80M of entities;m > 20k relation types;m > 1.2B relations.

l Examples:(Lil Wayne, _born_in, New Orleans)(Seattle, _contained_by, USA)(Machine Learning, _subdiscipline, Artificial Intelligence)

Connecting Large-Scale Knowledge Bases and Natural Language 4

Page 5: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language Conclusion

Connecting KBs and Natural Language

l Why?m Text → KB: information extraction;m KB → Text: interpretation (NER, semantic parsing), summary.

l Main issue: KBs are hard to manipulate.m Very large dimensions: 105 −108 entities, 107 −109 rel. types;m Sparse: few valid links;m Noisy/incomplete: missing/wrong relations/entities.

l How?1. Encode KBs into low-dimensional vector spaces;2. Use these representations as KB data in text applications.

Connecting Large-Scale Knowledge Bases and Natural Language 5

Page 6: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Menu

l Context

l Modeling Knowledge Basesm Embedding-based Modelsm Experiments on Freebase

l Connecting KBs & Natural Languagem Wordnet+Text for Word-Sense Disambiguationm Freebase+Text for Relation Extraction

l Conclusionm What now?

Connecting Large-Scale Knowledge Bases and Natural Language 6

Page 7: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Statistical Relational Learning

l Framework:

m ns subjects {subi }i∈�1;ns�m nr relation types {relk }k∈�1;nr �m no objects {objj }j∈�1;no�→ For us, ns = no = ne and ∀i ∈ �1;ne�,subi = obji .

m A relation exists for (subi , relk ,objj) if relk (subi ,objj)=1

l Goal: We want to model, from data,

P[relk(subi ,objj)= 1]

(equivalent to approximate the binary tensor X ∈ {0,1}ns×no×nr )

Connecting Large-Scale Knowledge Bases and Natural Language 7

Page 8: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Energy-based Learning

Two main ideas:

1. Models based on low-dimensional continuous vectorembeddings for entities and relation types, learned to define asimilarity criterion.

2. Stochastic training with sub-sampling of unknown relations.

Connecting Large-Scale Knowledge Bases and Natural Language 8

Page 9: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Learning Representations

l Subjects and objects are represented by vectors in Rd .m {subi }i∈�1;ns� → [s1, . . . ,sns ] ∈Rd×ns

m {objj }j∈�1;no� → [o1, . . . ,ono ] ∈Rd×no

For us, ns = no = ne and ∀i ∈ �1;ne�,si =oi .

l Rel. types = similarity operators between subjects/objects.m {relk }k∈�1;nr � → operateurs {rk }k∈�1;nr �

l Learning similarities depending on rel → d(sub, rel ,obj).(we can retrieve a probability using a transfer function)

Connecting Large-Scale Knowledge Bases and Natural Language 9

Page 10: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Modeling Relations as Translations

Intuition: we would like that s+ r ≈o.

We define the similarity measure:

d(h, r , t)= ||h+ r − t ||22

We learn s,r and o that verify that.

Connecting Large-Scale Knowledge Bases and Natural Language 10

Page 11: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Modeling Relations as Translations

Intuition: we would like that s+ r ≈o.

We define the similarity measure:

d(h, r , t)= ||h+ r − t ||22

We learn s,r and o that verify that.

Connecting Large-Scale Knowledge Bases and Natural Language 11

Page 12: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Modeling Relations as Translations

Intuition: we would like that s+ r ≈o.

We define the similarity measure:

d(sub, rel ,obj)= ||s+ r −o||22We learn s,r and o that verify that.

Connecting Large-Scale Knowledge Bases and Natural Language 12

Page 13: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Stochastic Training

l Learning by stochastic gradient descent: one observed(true?) relation after the other.

l For each relation from the training set:1. we sub-sample unobserved relations (false?).2. we check if the similarity of the true relation is lower.3. if not, we update parameters of the considered relations.

l Stopping criterion: performance on a validation set.

Connecting Large-Scale Knowledge Bases and Natural Language 13

Page 14: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Menu

l Context

l Modeling Knowledge Basesm Embedding-based Modelsm Experiments on Freebase

l Connecting KBs & Natural Languagem Wordnet+Text for Word-Sense Disambiguationm Freebase+Text for Relation Extraction

l Conclusionm What now?

Connecting Large-Scale Knowledge Bases and Natural Language 14

Page 15: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Chunks of Freebase

l Data statistics:Entities (ne) Rel. (nr ) Train. Ex. Valid. Ex. Test Ex.

Freebase15k 14,951 1,345 483,142 50,000 59,071Freebase1M 1×109 23,382 17.5×109 50,000 177,404

l Experimental setup:m Embedding dimension: 50.m Training time:

l on Freebase15k: ≈5h (on 1 CPU),l on Freebase1M : ≈1j (on 16 cores).

Connecting Large-Scale Knowledge Bases and Natural Language 15

Page 16: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Visualization of 1,000 Entities

Connecting Large-Scale Knowledge Bases and Natural Language 16

Page 17: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Visualization of 1,000 Entities - Zoom 1

Connecting Large-Scale Knowledge Bases and Natural Language 17

Page 18: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Visualization of 1,000 Entities - Zoom 2

Connecting Large-Scale Knowledge Bases and Natural Language 18

Page 19: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Visualization of 1,000 Entities - Zoom 3

Connecting Large-Scale Knowledge Bases and Natural Language 19

Page 20: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Link Prediction

"Who influenced J.K. Rowling?"

J. K. Rowling _influenced_by ?

C. S. LewisLloyd AlexanderTerry PratchettRoald DahlJorge Luis BorgesStephen KingIan Fleming

Connecting Large-Scale Knowledge Bases and Natural Language 20

Page 21: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Link Prediction

"Who influenced J.K. Rowling?"

J. K. Rowling _influenced_by G. K. ChestertonJ. R. R. TolkienC. S. LewisLloyd AlexanderTerry PratchettRoald DahlJorge Luis BorgesStephen KingIan Fleming

Connecting Large-Scale Knowledge Bases and Natural Language 21

Page 22: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Link Prediction

"Which genre is the movie WALL-E?"

WALL-E _has_genre ?Computer animationComedy filmAdventure filmScience FictionFantasyStop motionSatireDrama

Connecting Large-Scale Knowledge Bases and Natural Language 22

Page 23: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Link Prediction

"Which genre is the movie WALL-E?"

WALL-E _has_genre AnimationComputer animationComedy filmAdventure filmScience FictionFantasyStop motionSatireDrama

Connecting Large-Scale Knowledge Bases and Natural Language 23

Page 24: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Link Prediction

On Freebase15k:

On Freebase1M, our model predicts 34% in the Top-10.

Connecting Large-Scale Knowledge Bases and Natural Language 24

Page 25: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionEmbedding-based Models Experiments on Freebase

Learn Unknown Relation Types

Learning embeddings of 40 unkowns relation types.

Connecting Large-Scale Knowledge Bases and Natural Language 25

Page 26: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Menu

l Context

l Modeling Knowledge Basesm Embedding-based Modelsm Experiments on Freebase

l Connecting KBs & Natural Languagem Wordnet+Text for Word-Sense Disambiguationm Freebase+Text for Relation Extraction

l Conclusionm What now?

Connecting Large-Scale Knowledge Bases and Natural Language 26

Page 27: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Disambiguation within a Specific Framework

Disambiguation → connect free text and the KB WordNet.

Towards open-text semantic parsing:

Connecting Large-Scale Knowledge Bases and Natural Language 27

Page 28: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Joint Modeling of Text and WordNet

l Text is converted into relations (sub,rel ,obj).l We learn a vector for any symbol: words, entities and relation

types from WordNet.l Our system can label 37,141 words with 40,943 synsets.

Train. Ex. Test Ex. Labeled? SymbolWordNet 146,442 5,000 No synsetsWikipedia 2,146,131 10,000 No wordsConceptNet 11,332 0 Non wordsExt. WordNet 42,957 5,000 Yes words+synsetsUnamb. Wikip. 981,841 0 Yes words+synsetsTOTAL 3,328,703 20,000 - -

Connecting Large-Scale Knowledge Bases and Natural Language 28

Page 29: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Experimental Results

F1-score on 5,000 test sentences to disambiguate.

Connecting Large-Scale Knowledge Bases and Natural Language 29

Page 30: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Enrich WordNet

We create similarities going beyond WordNet.

"what does an army attack?"

army_NN_1 attack_VB_1 ?armed_service_NN_1ship_NN_1territory_NN_1military_unit_NN_1

Connecting Large-Scale Knowledge Bases and Natural Language 30

Page 31: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Enrich WordNet

We create similarities going beyond WordNet.

"what does an army attack?"

army_NN_1 attack_VB_1 troop_NN_4armed_service_NN_1ship_NN_1territory_NN_1military_unit_NN_1

Connecting Large-Scale Knowledge Bases and Natural Language 31

Page 32: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Enrich WordNet

We create similarities going beyond WordNet.

"Who or what earns money"

? earn_VB_1 money_NN_1business_firm_NN_1

family_NN_1payoff_NN_3

card_game_NN_1

Connecting Large-Scale Knowledge Bases and Natural Language 32

Page 33: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Enrich WordNet

We create similarities going beyond WordNet.

"Who or what earns money"

person_NN_1 earn_VB_1 money_NN_1business_firm_NN_1family_NN_1payoff_NN_3card_game_NN_1

Connecting Large-Scale Knowledge Bases and Natural Language 33

Page 34: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Menu

l Context

l Modeling Knowledge Basesm Embedding-based Modelsm Experiments on Freebase

l Connecting KBs & Natural Languagem Wordnet+Text for Word-Sense Disambiguationm Freebase+Text for Relation Extraction

l Conclusionm What now?

Connecting Large-Scale Knowledge Bases and Natural Language 34

Page 35: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Relation Extraction

Given a bunch of sentences.

Connecting Large-Scale Knowledge Bases and Natural Language 35

Page 36: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Relation Extraction

Given a bunch of sentences concerning the same pair of entities.

Connecting Large-Scale Knowledge Bases and Natural Language 36

Page 37: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Relation Extraction

Goal: identify if there is a relation between them to add to the KB.

Connecting Large-Scale Knowledge Bases and Natural Language 37

Page 38: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Relation Extraction

And from which type, to enrich an existing KB.

Connecting Large-Scale Knowledge Bases and Natural Language 38

Page 39: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Jointly use Text and Freebase

l Standard Method: a classifier is trained to predict therelation type, given txts and (sub, obj):

r(txts,sub,obj)= argmaxrel ′

Stxt2rel(txts, rel ′)

l Idea: extract relations by using both text + availableknowledge (= current KB).

l Our model of the KB forces extracted relations to agree with it:

r(txts,sub,obj)= argmaxrel ′

(Stxt2rel(txts, rel ′)−dBC(sub, rel ′,obj))

Connecting Large-Scale Knowledge Bases and Natural Language 39

Page 40: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWordnet+Text for Word-Sense Disambiguation Freebase+Text for Relation Extraction

Experiments on NYT+Freebase

We learn on New York Times papers and on Freebase.

recall

pre

cisi

on

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10.4

0.5

0.6

0.7

0.8

0.9

Wsabie M2R+FBMIMLREHoffmannWsabie M2RRiedelMintz

Precision/recall curve for predicting relations.

Connecting Large-Scale Knowledge Bases and Natural Language 40

Page 41: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWhat now?

Menu

l Context

l Modeling Knowledge Basesm Embedding-based Modelsm Experiments on Freebase

l Connecting KBs & Natural Languagem Wordnet+Text for Word-Sense Disambiguationm Freebase+Text for Relation Extraction

l Conclusionm What now?

Connecting Large-Scale Knowledge Bases and Natural Language 41

Page 42: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWhat now?

Encode KBs into vector spaces

l KBs are rich but need attention.

l Learn to project into vector spaces:m Ease their visualization;m Allows for link prediction (with/without ext. data);m Facilitate their use int other systems;m Compact format.

l Is that all?

Connecting Large-Scale Knowledge Bases and Natural Language 42

Page 43: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWhat now?

Challenges

We’re just getting started:l How to reason: combine logic, deduction.l Evaluate confidence in predictions.l Summarize KBs.l Fusion KBs.l Connect text and KBs: mutual interactions.l etc.

Connecting Large-Scale Knowledge Bases and Natural Language 43

Page 44: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWhat now?

Fin

Data/code available from my webpage.

Thanks!

[email protected]://www.hds.utc.fr/~bordesan

Connecting Large-Scale Knowledge Bases and Natural Language 44

Page 45: Connecting Large-Scale Knowledge Bases and Natural Language · (score_NN_2, _is_a, _sheet_music_NN_1) Connecting Large-Scale Knowledge Bases and Natural Language 3. ContextModeling

Context Modeling Knowledge Bases Connecting KBs & Natural Language ConclusionWhat now?

References

1. Learning Structured Embeddings of Knowledge Bases.A. Bordes, J. Weston, R. Collobert & Y. Bengio. AAAI, 2011.

2. Joint Learning of Words and Meaning Representations forOpen-Text Semantic Parsing.A. Bordes, X. Glorot, J. Weston & Y. Bengio. AISTATS, 2012.

3. A Latent Factor Model for Highly Multi-relational Data.R. Jenatton, N. Le Roux, A. Bordes & G. Obozinski. NIPS, 2012.

4. A Semantic Matching Energy Function for Learning withMulti-relational Data.A. Bordes, X. Glorot, J. Weston & Y. Bengio. MLj, 2013.

5. Irreflexive and Hierarchical Relations as Translations.A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston & O. Yakhnenko.ICML Workshop on Structured Learning, 2013

Connecting Large-Scale Knowledge Bases and Natural Language 45