cardiff university, 18 march 2019 word, sense and
TRANSCRIPT
![Page 1: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/1.jpg)
Jose Camacho-Collados
Cardiff University, 18 March 2019
Word, Sense and Contextualized Embeddings: Vector Representations of Meaning in NLP
1
![Page 2: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/2.jpg)
Outline
2
❖ Background➢ Vector Space Models (word embeddings)➢ Lexical resources
❖ Sense representations
➢ Knowledge-based: NASARI, SW2V
➢ Contextualized: ELMo, BERT
❖ Applications
![Page 3: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/3.jpg)
Word vector space models
3
Words are represented as vectors: semantically similar words
are close in the vector space
![Page 4: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/4.jpg)
Neural networks for learning word vector representations from text corpora -> word embeddings
4
![Page 5: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/5.jpg)
Why word embeddings?Embedded vector representations:
• are compact and fast to compute
• preserve important relational information between
words (actually, meanings):
• are geared towards general use
5
![Page 6: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/6.jpg)
• Syntactic parsing (Weiss et al. 2015)
• Named Entity Recognition (Guo et al. 2014)
• Question Answering (Bordes et al. 2014)
• Machine Translation (Zou et al. 2013)
• Sentiment Analysis (Socher et al. 2013)
… and many more!
6
Applications for word representations
![Page 7: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/7.jpg)
AI goal: language understanding
7
![Page 8: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/8.jpg)
• Word representations cannot capture ambiguity. For instance, bank
8
Limitations of word embeddings
![Page 9: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/9.jpg)
9
Problem 1: word representations cannot capture ambiguity
![Page 10: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/10.jpg)
07/07/2016 10
Problem 1: word representations cannot capture ambiguity
![Page 11: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/11.jpg)
11
Problem 1: word representations cannot capture ambiguity
![Page 12: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/12.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
12
Word representations and the triangular inequality
Example from Neelakantan et al (2014)
plant
pollen refinery
![Page 13: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/13.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
13
Example from Neelakantan et al (2014)
plant1
pollen refinery
plant2
Word representations and the triangular inequality
![Page 14: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/14.jpg)
• They cannot capture ambiguity. For instance, bank
-> They neglect rare senses and infrequent words
• Word representations do not exploit knowledge from existing lexical resources.
14
Limitations of word representations
![Page 15: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/15.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
Motivation: Model senses instead of only words
He withdrew money from the bank.
![Page 16: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/16.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
Motivation: Model senses instead of only words
...
...
He withdrew money from the bank.
bank#1
bank#2
![Page 17: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/17.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
Motivation: Model senses instead of only words
...
...
He withdrew money from the bank.
bank#1
bank#2
![Page 18: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/18.jpg)
18
a Novel Approach to a Semantically-Aware
Representations of Items
http://lcl.uniroma1.it/nasari/
![Page 19: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/19.jpg)
19
Key goal: obtain sense representations
![Page 20: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/20.jpg)
20
Key goal: obtain sense representations
We want to create a separate representation for each entry of a given word
![Page 21: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/21.jpg)
WordNet
Idea
21
+Encyclopedic knowledge Lexicographic knowledge
![Page 22: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/22.jpg)
WordNet
Idea
22
+Encyclopedic knowledge Lexicographic knowledge
+Information from text corpora
![Page 23: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/23.jpg)
23
WordNet
![Page 24: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/24.jpg)
WordNetMain unit: synset (concept)
electronic devicetelevision, telly,
television set, tv, tube, tv set, idiot box, boob tube,
goggle box
the middle of the day
Noon, twelve noon, high noon, midday, noonday, noontide
24
synset
word sense
![Page 25: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/25.jpg)
the branch of biology that
studies plantsbotany
WordNet semantic relations
((botany) a living organism lacking
the power of locomotion
plant, flora, plant life
a living thing that has (or can
develop) the ability to act or function independently
organism, being
any of a variety of plants grown indoors
for decorative purposes
houseplant
a protective covering that
is part of aplant
hood, capHypernymy
(is-a)
DomainHyponymy (has-kind)
Meronymy(part of)
25
![Page 26: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/26.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
Knowledge-based Representations (WordNet)
X. Chen, Z. Liu, M. Sun: A Unified Model for Word Sense Representation and Disambiguation (EMNLP 2014)
S. Rothe and H. Schutze: AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes (ACL 2015)
Faruqui, M., Dodge, J., Jauhar, S. K., Dyer, C., Hovy, E., & Smith, N. A. Retrofitting Word Vectors to Semantic Lexicons (NAACL 2015)*
S. K. Jauhar, C. Dyer, E. Hovy: Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models (NAACL 2015)
M. T. Pilehvar and N. Collier, De-Conflated Semantic Representations (EMNLP 2016)
26
![Page 27: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/27.jpg)
27
Wikipedia
![Page 28: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/28.jpg)
28
WikipediaHigh coverage of named entities and
specialized concepts from different domains
![Page 29: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/29.jpg)
29
Wikipedia hyperlinks
![Page 30: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/30.jpg)
30
Wikipedia hyperlinks
![Page 31: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/31.jpg)
Thanks to an automatic mapping algorithm, BabelNet
integrates Wikipedia and WordNet, among other
resources (Wiktionary, OmegaWiki, WikiData…).
Key feature: Multilinguality (271 languages)
31
![Page 32: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/32.jpg)
32
BabelNet
Concept
Entity
![Page 33: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/33.jpg)
It follows the same structure of WordNet:
synsets are the main units
33
BabelNet
![Page 34: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/34.jpg)
In this case, synsets are multilingual
34
BabelNet
![Page 35: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/35.jpg)
35
NASARI(Camacho-Collados et al., AIJ 2016)
Goal
Build vector representations for multilingual BabelNet
synsets.
How?
We exploit Wikipedia semantic network and WordNet
taxonomy to construct a subcorpus (contextual
information) for any given BabelNet synset.
![Page 36: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/36.jpg)
36
Process of obtaining contextual information for a BabelNet synset exploiting BabelNet taxonomy and Wikipedia as a semantic network
Pipeline
![Page 37: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/37.jpg)
37
Three types of vector representations:
- Lexical (dimensions are words)
-
- Unified (dimensions are multilingual BabelNet synsets)
-
- Embedded (latent dimensions)
Three types of vector representations
![Page 38: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/38.jpg)
38
Three types of vector representations:
- Lexical (dimensions are words)
-
- Unified (dimensions are multilingual BabelNet synsets)
-
- Embedded (latent dimensions)
Three types of vector representations
}
![Page 39: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/39.jpg)
Human-interpretable dimensions
plant (living organism)
organism#1
table#3
tree#1leaf#14
soil#2carpet#2
food#2garden#2
dictionary#3
refinery#1
39
![Page 40: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/40.jpg)
40
Three types of vector representations:
- Lexical (dimensions are words)
- Unified (dimensions are multilingual BabelNet synsets)
- Embedded: Low-dimensional vectors exploiting word embeddings
obtained from text corpora.
Three types of vector representations
![Page 41: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/41.jpg)
41
Three types of vector representations:
- Lexical (dimensions are words)
- Unified (dimensions are multilingual BabelNet synsets)
- Embedded: Low-dimensional vectors exploiting word
embeddings obtained from text corpora.
-
Word and synset embeddings share the same vector space!
Three types of vector representations
![Page 42: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/42.jpg)
42
Embedded vector representationClosest senses
![Page 43: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/43.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
43
SW2V (Mancini and Camacho-Collados et al., CoNLL 2017)
A word is the surface form of a sense: we can exploit this intrinsic relationship for jointly training word and sense embeddings.
![Page 44: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/44.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
44
SW2V (Mancini and Camacho-Collados et al., CoNLL 2017)
A word is the surface form of a sense: we can exploit this intrinsic relationship for jointly training word and sense embeddings.
How?
Updating the representation of the word and its associated senses interchangeably.
![Page 45: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/45.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
45
SW2V: IdeaGiven as input a corpus and a semantic network:
1. Use a semantic network to link to each word its associated senses in context.
He withdrew money from the bank.
![Page 46: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/46.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
46
Given as input a corpus and a semantic network:
1. Use a semantic network to link to each word its associated senses in context.
He withdrew money from the bank.
SW2V: Idea
![Page 47: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/47.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
47
Given as input a corpus and a semantic network:
1. Use a semantic network to link to each word its associated senses in context.
2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.
SW2V: Idea
![Page 48: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/48.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
48
Given as input a corpus and a semantic network:
1. Use a semantic network to link to each word its associated senses in context.
2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.
He bank
money
withdrew thefrom
SW2V: Idea
![Page 49: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/49.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
49
Given as input a corpus and a semantic network:
1. Use a semantic network to link to each word its associated senses in context.
2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.
He bank
money
withdrew thefrom
error
SW2V: Idea
![Page 50: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/50.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
50
Given as input a corpus and a semantic network:
1. Use a semantic network to link to each word its associated senses in context.
2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.
He bank
money
withdrew thefrom
error
SW2V: Idea
![Page 51: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/51.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
51
Given as input a corpus and a semantic network:
1. Use a semantic network to link to each word its associated senses in context.
2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.
He bank
money
withdrew thefrom
error
SW2V: Idea
![Page 52: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/52.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
52
Given as input a corpus and a semantic network:
1. Use a semantic network to link to each word its associated senses in context.
2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.
He bank
money
withdrew thefrom
error
SW2V: Idea
![Page 53: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/53.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
53
Given as input a corpus and a semantic network:
1. Use a semantic network to link to each word its associated senses in context.
2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections.
In this way it is possible to learn word and sense/synset embeddings jointly on a single training.
SW2V: Idea
![Page 54: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/54.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
54
Full architecture of W2V (Mikolov et al., 2013)
Words and associated senses used both as input and output.
E=-log(p(wt|Wt))
![Page 55: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/55.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
55
Words and associated senses used both as input and output.
Full architecture of SW2V
E=-log(p(wt|Wt,St)) - ∑s∈St log(p(s|Wt,St))
![Page 56: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/56.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
56
Word and senses connectivity: example 1
Ten closest word and sense embeddings to the sense company (military unit)
![Page 57: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/57.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
57
Word and senses connectivity: example 2
Ten closest word and sense embeddings to the sense school (group of fish)
![Page 58: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/58.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
58
- Taxonomy Learning (Espinosa-Anke et al., EMNLP 2016)
- Open Information Extraction (Delli Bovi et al. EMNLP 2015).
- Lexical entailment (Nickel & Kiela, NIPS 2017)
- Word Sense Disambiguation (Rothe & Schütze, ACL 2015)
- Sentiment analysis (Flekova & Gurevych, ACL 2016)
- Lexical substitution (Cocos et al., SENSE 2017)
- Computer vision (Young et al. ICRA 2017)
...
Applications of knowledge-based sense representations
![Page 59: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/59.jpg)
59
❖ Domain labeling/adaptation
❖ Word Sense Disambiguation
❖ Downstream NLP applications (e.g. text classification)
Applications
![Page 60: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/60.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
60
Annotate each concept/entity with its corresponding
domain of knowledge.
To this end, we use the Wikipedia featured articles page,
which includes 34 domains and a number of Wikipedia
pages associated with each domain (Biology, Geography,
Mathematics, Music, etc. ).
Domain labeling(Camacho-Collados and Navigli, EACL 2017)
![Page 61: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/61.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
Wikipedia featured articles
Domain labeling
![Page 62: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/62.jpg)
NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language ProcessingCamacho-Collados, Espinosa-Anke, Pilehvar
62
How to associate a concept with a domain?
1. Learn a NASARI vector for the concatenation of all Wikipedia pages
associated with a given domain.
2. Exploit the semantic similarity between knowledge-based vectors
and graph properties of the lexical resources.
Domain labeling
![Page 63: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/63.jpg)
63
BabelDomains(Camacho-Collados and Navigli, EACL 2017)
As a result:
Unified resource with information about domains of knowledge
BabelDomains available for BabelNet, Wikipedia and WordNet available at
http://lcl.uniroma1.it/babeldomains
Already integrated into BabelNet (online interface and API)
![Page 64: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/64.jpg)
64
BabelDomains
Physics and astronomy
Computing
Media
![Page 65: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/65.jpg)
65
Task: Given a term, predict its hypernym(s)
Model: Distributional supervised system based on the transformation
matrix of Mikolov et al. (2013).
Idea: Training data filtered by domain of knowledge
Domain filtering for supervised distributional hypernym discovery
(Espinosa-Anke et al., EMNLP 2016;Camacho-Collados and Navigli, EACL 2017)
Fruit
Apple
is a
![Page 66: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/66.jpg)
66
Domain filtering for supervised distributional hypernym discovery
Results on the hypernym discovery task for five domains
Conclusion: Filtering training data by domains prove to be clearly beneficial
Domain-filtered training data
Non-filtered training data
![Page 67: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/67.jpg)
67
Kobe, which is one of Japan's largest cities, [...]
?
Word Sense Disambiguation
![Page 68: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/68.jpg)
68
Kobe, which is one of Japan's largest cities, [...]
X
Word Sense Disambiguation
![Page 69: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/69.jpg)
69
Kobe, which is one of Japan's largest cities, [...]
Word Sense Disambiguation
![Page 70: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/70.jpg)
70
Word Sense Disambiguation (Camacho-Collados et al., AIJ 2016)
Basic idea
Select the sense which is semantically closer to the
semantic representation of the whole document
(global context).
![Page 71: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/71.jpg)
Word Sense Disambiguation on textual definitions
(Camacho-Collados et al., LREC 2016; LREV 2018)
Combination of a graph-based disambiguation system (Babelfy) with NASARI to disambiguate the concepts and named entities of over 35M definitions in 256 languages.
Sense-annotated corpus freely available at
http://lcl.uniroma1.it/disambiguated-glosses/
71
![Page 72: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/72.jpg)
Context-rich WSD
Interchanging the positions of the king and a rook.
castling (chess)
![Page 73: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/73.jpg)
Context-rich WSD
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Interchanging the positions of the king and a rook.
castling (chess)
![Page 74: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/74.jpg)
Context-rich WSD
Interchanging the positions of the king and a rook.
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
Manœuvre du jeu d'échecs
Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial
en el juego de ajedrez que involucra al rey y a una de las torres del jugador.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.
Rok İngilizce'de kaleye rook denmektedir.
Rokade er et spesialtrekk i sjakk.
Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.
castling (chess)
![Page 75: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/75.jpg)
Context-rich WSD
75
Interchanging the positions of the king and a rook.
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
Manœuvre du jeu d'échecs
Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial
en el juego de ajedrez que involucra al rey y a una de las torres del jugador.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.
Rok İngilizce'de kaleye rook denmektedir.
Rokade er et spesialtrekk i sjakk.
Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.
castling (chess)
![Page 76: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/76.jpg)
Context-rich WSD exploiting parallel corpora
(Delli Bovi et al., ACL 2017)
Applying the same method to provide high-quality sense annotations from parallel corpora (Europarl): 120M+ sense annotations for 21 languages.
http://lcl.uniroma1.it/eurosense/
Extrinsic evaluation: Improved performance of a standard supervised WSD system using this automatically sense-annotated corpora.
![Page 77: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/77.jpg)
77
Towards a seamless integration of senses in downstream NLP applications
(Pilehvar et al., ACL 2017)
Question: What if we apply WSD and inject sense embeddings to a
standard neural classifier?
Problems:
![Page 78: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/78.jpg)
78
Towards a seamless integration of senses in downstream NLP applications
(Pilehvar et al., ACL 2017)
Question: What if we apply WSD and inject sense embeddings to a
standard neural classifier?
Problems:
- WSD is not perfect
![Page 79: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/79.jpg)
79
Towards a seamless integration of senses in downstream NLP applications
(Pilehvar et al., ACL 2017)
Question: What if we apply WSD and inject sense embeddings to a
standard neural classifier?
Problems:
- WSD is not perfect -> Solution: High-confidence disambiguation
![Page 80: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/80.jpg)
80
Towards a seamless integration of senses in downstream NLP applications
(Pilehvar et al., ACL 2017)
Question: What if we apply WSD and inject sense embeddings to a
standard neural classifier?
Problems:
- WSD is not perfect -> Solution: High-confidence disambiguation
- WordNet lacks coverage
![Page 81: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/81.jpg)
81
Towards a seamless integration of senses in downstream NLP applications
(Pilehvar et al., ACL 2017)
Question: What if we apply WSD and inject sense embeddings to a
standard neural classifier?
Problems:
- WSD is not perfect -> Solution: High-confidence disambiguation
- WordNet lacks coverage -> Solution: Use of Wikipedia
![Page 82: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/82.jpg)
82
Tasks: Topic categorization and sentiment analysis (polarity detection)
Topic categorization: Given a text, assign it a topic (e.g. politics, sports, etc.).
Polarity detection: Predict the sentiment of the sentence/review as either positive or negative.
![Page 83: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/83.jpg)
83
Classification model
Standard CNN classifier
inspired by Kim (2014)
![Page 84: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/84.jpg)
84
Sense-based vs. word-based: Conclusions
Sense-based better than word-based... when the input text is large enough
![Page 85: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/85.jpg)
85
Sense-based vs. word-based:
Sense-based better than word-based... when the input text is large enough:
![Page 86: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/86.jpg)
86
Why does the input text size matter?
- Word sense disambiguation works better in larger texts (Moro et al. 2014; Raganato et al. 2017)
- Disambiguation increases sparsity
![Page 87: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/87.jpg)
87
Contextualized word embeddingsELMo/BERT
Peters et al. (NAACL 2018)
Devlin et al. (NAACL 2019)
![Page 88: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/88.jpg)
88
Contextualized word embeddingsELMo/BERT
![Page 89: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/89.jpg)
89
Contextualized word embeddingsELMo/BERT
As word embeddings, learned by leveraging language models on massive amounts of text corpora.
![Page 90: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/90.jpg)
90
Contextualized word embeddingsELMo/BERT
As word embeddings, learned by leveraging language models on massive amounts of text corpora.
New: each word vector depends on the context. It is dynamic.
![Page 91: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/91.jpg)
91
Contextualized word embeddingsELMo/BERT
As word embeddings, learned by leveraging language models on massive amounts of text corpora.
New: each word vector depends on the context. It is dynamic.
Important improvements in many NLP tasks.
![Page 92: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/92.jpg)
92
Contextualized word embeddingsELMo/BERT (examples)
He withdrew money from the bank.
The bank remained closed yesterday.
We found a nice spot by the bank of the river.
![Page 93: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/93.jpg)
93
Contextualized word embeddingsELMo/BERT (examples)
He withdrew money from the bank.
The bank remained closed yesterday.
We found a nice spot by the bank of the river.
0.25, 0.32, -0.1 ….
0.22, 0.30, -0.08 ….
-0.8, 0.01, 0.3 ….
![Page 94: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/94.jpg)
94
Contextualized word embeddingsELMo/BERT (examples)
He withdrew money from the bank.
The bank remained closed yesterday.
We found a nice spot by the bank of the river.
0.25, 0.32, -0.1 ….
0.22, 0.30, -0.08 ….
-0.8, 0.01, 0.3 ….
Similar vectors
![Page 95: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/95.jpg)
95
How well do these models capture “meaning”?
Good enough for many applications.
Room for improvement. No noticeable improvements in:
➢ Winograd Schema Challenge: BERT ~65% vs Humans ~95%
➢ Word-in-Context Challenge: BERT ~65% vs Humans ~85%
![Page 96: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/96.jpg)
96
How well do these models capture “meaning”?
Good enough for many applications.
Room for improvement. No noticeable improvements in:
➢ Winograd Schema Challenge: BERT ~65% vs Humans ~95%
➢ Word-in-Context Challenge: BERT ~65% vs Humans ~85%
requires commonsense reasoning
requires abstracting the notion of sense
![Page 97: Cardiff University, 18 March 2019 Word, Sense and](https://reader033.vdocument.in/reader033/viewer/2022052106/628806700104660cea3adee7/html5/thumbnails/97.jpg)
97
For more information on meaning representations (embeddings):
❖ ACL 2016 Tutorial on “Semantic representations of word senses and concepts”: http://josecamachocollados.com/slides/Slides_ACL16Tutorial_SemanticRepresentation.pdf
❖ EACL 2017 workshop on “Sense, Concept and Entity Representations and their Applications”: https://sites.google.com/site/senseworkshop2017/
❖ NAACL 2018 Tutorial on “Interplay between lexical resources and NLP”: https://bitbucket.org/luisespinosa/lr-nlp/
❖ “From Word to Sense Embeddings: A Survey on Vector Representations of Meaning” (JAIR 2018): https://www.jair.org/index.php/jair/article/view/11259