word re-embedding via manifold learning · 2020. 3. 13. · word re-embedding: related work –...
TRANSCRIPT
![Page 1: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/1.jpg)
Word Re-Embedding via Manifold Learning
Souleiman Hasan Lecturer, Maynooth University, School of Business and
Hamilton Institute
Machine Learning Dublin Meetup, Dublin, Ireland, 26 Feb 2018
![Page 2: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/2.jpg)
Based On
• Souleiman Hasan and Edward Curry. "Word Re-Embedding via Manifold Dimensionality Retention." Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017)
http://aclweb.org/anthology/D17-1033
2
![Page 3: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/3.jpg)
Outline
• Motivation
– NLP tasks, Semantics, Word embeddings
• Background
– Mathematical Structures, Manifold Learning
• Word Re-embedding
– Methodology and Related Work
– Approach
– Results
3
![Page 4: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/4.jpg)
NLP Tasks
– Named entity recognition
4
![Page 5: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/5.jpg)
NLP Tasks
– Sentiment Analysis
5
![Page 6: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/6.jpg)
Generic NLP Supervised Model
6
ML Training Data
Model W
ord
Fe
atu
re
Extr
acti
on
![Page 7: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/7.jpg)
Generic NLP Supervised Model
7
ML Training Data
Model W
ord
Fe
atu
re
Extr
acti
on
Distributed Word Representations
E.g. (Collobert, 2011), (Turian et al., 2010), (Devlin et al., 2014)
![Page 8: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/8.jpg)
Word Distributed Representations capture Semantics
• Semantics: “The relation between the words or expressions of a language and their meaning.” (Gardenfors, 2004)
Mean
ing
s
Sym
bo
ls
Orange Yellow
{sunset, the tray, my table, Joe’s laptop, that towel, …}
{my car, the sticker, her pen,
my bag, …}
Extensional e.g. Tarski model theory
Conceptual Spaces (Gardendors, 2004)
Orange Yellow
Distributional ESA (Gabrilovich & Markovitch, 2007)
8
Orange Yellow
http://en.wikipedia.org /wiki/Fire
http://en.wikipedia.org / wiki/Sun v1 v2
topology, statistical topology
Image CC BY-SA 3.0 by Michael Horvath https://commons.wikimedia.org/wiki/User:SharkD
![Page 9: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/9.jpg)
Word Embeddings
• Word2Vec (Mikolov et al., 2013)
• GloVe (Pennington et al., 2014)
9
![Page 10: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/10.jpg)
Word Embeddings
• Word2Vec (Mikolov et al., 2013)
• GloVe (Pennington et al., 2014)
10 Image is from work created and shared by Google and used according to terms described in the Creative Commons 3.0 Attribution License.
![Page 11: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/11.jpg)
Why Called Embedding?
11 Image CC BY-SA 3.0 by Lars H. Rohwedder, https://commons.wikimedia.org/wiki/User:RokerHRO
![Page 12: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/12.jpg)
Why Called Embedding?
12 Derivative image CC BY-SA 3.0 by Ryan Wilson from Jitse Niesen https://commons.wikimedia.org/wiki/User:Pbroks13; https://en.wikipedia.org/wiki/User:Jitse_Niesen
![Page 13: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/13.jpg)
Mathematical Structures
13
Set
Topological Space
Metric Space
Vector Space
Mo
re Structu
re
{Dublin, Paris, LA}
{Dublin, Paris, LA} {{}, {Dublin, Paris}, {Dublin, Paris, LA}}
{Dublin, Paris, LA} Dublin Paris LA
Dublin 0 779 8314
Paris 779 0 9096
LA 8314 9096 0
{Dublin, Paris, LA} Dublin=53.3498° N+6.2603° W Paris= 48.8566° N+ 2.3522° E LA= 34.0522° N+ 118.2437° W
![Page 14: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/14.jpg)
Mathematical Structures and ML
14
Set
Topological Space
Metric Space
Vector Space
{Dublin, Paris, LA}
{Dublin, Paris, LA} {{}, {Dublin, Paris}, {Dublin, Paris, LA}}
{Dublin, Paris, LA} Dublin Paris LA
Dublin 0 779 8314
Paris 779 0 9096
LA 8314 9096 0
{Dublin, Paris, LA} Dublin=53.3498° N+6.2603° W Paris= 48.8566° N+ 2.3522° E LA= 34.0522° N+ 118.2437° W N dim-> k dim
Dimensionality Reduction
Kernel Methods
Graph Kernels
Emb
edd
ing
![Page 15: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/15.jpg)
e.g. Earth Surface 2D Embedding
15 Image CC BY-SA 3.0 by Lars H. Rohwedder, https://commons.wikimedia.org/wiki/User:RokerHRO
![Page 16: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/16.jpg)
e.g. Shape of the Universe
16
![Page 17: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/17.jpg)
Manifold Learning
• Embedding while preserving the neighbourhood
17 Saul, Lawrence K., and Sam T. Roweis. "An introduction to locally linear embedding.“ Available at: http://www. cs. toronto. edu/~ roweis/lle/publications. html(2000).
![Page 18: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/18.jpg)
Manifold Learning for Dimensionality Reduction
18
![Page 19: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/19.jpg)
Manifold Learning- Various Algorithms
19 http://scikit-learn.org/stable/auto_examples/manifold/plot_compare_methods.html
![Page 20: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/20.jpg)
Word Re-Embedding: Problem
20
Wo
rd P
airs
Em
be
dd
ing
Sim
ilari
ty
bas
ed o
n G
loV
e 4
2B
30
0d
em
bed
din
g
Word Pairs Ground Truth Similarity By WS353 ground truth similarity score
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10
Pair: “shore”
“woodland”
Embedding sim(“shore”, “woodland”) > sim(“physics”, “proton”)
Despite an opposite order with a big difference in the ground truth
Pair: “physics”
“proton”
![Page 21: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/21.jpg)
Word Re-Embedding: Problem
21
Wo
rd P
airs
Em
be
dd
ing
Sim
ilari
ty
bas
ed o
n G
loV
e 4
2B
30
0d
em
bed
din
g
Word Pairs Ground Truth Similarity By WS353 ground truth similarity score
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10
Pair: “shore”
“woodland”
Embedding sim(“shore”, “woodland”) > sim(“physics”, “proton”)
Despite an opposite order with a big difference in the ground truth
Pair: “physics”
“proton”
Can we improve an existing embedding
space?
![Page 22: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/22.jpg)
Word Re-Embedding: Methodology
22
![Page 23: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/23.jpg)
Word Re-Embedding: Related Work
– Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a)
– Unified metric recovery framework for word embedding and manifold learning (Hashimoto et al., 2016)
– Manifold learning for dimensionality reduction and embedding: Locally Linear Embedding (LLE) (Roweis and Saul, 2000), Isomap (Balasubramanian and Schwartz, 2002), t-SNE (Maaten and Hinton, 2008), etc.
– Word embedding post-processing: (Labutov and Lipson, 2013), (Lee et al., 2016), (Mu et al., 2017)
– Need for generic, unsupervised, nonlinear, and theoretically-founded model for post-processing
23
![Page 24: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/24.jpg)
Approach
24 Original Word
Embedding Space
Sample
Manifold
Fitting Window
fit
Manifold
LearningNew
Embedding
tra
nd
form
window start
window
length
Test Vector
Re-Embedded
Test Vector
dimensionality d
dimensionality d
1
2
3
4
Vo
ca
bu
lary
(fr
eq
ue
ncy o
rde
red
)
![Page 25: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/25.jpg)
Results
25
Wo
rd P
airs
Sim
ilari
ty
bas
ed o
n G
loV
e 4
2B
30
0d
em
bed
din
g, a
nd
n
orm
aliz
ed t
o u
nit
mea
ns
Ori
gin
al E
mb
edd
ing
Man
ifo
ld R
e-Em
bed
din
g
Word Pairs Ground Truth Similarity By WS353 ground truth similarity score
-0.005
-1E-17
0.005
0.01
0.015
0.02
0.025
0.03
0 1 2 3 4 5 6 7 8 9 10
Pair: “shore” “woodland”
Pair: “physics” “proton”
Re-embedding by Manifold learning increased the similarity estimation
between nearby words, and decreased the similarity estimation between distant words, resulting in a higher correlation with human judgement
![Page 26: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/26.jpg)
Word Re-Embedding: Results
26
![Page 27: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/27.jpg)
Word Re-Embedding: Results
27
![Page 28: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/28.jpg)
Word Re-Embedding: Results
28
![Page 29: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/29.jpg)
Word Re-Embedding: Results
29
![Page 30: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/30.jpg)
Word Re-Embedding: Results
30
![Page 31: Word Re-Embedding via Manifold Learning · 2020. 3. 13. · Word Re-Embedding: Related Work – Word embedding: Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014a) –](https://reader036.vdocument.in/reader036/viewer/2022063008/5fbeba064608cd22466fa13e/html5/thumbnails/31.jpg)
Conclusions
– Word re-embedding improves performance on word similarity tasks
– The sample window start should be chosen just after the stop words
– The sample length should be close or equal to the number of local neighbours, which in turn can be chosen from a wide range
– The dimensionality of the original embedding space should be retained
31