word embeddings - deep learning for nlpshad.pcs15/data/we-kevin.pdf · word embeddings deep...
TRANSCRIPT
![Page 1: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/1.jpg)
.
......
Word EmbeddingsDeep Learning for NLP
Kevin Patel
ICON 2017
December 21, 2017
Kevin Patel Word Embeddings 1/100
![Page 2: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/2.jpg)
Outline
...1 Introduction
...2 Word EmbeddingsCount Based EmbeddingsPrediction Based EmbeddingsMultilingual Word EmbeddingsInterpretable Word Embeddings
...3 Evaluating Word EmbeddingsIntrinsic EvaluationExtrinsic EvaluationEvaluation FrameworksVisualizing Word Embeddings
Kevin Patel Word Embeddings 2/100
![Page 3: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/3.jpg)
Outline
...4 Discussion on Lower Bounds
...5 Applications of Word EmbeddingsAre Word Embeddings Useful for Sarcasm Detection?Iterative Unsupervised Most Frequent Sense Detection using WordEmbeddings
...6 Conclusion
Kevin Patel Word Embeddings 3/100
![Page 4: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/4.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Layman(ish) Intro to ML
In simple terms, Machine Learning comprises ofRepresenting data in some numeric formLearning some function on that representation
Kevin Patel Word Embeddings 4/100
![Page 5: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/5.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Layman(ish) Intro to ML
In simple terms, Machine Learning comprises ofRepresenting data in some numeric formLearning some function on that representation
Kevin Patel Word Embeddings 4/100
![Page 6: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/6.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Layman(ish) Intro to ML
In simple terms, Machine Learning comprises ofRepresenting data in some numeric formLearning some function on that representation
Kevin Patel Word Embeddings 4/100
![Page 7: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/7.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Layman(ish) Intro to ML
In simple terms, Machine Learning comprises ofRepresenting data in some numeric formLearning some function on that representation
Kevin Patel Word Embeddings 4/100
![Page 8: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/8.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Layman(ish) Intro to ML
In simple terms, Machine Learning comprises ofRepresenting data in some numeric formLearning some function on that representation
Kevin Patel Word Embeddings 4/100
![Page 9: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/9.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Layman(ish) Intro to ML
In simple terms, Machine Learning comprises ofRepresenting data in some numeric formLearning some function on that representation
How to place words to learn,say, Binary SentimentClassification?
Good: PositiveAwesome: PositiveBad: Negative
Kevin Patel Word Embeddings 4/100
![Page 10: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/10.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Layman(ish) Intro to ML
In simple terms, Machine Learning comprises ofRepresenting data in some numeric formLearning some function on that representation
How to place words to learn,say, Binary SentimentClassification?
Good: PositiveAwesome: PositiveBad: Negative
Kevin Patel Word Embeddings 4/100
![Page 11: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/11.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Representations for Learning AlgorithmsDetect whether the following image is dog or not?
Basic idea: feed raw pixels as input vectorWorks well:
Inherent structure in the image
Detect whether a word is a dog or not?Labrador
Nothing in spelling of labrador that can connect it to dogNeed a representation of labrador which indicates that it is adog
Kevin Patel Word Embeddings 5/100
![Page 12: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/12.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Local Representations
Information about a particular item located solely in thecorresponding representational element (dimension)Effectively one unit is turned on in a network, all the othersare offNo sharing between represented dataEach feature is independentNo generalization on the basis of similarity between features
Kevin Patel Word Embeddings 6/100
![Page 13: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/13.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Distributed Representations
Information about a particular item distributed among a set of(not necessarily) mutually exclusive representational elements(dimensions)
One item spread over multiple dimensionsOne dimension contributing to multiple items
A new input is processed similar to samples in training datawhich were similar
Better generalization
Kevin Patel Word Embeddings 7/100
![Page 14: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/14.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Distributed Representations: Example
Number Local Representation Distributed Representation0 1 0 0 0 0 0 0 0 0 0 01 0 1 0 0 0 0 0 0 0 0 12 0 0 1 0 0 0 0 0 0 1 03 0 0 0 1 0 0 0 0 0 1 14 0 0 0 0 1 0 0 0 1 0 05 0 0 0 0 0 1 0 0 1 0 16 0 0 0 0 0 0 1 0 1 1 07 0 0 0 0 0 0 0 1 1 1 1
Kevin Patel Word Embeddings 8/100
![Page 15: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/15.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Word Embeddings : IntuitionWord Embeddings: distributed vector representations of wordssuch that the similarity among vectors correlate with semanticsimilarity among the corresponding words
Given that sim(dog, cat) is more than sim(dog, furniture),cos(−→dog, −→cat) is greater than cos(−→dog, −−−−−→furniture)
Such similarity information uncovered from context
Consider the following sentences:I like sweet food .You like spicy food .They like xyzabc food .
What is xyzabc ?Meaning of words can be inferred from their neighbors(context) and words that share neighbors
Neighbors of xyzabc: like, foodWords that share neighbors of xyzabc: sweet, spicy
Kevin Patel Word Embeddings 9/100
![Page 16: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/16.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Word Embeddings : IntuitionWord Embeddings: distributed vector representations of wordssuch that the similarity among vectors correlate with semanticsimilarity among the corresponding words
Given that sim(dog, cat) is more than sim(dog, furniture),cos(−→dog, −→cat) is greater than cos(−→dog, −−−−−→furniture)
Such similarity information uncovered from contextConsider the following sentences:
I like sweet food .You like spicy food .They like xyzabc food .
What is xyzabc ?Meaning of words can be inferred from their neighbors(context) and words that share neighbors
Neighbors of xyzabc: like, foodWords that share neighbors of xyzabc: sweet, spicy
Kevin Patel Word Embeddings 9/100
![Page 17: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/17.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Modelling Meaning via Word Embeddings
Geometric metaphor of meaning(Sahlgren, 2006):
Meanings are locations insemantic space, and semanticsimilarity is proximity betweenthe locations.
Distributional Hypothesis (Harris,1970)
Words with similar distributionalproperties have similar meaningsOnly differences in meaningcan be modelled
Kevin Patel Word Embeddings 10/100
![Page 18: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/18.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Modelling Meaning via Word Embeddings
Geometric metaphor of meaning(Sahlgren, 2006):
Meanings are locations insemantic space, and semanticsimilarity is proximity betweenthe locations.
Distributional Hypothesis (Harris,1970)
Words with similar distributionalproperties have similar meanings
Only differences in meaningcan be modelled
Kevin Patel Word Embeddings 10/100
![Page 19: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/19.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Modelling Meaning via Word Embeddings
Geometric metaphor of meaning(Sahlgren, 2006):
Meanings are locations insemantic space, and semanticsimilarity is proximity betweenthe locations.
Distributional Hypothesis (Harris,1970)
Words with similar distributionalproperties have similar meaningsOnly differences in meaningcan be modelled
Kevin Patel Word Embeddings 10/100
![Page 20: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/20.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Modelling Meaning via Word Embeddings
Geometric metaphor of meaning(Sahlgren, 2006):
Meanings are locations insemantic space, and semanticsimilarity is proximity betweenthe locations.
Distributional Hypothesis (Harris,1970)
Words with similar distributionalproperties have similar meaningsOnly differences in meaningcan be modelled
Kevin Patel Word Embeddings 10/100
![Page 21: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/21.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Modelling Meaning via Word Embeddings
Geometric metaphor of meaning(Sahlgren, 2006):
Meanings are locations insemantic space, and semanticsimilarity is proximity betweenthe locations.
Distributional Hypothesis (Harris,1970)
Words with similar distributionalproperties have similar meaningsOnly differences in meaningcan be modelled
Kevin Patel Word Embeddings 10/100
![Page 22: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/22.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Entire Vector vs. Individual dimensions
Only proximity in the entire space is representedNo phenomenological correlations with dimensions ofhigh-dimensional space (in majority of algorithms)
Those models who do have some correlations, are known asinterpretable models
Kevin Patel Word Embeddings 11/100
![Page 23: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/23.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Modelling Meaning via Word Embeddings
Co-occurrence matrix (Rubenstein and Goodenough, 1965)A mechanism to capture distributional propertiesRows of co-occurrence matrix can be directly considered asword vectors
Neural Word EmbeddingsVector representations learnt using neural networks - Bengioet al. (2003); Collobert and Weston (2008a); Mikolov et al.(2013b)
Kevin Patel Word Embeddings 12/100
![Page 24: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/24.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Co-occurrence Matrix
Originally proposed by Schütze (1992)Foundation of count based approaches that followAutomatic derivation of vectorsCollect co-occurrence counts in a matrixRows or columns are the vectors of corresponding wordIf counting in both directions, matrix is symmetricalIf counting in one side, matrix is asymmetrical, and is knownas directional co-occurrence
Kevin Patel Word Embeddings 13/100
![Page 25: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/25.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Co-occurrence Matrix (contd.)<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
Co-occurrence Matrix
word <> I like love hate rate rats cats dogs bats<> 0 4 0 0 0 0 1 1 1 1
I 4 0 1 1 1 1 0 0 0 0like 0 1 0 0 0 0 0 1 0 0love 0 1 0 0 0 0 0 0 1 0hate 0 1 0 0 0 0 1 0 0 0rate 0 1 0 0 0 0 0 0 0 1rats 1 0 0 0 1 0 0 0 0 0cats 1 0 1 0 0 0 0 0 0 0dogs 1 0 0 1 0 0 0 0 0 0bats 1 0 0 0 0 1 0 0 0 0
Kevin Patel Word Embeddings 14/100
![Page 26: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/26.jpg)
.
......
Word EmbeddingsCount Based Embeddings
Kevin Patel Word Embeddings 15/100
![Page 27: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/27.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
LSA
Latent Semantic AnalysisOriginally developed as Latent Semantic Indexing (LSI)(Dumais et al., 1988)Adapted for word-space modelsDeveloped to tackle inability of models of co-occurrencematrices to handle synonymy
Query about hotels cannot retrieve results about motelsWords and Documents dimensions → Latent dimensions
Uses Singular Value Decomposition (SVD) for dimensionalityreduction
Kevin Patel Word Embeddings 16/100
![Page 28: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/28.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
LSA (contd.)
Words-by-documents matrixEntropy based weighting of co-occurrences
fij = (log(TFij) + 1)× (1− (∑
j(pijlogpij
logD ))) (1)
where D is number of documents, TFij is frequency of term iin document j, fi is frequency of term i in documentcollection, and pij =
TFijfi
Truncated SVD to reduce dimensionalityCosine measure to compute vector similarities
Kevin Patel Word Embeddings 17/100
![Page 29: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/29.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
HAL
Hyperspace Analogous to Language (Lund and Burgess,1996a)Developed specifically for word representationsUses directional co-occurrence
Kevin Patel Word Embeddings 18/100
![Page 30: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/30.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
HAL (contd.)
Directional word by word matrixDistance weighting of the co-occurrencesConcatenation of row-column vectorsDimensionality reduction optional
Discard low variant dimensions
Normalization of vectors to unit lengthSimilarities computed through either Manhattan or Euclideandistance
Kevin Patel Word Embeddings 19/100
![Page 31: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/31.jpg)
.
......
Word EmbeddingsPrediction Based Embeddings
Kevin Patel Word Embeddings 20/100
![Page 32: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/32.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
NNLM
Neural Network Language ModelProposed by Bengio et al. (2003)Predict word given contextWord Vectors learnt as a by-product of language modelling
Kevin Patel Word Embeddings 21/100
![Page 33: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/33.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
NNLM: Original Model
Kevin Patel Word Embeddings 22/100
![Page 34: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/34.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
NNLM: Simplified (1)
Kevin Patel Word Embeddings 23/100
![Page 35: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/35.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
NNLM: Simplified (2)
Kevin Patel Word Embeddings 24/100
![Page 36: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/36.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Skip Gram
Proposed by Mikolov et al. (2013b)Predict Context given word
Kevin Patel Word Embeddings 25/100
![Page 37: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/37.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Skip Gram (contd.)
Given a sequence of training words w1,w2, . . . ,wT, maximize
1
T
T∑t=1
∑−c≤j≤c,j
logp (wt+j|wt) (2)
wherep(wO|wI) =
exp(uTwOvwI)∑W
w=1 exp(uTwvwI)
(3)
Kevin Patel Word Embeddings 26/100
![Page 38: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/38.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Global Vectors (GloVe)
Proposed by Pennington et al. (2014)Predict Context given wordSimilar to Skip-gram, but objective function is different
J =
V∑i,j=1
f(Xij)(wTi wj + bi + bj − logXij)
2 (4)
where Xij can be likelihood of ith and jth word occuringtogether, and f is a weightage function
Kevin Patel Word Embeddings 27/100
![Page 39: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/39.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Tuning word embeddings
Techniques which intend to tune already trained wordembeddings to various tasks using additional informationLing et al. (2015) improve quality of word2vec for syntactictasks such as POS
Take word positioning into accountStructured Skip-Gram and Continuous Windows: available aswang2vec
Levy and Goldberg (2014) use dependency parse treesLinear windows capture broad topical similarities, anddependency context captures functional similarities
Patel et al. (2017) use medical code hierarchy to improvemedical domain specific word embeddings
Kevin Patel Word Embeddings 28/100
![Page 40: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/40.jpg)
.
......
Word EmbeddingsMultilingual Word Embeddings
Kevin Patel Word Embeddings 29/100
![Page 41: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/41.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
ObjectiveEnglish data >> Data for other languagesLanguage independent phenomenon learnt on English shouldbe applicable to other languagesSolution via word embeddings:
Project words of different languages into a common subspaceGoal of multilingual word embeddings: Shared subspace for alllanguagesNeural MT learns such embeddings implicitly by optimizingthe MT objectiveWe shall discuss explicit models
These models are for MT what word2vec, GloVe are for NLPMuch lower cost of training as compared to Neural MT
Applications: Machine Translation, Automated BilingualDictionary Generation, Cross-lingual Information Retrieval,etc.
Kevin Patel Word Embeddings 30/100
![Page 42: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/42.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Types of Cross-lingual Embeddings
Based on the underlying approaches:Monolingual mappingCross-lingual trainingJoint optimization
Based on the resource used:Word-aligned dataSentence-aligned dataDocument-aligned dataLexiconNo parallel data
Kevin Patel Word Embeddings 31/100
![Page 43: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/43.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Monolingual Mapping
Learning in two step:Train separate embeddings we and wf on large monolingualcorpora of corresponding languages e and fLearn transformations g1 and g2 such that we = g1(wf) andwf = g2(we)
Transformations learnt using bilingual word mappings (lexicon)
Kevin Patel Word Embeddings 32/100
![Page 44: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/44.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Monolingual Mapping (contd.)
Linear Projection proposed by Mikolov et al. (2013a)
Learn matrix W s.t
we ≈ W.wf
which minimizesn∑i=1
∥Wwf − we∥2
We adapted this method for automatic synset linking inmultilingual wordnets (accepted at GWC 2018)
Kevin Patel Word Embeddings 33/100
![Page 45: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/45.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Monolingual Mapping (Contd.)
Linear projection (Mikolov et al., 2013a): LexiconProjection via CCA (Faruqui and Dyer, 2014b): LexiconAlignment-based projection (Guo et al., 2015): Word-aligneddataAdversarial auto-encoder (Barone, 2016): No parallel data
Kevin Patel Word Embeddings 34/100
![Page 46: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/46.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Cross Lingual Training
Goal: optimizing cross-lingual objectiveMainly rely on sentence alignmentsRequire parallel corpus for training
Kevin Patel Word Embeddings 35/100
![Page 47: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/47.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Cross Lingual Training (contd.)
Bilingual CompositionalSentence Model proposed byHermann and Blunsom(2013)Train two models to producesentence representations ofaligned sentences in twolanguagesMinimize distance betweensentence representations ofaligned sentences
Kevin Patel Word Embeddings 36/100
![Page 48: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/48.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Cross Lingual Training (contd.)
Bilingual compositional sentence model (Hermann andBlunsom, 2013): Sentence-aligned dataDistributed word alignment (Kočiskỳ et al., 2014):Sentence-aligned dataTranslation-invariant LSA (Huang et al., 2015): LexiconInverted Indexing on Wikipedia (Søgaard et al., 2015):Document-aligned data
Kevin Patel Word Embeddings 37/100
![Page 49: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/49.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Joint Optimization
Jointly optimize both monolingual M and cross-lingual ΩconstraintsObjective: minimize Ml1 + Ml2 + λ.Ωl1→l2 +Ωl2→l1where λ decides weightage of cross-lingual constraints
Kevin Patel Word Embeddings 38/100
![Page 50: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/50.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Joint Optimization (contd.)
Multitask Language Model proposed by Klementiev et al.(2012):
Train neural language model (NNLM)Jointly optimize monolingual maximum likelihood (M) withword alignment based MT regularization term (Ω)
Kevin Patel Word Embeddings 39/100
![Page 51: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/51.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Joint Optimization (contd.)
Multi-task language model (Klementiev et al., 2012):Word-aligned dataBilingual skip-gram (Luong et al., 2015): Word-aligned dataBilingual bag-of-words without alignment (Gouws et al.,2015): Sentence-aligned dataBilingual sparse representations (Vyas and Carpuat, 2016):Word-aligned data
Kevin Patel Word Embeddings 40/100
![Page 52: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/52.jpg)
.
......
Word EmbeddingsInterpretable Word Embeddings
Kevin Patel Word Embeddings 41/100
![Page 53: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/53.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Interpretability and Explainability
A model is interpretable if a human can make sense out of itExample: Decision treesInterpretable models enable one to explain the performance ofthe system and tune it accordinglyHowever, in practice, interpretable models generally performpoor compared to other systems
Kevin Patel Word Embeddings 42/100
![Page 54: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/54.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Interpretable Word EmbeddingsDimensions interpretable by ordering words based on value
Word d205
iguana 0.599371bueller 0.584335chimpanzee 0.577834wasp 0.556845chimp 0.553980hamster 0.534810giraffe 0.532316unicorn 0.529533caterpillar 0.528376baboon 0.526324gorilla 0.521590tortoise 0.519941sparrow 0.516842lizard 0.515716cockroach 0.505015crocodile 0.491139alligator 0.486275moth 0.471682kangaroo 0.469284toad 0.463514
Word d272
thigh 0.875286knee 0.872282shoulder 0.866209elbow 0.857403wrist 0.852959ankle 0.851555groin 0.841347forearm 0.837988leg 0.836661pelvis 0.777564neck 0.758420spine 0.754774torso 0.707458hamstring 0.701921buttocks 0.689092knees 0.676485ankles 0.658485jaw 0.653126biceps 0.650972hips 0.647000
Examples from NNSE embeddings Murphy et al. (2012)
Kevin Patel Word Embeddings 43/100
![Page 55: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/55.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
NNSENon Negative Sparse Embeddings proposed by Murphy et al.(2012)Word embeddings interpretable and cognitively plausiblePerforms a mixture of topical and taxonomical semanticsComputation
Dependency co-occurrence adjusted with PPMI (to normalizefor word frequency) and reduced with sparse SVDDocument co-occurrence adjusted with PPMI and reducedwith sparse SVDTheir union factorized using a variant of non-negative sparsecoding
Resulting word embeddings have both topical neighbors(judge is near to prison) and taxonomical neighbors (judge isnear to referee)Code unavailable, embeddings available athttp://www.cs.cmu.edu/~bmurphy/NNSE/
Kevin Patel Word Embeddings 44/100
![Page 56: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/56.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
OIWE
Online Interpretable Word Embeddings proposed by Luo et al.(2015)Main idea: apply sparsity to skip gramAchieve sparsity by setting to 0 any dimensions of a vectorthat falls below 0Propose two techniques to do this via gradient descentThey outperform NNSE at word intrusion taskCode available on Github athttps://github.com/SkTim/OIWE
Kevin Patel Word Embeddings 45/100
![Page 57: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/57.jpg)
.
......
Evaluating Word EmbeddingsIntrinsic Evaluation
Kevin Patel Word Embeddings 46/100
![Page 58: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/58.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Word Pair Similarity
Evaluates generalizability of word embeddingsOne of the most widely used evaluationsMany datasets available: WS353, RG65, MEN, SimLex,SCWS, etc.
Word1 Word2 HumanScore
Model1Score
Model2Score
street street 10.00 1.0 1.0street avenue 8.88 0.04 0.38street block 6.88 0.14 0.26street place 6.44 0.21 0.18street children 4.94 -0.08 0.15Spearman Correlation 0.6 1.0
Kevin Patel Word Embeddings 47/100
![Page 59: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/59.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Word Analogy task
Proposed by Mikolov et al. (2013b)Try to answer the questionman is to woman as king is to ?Often discussed in media
Kevin Patel Word Embeddings 48/100
![Page 60: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/60.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Categorization
Evaluates the ability of embeddings to form proper clustersGiven sets of words with different labels, try to cluster them,and check the correspondence between clusters and sets.The purer the cluster, the better is the embeddingsDatasets available: Bless, Battig, etc.
Kevin Patel Word Embeddings 49/100
![Page 61: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/61.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Word Intrusion Detection
Proposed by Murphy et al. (2012)Provides a way to interpret dimensionsMost approaches do not report results on this task
Experiments done by us suggest many of them are notinterpretable
Kevin Patel Word Embeddings 50/100
![Page 62: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/62.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Word Intrusion Detection (contd.)
The approach:...1 Select a dimension...2 Reverse sort all vectors based on this dimension...3 Select top 5 words...4 Select a word, which is in bottom half of this list, and is in top
10 percentile in some other columns...5 Give a random permutation of these 6 words to a human
evaluatorExample: bathroom, closet, attic, balcony, quickly, toilet
...6 Check precision
Kevin Patel Word Embeddings 51/100
![Page 63: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/63.jpg)
.
......
Evaluating Word EmbeddingsExtrinsic Evaluation
Kevin Patel Word Embeddings 52/100
![Page 64: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/64.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Extrinsic Evaluations
Evaluating word embeddings on downstream NLP tasks suchas Part of speech tagging, Named Entity Recognition, etc.Makes more sense as we ultimately want to use embeddingsfor such tasksHowever, performance does not solely rely on embeddings
Improvement/Degradation could be due to other factors suchas network architecture, hyperparameters, etc.
If an embedding E1 is better than another embedding E2
when used with some network architecture for NER, does thatmean E1 will be better for all architectures of NER?
Kevin Patel Word Embeddings 53/100
![Page 65: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/65.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Evaluations on Unified Architectures
Unified architectures such as Collobert and Weston (2008b)used for extrinsic evaluationsFor different tasks, the architecture remains same, except thelast layer, where the output neurons are changed according tothe task at handIf an embedding E1 is better than another embedding E2 onall tasks on such a unified architecture, then we can expect itto be truly better
Kevin Patel Word Embeddings 54/100
![Page 66: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/66.jpg)
.
......
Evaluating Word EmbeddingsEvaluation Frameworks
Kevin Patel Word Embeddings 55/100
![Page 67: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/67.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
WordVectors.org
Proposed by Faruqui and Dyer (2014a)A web interface for evaluating a collection of word pairsimilarity datasets on your embeddings available athttp://wordvectors.org/Also provides visualization for common sets of words like(Male,Female) and (Antonym,Synonym) pairs
Kevin Patel Word Embeddings 56/100
![Page 68: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/68.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
VecEval
Proposed by Nayak et al. (2016)A web based tool for performing extrinsic evaluationshttp://www.veceval.com/Claimed to support six different tasks: POS, NER, Chunking,Sentiment Analysis, Natural Language Inference, QuestionAnsweringHas never worked for meWeb interface no longer available inactive, code available onGithub at https://github.com/NehaNayak/veceval
Kevin Patel Word Embeddings 57/100
![Page 69: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/69.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Anago
A Keras implementation of sequence labelling based onLample et al. (2016)’s architectureCan perform POS, NER, SRL, etc.Used in our lab for extrinsic evaluationCode available on Github athttps://github.com/Hironsan/anago
Kevin Patel Word Embeddings 58/100
![Page 70: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/70.jpg)
.
......
Evaluating Word EmbeddingsVisualizing Word Embeddings
Kevin Patel Word Embeddings 59/100
![Page 71: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/71.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Visualizing Word Embeddings
Various ways to visualize word embeddings: PCA, Isomap,tSNE, etc., available in scikit-learn
from sklearn import decomposition, manifoldvis = decomposition.TruncatedSVD(n_components=2) - PCAE_vis = vis.fit_transform( E)plot E_vis here
Check out http://scikit-learn.org/stable/auto_examples/manifold/plot_lle_digits.html for manymethods applied to MNIST visualization
Kevin Patel Word Embeddings 60/100
![Page 72: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/72.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Related Work
Baroni et al. (2014): Neural word embeddings are better thantraditional methods such as LSA, HAL, RI (Landauer andDumais, 1997; Lund and Burgess, 1996b; Sahlgren, 2005)Levy et al. (2015): Superiority of neural word embeddings notdue to the embedding algorithm, but due to certain designchoices and hyperparameters optimizations
Varies other hyperparameters; keeps number of dimensions =500
Schnabel et al. (2015); Zhai et al. (2016); Ghannay et al.(2016): No justification for chosen number of dimensions intheir evaluationsMelamud et al. (2016): Optimal number of dimensionsdifferent for different evaluations of word embeddings
Kevin Patel Word Embeddings 61/100
![Page 73: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/73.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Why Dimensions matter?: A Practical Example
Various app developers want to utilize word embeddingsExample memory limit for app: 200 MBSize of Google Pre-trained vectors file: 3.4 GBNatural thought process: decrease dimensions
To what value? 100? 50? 20?
Depends on the words/entities we want to place in the space
Kevin Patel Word Embeddings 62/100
![Page 74: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/74.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Number of Dimensions and Equidistant points
Kevin Patel Word Embeddings 63/100
![Page 75: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/75.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Number of Dimensions and Equidistant pointsNumber of dimensions of a vector space imposes a restrictionon the number of equidistant points it can haveGiven that distance is euclidean, if the number of dimensionsλ = N, then maximum number of equidistant points E in thecorresponding space is N + 1 (Swanepoel, 2004)Given that distance is cosine, no closed form solution exists
Dimensions λ and max. no. of equiangular lines E(Barg and Yu, 2014)λ E λ E3 6 18 614 6 19 765 10 20 966 16 21 126
7<=n<=13 28 22 17614 30 23 27615 36 24<=n<=41 27616 42 42 28817 51 43 344
Kevin Patel Word Embeddings 64/100
![Page 76: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/76.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Objective
.Problem Statement..
......Does the number of pairwise equidistant words enforce a lowerbound on the number of dimensions for word embeddings?
’Equidistance’ determined using co-occurrence matrixPlan of Action:
Verify using a toy corpusEvaluate on actual corpus
Kevin Patel Word Embeddings 65/100
![Page 77: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/77.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Motivation (1/4)
Consider the following toy corpus:<>I like cats <>I love dogs <>I hate rats <>I rate bats <>Corresponding co-occurrence matrix:
word <> I like love hate rate rats cats dogs batslike 0 1 0 0 0 0 0 1 0 0love 0 1 0 0 0 0 0 0 1 0hate 0 1 0 0 0 0 1 0 0 0rate 0 1 0 0 0 0 0 0 0 1
Distance between any pair of words =√2
The words form a regular tetrahedron
Kevin Patel Word Embeddings 66/100
![Page 78: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/78.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Motivation (2/4)Mean and Std Dev of Mean of a point’s distance with other points
Dimension Mean Stddev1 0.94 0.942 1.77 0.803 2.63 0.10
−1.5−1.0−0.50.0 0.5 1.0 1.5
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
.hate .like.love .rate
1d(Before)−1.5−1.0−0.50.0 0.5 1.0 1.5
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
.hate.like .love.rate
2d(Before)−1.5−1.0−0.50.0 0.5 1.0 1.5
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
.hate
.like.love
.rate
3d(Before)
−1.5−1.0−0.50.0 0.5 1.0 1.5
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
.hate.like.love .rate
1d(After)−1.5−1.0−0.50.00.51.01.5
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.5
.hate
.like
.love
.rate
2d(After)−2 −1 0 1 2
−2
−1
0
1
2
.hate .like
.love
.rate
3d(After)
Kevin Patel Word Embeddings 67/100
![Page 79: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/79.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Motivation (3/4)Hypothesis:
If the learning algorithm of word embeddings does not getenough dimensions, then it will fail to uphold the equalityconstraint
Standard deviation of the mean of all pairwise distances will behigher
As we increase the dimension, the algorithm will get moredegrees of freedom to model the equality constraint in abetter way
There will be statistically significant changes in the standarddeviation
Once the lower bound of dimensions is reached, the algorithmgets enough degrees of freedom.
From this point onwards, even if we increase dimensions, therewill not be any statistically significant difference in thestandard deviation
Kevin Patel Word Embeddings 68/100
![Page 80: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/80.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Motivation (4/4)
Dim σ P-value Dim σ P-value7 0.358 12 0.154 0.00588 0.293 0.0020 13 0.111 0.00019 0.273 0.0248 14 0.044 0.0001
10 0.238 0.0313 15 0.047 0.309611 0.189 0.0013 16 0.054 0.1659
Avg standard deviation (σ) for 15 pairwise equidistant words (along withtwo tail p-values of Welch’s unpaired t-test for statistical significance)
Kevin Patel Word Embeddings 69/100
![Page 81: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/81.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (1/5)1. Compute the word × word co-occurrence matrix from the corpus
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
word <> I like love hate rate rats cats dogs bats<> 0 4 0 0 0 0 1 1 1 1
I 4 0 1 1 1 1 0 0 0 0like 0 1 0 0 0 0 0 1 0 0love 0 1 0 0 0 0 0 0 1 0hate 0 1 0 0 0 0 1 0 0 0rate 0 1 0 0 0 0 0 0 0 1rats 1 0 0 0 1 0 0 0 0 0cats 1 0 1 0 0 0 0 0 0 0dogs 1 0 0 1 0 0 0 0 0 0bats 1 0 0 0 0 1 0 0 0 0
Kevin Patel Word Embeddings 70/100
![Page 82: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/82.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (2/5)2. Create the corresponding word × word cosine similarity matrix
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
<> I like love hate rate rats cats dogs bats<> 1.0 0.0 0.8 0.8 0.8 0.8 0.0 0.0 0.0 0.0
I 0.0 1.0 0.0 0.0 0.0 0.0 0.8 0.8 0.8 0.8like 0.8 0.0 1.0 0.5 0.5 0.5 0.0 0.0 0.0 0.0love 0.8 0.0 0.5 1.0 0.5 0.5 0.0 0.0 0.0 0.0hate 0.8 0.0 0.5 0.5 1.0 0.5 0.0 0.0 0.0 0.0rate 0.8 0.0 0.5 0.5 0.5 1.0 0.0 0.0 0.0 0.0rats 0.0 0.8 0.0 0.0 0.0 0.0 1.0 0.5 0.5 0.5cats 0.0 0.8 0.0 0.0 0.0 0.0 0.5 1.0 0.5 0.5dogs 0.0 0.8 0.0 0.0 0.0 0.0 0.5 0.5 1.0 0.5bats 0.0 0.8 0.0 0.0 0.0 0.0 0.5 0.5 0.5 1.0
Kevin Patel Word Embeddings 71/100
![Page 83: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/83.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (3/5)3. For each similarity value sk, create a graph, where the words are
nodes, and an edge between node i and node j if sim( i, j) = sk
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
Sim=0.0Kevin Patel Word Embeddings 72/100
![Page 84: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/84.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (3/5)3. For each similarity value sk, create a graph, where the words are
nodes, and an edge between node i and node j if sim( i, j) = sk
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
Sim=0.5Kevin Patel Word Embeddings 72/100
![Page 85: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/85.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (3/5)3. For each similarity value sk, create a graph, where the words are
nodes, and an edge between node i and node j if sim( i, j) = sk
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
Sim=0.8Kevin Patel Word Embeddings 72/100
![Page 86: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/86.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (3/5)3. For each similarity value sk, create a graph, where the words are
nodes, and an edge between node i and node j if sim( i, j) = sk
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
Sim=1.0Kevin Patel Word Embeddings 72/100
![Page 87: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/87.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (4/5)4. Find maximum clique on this graph. The number of nodes inthis clique is the maximum number of pairwise equidistant points
Ek
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
Sim Ek0.5 4
0.8 01.0 0
Sim=0.5Kevin Patel Word Embeddings 73/100
![Page 88: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/88.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (4/5)4. Find maximum clique on this graph. The number of nodes inthis clique is the maximum number of pairwise equidistant points
Ek
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
Sim Ek0.5 40.8 0
1.0 0
Sim=0.8Kevin Patel Word Embeddings 73/100
![Page 89: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/89.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (4/5)4. Find maximum clique on this graph. The number of nodes inthis clique is the maximum number of pairwise equidistant points
Ek
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
Sim Ek0.5 40.8 01.0 0
Sim=1.0Kevin Patel Word Embeddings 73/100
![Page 90: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/90.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach (5/5)5. Reverse lookup Ek to get the number of dimension λ
<> I like cats <> I love dogs <> I hate rats <> I rate bats <>
Sim Ek0.5 40.8 01.0 0
Max 4
λ E λ E3 6 18 614 6 19 765 10 20 966 16 21 126
7<=n<=13 28 22 17614 30 23 27615 36 24<=n<=41 27616 42 42 28817 51 43 344
Kevin Patel Word Embeddings 74/100
![Page 91: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/91.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Evaluation
Used Brown CorpusFound 19 as lower bound using our approachContext window: 1 to the left and 1 to the rightNumber of dimensions: 1 to 355 randomly initialized models for each configuration (averageresults reported)Intrinsic Evaluation
Word Pair Similarity: Predicting sim(wa,wb) usingcorresponding word embeddingsWord Analogy: Finding missing wd in the relation: a is to b asc is to dCategorization: Checking the purity of clusters formed by wordembeddings
Kevin Patel Word Embeddings 75/100
![Page 92: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/92.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Results
Performance for Word Pair Similarity task with respect to number ofdimensions
Kevin Patel Word Embeddings 76/100
![Page 93: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/93.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Results
Performance for Word Analogy task with respect to number of dimensions
Kevin Patel Word Embeddings 76/100
![Page 94: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/94.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Results
Performance for Categorization task with respect to number ofdimensions
Kevin Patel Word Embeddings 76/100
![Page 95: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/95.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
AnalysisFound lower bound consistent with experimental evaluation
Poltawa snakestrike burnings Tsar’smiswritten brows maintained South-Eastfar-famed 27% non-dramas octagonalboatyards U-2 Devol mournersHearing sideshow third-story upcomingpram dolphins Croydon neuromuscularGladius pvt littered annoyingvuhranduh athletes eraser provincialismDaly wreaths villain suspiciousnooks fielder belly Gogol’sinterchange two-to-three resemble discountedkidneys Hangman’s commend accordionsummarizing optimality Orlando Leamingtonswift Taras-Tchaikovsky puts groomedspit firmer rosy-fingered Bechhofercampfire Tomas
Set of pairwise equiangular points (vectors) from Brown corpus
Kevin Patel Word Embeddings 77/100
![Page 96: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/96.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Limitations
The Max Clique finding component of the approachRenders approach intractable for larger corporaNeed to find an alternative
Kevin Patel Word Embeddings 78/100
![Page 97: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/97.jpg)
.
......
Applications of Word EmbeddingsAre Word Embeddings Useful for Sarcasm Detection?
Kevin Patel Word Embeddings 79/100
![Page 98: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/98.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Problem Statement
Detect whether a sentence is sarcastic or not?Especially among those sentences which do not containsentiment bearing words
Example: A woman needs a man just like a fish needs a bicycle
Kevin Patel Word Embeddings 80/100
![Page 99: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/99.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Motivation
Similarity measure among word embeddings a proxy formeasuring contextual incongruityExample: A woman needs a man just like a fish needs a bicycle
similarity(man,woman) = 0.766similarity(fish,bicycle) = 0.131
Imbalance in similarities above an indication of contextualincongruity
Kevin Patel Word Embeddings 81/100
![Page 100: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/100.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Approach
Gist of the approach is adding similarity of word embeddingsbased features, such as
Maximum similarity between all pairs of words in a sentenceMinimum similarity between all pairs of words in a sentence
Kevin Patel Word Embeddings 82/100
![Page 101: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/101.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Evaluation...1 Liebrecht et al. (2013): They consider unigrams, bigrams
and trigrams as features....2 González-Ibánez et al. (2011): Two sets of features:
unigrams and dictionary-based....3 Buschmeier et al. (2014):
Hyperbole (captured by 3 positive or negative words in a row)Quotation marks and ellipsisPositive/Negative Sentiment words followed by an exclamationor question markPositive/Negative Sentiment Scores followed by ellipsis (‘...’)Punctuation, Interjections, and Laughter expressions.
...4 Joshi et al. (2015): In addition to unigrams, they usefeatures based on implicit and explicit incongruity
Implicit incongruity features - patterns with implicit sentiment, extracted in a pre-processing step.Explicit incongruity features - number of sentiment flips, lengthof positive and negative sub-sequences and lexical polarity.
Kevin Patel Word Embeddings 83/100
![Page 102: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/102.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Results
Word Embedding Average F-score GainLSA 0.453Glove 0.651
Dependency 1.048Word2Vec 1.143
Average gain in F-scores for the four types of word embeddings; Thesevalues are computed for a subset of these embeddings consisting of wordscommon to all four
Kevin Patel Word Embeddings 84/100
![Page 103: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/103.jpg)
.
......
Applications of Word EmbeddingsIterative Unsupervised Most Frequent Sense Detection using
Word Embeddings
Kevin Patel Word Embeddings 85/100
![Page 104: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/104.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
WordNet
Groups synonymous words into synsetsSynset example:
Synset ID: 02139199Synset Members: bat, chiropteran Gloss: nocturnal mouselike mammal with forelimbs modified toform membranous wings and anatomical adaptations forecholocation by which they navigateExample: Bats are creatures of the night.
Relations with other synsets (hypernym/hyponym:parent/child, meronym/holonym: part/whole)
Kevin Patel Word Embeddings 86/100
![Page 105: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/105.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Introduction
Word Sense Disambiguation (WSD) : one of the relativelyhard problems in NLP
Both supervised and unsupervised ML explored in literatureMost Frequent Sense (MFS) baseline: strong baseline forWSD
Given a WSD problem instance, simply assign the mostfrequent sense of that word
Ignores contextReally strong results
Due to skew in sense distribution of dataComputing MFS:
Trivial for sense-annotated corpora, which is not available inlarge amounts.Need to learn from raw data
Kevin Patel Word Embeddings 87/100
![Page 106: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/106.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Problem Statement.Problem Statement..
......Given a raw corpus, estimate most frequent sense of differentwords in that corpus
Bhingardive et al. (2015) showed that pretrained wordembeddings can be used to compute most frequent senseOur work further strengthens the claim by Bhingardive et al.(2015) that word embeddings indeed capture most frequentsenseOur approach outperforms others at the task of MFSextractionTo compute MFS using our approach:
...1 Train word embeddings on the raw corpus.
...2 Apply our approach on the trained word embeddings.
Kevin Patel Word Embeddings 88/100
![Page 107: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/107.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Intuition
Strive for consistency in assignment of senses to maintainsemantic congruityExample:
If cricket and bat co-occur a lot, then cricket taking insectsense and bat taking reptile sense is less likely
If cricket and bat co-occur a lot, and cricket’s MFS is sports,then bat taking reptile sense is extremely unlikely
Key point: solve easy words, then use them for difficult wordsIn other words, iterate over degree of polysemy from 2 onward
Kevin Patel Word Embeddings 89/100
![Page 108: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/108.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Intuition
Strive for consistency in assignment of senses to maintainsemantic congruityExample:
If cricket and bat co-occur a lot, then cricket taking insectsense and bat taking reptile sense is less likelyIf cricket and bat co-occur a lot, and cricket’s MFS is sports,then bat taking reptile sense is extremely unlikely
Key point: solve easy words, then use them for difficult wordsIn other words, iterate over degree of polysemy from 2 onward
Kevin Patel Word Embeddings 89/100
![Page 109: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/109.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Algorithm
Kevin Patel Word Embeddings 90/100
![Page 110: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/110.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Algorithm
Kevin Patel Word Embeddings 90/100
![Page 111: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/111.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Algorithm
Kevin Patel Word Embeddings 90/100
![Page 112: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/112.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Algorithm
Kevin Patel Word Embeddings 90/100
![Page 113: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/113.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Algorithm
wisj is vote for sj due to wiTwo components
Wordnet similarity between mfs(wi) and sjEmbedding space similarity between wi and current word
Kevin Patel Word Embeddings 90/100
![Page 114: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/114.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Algorithm
Kevin Patel Word Embeddings 90/100
![Page 115: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/115.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Algorithm
Kevin Patel Word Embeddings 90/100
![Page 116: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/116.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Parameters
KSimilarity MeasureUnweighted (no vector space component) vs. Weighted
Kevin Patel Word Embeddings 91/100
![Page 117: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/117.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Evaluation
Two setups:Evaluating MFS as solution for WSDEvaluating MFS as a classification task
Kevin Patel Word Embeddings 92/100
![Page 118: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/118.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
MFS as solution for WSD
Method Senseval2 Senseval3Bhingardive(reported) 52.34 43.28SemCor(reported) 59.88 65.72Bhingardive 48.27 36.67Iterative 63.2 56.72SemCor 67.61 71.06
Accuracy of WSD using MFS (Nouns)
Kevin Patel Word Embeddings 93/100
![Page 119: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/119.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
MFS as solution for WSD (contd.)
Method Senseval2 Senseval3Bhingardive(reported) 37.79 26.79Bhingardive(optimal) 43.51 33.78Iterative 48.1 40.4SemCor 60.03 60.98Accuracy of WSD using MFS (All Parts of Speech)
Kevin Patel Word Embeddings 94/100
![Page 120: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/120.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
MFS as classification task
Method Nouns Adjectives Adverbs Verbs TotalBhingardive 43.93 81.79 46.55 37.84 58.75Iterative 48.27 80.77 46.55 44.32 61.07
Percentage match between predicted MFS and WFS
Kevin Patel Word Embeddings 95/100
![Page 121: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/121.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
MFS as classification task (contd.)
Nouns(49.20)
Verbs(26.44)
Adjectives(19.22)
Adverbs(5.14) Total
Bhingardive 29.18 25.57 26.00 33.50 27.83Iterative 35.46 31.90 30.43 47.78 34.19
Percentage match between predicted MFS and true SemCor MFS. Notethat numbers in column headers indicate what percent of total wordsbelong to that part of speech
Kevin Patel Word Embeddings 96/100
![Page 122: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/122.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
AnalysisBetter than Bhingardive et al. (2015); not able to beatSemCor and WFS.
There are words for which WFS doesn’t give proper dominantsense. Consider the following examples:
tiger - an audacious personlife - characteristic state or mode of living (social life, city life,real life)option - right to buy or sell property at an agreed priceflavor - general atmosphere of place or situationseason - period of year marked by special events
Tagged words ranking very low to make a significant impact.For example:
While detecting MFS for a bisemous word, the firstmonosemous neighbour actually ranks 1101i.e. a 1000 polysemous words are closer than this monosemousword.Monosemous word may not be the one who can influence theMFS.
Kevin Patel Word Embeddings 97/100
![Page 123: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/123.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Summary
Proposed an iterative approach for unsupervised mostfrequent sense detection using word embeddingsSimilar trends, yet better overall results from Bhingardiveet al. (2015)Future Work
Apply approach to other languages
Kevin Patel Word Embeddings 98/100
![Page 124: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/124.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Conclusion
Discussed why we need word embeddingsBriefly looked at classical word embeddingsDiscussed a few cross-lingual word embeddings andinterpretable word embeddingsMentioned evaluation mechanisms and toolsArgued on existence of lower bounds for number ofdimensions of word embeddingsDiscussed some in-house applications
Kevin Patel Word Embeddings 99/100
![Page 125: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/125.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Thank You
Kevin Patel Word Embeddings 100/100
![Page 126: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/126.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Barg, A. and Yu, W.-H. (2014). New bounds for equiangular lines.Contemporary Mathematics, 625:111–121.
Barone, A. V. M. (2016). Towards cross-lingual distributedrepresentations without parallel text trained with adversarialautoencoders. arXiv preprint arXiv:1608.02996.
Baroni, M., Dinu, G., and Kruszewski, G. (2014). Don’t count,predict! a systematic comparison of context-counting vs.context-predicting semantic vectors. In Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(Volume 1: Long Papers), pages 238–247. Association forComputational Linguistics.
Bengio, Y., Ducharme, R., Vincent, P., and Janvin, C. (2003). Aneural probabilistic language model. J. Mach. Learn. Res.,3:1137–1155.
Kevin Patel Word Embeddings 101/100
![Page 127: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/127.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Bhingardive, S., Singh, D., V, R., Redkar, H., and Bhattacharyya,P. (2015). Unsupervised most frequent sense detection usingword embeddings. In Proceedings of the 2015 Conference of theNorth American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, pages 1238–1243,Denver, Colorado. Association for Computational Linguistics.
Buschmeier, K., Cimiano, P., and Klinger, R. (2014). An impactanalysis of features in a classification approach to ironydetection in product reviews. In Proceedings of the 5thWorkshop on Computational Approaches to Subjectivity,Sentiment and Social Media Analysis, pages 42–49.
Kevin Patel Word Embeddings 102/100
![Page 128: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/128.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
ReferencesCollobert, R. and Weston, J. (2008a). A unified architecture for
natural language processing: deep neural networks withmultitask learning. In Cohen, W. W., McCallum, A., andRoweis, S. T., editors, ICML, volume 307 of ACM InternationalConference Proceeding Series, pages 160–167. ACM.
Collobert, R. and Weston, J. (2008b). A unified architecture fornatural language processing: deep neural networks withmultitask learning. In Cohen, W. W., McCallum, A., andRoweis, S. T., editors, ICML, volume 307 of ACM InternationalConference Proceeding Series, pages 160–167. ACM.
Dumais, S. T., Furnas, G. W., Landauer, T. K., Deerwester, S.,and Harshman, R. (1988). Using latent semantic analysis toimprove access to textual information. In Proceedings of theSIGCHI conference on Human factors in computing systems,pages 281–285. ACM.
Kevin Patel Word Embeddings 103/100
![Page 129: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/129.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
ReferencesFaruqui, M. and Dyer, C. (2014a). Community evaluation and
exchange of word vectors at wordvectors.org. In Proceedings ofthe 52nd Annual Meeting of the Association for ComputationalLinguistics: System Demonstrations, Baltimore, USA.Association for Computational Linguistics.
Faruqui, M. and Dyer, C. (2014b). Improving vector space wordrepresentations using multilingual correlation. Association forComputational Linguistics.
Ghannay, S., Favre, B., Estéve, Y., and Camelin, N. (2016). Wordembedding evaluation and combination. In Chair), N. C. C.,Choukri, K., Declerck, T., Goggi, S., Grobelnik, M., Maegaard,B., Mariani, J., Mazo, H., Moreno, A., Odijk, J., and Piperidis,S., editors, Proceedings of the Tenth International Conferenceon Language Resources and Evaluation (LREC 2016), Paris,France. European Language Resources Association (ELRA).
Kevin Patel Word Embeddings 104/100
![Page 130: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/130.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
ReferencesGonzález-Ibánez, R., Muresan, S., and Wacholder, N. (2011).
Identifying sarcasm in twitter: a closer look. In Proceedings ofthe 49th Annual Meeting of the Association for ComputationalLinguistics: Human Language Technologies: shortpapers-Volume 2, pages 581–586. Association for ComputationalLinguistics.
Gouws, S., Bengio, Y., and Corrado, G. (2015). Bilbowa: Fastbilingual distributed representations without word alignments. InProceedings of the 32nd International Conference on MachineLearning (ICML-15), pages 748–756.
Guo, J., Che, W., Yarowsky, D., Wang, H., and Liu, T. (2015).Cross-lingual dependency parsing based on distributedrepresentations. In ACL (1), pages 1234–1244.
Harris, Z. S. (1970). Distributional structure. Springer.
Kevin Patel Word Embeddings 105/100
![Page 131: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/131.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Hermann, K. M. and Blunsom, P. (2013). Multilingual distributedrepresentations without word alignment. arXiv preprintarXiv:1312.6173.
Huang, K., Gardner, M., Papalexakis, E., Faloutsos, C.,Sidiropoulos, N., Mitchell, T., Talukdar, P. P., and Fu, X.(2015). Translation invariant word embeddings. In Proceedingsof the 2015 Conference on Empirical Methods in NaturalLanguage Processing, pages 1084–1088.
Joshi, A., Sharma, V., and Bhattacharyya, P. (2015). Harnessingcontext incongruity for sarcasm detection. In Proceedings of the53rd Annual Meeting of the Association for ComputationalLinguistics and the 7th International Joint Conference onNatural Language Processing, volume 2, pages 757–762.
Kevin Patel Word Embeddings 106/100
![Page 132: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/132.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Klementiev, A., Titov, I., and Bhattarai, B. (2012). Inducingcrosslingual distributed representations of words. Proceedings ofCOLING 2012, pages 1459–1474.
Kočiskỳ, T., Hermann, K. M., and Blunsom, P. (2014). Learningbilingual word representations by marginalizing alignments. arXivpreprint arXiv:1405.0947.
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., andDyer, C. (2016). Neural architectures for named entityrecognition. arXiv preprint arXiv:1603.01360.
Landauer, T. K. and Dumais, S. T. (1997). A solution to plato’sproblem: The latent semantic analysis theory of acquisition,induction, and representation of knowledge. PSYCHOLOGICALREVIEW, 104(2):211–240.
Kevin Patel Word Embeddings 107/100
![Page 133: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/133.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Levy, O. and Goldberg, Y. (2014). Dependency-based wordembeddings. In Proceedings of the 52nd Annual Meeting of theAssociation for Computational Linguistics, ACL 2014, June22-27, 2014, Baltimore, MD, USA, Volume 2: Short Papers,pages 302–308.
Levy, O., Goldberg, Y., and Dagan, I. (2015). Improvingdistributional similarity with lessons learned from wordembeddings. Transactions of the Association for ComputationalLinguistics, 3:211–225.
Liebrecht, C., Kunneman, F., and van den Bosch, A. (2013). Theperfect solution for detecting sarcasm in tweets# not. Workshopon Computational Approaches to Subjectivity, Sentiment andSocial Media Analysis, WASSA.
Kevin Patel Word Embeddings 108/100
![Page 134: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/134.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Ling, W., Dyer, C., Black, A. W., and Trancoso, I. (2015).Two/too simple adaptations of word2vec for syntax problems. InProceedings of the 2015 Conference of the North AmericanChapter of the Association for Computational Linguistics:Human Language Technologies, pages 1299–1304.
Lund, K. and Burgess, C. (1996a). Producing high-dimensionalsemantic spaces from lexical co-occurrence. Behavior ResearchMethods, Instruments, & Computers, 28(2):203–208.
Lund, K. and Burgess, C. (1996b). Producing high-dimensionalsemantic spaces from lexical co-occurrence. Behavior ResearchMethods, Instruments, & Computers, 28(2):203–208.
Kevin Patel Word Embeddings 109/100
![Page 135: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/135.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
ReferencesLuo, H., Liu, Z., Luan, H., and Sun, M. (2015). Online learning of
interpretable word embeddings. In Proceedings of the 2015Conference on Empirical Methods in Natural LanguageProcessing, pages 1687–1692.
Luong, T., Pham, H., and Manning, C. D. (2015). Bilingual wordrepresentations with monolingual quality in mind. In VS@HLT-NAACL, pages 151–159.
Melamud, O., McClosky, D., Patwardhan, S., and Bansal, M.(2016). The role of context types and dimensionality in learningword embeddings. In Knight, K., Nenkova, A., and Rambow, O.,editors, NAACL HLT 2016, The 2016 Conference of the NorthAmerican Chapter of the Association for ComputationalLinguistics: Human Language Technologies, San DiegoCalifornia, USA, June 12-17, 2016, pages 1030–1040. TheAssociation for Computational Linguistics.
Kevin Patel Word Embeddings 110/100
![Page 136: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/136.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Mikolov, T., Le, Q. V., and Sutskever, I. (2013a). Exploitingsimilarities among languages for machine translation. CoRR,abs/1309.4168.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J.(2013b). Distributed representations of words and phrases andtheir compositionality. In Burges, C., Bottou, L., Welling, M.,Ghahramani, Z., and Weinberger, K., editors, Advances inNeural Information Processing Systems 26, pages 3111–3119.Curran Associates, Inc.
Murphy, B., Talukdar, P., and Mitchell, T. (2012). LearningEffective and Interpretable Semantic Models using Non-NegativeSparse Embedding, pages 1933–1949. Association forComputational Linguistics.
Kevin Patel Word Embeddings 111/100
![Page 137: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/137.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Nayak, N., Angeli, G., and Manning, C. D. (2016). Evaluatingword embeddings using a representative suite of practical tasks.ACL 2016, page 19.
Patel, K., Patel, D., Golakiya, M., Bhattacharyya, P., and Birari,N. (2017). Adapting pre-trained word embeddings for use inmedical coding. In BioNLP 2017, pages 302–306, Vancouver,Canada,. Association for Computational Linguistics.
Pennington, J., Socher, R., and Manning, C. D. (2014). Glove:Global vectors for word representation. Proceedings of theEmpiricial Methods in Natural Language Processing (EMNLP2014), 12.
Rubenstein, H. and Goodenough, J. B. (1965). Contextualcorrelates of synonymy. Commun. ACM, 8(10):627–633.
Kevin Patel Word Embeddings 112/100
![Page 138: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/138.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
ReferencesSahlgren, M. (2005). An introduction to random indexing. In In
Methods and Applications of Semantic Indexing Workshop atthe 7th International Conference on Terminology and KnowledgeEngineering, TKE 2005.
Sahlgren, M. (2006). The Word-Space Model: Using distributionalanalysis to represent syntagmatic and paradigmatic relationsbetween words in high-dimensional vector spaces. PhD thesis,Institutionen för lingvistik.
Schnabel, T., Labutov, I., Mimno, D., and Joachims, T. (2015).Evaluation methods for unsupervised word embeddings. InProceedings of the Conference on Empirical Methods in NaturalLanguage Processing (EMNLP), pages 298–307.
Schütze, H. (1992). Dimensions of meaning. InSupercomputing’92., Proceedings, pages 787–796. IEEE.
Kevin Patel Word Embeddings 113/100
![Page 139: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/139.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Søgaard, A., Agić, Ž., Alonso, H. M., Plank, B., Bohnet, B., andJohannsen, A. (2015). Inverted indexing for cross-lingual nlp. InThe 53rd Annual Meeting of the Association for ComputationalLinguistics and the 7th International Joint Conference of theAsian Federation of Natural Language Processing (ACL-IJCNLP2015).
Swanepoel, K. J. (2004). Equilateral sets in finite-dimensionalnormed spaces. In Seminar of Mathematical Analysis,volume 71, pages 195–237. Secretariado de Publicationes,Universidad de Sevilla, Seville.
Vyas, Y. and Carpuat, M. (2016). Sparse bilingual wordrepresentations for cross-lingual lexical entailment. InHLT-NAACL, pages 1187–1197.
Kevin Patel Word Embeddings 114/100
![Page 140: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/140.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
References
Zhai, M., Tan, J., and Choi, J. D. (2016). Intrinsic and extrinsicevaluations of word embeddings. In Proceedings of the ThirtiethAAAI Conference on Artificial Intelligence, AAAI’16, pages4282–4283. AAAI Press.
Kevin Patel Word Embeddings 115/100
![Page 141: Word Embeddings - Deep Learning for NLPshad.pcs15/data/we-kevin.pdf · Word Embeddings Deep Learning for NLP Kevin Patel ICON 2017 December 21, 2017 Kevin Patel Word Embeddings 1/100](https://reader033.vdocument.in/reader033/viewer/2022053004/5f07efb17e708231d41f80e1/html5/thumbnails/141.jpg)
Introduction Word Embeddings Evaluating WordEmbeddings
Discussion on LowerBounds
Applications ofWord Embeddings
Conclusion References
.
Web References
http://www.indiana.edu/~gasser/Q530/Notes/representation.html
Kevin Patel Word Embeddings 116/100