lecture 8: nlp and word embeddingsfall97.class.vision/slides/8.pdf · transfer learning and word...
TRANSCRIPT
![Page 1: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/1.jpg)
Lecture 8 -SRTTU – A.Akhavan 1 ۱۳۹۷آبان ۱۹شنبه،
Lecture 8: NLP and Word Embeddings
Alireza Akhavan Pour
CLASS.VISION
![Page 2: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/2.jpg)
Lecture 8 -SRTTU – A.Akhavan
NLP and Word EmbeddingsWord representation
2 ۱۳۹۷آبان ۱۹شنبه،
V = [a, aaron, ..., zulu, <UNK>]1-hot representation
𝑽 = 𝟏𝟎, 𝟎𝟎𝟎
I want a glass of orange ______ .
I want a glass of apple ______ .
𝑂5391 𝑂9853
𝑗𝑢𝑖𝑐𝑒
?
مشکل؟.فاصله اقلیدسی تمام بردارها یکسان است.کندgeneralizeنمیتواند از روی کلماتی که در آموزش دیده است
![Page 3: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/3.jpg)
Lecture 8 -SRTTU – A.Akhavan
Featurized representation: word embedding
3 ۱۳۹۷آبان ۱۹شنبه،
1-جنسیت 1 -0.95 0.97 0.00 0.01
0.94سلطنتی 0.93 -0.01 0.000.020.01
0.71سن 0.69 0.03 -0.020.020.03
0.02خوراکی 0.00 0.96 -0.97-0.010.01
سایززنده
قیمتفعل
....
300
...
𝒆𝟔𝟐𝟓𝟕
...
𝒆𝟒𝟓𝟔
I want a glass of orange ______ .
I want a glass of apple ______ .
𝑗𝑢𝑖𝑐𝑒
𝑗𝑢𝑖𝑐𝑒
![Page 4: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/4.jpg)
Lecture 8 -SRTTU – A.Akhavan
Visualizing word embeddings
4 ۱۳۹۷آبان ۱۹شنبه،
[van der Maaten and Hinton., 2008.Visualizing data using t-SNE]
![Page 5: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/5.jpg)
Lecture 8 -SRTTU – A.Akhavan
Using word embeddings:Named entity recognition example
5 ۱۳۹۷آبان ۱۹شنبه،
Robert Lin is an apple farmer
Sally Johnson is an orange farmer
Robert Lin is a durian cultivator
![Page 6: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/6.jpg)
Lecture 8 -SRTTU – A.Akhavan
Now if you have tested your model with this sentence "Robert Lin is a durian cultivator“ the network should learn the name even if it hasn't seen the word durian before (during training). That's the power of word representations.
The algorithms that are used to learn word embeddings can examine billions of words of unlabeled text - for example, 100 billion words and learn the representation from them.
6 ۱۳۹۷آبان ۱۹شنبه،
Using word embeddings:Named entity recognition example
![Page 7: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/7.jpg)
Lecture 8 -SRTTU – A.Akhavan
Transfer learning and word embeddings
I. Learn word embeddings from large text corpus (1-100 billion of words).
Or download pre-trained embedding online.
II. Transfer embedding to new task with the smaller training set (say, 100k words).
III. Optional: continue to finetune the word embeddings with new data. You bother doing this if your smaller training set (from step 2) is big enough.
7 ۱۳۹۷آبان ۱۹شنبه،
10.000مثال به جای وکتور . یک ویژگی مثبت دیگر کاهش ابعاد ورودی است.بعدی کار خواهیم کرد300با یک وکتور one-hotبعدی
![Page 8: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/8.jpg)
Lecture 8 -SRTTU – A.Akhavan
Relation to face encoding (Embeddings)
8 ۱۳۹۷آبان ۱۹شنبه،
[Taigman et. al., 2014. DeepFace: Closing the gap to human level performance]
Word embeddings have an interesting relationship to the face recognition task:o In this problem, we encode each face into a vector and then check how similar
are these vectors.o Words encoding and embeddings have a similar meaning here.
In the word embeddings task, we are learning a representation for each word in our vocabulary (unlike in image encoding where we have to map each new image to some n-dimensional vector).
![Page 9: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/9.jpg)
Lecture 8 -SRTTU – A.Akhavan
Properties of word embeddings
• Analogies
9 ۱۳۹۷آبان ۱۹شنبه،
[Mikolov et. al., 2013, Linguistic regularities in continuous space word representations]
Can we conclude this relation: Man ==> Woman King ==> ??
𝒆𝑴𝒂𝒏 𝒆𝑾𝒐𝒎𝒂𝒏 𝒆𝑲𝒊𝒏𝒈 𝒆𝑸𝒖𝒆𝒆𝒏
eMan – eWoman ≈
−2000
eKing – eQ𝐮𝐞𝐞𝐧 ≈
−2000
![Page 10: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/10.jpg)
Lecture 8 -SRTTU – A.Akhavan
Analogies using word vectors
10 ۱۳۹۷آبان ۱۹شنبه،
300 D
man
woman
King
Queen
𝑭𝒊𝒏𝒅 𝒘𝒐𝒓𝒅 𝒘: 𝑎𝑟𝑔max𝑤
൯𝑠𝑖𝑚(𝑒𝑤, 𝑒𝑘𝑖𝑛𝑔 − 𝑒𝑚𝑎𝑛 + 𝑒𝑤𝑜𝑚𝑎𝑛
𝒆𝒘
t-SNE
![Page 11: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/11.jpg)
Lecture 8 -SRTTU – A.Akhavan
Cosine similarity
11 ۱۳۹۷آبان ۱۹شنبه،
𝑠𝑖𝑚 𝑢, 𝑣 =𝑈𝑇𝑉
𝑢 2 𝑣 2
൯𝑠𝑖𝑚(𝑒𝑤, 𝑒𝑘𝑖𝑛𝑔 − 𝑒𝑚𝑎𝑛 + 𝑒𝑤𝑜𝑚𝑎𝑛
𝑉
𝑈𝑉
𝑈
1
-1
0
𝑈 − 𝑉 2
:فاصله اقلیدسی
Ottawa:Canada as Iran:Tehran
Man:Woman as boy:girl
Big:bigger as tall:taller
Yen: Japan as Ruble:Russia
![Page 12: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/12.jpg)
Lecture 8 -SRTTU – A.Akhavan
Embedding matrix
12 ۱۳۹۷آبان ۱۹شنبه،
here … orange example ... <UNK>
-0.2 … -0.67 -0.2 … 0.2
0.7 … 0.3 -0.5 … 0.1
0.85 … 0.25 0.3 … 1
-0.04 … -0.18 0.33 … -0.1
...
0.5 … 1 0.3 … 0.2
300
10.000
6257
0000.:1.:00
6257
10
.00
0
𝑂6257
𝐸
![Page 13: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/13.jpg)
Lecture 8 -SRTTU – A.Akhavan
Embedding matrix
13 ۱۳۹۷آبان ۱۹شنبه،
here … orange example ... <UNK>
-0.2 … -0.67 -0.2 … 0.2
0.7 … 0.3 -0.5 … 0.1
0.85 … 0.25 0.3 … 1
-0.04 … -0.18 0.33 … -0.1
...
0.5 … 1 0.3 … 0.2
300
10.000
6257
0000.:1.:00
6257
10
.00
0
𝑂6257
𝐸
𝑬 . 𝑶𝟔𝟐𝟓𝟕 = 𝒆𝟔𝟐𝟓𝟕
![Page 14: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/14.jpg)
Lecture 8 -SRTTU – A.Akhavan
Embedding matrix
14 ۱۳۹۷آبان ۱۹شنبه،
here … orange example ... <UNK>
-0.2 … -0.67 -0.2 … 0.2
0.7 … 0.3 -0.5 … 0.1
0.85 … 0.25 0.3 … 1
-0.04 … -0.18 0.33 … -0.1
...
0.5 … 1 0.3 … 0.2
𝑬 .𝑶𝟔𝟐𝟓𝟕 = 𝒆𝟔𝟐𝟓𝟕
300x10k
0000.:1.:00
10k x 1 300 x 1
If O6257 is the one hot encoding of the word orange of shape (10000, 1), thennp.dot(E,O6257) = e6257 which shape is (300, 1).
Generally np.dot(E, Oj) = ej (embedding for word j)
![Page 15: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/15.jpg)
Lecture 8 -SRTTU – A.Akhavan
Embedding matrix
15 ۱۳۹۷آبان ۱۹شنبه،
The Embedding layer is best understood as a dictionary mapping integer indices (which stand for specific words) to dense vectors. It takes as input integers, it looks up these integers into an internal dictionary, and it returns the associated vectors. It's effectively a dictionary lookup.
https://keras.io/layers/embeddings/
![Page 16: Lecture 8: NLP and Word Embeddingsfall97.class.vision/slides/8.pdf · Transfer learning and word embeddings I. Learn word embeddings from large text corpus (1-100 billion of words)](https://reader034.vdocument.in/reader034/viewer/2022043006/5f609954eb71df2ea40d42b8/html5/thumbnails/16.jpg)
Lecture 8 -SRTTU – A.Akhavan 16 ۱۳۹۷آبان ۱۹شنبه،
منابع
• https://www.coursera.org/specializations/deep-learning
• https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/6.1-using-word-embeddings.ipynb
• https://github.com/mbadry1/DeepLearning.ai-Summary/