nlp and word embeddings - cs230.stanford.edu · andrew ng visualizing word embeddings fish dog cat...
TRANSCRIPT
![Page 1: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/1.jpg)
deeplearning.ai
NLPandWordEmbeddings
Wordrepresentation
![Page 2: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/2.jpg)
AndrewNg
Word representationV = [a, aaron, …, zulu, <UNK>]
1-hot representation
Apple(456)
Orange(6257) I want a glass of orange ______.
I want a glass of apple______.
King(4914)000⋮1⋮000
Woman(9853)00000⋮1⋮0
Man(5391)
0000⋮1⋮00
Queen(7157)
00000⋮1⋮0
0⋮1⋮00000
00000⋮1⋮0
![Page 3: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/3.jpg)
AndrewNg
Featurized representation: word embeddingApple(456)
Orange(6257)
King(4914)
Woman(9853)
Man(5391)
Queen(7157)
I want a glass of orange ______.I want a glass of apple______.
-0.95 0.97 0.00 0.01
0.93 0.95 -0.01 0.00
0.7 0.69 0.03 -0.02
0.02 0.01 0.95 0.97
![Page 4: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/4.jpg)
AndrewNg
Visualizing word embeddings
fish
dogcat
applegrape
orangeonethree
two
four
king
man
queen
woman
[van der Maaten and Hinton., 2008. Visualizing data using t-SNE]
![Page 5: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/5.jpg)
deeplearning.ai
NLPandWordEmbeddings
Usingwordembeddings
![Page 6: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/6.jpg)
AndrewNg
Named entity recognition example
Sally Johnson is an orange farmer
1 1 0 0 0 0
Robert Lin is an apple farmer
![Page 7: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/7.jpg)
AndrewNg
Transfer learning and word embeddings
1. Learn word embeddings from large text corpus. (1-100B words)
(Or download pre-trained embedding online.)
2. Transfer embedding to new task with smaller training set. (say, 100k words)
3. Optional: Continue to finetune the word embeddings with newdata.
![Page 8: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/8.jpg)
AndrewNg
Relation to face encoding
⋮
$(&)
⋮
⋮
$(()
)*
[Taigman et. al., 2014. DeepFace: Closing the gap to human level performance]
f($(&))
f($(())
![Page 9: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/9.jpg)
deeplearning.ai
NLPandWordEmbeddings
Propertiesofwordembeddings
![Page 10: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/10.jpg)
AndrewNg
AnalogiesApple(456)
Orange(6257)
King(4914)
Woman(9853)
Man(5391)
Queen(7157)
Gender
Royal
Age
Food
−10.010.030.09
10.020.020.01
-0.950.930.700.02
0.970.950.690.01
0.00-0.010.030.95
0.010.00-0.020.97
[Mikolov et. al., 2013, Linguistic regularities in continuous space word representations]
![Page 11: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/11.jpg)
AndrewNg
Analogies using word vectorsfish
dog
cat
applegrape
orangeone
three
two
four
kingman
queen
woman
()*+ − (,-)*+ ≈ (/0+1 − (?
![Page 12: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/12.jpg)
AndrewNg
Cosine similarity
345((,, (/0+1 − ()*+ + (,-)*+)
Man:Woman as Boy:GirlOttawa:Canada as Nairobi:KenyaBig:Bigger as Tall:TallerYen:Japan as Ruble:Russia
![Page 13: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/13.jpg)
deeplearning.ai
NLPandWordEmbeddings
Embeddingmatrix
![Page 14: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/14.jpg)
AndrewNg
Embedding matrix
In practice, use specialized function to look up an embedding.
![Page 15: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/15.jpg)
deeplearning.ai
NLPandWordEmbeddings
Learningwordembeddings
![Page 16: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/16.jpg)
AndrewNg
Neural language modelI want a glass of orange ______.4343 9665 1 3852 6163 6257
I
want
a
glass
of
orange
*+,+,
*-../*0*,1/2*.0.,*.2/3
4
44
44
4
5+,+,
5-../505,1/25.0.,
5.2/3[Bengio et. al., 2003, A neural probabilistic language model]
![Page 17: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/17.jpg)
AndrewNg
Other context/target pairsI want a glass of orange juice to go along with my cereal.
Context: Last 4 words.
4 words on left & right
Last 1 word
Nearby 1 word
![Page 18: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/18.jpg)
deeplearning.ai
NLPandWordEmbeddings
Word2Vec
![Page 19: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/19.jpg)
AndrewNg
Skip-gramsI want a glass of orange juice to go along with my cereal.
[Mikolov et. al., 2013. Efficient estimation of word representations in vector space.]
![Page 20: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/20.jpg)
AndrewNg
ModelVocab size = 10,000k
![Page 21: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/21.jpg)
AndrewNg
Problems with softmax classification
! " # = %&'()*∑ %&,()*-.,...01-
How to sample the context #?
![Page 22: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/22.jpg)
deeplearning.ai
NLPandWordEmbeddings
Negativesampling
![Page 23: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/23.jpg)
AndrewNg
Defining a new learning problem
I want a glass of orange juice to go along with my cereal.
[Mikolov et. al., 2013. Distributed representation of words and phrases and their compositionality]
![Page 24: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/24.jpg)
AndrewNg
Model
Softmax: ! " # = %&'()*∑ %&,()*-.,...01-
context wordorangeorangeorange
juicekingbook
target?
theof
orangeorange
100
00
![Page 25: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/25.jpg)
AndrewNg
Selecting negative examples
context wordorangeorangeorange
juicekingbook
target?
theof
orangeorange
100
00
![Page 26: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/26.jpg)
deeplearning.ai
NLPandWordEmbeddings
GloVe wordvectors
![Page 27: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/27.jpg)
AndrewNg
GloVe (global vectors for word representation)
I want a glass of orange juice to go along with my cereal.
[Pennington et. al., 2014. GloVe: Global vectors for word representation]
![Page 28: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/28.jpg)
AndrewNg
Model
![Page 29: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/29.jpg)
AndrewNg
A note on the featurization view of word embeddings
minimize ∑ ∑ ( )*+ ,*-.+ + 0* − 0+2 − log)*+678,888
+:778,888*:7
King(4914)
Woman(9853)
Man(5391)
Queen(7157)
-0.950.930.700.02
0.970.950.690.01
−10.010.030.09
10.020.020.01
GenderRoyalAgeFood
![Page 30: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/30.jpg)
deeplearning.ai
NLPandWordEmbeddings
Sentimentclassification
![Page 31: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/31.jpg)
AndrewNg
Sentiment classification problem! "
The dessert is excellent.
Service was quite slow.
Good for a quick meal, but nothing special.
Completely lacking in good taste, good service, and good ambience.
![Page 32: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/32.jpg)
AndrewNg
Simple sentiment classification model
The
desert
is
excellent
#$%&$
#&'($
#'(%'
#)*$+
,
,
,
,
-$%&$
-&'($
-'(%'
-)*$+
8928 2468 4694 3180The dessert is excellent
“Completely lacking in good taste, good service, and good ambience.”
![Page 33: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/33.jpg)
AndrewNg
RNN for sentiment classification
Completely lacking in good …. ambience
, , , , ,
-*$6& -'%(( -''&7 -)$$& -))+
"8
softmax
⋯:;+< :;*< :;&< :;)< :;'< :;*+<
![Page 34: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/34.jpg)
deeplearning.ai
NLPandWordEmbeddings
Debiasingwordembeddings
![Page 35: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/35.jpg)
AndrewNg
The problem of bias in word embeddings
[Bolukbasi et. al., 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings]
Man:Woman as King:Queen
Man:Computer_Programmer as Woman:
Father:Doctor as Mother:
Word embeddings can reflect gender, ethnicity, age, sexual orientation, and other biases of the text used to train the model.
Homemaker
Nurse
![Page 36: NLP and Word Embeddings - cs230.stanford.edu · Andrew Ng Visualizing word embeddings fish dog cat apple grape one orange three two four king man queen woman [van der Maaten and Hinton.,](https://reader031.vdocument.in/reader031/viewer/2022022808/5e065449d82b6440840f244e/html5/thumbnails/36.jpg)
AndrewNg
Addressing bias in word embeddings1. Identify bias direction.
2. Neutralize: For every word that is not definitional, project to get rid of bias.
3. Equalize pairs.
[Bolukbasi et. al., 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings]