cs11-747 neural networks for nlp introduction, bag-of
TRANSCRIPT
![Page 1: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/1.jpg)
CS11-747 Neural Networks for NLP
Introduction, Bag-of-words, and
Multi-layer PerceptronGraham Neubig
Sitehttps://phontron.com/class/nn4nlp2021/
![Page 2: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/2.jpg)
Language is Hard!
![Page 3: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/3.jpg)
Are These Sentences OK?• Jane went to the store.
• store to Jane went the.
• Jane went store.
• Jane goed to the store.
• The store went to Jane.
• The food truck went to Jane.
![Page 4: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/4.jpg)
Engineering Solutions• Jane went to the store.
• store to Jane went the.
• Jane went store.
• Jane goed to the store.
• The store went to Jane.
• The food truck went to Jane.
} Create a grammar of the language
} Consider morphology and exceptions
} Semantic categories, preferences
} And their exceptions
![Page 5: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/5.jpg)
Are These Sentences OK?• ジェインは店へ⾏った。
• は店⾏ったジェインは。
• ジェインは店へ⾏た。
• 店はジェインへ⾏った。
• 屋台はジェインのところへ⾏った。
![Page 6: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/6.jpg)
Phenomena to Handle• Morphology
• Syntax
• Semantics/World Knowledge
• Discourse
• Pragmatics
• Multilinguality
![Page 7: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/7.jpg)
Neural Nets for NLP
• Neural nets are a tool to do hard things!
• This class will give you the tools to handle the problems you want to solve in NLP.
![Page 8: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/8.jpg)
Class Format/Structure
![Page 9: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/9.jpg)
(Special Remote) Class Format
• Before class: Watch lecture video, often do reading
• During class:• Discussion: Gather in Zoom to discuss some
questions presented in the video • Code/Data Walk: The TAs (or instructor) will
sometimes walk through some demonstration code, data, or model predictions
• After class: Do quiz about material
![Page 10: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/10.jpg)
Scope of Teaching• Basics of general neural network knowledge
-> Covered briefly (see reading and ask TAs if you are not familiar). Will have recitation.
• Advanced training techniques for neural networks -> Some coverage, like VAEs and adversarial training, mostly from the scope of NLP, not as much as other DL classes
• Advanced NLP-related neural network architectures -> Covered in detail
• Structured prediction and structured models in neural nets -> Covered in detail
• Implementation details salient to NLP -> Covered in detail
![Page 11: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/11.jpg)
Assignments• Assignment 1 - Build-your-own Neural Network Toolkit:
Individually implement some parts of a neural network • Assignment 2 - Text Classifier / Questionnaire:
Individually implement a text classifier and fill in questionnaire on topics of interest
• Assignment 3 - SOTA Survey / Re-implementation: Re-implement and reproduce results from a recently published paper
• Assignment 4 - Final Project: Perform a unique project that either (1) improves on state-of-the-art, or (2) applies neural net models to a unique task
![Page 12: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/12.jpg)
Instructors• Instructor: Graham Neubig (natural language analysis,
multilingual NLP, ML for NLP) • Co-Instructor: Pengfei Liu (text summarization, information
extraction, and interpretable evaluation) • TAs:
• Shuyan Zhou (natural language command and control) • Zhisong Zhang (syntax and shallow semantic analysis) • Divyansh Kaushik (robustness, causality, human-in-the-loop) • Zhengbao Jiang (knowledge and large language models) • Ritam Dutt (AI for social good, discourse and pragmatics)
• Piazza: http://piazza.com/cmu/spring2021/cs11747/home
![Page 13: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/13.jpg)
Neural Networks: A Tool for Doing Hard Things
![Page 14: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/14.jpg)
An Example Prediction Problem: Sentence Classification
I hate this movie
I love this movie
very good good
neutral bad
very bad
very good good
neutral bad
very bad
![Page 15: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/15.jpg)
A First Try: Bag of Words (BOW)
I hate this movie
lookup lookup lookup lookup
+ + + +
bias
=
scores
softmax
probs
![Page 16: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/16.jpg)
What do Our Vectors Represent?
• Each word has its own 5 elements corresponding to [very good, good, neutral, bad, very bad]
• “hate” will have a high value for “very bad”, etc.
![Page 17: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/17.jpg)
Build It, Break It
There’s nothing I don’t love about this movie
very good good
neutral bad
very bad
I don’t love this movie
very good good
neutral bad
very bad
![Page 18: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/18.jpg)
Combination Features
• Does it contain “don’t” and “love”?
• Does it contain “don’t”, “i”, “love”, and “nothing”?
![Page 19: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/19.jpg)
Basic Idea of Neural Networks (for NLP Prediction Tasks)
I hate this movie
lookup lookup lookup lookup
softmax
probs
some complicated function to extract
combination features (neural net)
scores
![Page 20: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/20.jpg)
Continuous Bag of Words (CBOW)
I hate this movie
+
bias
=
scores
+ + +
lookup lookup lookuplookup
W
=
![Page 21: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/21.jpg)
What do Our Vectors Represent?
• Each vector has “features” (e.g. is this an animate object? is this a positive word, etc.)
• We sum these features, then use these to make predictions
• Still no combination features: only the expressive power of a linear model, but dimension reduced
![Page 22: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/22.jpg)
Deep CBOWI hate this movie
+
bias
=
scores
W
+ + +=
tanh( W1*h + b1)
tanh( W2*h + b2)
![Page 23: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/23.jpg)
What do Our Vectors Represent?
• Now things are more interesting!
• We can learn feature combinations (a node in the second layer might be “feature 1 AND feature 5 are active”)
• e.g. capture things such as “not” AND “hate”
![Page 24: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/24.jpg)
What is a Neural Net?: Computation Graphs
![Page 25: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/25.jpg)
“Neural” NetsOriginal Motivation: Neurons in the Brain
Image credit: Wikipedia
Current Conception: Computation Graphs
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xi
![Page 26: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/26.jpg)
y = x>Ax+ b · x+ c
A node is a {tensor, matrix, vector, scalar} value
expression:
x
graph:
![Page 27: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/27.jpg)
y = x>Ax+ b · x+ c
x
expression:
graph:
An edge represents a function argument (and also an data dependency). They are just pointers to nodes.A node with an incoming edge is a function of that edge’s tail node.
f(u) = u>
A node knows how to compute its value and the value of its derivative w.r.t each argument (edge) times a derivative of an arbitrary input .@F
@f(u)
@f(u)
@u
@F@f(u)
=
✓@F
@f(u)
◆>
![Page 28: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/28.jpg)
y = x>Ax+ b · x+ c
x
f(u) = u>
A
f(U,V) = UV
expression:
graph:
Functions can be nullary, unary, binary, … n-ary. Often they are unary or binary.
![Page 29: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/29.jpg)
y = x>Ax+ b · x+ c
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
expression:
graph:
Computation graphs are directed and acyclic (in DyNet)
![Page 30: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/30.jpg)
y = x>Ax+ b · x+ c
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
x A
f(x,A) = x>Ax
@f(x,A)
@A= xx>
@f(x,A)
@x= (A> +A)x
expression:
graph:
![Page 31: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/31.jpg)
y = x>Ax+ b · x+ c
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xi
expression:
graph:
![Page 32: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/32.jpg)
y = x>Ax+ b · x+ c
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
yf(x1, x2, x3) =
X
i
xi
expression:
graph:
variable names are just labelings of nodes.
![Page 33: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/33.jpg)
Algorithms (1)
• Graph construction
• Forward propagation
• In topological order, compute the value of the node given its inputs
![Page 34: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/34.jpg)
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xigraph:
Forward Propagation
![Page 35: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/35.jpg)
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xigraph:
Forward Propagation
![Page 36: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/36.jpg)
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xigraph:
Forward Propagation
![Page 37: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/37.jpg)
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xigraph:
x>
Forward Propagation
![Page 38: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/38.jpg)
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xigraph:
x>
x>A
Forward Propagation
![Page 39: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/39.jpg)
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xigraph:
x>
x>A
b · x
Forward Propagation
![Page 40: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/40.jpg)
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xigraph:
x>
x>A
b · x
x>Ax
Forward Propagation
![Page 41: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/41.jpg)
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xigraph:
x>
x>A
b · x
x>Ax
Forward Propagation
x>Ax+ b · x+ c
![Page 42: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/42.jpg)
Algorithms (2)• Back-propagation:
• Process examples in reverse topological order • Calculate the derivatives of the parameters with
respect to the final value (This is usually a “loss function”, a value we want to minimize)
• Parameter update:• Move the parameters in the direction of this
derivative W -= α * dl/dW
![Page 43: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/43.jpg)
x
f(u) = u>
A
f(U,V) = UV
f(M,v) = Mv
b
f(u,v) = u · v
c
f(x1, x2, x3) =X
i
xigraph:
Back Propagation
![Page 44: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/44.jpg)
Concrete Implementation Examples
![Page 45: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/45.jpg)
Neural Network Frameworks
![Page 46: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/46.jpg)
Basic Process in (Dynamic) Neural Network Frameworks
• Create a model
• For each example
• create a graph that represents the computation you want
• calculate the result of that computation
• if training, perform back propagation and update
![Page 47: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/47.jpg)
Bag of Words (BOW)I hate this movie
lookup lookup lookup lookup
+ + + +
bias
=
scores
softmax
probs
![Page 48: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/48.jpg)
Continuous Bag of Words (CBOW)
I hate this movie
+
bias
=
scores
+ + +
lookup lookup lookuplookup
W
=
![Page 49: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/49.jpg)
Deep CBOWI hate this movie
+
bias
=
scores
W
+ + +=
tanh( W1*h + b1)
tanh( W2*h + b2)
![Page 50: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/50.jpg)
Things to Remember Going Forward
![Page 51: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/51.jpg)
Things to Remember• Neural nets are powerful!
• They are universal function approximators, can calculate any continuous function
• But language is hard, and data is limited.
• We need to design our networks to have inductive bias, to make it easy to learn things we’d like to learn.
![Page 52: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/52.jpg)
Class Plan
![Page 53: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/53.jpg)
Topic 1: Models of Sentences/Sequences
undeserved
NN
• Bag of words, bag of n-grams • Convolutional nets • Recurrent neural networks and variations • Modeling documents and longer texts
this movie’s reputation is
![Page 54: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/54.jpg)
Topic 2: Implementing, Debugging, and Interpreting
• Implementation: How to efficiently and effectively implement your models
• Debugging: How to find problems in your implemented models
• Interpretation: How to find why your model made a prediction?
Example: [Ribeiro+ 16]
![Page 55: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/55.jpg)
Topic 3: Conditioned GenerationI hate this movie
LSTM LSTM LSTM LSTM LSTM
</s>
LSTM LSTM LSTM LSTM
この 映画 が 嫌い
argmax
この 映画argmax
がargmax
嫌いargmax
</s>argmax
• Encoder decoder models
• Attentional models, self-attention (Transformers)
![Page 56: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/56.jpg)
Topic 4: Pre-trained Embeddings
• Pre-training word embeddings, contextualized word embeddings, sentence embeddings
• Design decisions in pre-training: model, data, objective
![Page 57: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/57.jpg)
Topic 5: Structured Prediction Models
I hate this movie
LSTM LSTM LSTM LSTM
LSTM LSTM LSTM LSTM
PRP VB DT NN• CRFs, and other marginalization-based training • REINFORCE, minimum risk training • Margin-based and search-based training methods • Advanced search algorithms
![Page 58: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/58.jpg)
Topic 6: Models of Tree/Graph Structures
• Shift reduce, minimum spanning tree parsing
• Tree structured compositions
• Models of graph structures
I hate this movie
RNN
RNN
RNN
![Page 59: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/59.jpg)
Topic 7: Advanced Learning Techniques
• Models with Latent Random Variables
• Adversarial Networks
• Semi-supervised and Unsupervised Learning
![Page 60: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/60.jpg)
Topic 8: Knowledge-based and Text-based QA
• Learning and QA over knowledge graphs
• Machine reading and text-based QA
animal
dog cat
is-a is-a
![Page 61: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/61.jpg)
Topic 9: Multi-task and Multilingual Learning
• Multi-task and transfer learning
• Multilingual learning of representations
I hate this movieこの 映画 が 嫌い
PRP VB DT NN
![Page 62: CS11-747 Neural Networks for NLP Introduction, Bag-of](https://reader034.vdocument.in/reader034/viewer/2022052013/62860ae211940f6e260a3f91/html5/thumbnails/62.jpg)
Any Questions?