![Page 1: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/1.jpg)
1
Yulia Tsvetkov
Algorithms for NLP
IITP, Fall 2019
Lecture 21: Machine Translation I
![Page 2: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/2.jpg)
Machine Translation
![Page 3: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/3.jpg)
![Page 4: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/4.jpg)
from Dream of the Red Chamber Cao Xue Qin (1792)
![Page 5: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/5.jpg)
![Page 6: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/6.jpg)
English: leg, foot, paw
French: jambe, pied, patte, etape
![Page 7: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/7.jpg)
ChallengesAmbiguities
▪ words▪ morphology▪ syntax▪ semantics▪ pragmatics
Gaps in data
▪ availability of corpora▪ commonsense knowledge
+Understanding of context, connotation, social norms, etc.
![Page 8: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/8.jpg)
Classical Approaches to MT
▪ Direct translation: word-by-word dictionary translation▪ Transfer approaches: word dictionary + rules
morphological analysis
lexical transfer using bilingual dictionary
local reordering
morphological generation
source language text
target language text
![Page 9: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/9.jpg)
![Page 10: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/10.jpg)
Levels of Transfer
![Page 11: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/11.jpg)
Levels of Transfer
▪ Syntactic transfer
![Page 12: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/12.jpg)
Levels of Transfer
▪ Syntactic transfer
![Page 13: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/13.jpg)
Levels of Transfer
▪ Syntactic transfer
![Page 14: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/14.jpg)
Levels of Transfer
▪ Semantic transfer
![Page 15: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/15.jpg)
Levels of Transfer
![Page 16: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/16.jpg)
Levels of Transfer
![Page 17: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/17.jpg)
Levels of Transfer
![Page 18: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/18.jpg)
The Vauquois Triangle (1968)
![Page 19: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/19.jpg)
![Page 20: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/20.jpg)
Levels of Transfer: The Vauquois triangle
![Page 21: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/21.jpg)
![Page 22: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/22.jpg)
Statistical approaches
![Page 23: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/23.jpg)
Statistical MT
Modeling correspondences between languages
Sentence-aligned parallel corpus:
Yo lo haré mañanaI will do it tomorrow
Hasta prontoSee you soon
Hasta prontoSee you around
Yo lo haré prontoNovel Sentence
I will do it soon
I will do it around
See you tomorrow
Machine translation system:
Model of translation
![Page 24: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/24.jpg)
Research Problems
▪ How can we formalize the process of learning to translate from examples?
▪ How can we formalize the process of finding translations for new inputs?
▪ If our model produces many outputs, how do we find the best one?
▪ If we have a gold standard translation, how can we tell if our output is good or bad?
![Page 25: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/25.jpg)
Two Views of MT
![Page 26: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/26.jpg)
MT as Code Breaking
![Page 27: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/27.jpg)
27
The Noisy-Channel Model
source W Anoisy channel
![Page 28: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/28.jpg)
28
source W Anoisy channel
decoder observed w a
best
The Noisy-Channel Model
![Page 29: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/29.jpg)
29
source W Anoisy channel
decoder observed w a
best
▪ We want to predict a sentence given acoustics:
The Noisy-Channel Model
![Page 30: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/30.jpg)
30
▪ We want to predict a sentence given acoustics:
▪ The noisy-channel approach:
The Noisy-Channel Model
![Page 31: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/31.jpg)
31
▪ The noisy-channel approach:
channel model source model
source W Anoisy channel
decoderobserved
w abest
The Noisy-Channel Model
![Page 32: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/32.jpg)
32
▪ The noisy-channel approach:
source W Anoisy channel
decoderobserved
w abest
Prior
Acoustic model (HMMs)Translation model
Likelihood
Language model: Distributions over sequences of words (sentences)
The Noisy-Channel Model
![Page 33: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/33.jpg)
Noisy Channel Model
![Page 34: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/34.jpg)
MT as Direct Modeling
▪ one model does everything▪ trained to reproduce a corpus of translations
![Page 35: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/35.jpg)
Two Views of MT
▪ Code breaking (aka the noisy channel, Bayes rule)▪ I know the target language ▪ I have example translations texts (example enciphered data)
▪ Direct modeling (aka pattern matching)▪ I have really good learning algorithms and a bunch of example
inputs (source language sentences) and outputs (target language translations)
![Page 36: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/36.jpg)
Which is better?
▪ Noisy channel -▪ easy to use monolingual target language data▪ search happens under a product of two models (individual models
can be simple, product can be powerful)▪ obtaining probabilities requires renormalizing
▪ Direct model -▪ directly model the process you care about▪ model must be very powerful
![Page 37: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/37.jpg)
Where are we in 2019?
▪ Direct modeling is where most of the action is▪ Neural networks are very good at generalizing and conceptually
very simple▪ Inference in “product of two models” is hard
▪ Noisy channel ideas are incredibly important and still play a big role in how we think about translation
![Page 38: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/38.jpg)
Two Views of MT
![Page 39: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/39.jpg)
Noisy Channel: Phrase-Based MT
fesource phrase
target phrase
translation features
Translation Model
f
Language Model
e f
Reranking Model
feature weights
Parallel corpus
Monolingual corpus
Held-out parallel corpus
![Page 41: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/41.jpg)
A common problem
Both models must assign probabilities to how a sentence in one language translates into a sentence in another language.
![Page 42: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/42.jpg)
Learning from Data
![Page 43: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/43.jpg)
Parallel corpora
![Page 44: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/44.jpg)
![Page 45: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/45.jpg)
![Page 46: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/46.jpg)
![Page 47: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/47.jpg)
Mining parallel data from microblogs Ling et al. 2013
![Page 48: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/48.jpg)
http://opus.nlpl.eu
![Page 49: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/49.jpg)
![Page 50: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/50.jpg)
▪ There is a lot more monolingual data in the world than translated data
▪ Easy to get about 1 trillion words of English by crawling the web
▪ With some work, you can get 1 billion translated words of English-French▪ What about Japanese-Turkish?
![Page 51: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/51.jpg)
Phrase-Based MT
fesource phrase
target phrase
translation features
Translation Model
f
Language Model
e f
Reranking Model
feature weights
Parallel corpus
Monolingual corpus
Held-out parallel corpus
![Page 52: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/52.jpg)
Phrase-Based System Overview
Sentence-aligned corpus
cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9language ||| langue ||| 0.9 …
Phrase table(translation model)Word alignments
Many slides and examples from Philipp Koehn or John DeNero
![Page 53: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/53.jpg)
Word Alignment Models
![Page 54: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/54.jpg)
Lexical Translation
▪ How do we translate a word? Look it up in the dictionary
Haus — house, building, home, household, shell
▪ Multiple translations▪ some more frequent than others▪ different word senses, different registers, different inflections (?)▪ house, home are common
▪ shell is specialized (the Haus of a snail is a shell)
![Page 55: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/55.jpg)
How common is each?
Look at a parallel corpus (German text along with English translation)
![Page 56: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/56.jpg)
Estimate Translation Probabilities
Maximum likelihood estimation
![Page 57: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/57.jpg)
▪ Goal: a model▪ where e and f are complete English and Foreign sentences
Lexical Translation
![Page 58: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/58.jpg)
Lexical Translation
▪ Goal: a model ▪ where e and f are complete English and Foreign sentences▪ Lexical translation makes the following assumptions:▪ Each word in e
i in e is generated from exactly one word in f
▪ Thus, we have an alignment ai that indicates which word e
i “came
from”, specifically it came from fai
▪ Given the alignments a, translation decisions are conditionally independent of each other and depend only on the aligned source word f
ai
![Page 59: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/59.jpg)
Lexical Translation
▪ Putting our assumptions together, we have:
![Page 60: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/60.jpg)
The Alignment Function
▪ Alignments can be visualized in by drawing links between two sentences, and they are represented as vectors of positions:
![Page 61: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/61.jpg)
Reordering
▪ Words may be reordered during translation.
![Page 62: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/62.jpg)
Word Dropping
▪ A source word may not be translated at all
![Page 63: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/63.jpg)
Word Insertion
▪ Words may be inserted during translation ▪ English just does not have an equivalent▪ But it must be explained - we typically assume every source
sentence contains a NULL token
![Page 64: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/64.jpg)
One-to-many Translation
▪ A source word may translate into more than one target word
![Page 65: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/65.jpg)
Many-to-one Translation
▪ More than one source word may not translate as a unit in lexical translation
![Page 66: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/66.jpg)
IBM Model 1
▪ Generative model: break up translation process into smaller steps
▪ Simplest possible lexical translation model▪ Additional assumptions▪ All alignment decisions are independent▪ The alignment distribution for each a
i is uniform over all source
words and NULL
![Page 67: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/67.jpg)
IBM Model 1
▪ Translation probability▪ for a foreign sentence f = (f
1, ..., f
lf ) of length l
f
▪ to an English sentence e = (e1, ..., e
le ) of length l
e
▪ with an alignment of each English word ej to a foreign word f
i
according to the alignment function a : j → i
▪ parameter ϵ is a normalization constant
![Page 68: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/68.jpg)
Example
![Page 69: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/69.jpg)
Learning Lexical Translation Models
We would like to estimate the lexical translation probabilities t(e|f) from a parallel corpus
▪ ... but we do not have the alignments
▪ Chicken and egg problem▪ if we had the alignments,
→ we could estimate the parameters of our generative model (MLE)
▪ if we had the parameters,
→ we could estimate the alignments
![Page 70: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/70.jpg)
EM Algorithm
▪ Incomplete data▪ if we had complete data, we could estimate the model▪ if we had the model, we could fill in the gaps in the data
▪ Expectation Maximization (EM) in a nutshell
1. initialize model parameters (e.g. uniform, random)
2. assign probabilities to the missing data
3. estimate model parameters from completed data
4. iterate steps 2–3 until convergence
![Page 71: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/71.jpg)
EM Algorithm
▪ Initial step: all alignments equally likely▪ Model learns that, e.g., la is often aligned with the
![Page 72: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/72.jpg)
EM Algorithm
▪ After one iteration▪ Alignments, e.g., between la and the are more likely
![Page 73: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/73.jpg)
EM Algorithm
▪ After another iteration▪ It becomes apparent that alignments, e.g., between fleur and
flower are more likely (pigeon hole principle)
![Page 74: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/74.jpg)
EM Algorithm
▪ Convergence▪ Inherent hidden structure revealed by EM
![Page 75: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/75.jpg)
EM Algorithm
▪ Parameter estimation from the aligned corpus
![Page 76: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/76.jpg)
IBM Model 1 and EM
EM Algorithm consists of two steps
▪ Expectation-Step: Apply model to the data▪ parts of the model are hidden (here: alignments)▪ using the model, assign probabilities to possible values
▪ Maximization-Step: Estimate model from data▪ take assigned values as fact▪ collect counts (weighted by lexical translation probabilities)▪ estimate model from counts
▪ Iterate these steps until convergence
![Page 77: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/77.jpg)
IBM Model 1 and EM
▪ We need to be able to compute:▪ Expectation-Step: probability of alignments▪ Maximization-Step: count collection
![Page 78: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/78.jpg)
IBM Model 1 and EM
t-table
![Page 79: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/79.jpg)
IBM Model 1 and EM
t-table
![Page 80: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/80.jpg)
IBM Model 1 and EM
t-table
![Page 81: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/81.jpg)
IBM Model 1 and EM
Applying the chain rule:
t-table
![Page 82: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/82.jpg)
IBM Model 1 and EM: Expectation Step
![Page 83: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/83.jpg)
IBM Model 1 and EM: Expectation Step
![Page 84: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/84.jpg)
The Trick
![Page 85: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/85.jpg)
IBM Model 1 and EM: Expectation Step
![Page 86: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/86.jpg)
IBM Model 1 and EM: Expectation Step
E-step
t-table
![Page 87: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/87.jpg)
IBM Model 1 and EM: Maximization Step
![Page 88: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/88.jpg)
IBM Model 1 and EM: Maximization Step
E-step
M-step
t-table
![Page 89: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/89.jpg)
IBM Model 1 and EM: Maximization Step
![Page 90: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/90.jpg)
IBM Model 1 and EM: Maximization Step
Update t-table:
p(the|la) = c(the|la)/c(la)
E-step
M-step
t-table
![Page 91: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/91.jpg)
IBM Model 1 and EM: Pseudocode
![Page 92: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/92.jpg)
Convergence
![Page 93: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/93.jpg)
IBM Model 1
▪ Generative model: break up translation process into smaller steps
▪ Simplest possible lexical translation model▪ Additional assumptions▪ All alignment decisions are independent▪ The alignment distribution for each a
i is uniform over all source
words and NULL
![Page 94: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/94.jpg)
IBM Model 1
▪ Translation probability▪ for a foreign sentence f = (f
1, ..., f
lf ) of length l
f
▪ to an English sentence e = (e1, ..., e
le ) of length l
e
▪ with an alignment of each English word ej to a foreign word f
i
according to the alignment function a : j → i
▪ parameter ϵ is a normalization constant
![Page 95: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/95.jpg)
Example
![Page 96: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/96.jpg)
Evaluating Alignment Models
▪ How do we measure quality of a word-to-word model?
▪ Method 1: use in an end-to-end translation system▪ Hard to measure translation quality▪ Option: human judges▪ Option: reference translations (NIST, BLEU)▪ Option: combinations (HTER)▪ Actually, no one uses word-to-word models alone as TMs
▪ Method 2: measure quality of the alignments produced▪ Easy to measure▪ Hard to know what the gold alignments should be▪ Often does not correlate well with translation quality (like perplexity in LMs)
![Page 97: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/97.jpg)
Alignment Error Rate
![Page 98: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/98.jpg)
Alignment Error Rate
![Page 99: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/99.jpg)
Alignment Error Rate
![Page 100: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/100.jpg)
Alignment Error Rate
![Page 101: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/101.jpg)
Alignment Error Rate
![Page 102: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp19/slides/FA19_IITP_lecture... · 2019. 11. 19. · Phrase-Based MT e f source phrase target phrase translation features Translation](https://reader035.vdocument.in/reader035/viewer/2022081518/613d6691e1ef621e9f2db320/html5/thumbnails/102.jpg)
Problems with Lexical Translation
▪ Complexity -- exponential in sentence length▪ Weak reordering -- the output is not fluent▪ Many local decisions -- error propagation