machine translation a presentation by: julie conlonova, rob chase, and eric pomerleau
Post on 22-Dec-2015
214 views
TRANSCRIPT
![Page 1: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/1.jpg)
Machine Translation
A Presentation by:
Julie Conlonova,
Rob Chase,
and Eric Pomerleau
![Page 2: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/2.jpg)
Overview
Language Alignment SystemDatasets
Sentence-aligned sets for training (ex. The Hansards Corpus, European Parliamentary Proceedings Parallel Corpus)
A word-aligned set for testing and evaluation to measure accuracy and precision
Decoding
![Page 3: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/3.jpg)
Language Alignment
Goal: Produce a word-aligned set from a sentence-aligned dataset
First step on the road toward Statistical Machine Translation
Example Problem: The motion to adjourn the House is now
deemed to have been adopted. La motion portant que la Chambre s'ajourne
maintenant est réputée adoptée.
![Page 4: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/4.jpg)
IBM Models 1 and 2-Kevin Knight, A Statistical MT Tutorial Workbook, 1999
Each capable of being used to produce a word-aligned dataset separately.
EM Algorithm Model 1 produces T-values based on
normalized fractional counting of corresponding words.
Additionally, Model 2 uses A-values for “reverse distortion probabilities” – probabilities based on the positions of the words
![Page 5: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/5.jpg)
Training Data
European Parliament Proceedings Parallel Corpus 1996-2003
Aligned Languages: English - French English - Dutch English - Italian English - Finish English - Portuguese English - Spanish English - Greek
![Page 6: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/6.jpg)
Training Data cont.
Eliminated Misaligned sentences Sentences with 50 or more words XML tags Symbols and numerical characters other then
commas and periods
![Page 7: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/7.jpg)
Ideally…
http://www.cs.berkeley.edu/~klein/cs294-5
![Page 8: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/8.jpg)
Bypassing Interlingua: Models I-III
Variables contributing to the probability of a sentence:Correlation between words in the
source/target languagesFertility of a wordCorrelation between order of words in
source sentence and order of words in target
![Page 9: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/9.jpg)
A Translation Matrix
Rob Cat is Dog
Rob 1 0 0 0
Gato 0 1 0 0
es 0 0 .5 0
esta 0 0 .5 0
Perro 0 0 0 1
![Page 10: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/10.jpg)
Building the Translation Matrix: Starting from alignments
Find the sentence alignmentIf a word in the source aligns with a word
in the target, then increment the translation matrix.
Normalize the translation matrix
![Page 11: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/11.jpg)
Can’t find alignments
Most sentences in the hansards corpus are 60 words long. There are many that can be over 100.
100100 possible alignments
![Page 12: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/12.jpg)
Counting
Rob is a boy. Rob es nino.Rob is tall. Rob es alto.Eric is tall. Eric es alto.
… …
Base counts on co-occurrence, weighting based on sentence length.
![Page 13: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/13.jpg)
Iterative Convergence
Use Estimation Maximization algorithm
Creates translation matrix
Rob Is Tall boy
Rob .66 .33 .25 .25
es .30 .66 .25 .25
alto .2 .05 .5 0
nino .2 .05 0 .5
![Page 14: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/14.jpg)
Distorting the Sentence
Word order changes between languagesHow is a sentence with 2 words distorted?How is a sentence with 3 words distorted?How is a sentence with …
To keep track of this information we use…
![Page 15: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/15.jpg)
A tesseract!
(A quadruply nested default dictionary)
This could be a problem if there are more than 100 words in a sentence.
100x100x100x100 = too big for RAM and takes too much time
![Page 16: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/16.jpg)
Broad Look at MT
“The translation process can be described simply as:
1. Decoding the meaning of the source text, and
2. Re-encoding this meaning in the target language.”
- “Translation Process”, Wikipedia, May 2006
![Page 17: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/17.jpg)
Decoding
How to go from the T-matrix and A-matrix to a word alignment?
There are several approaches…
![Page 18: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/18.jpg)
Viterbi
If only doing alignment, much smaller memory and time requirements.
Returns optimal path.
T-Matrix probabilities function as the “emission” matrix
A-Matrix probabilities concerned with the positioning of words
![Page 19: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/19.jpg)
Decoding as a Translator
Without supplying a translated sentence to the program, it is capable of being a stand-alone translator instead of a word aligner.
However, while the Viterbi algorithm runs quickly with pruning for decoding, for translating the run time skyrockets.
![Page 20: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/20.jpg)
Greedy Hill ClimbingKnight & Koehn, What’s New in Statistical Machine Translation, 2003
Best first search2-step look ahead to avoid getting stuck in
most probable local maxima
![Page 21: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/21.jpg)
Beam SearchKnight & Koehn, What’s New in Statistical Machine Translation, 2003
Optimization of Best First Search with heuristics and “beam” of choices
Exponential tradeoff when increasing the “beam” width
![Page 22: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/22.jpg)
Other Decoding MethodsKnight & Koehn, What’s New in Statistical Machine Translation, 2003
Finite State Transducer Mapping between languages based on a finite
automaton
Parsing String to Tree Model
![Page 23: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/23.jpg)
Problem: One to Many
Necessary to take all alignments over a certain probability in order to capture the “probability that e has fertility at least a given value”
Al-Onaizan, Curin, Jahr, etc., Statistical Machine Translation, 1999
![Page 24: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/24.jpg)
Results
Study done in 2003 on word alignment error rates in Hansards corpus: Model 2 –
29.3% on 8K training sentence pairs 19.5% on 1.47M training sentence pairs
Optimized Model 6 – 20.3% on 8K training sentence pairs 8.7% on 1.47M training sentence pairs
Och and Ney, A Systematic Comparison of Various Statistical Alignment Models, 2003
![Page 25: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/25.jpg)
Expected Accuracy
70% overall
Language performance: Dutch
French• Italian, Spanish, Portuguese
Greek Finish
![Page 26: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/26.jpg)
Possible Future Work
Given more time, we would’ve implemented IBM Model 3
Additionally uses n, p, and d fertilities for weighted alignments: N, number of words produced by one word D, distortion P, parameter involving words that aren’t involved directly
Invokes Model 2 for scoring
![Page 27: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/27.jpg)
Another Possible Translation Scheme
Example-Based Machine Translation Translation-by-Analogy Can sometimes achieve better than the “gist”
translations from other models
![Page 28: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/28.jpg)
Why Is Improving Machine Translation Necessary?
![Page 29: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/29.jpg)
A Chinese to English Translation
![Page 30: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/30.jpg)
The End
Are there any questions/comments?
![Page 31: Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau](https://reader030.vdocument.in/reader030/viewer/2022032523/56649d815503460f94a66e70/html5/thumbnails/31.jpg)