translation - cs.utexas.edu

CS 378 Lecture 17

Machine Translation

Today - Phrase- based MT- Word alignment

Announceinents - Midterm - FP- A-3

-1-14

Read LSTMS

Is T2involves gated

9 I,t Iz

connections☐=☐=q ñ

, qIz Ii = f- ① Ii

- ,

I,I,

Machinetranslati.nloday : phrase - based (pre

- 2015)Next time : neural w/ seqfseq models

(post- 2015), following on how we

used RNNS for LM

Input : 5 source sentence

Output: I target language

Data : bitext . Set of (5) F) pairs

French →meEn

→ Jasentence ↳

..-

we don't know[ how to do this !

interlingua

Bernard Vauquois ( 1968)0 interlingua

ic- syntax

¥É sweetspot

0 word- level>0

5 I

Phrasetsasedmt

Bitext → alignedword table LM

phrase alignmentbitext

→ phyrase ,Je fois un bureau decoder• A- ☒ ☒Iaa desk

Decoder : searches over the space of

phrase- by - phrase translations to find

one that scores best ( includingan LM score)

Jes fais in bureau →a) ÷: "→ to figure out

↳candidate is

grammaticalEnglish

Wordltlignment (focus of today )

Input : bilext (STI) pairs

Output: one -to - many alignments fromJ to I

placeholderd

J= Je rais le faire NULLa-= '/ 2/12 ¥3E-I am going to do it a~=2

a 5- 2-. .

Each word in t aligns to one

word in 5

Define a vector a-

ai= index in 5 that word ti

aligns to

Alignment models : place distributionover p(toils ) generative model

of I,a-

f- = words in an HMM

a- ✗ tags

IBMM.de/1(1993)-n targetwords

a- = (ai , - - -

,an ) f- = (t , , . .,tn)

5- (si , -. -

,5m,NULL) m source

words

Model t.PH , a- 151=17 Plait Pltilsai )-1=1

Generative process: for each target

word i, pick a source index ai

Pla:) =¥ uniform over these options

Generate ti conditioned on Saiai th source

word

Model params : translation dictionaryQ Pl target word / Je )

g-

¥.

Mta>a- women

like emissions

in Hmm !

Each ai is like a switch,tells

youwhat row of ⑤ to use

Je fois

a. =L : Plt , / Je)a ,=2 : Plt

,/ fais )

T={I , lire, eat] 5=1 Je , It , mange ,aime, NULL]

I*Je 0.8

µ"

go"

J'0.8 0.1 0.1

Je NULL

mange 0 0 " °213 A fly,

aime 0 1-0 0

INULL 0.4 0.3 0.3

Je NULL Platt ,5)?

t= Iprop.to {

PIIIJE)""

PCI /Naya:"

what do we want ?= {0-8=>43Posterior p( a- 15,1--1 0.4 Y}

Hmm : Play/ ⇒ ply-l.FI posteriormodel (tagging )

£ PIE,a- 1st

argmax p(a-It,5) multiply Wtop/bottoma- ✓pltls-lpla-H.sk

constant

w.r.t.aaogmai-pla-lt.skargmaxpcñstls)a- a-

constant ¥,

= argmax ITP Pltilsai )a- it

P (ai It , 5) proportional to Pctilsai)

I 2 3

FIL NULL 5

Ia'

I like

9=1 J' PCIIJY

0.8 213

Pla , 15,t)=ai-2{ aime PLT-la.me/⇒ o

o113

argmax a , ?= ,?⇒ NULL a. y

Pla , 15,1=1

argmax a,?=2

Bitext Je^IJ ' aime I likeTeItTammak

⇒ alignments + phrases

Learning Hara !

Unsupervised : no examples of labeleda-

.

Expectation Maximization :D

maximizes { log { p(a- ,F'" 1st")

a-

(sci ),tciyd i -4-in Plt Cil 151

")

Phrase-basedMT-⇐it ) pairs ⇒ learn an aligner

⇒ align our data

phrase extraction : aligned sent⇒ phrase translationoptions

Phrase table : huge !

separator score(from alignments

Je fois 111 I make 0.9

Je fois M I am making 0.6

Jaime 111 I like 0.7

J +LMdecoder

TaYs Idea of alignment

learn a model to get adistribution over source words

given a target ward

translation - cs.utexas.edu

Documents