machine translation and mt tools: giza++ and moses -nirdesh chauhan

29
MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Upload: austin-taylor

Post on 23-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES

-Nirdesh Chauhan

Page 2: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Outline

Problem statement in SMT

Translation models

Using Giza++ and Moses

Page 3: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Introduction to SMT

Given a sentence in foreign language F, find most appropriate translation in English E

P(F|E) – Translation model P(E) – Language model

Page 4: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

The Generation Process4

Partition: Think of all possible partitions of the source language

Lexicalization: For a give partition, translate each phrase into the foreign language

Reordering: permute the set of all foreign words - words possibly moving across phrase boundaries

We need the notion of alignment to better explain mathematic behind the generation process

Page 5: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Alignment

Page 6: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Word-based alignment

For each word in source language, align words from target language that this word possibly produces

Based on IBM models 1-5 Model 1 – simplest As we go from models 1 to 5, models get

more complex but more realistic

This is all that Giza++ does

Page 7: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Alignment

A function from target position to source position:

7

The alignment sequence is: 2,3,4,5,6,6,6Alignment function A: A(1) = 2, A(2) = 3 ..A different alignment function will give the sequence:1,2,1,2,3,4,3,4 for A(1), A(2)..

To allow spurious insertion, allow alignment with word 0 (NULL)No. of possible alignments: (I+1)J

Page 8: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

IBM Model 1: Generative Process

8

Page 9: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

IBM Model 1: Details

No assumptions. Above formula is exact. Choosing length: P(J|E) = P(J|E,I) = P(J|I) = Choosing Alignment: all alignments equiprobable

Translation Probability

A

J

jaJ jjeft

IEFP

1

)|(*)1(

)|(

),,|(*),|(*)|()|( AEJFPEJAPEJPEFPA

9

),,,|(*),,|(

),,|(*),|(

),,|(*),|(

11

11

11

11

1

11111

IJjj

J

j

Ijj

IJJIJ

eaJffPeJaaP

eaJfPeJaP

EJAFPEJAP

A

IJjj

J

j

Ijj eaJffPeJaaPEJPEFP ),,,|(*),,|(*)|()|( 1

11

11

11

11

1

1),,|( 1

11

IeJaaP Ij

j

)|(),,,|( 11

11

1 jajefteaJffP IJj

j

Page 10: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Training Alignment Models10

Given a parallel corpora, for each (F,E) learn the best alignment A and the component probabilities: t(f|e) for Model 1 lexicon probability P(f|e) and alignment

probability P(ai|ai-1,I)

How to compute these probabilities if all you have is a parallel corpora

Page 11: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Intuition : Interdependence of Probabilities

11

If you knew which words are probable translation of each other then you can guess which alignment is probable and which one is improbable

If you were given alignments with probabilities then you can compute translation probabilities

Looks like a chicken and egg problem

EM algorithm comes to the rescue

Page 12: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Expectation Maximization (EM) Algorithm

12

Used when we want maximum likelihood estimate of the parameters of a model when the model depends on hidden variables-In present case, parameters are Translation Probabilities, and hidden Variables are alignment probabilities • Init: Start with an arbitrary estimate of parameters• E-step: compute the expected value of hidden variables• M-Step: Recompute the parameters that maximize the likelihood of data given the expected value of the hidden variables from E-step

Page 13: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Example of EM Algorithm13

Green houseCasa verde

The houseLa case

Init: Assume that any word can generate any word with equal prob:

P(la|house) = 1/3

Page 14: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

E-Step14

J

jaj

J jeft

I

EJAFPEJAPEJFAP

1

)|(*)1(

),,|(*),|(),|,(

E-Step:

A

EFAP

EFAPEFAP

)|,(

)|,(),|(

Page 15: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

M-Step

15

f

EF A

eftcount

eftcounteft

EFAefCAPeftcount

)|(

)|()|(

),,|,(*)()|(,

Page 16: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

E-Step again

J

jaj

J jeft

I

EJAFPEJAPEJFAP

1

)|(*)1(

),,|(*),|(),|,(

A

EFAP

EFAPEFAP

)|,(

)|,(),|(

16

1/3 2/3 2/3 1/3

Repeat till convergence

Page 17: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Limitation: Only 1->Many Alignments allowed

17

Page 18: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Phrase-based alignment

More natural

Many-to-one mappings allowed

Page 19: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Generating Bi-directional Alignments Existing models only generate uni-directional

alignments Combine two uni-directional alignments to get

many-to-many bi-directional alignments

19

Page 20: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Hindi-Eng Alignment

छु� ट्टि�यों� के� लिए गो वा� एके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�

Goa |

is

a |

premier |

beach

vacation | | |

destination | |

20

Page 21: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Eng-Hindi Alignment

छु� ट्टि�यों� के� लिए गो वा� एके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�

Goa

|

is

a

|premier

|

beach

|

vacation

|

destination

|21

Page 22: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Combining Alignments

छु� ट्टि�यों� के� लिए गो वा� एके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�

Goa +

is

a +premier |

|

beach

|

vacation | |

+

destination

|

| |

22P=2/3=.67, R=2/7=.3P=4/5=.8,R=4/7=.6

P=5/6=.83,R=5/7=.7P=6/9=.67,R=6/7=.85

Page 23: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

A Different Heuristic from Moses-Site

23

GROW-DIAG-FINAL(e2f,f2e): neighboring = ((-1,0),(0,-1),(1,0),(0,1),(-1,-1),(-1,1),(1,-1),(1,1)) alignment = intersect(e2f,f2e); GROW-DIAG(); FINAL(e2f); FINAL(f2e);

GROW-DIAG(): iterate until no new points added for english word e = 0 ... en for foreign word f = 0 ... fn if ( e aligned with f ) for each neighboring point ( e-new, f-new ): if (( e-new, f-new ) in union( e2f, f2e ) and

( e-new not aligned and f-new not aligned )) add alignment point ( e-new, f-new ) FINAL(a): for english word e-new = 0 ... en for foreign word f-new = 0 ... fn if ( ( ( e-new, f-new ) in alignment a) and

( e-new not aligned or f-new not aligned ) ) add alignment point ( e-new, f-new )

Proposed Changes:After growing diagonalAlign the shorter sentence firstAnd use alignments only fromcorresponding directional alignment

Page 24: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Generating Phrase Alignments

छु� ट्टि�यों� के� लिए गो वा� एके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�

Goa +

is

a +premier +

beach

+

vacation + +

+

destination + +

24a premier beach vacation destinationएके प्रमु�ख समु�द्र-तटी�यों गो�तव्य है�

premier beach vacationप्रमु�ख समु�द्र-तटी�यों

Page 25: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Using Moses and Giza++

Refer to http://www.statmt.org/moses_steps.html

Page 26: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Steps

Install all packages in Moses

Input - sentence aligned parallel corpus

Training Tuning Generate output on test corpus

(decoding)

Page 27: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Example

train.enh e l l o

h e l l o

w o r l d

c o m p o u n d w o r d

h y p h e n a t e d

o n e

b o o m

k w e e z l e b o t t e r

train.prhh eh l ow

hh ah l ow

w er l d

k aa m p aw n d w er d

hh ay f ah n ey t ih d

ow eh n iy

b uw m

k w iy z l ah b aa t ah r

Page 28: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Sample from Phrase-tableb o ||| b aa ||| (0) (1) ||| (0) (1) ||| 1 0.666667 1

0.181818 2.718

b ||| b ||| (0) ||| (0) ||| 1 1 1 1 2.718

c o m p o ||| aa m p ||| (2) (0,1) (1) (0) (1) ||| (1,3) (1,2,4) (0) ||| 1 0.0486111 1 0.154959 2.718

c ||| p ||| (0) ||| (0) ||| 1 1 1 1 2.718

d w ||| d w ||| (0) (1) ||| (0) (1) ||| 1 0.75 1 1 2.718

d ||| d ||| (0) ||| (0) ||| 1 1 1 1 2.718

e b ||| ah b ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718

e l l ||| ah l ||| (0) (1) (1) ||| (0) (1,2) ||| 1 1 0.5 0.5 2.718

e l l ||| eh l ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.111111 0.5 0.111111 2.718

e l ||| eh ||| (0) (0) ||| (0,1) ||| 1 0.111111 1 0.133333 2.718

e ||| ah ||| (0) ||| (0) ||| 1 1 0.666667 0.6 2.718

h e ||| hh ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718

h ||| hh ||| (0) ||| (0) ||| 1 1 1 1 2.718

l e b ||| l ah b ||| (0) (1) (2) ||| (0) (1) (2) ||| 1 1 1 0.5 2.718

l e ||| l ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.5 2.718

l l o ||| l ow ||| (0) (0) (1) ||| (0,1) (2) ||| 0.5 1 1 0.227273 2.718l l ||| l ||| (0) (0) ||| (0,1) ||| 0.25 1 1 0.833333 2.718l o ||| l ow ||| (0) (1) ||| (0) (1) ||| 0.5 1 1 0.227273 2.718l ||| l ||| (0) ||| (0) ||| 0.75 1 1 0.833333 2.718m ||| m ||| (0) ||| (0) ||| 1 0.5 1 1 2.718n d ||| n d ||| (0) (1) ||| (0) (1) ||| 1 1 1 1 2.718n e ||| eh n iy ||| (1) (2) ||| () (0) (1) ||| 1 1 0.5 0.3 2.718n e ||| n iy ||| (0) (1) ||| (0) (1) ||| 1 1 0.5 0.3 2.718n ||| eh n ||| (1) ||| () (0) ||| 1 1 0.25 1 2.718o o m ||| uw m ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.5 1 0.181818 2.718o o ||| uw ||| (0) (0) ||| (0,1) ||| 1 1 1 0.181818 2.718o ||| aa ||| (0) ||| (0) ||| 1 0.666667 0.2 0.181818 2.718o ||| ow eh ||| (0) ||| (0) () ||| 1 1 0.2 0.272727 2.718o ||| ow ||| (0) ||| (0) ||| 1 1 0.6 0.272727 2.718w o r ||| w er ||| (0) (1) (1) ||| (0) (1,2) ||| 1 0.1875 1 0.424242 2.718w ||| w ||| (0) ||| (0) ||| 1 0.75 1 1 2.718

Page 29: MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Testing output

h o t hh aa t

p h o n e p|UNK hh ow eh n iy

b o o k b uw k