part1:,re’ordering,for...

77
Syntaxbased Transla0on Part 1: Reordering for Phrasebased transla0on March 13, 2012 Thanks to Michael Collins for many of today’s slides. Take a look at Mike’s course: http://www.cs.columbia.edu/~cs4705/

Upload: others

Post on 19-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Syntax-­‐based  Transla0onPart  1:  Re-­‐ordering  forPhrase-­‐based  transla0on

March 13, 2012

Thanks to Michael Collins for many of today’s slides.Take a look at Mike’s course: http://www.cs.columbia.edu/~cs4705/

Page 2: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Goals

• Understand  why  syntax  is  important  for  reordering  models

• Review  non-­‐syntac0c  reordering  models  for  phrase-­‐based  machine  transla0on

• Review  the  “Clause  Restructuring”  approach  of  Collins,  Koehn,  and  Kucerova

• Understand  why  it  is  a  good  fit  for  phrase-­‐based  machine  transla0on

• Discuss  its  limita0ons2

Page 3: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Phrase-­‐based  model

• Foreign  input  is  segmented  in  phrases  • Each  phrase  is  translated  into  English  • Phrases  are  reordered

3

natuerlich hat john spass am spiel

of course john has fun with the game

Page 4: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Some  Reordering  Already  Captured

4

natuerlich hat john spass am spiel

of course john has fun with the game

natuerlich hat john spass am spiel

of course john has fun with the game

• Local  reordering  can  be  captured  within  phrases

Page 5: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Phrase  transla0on  table

• Main  knowledge  source:  table  with  phrase  transla0ons  and  their  probabili0es  

• Example:  phrase  transla0ons  for  natuerlich

5

Source Transla,on Probability  φ(e|f)natuerlich of  course 0.5natuerlich naturally 0.3natuerlich of  course  , 0.15natuerlich ,  of  course  , 0.05

Page 6: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Probabilis0c  Model

• Bayes  rule–ebest  =  arg  maxe  p(e|f)                =  arg  maxe  p(f|e)  plm(e)

–  transla0on  model  p(e|f)  –  language  model  plm(e)

• Reordering  score  can  be  incorporated  in  the  TM

–  phrase  transla0on  probability  φ  –  reordering  probability  d 6

Probabilistic Model

• Bayes rule

ebest = argmaxe p(e|f)= argmaxe p(f |e) plm(e)

– translation model p(e|f)– language model plm(e)

• Decomposition of the translation model

p(

¯

f

I

1 |eI

1) =

IY

i=1

�(

¯

f

i

|ei

) d(start

i

� end

i�1 � 1)

– phrase translation probability �

– reordering probability d

Chapter 5: Phrase-Based Models 6

Page 7: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Log-­‐linear  model

7

Weighted Model as Log-Linear Model

p(e, a|f) = exp(�

IX

i=1

log �(

¯

f

i

|ei

)+

d

IX

i=1

log d(a

i

� b

i�1 � 1)+

LM

|e|X

i=1

log p

LM

(e

i

|e1...ei�1))

Chapter 5: Phrase-Based Models 19

Page 8: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Distance-­‐based  Reordering

8

Distance-Based Reordering

1 2 3 4 5 6 7

d=0

d=-3

d=-2

d=-1

foreign

English

phrase translates movement distance

1 1–3 start at beginning 02 6 skip over 4–5 +23 4–5 move back over 4–6 -34 7 skip over 6 +1

Scoring function: d(x) = ↵

|x| — exponential with distance

Chapter 5: Phrase-Based Models 7

Scoring function: d(x) = α|x| – exponential with distance

Page 9: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Values  of  α

9

0

0.25

0.50

0.75

1.00

0 1 2 3 4 5

0.990.750.50.250.1

Distance of move

Pro

babi

lity

α=

Page 10: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Discussion:  Distance-­‐based  reordering

• What  do  you  think  of  it?• Is  it  a  good  model  for  how  reordering  works  across  languages?

• What  is  it  missing?

(Discuss  with  your  neighbor)

10

Page 11: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Distance-­‐based  reordering

• Small  values  of  α,  severely  discourage  reordering–Limit  reordering  to  monotonic  or  a  narrow  window–OK  for  languages  with  very  similar  word  orders–Bad  for  languages  with  different  word  orders  

• The  distance-­‐based  penalty  applies  uniformly  to  all  words  and  all  word  types–Doesn’t  know  that  adjec0ves  and  nouns  should  swap  when  transla0ng  from  French  to  English

• Puts  most  responsibility  on  the  language  model11

Page 12: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

How  else  could  we  model  reordering?

• Why  not  assign  a  dis0nct  reordering  probability  to  each  word/phrase  in  the  phrase  table?–p(reorder  |  f,  e)

• This  is  known  as  lexicalized  reordering• How  can  we  es0mate  that  probability?

12

Page 13: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Lexicalized  Reordering  model

13

Howmuch

you

foryour

FacebookW

ievi

el

man

aufrg

und

sein

es

profile

Pro

fils

in Face

book

charge

verd

iene

n

should

sollt

e

Page 14: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Lexicalized  Reordering  model

13

Howmuch

you

foryour

FacebookW

ievi

el

man

aufrg

und

sein

es

profile

Pro

fils

in Face

book

charge

verd

iene

n

should

sollt

e

m

m: monotone (keep order)

Page 15: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Lexicalized  Reordering  model

13

Howmuch

you

foryour

FacebookW

ievi

el

man

aufrg

und

sein

es

profile

Pro

fils

in Face

book

charge

verd

iene

n

should

sollt

e

m

s

m: monotone (keep order)s: swap order

Page 16: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Lexicalized  Reordering  model

13

Howmuch

you

foryour

FacebookW

ievi

el

man

aufrg

und

sein

es

profile

Pro

fils

in Face

book

charge

verd

iene

n

should

sollt

e

m

s

d

md

m

m: monotone (keep order)s: swap orderd: become discontinuous

Page 17: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Lexicalized  Reordering  model

13

Howmuch

you

foryour

FacebookW

ievi

el

man

aufrg

und

sein

es

profile

Pro

fils

in Face

book

charge

verd

iene

n

should

sollt

e

m

d

md

m

s

Reordering features are probability estimates of s, d, and m

d

md

m

m: monotone (keep order)s: swap orderd: become discontinuous

Page 18: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Lexicalized  Reordering  table

• Iden0cal  phrase  pairs  <f,e>  as  in  the  phrase  transla0on  table

• Contains  values  for  p(monotone|e,f),  p(swap|e,f),  p(discon>nuous|e,f)

14

Source Transla,on p(m|e,f) p(s|e,f) p(d|e,f)natuerlich of  course 0.52 0.08 0.40natuerlich naturally 0.42 0.10 0.48natuerlich of  course  , 0.50 0.001 0.499natuerlich ,  of  course   0.27 0.17 0.56

Page 19: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Discussion:  Is  this  beler?

• Do  you  think  that  this  is  a  more  sensible  reordering  model  than  the  distance-­‐based  one?

• How  could  you  determine  if  it  is  beler  or  not?• What  do  you  think  that  it  s0ll  lacks?

(Discuss  with  your  neighbor)

15

Page 20: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Empirically,  yes!

16

0

15.0

30.0

45.0

60.0

Arabic Japanese Korean Chinese En-Chinese

16.6

38.642.3

47.650.9

15.2

34.635.7

45.149.9

Baseline Lexicalized Reordering

Koehn et al, IWSLT 2005

Page 21: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

The  Awful  German  Language The Germans have another kind of parenthesis, which they make by splitting a verb in two and putting half of it at the beginning of an exciting chapter and the OTHER HALF at the end of it. Can any one conceive of anything more confusing than that? These things are called ‘separable verbs.’ The wider the two portions of one of them are spread apart, the better the author of the crime is pleased with his performance.”

Mark Twain

Page 22: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

German  verbs

18

Ich werde Ihnen den Report aushaendigen . I will to_you the report pass_on .

Page 23: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

German  verbs

18

Ich werde Ihnen den Report aushaendigen . I will to_you the report pass_on .

Ich werde Ihnen die entsprechenden Anmerkungen aushaendigen . I will to_you the corresponding comments pass_on .

Page 24: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

German  verbs

18

Ich werde Ihnen den Report aushaendigen . I will to_you the report pass_on .

Ich werde Ihnen die entsprechenden Anmerkungen aushaendigen . I will to_you the corresponding comments pass_on .

Ich werde Ihnen die entsprechenden Anmerkungen am Dienstag aushaendigen I will to_you the corresponding comments on Tuesday pass_on

Page 25: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

German  free  word  order

19

I will to_you the report pass_on

To_you will I the report pass_on

The report will I to_you pass_on

The finite verb always appears in 2nd position, but Any constituent (not just the subject) can appear in the 1st position

Page 26: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

German  verbs

20

Ich werde Ihnen den Report aushaendigen , I will to_you the report pass_on ,

Main clause

Page 27: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

German  verbs

20

Ich werde Ihnen den Report aushaendigen , I will to_you the report pass_on ,

Main clause

damit Sie den eventuell uebernehmen koennen .so_that you it perhaps adopt can .

Subordinate clause

Page 28: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Collins’  Mo0va0on

21

Phrase-based models have an overly simplistic way of handling different word orders.

We can describe the linguistic differences between different languages.

Collins defines a set of 6 simple, linguistically motivated rules, and demonstrates that they result in significant translation improvements.

Page 29: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Collins’  Pre-­‐ordering  Model

22

Ich werde Ihnen den Report aushaendigen , damit Sie den eventuell uebernehmen koennen .

Step 1: Reorder the source language

Page 30: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Collins’  Pre-­‐ordering  Model

22

Ich werde Ihnen den Report aushaendigen , damit Sie den eventuell uebernehmen koennen .

Ich werde aushaendigen Ihnen den Report , damit Sie koennen uebernehmen den eventuell .

Step 1: Reorder the source language

Page 31: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Collins’  Pre-­‐ordering  Model

22

Ich werde Ihnen den Report aushaendigen , damit Sie den eventuell uebernehmen koennen .

(I will pass_on to_you the report, so_that you can adopt it perhaps .)

Ich werde aushaendigen Ihnen den Report , damit Sie koennen uebernehmen den eventuell .

Step 1: Reorder the source language

Page 32: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Collins’  Pre-­‐ordering  Model

22

Ich werde Ihnen den Report aushaendigen , damit Sie den eventuell uebernehmen koennen .

(I will pass_on to_you the report, so_that you can adopt it perhaps .)

Ich werde aushaendigen Ihnen den Report , damit Sie koennen uebernehmen den eventuell .

Step 1: Reorder the source language

Step 2: Apply the phrase-based machine translation pipeline to the reordered input.

Page 33: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Example  Parse  Tree

23

S

PPER-SBI

VFIN-HDwill

VP

PPER-DAto_you

NP-OA VVINF-HDpass_on

ART the

NNReport

Page 34: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

24

VP-OC

PDS-OAdenthat

ADJD-MOeventuellperhaps

VVINF-HDuebernehmen

adopt

S

VINF-HDkoennen

can

...

Rule 1: Verbs are initial in VPs Within a VP, move the head to the initial position

Page 35: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

24

VP-OC

PDS-OAdenthat

ADJD-MOeventuellperhaps

VVINF-HDuebernehmen

adopt

S

VINF-HDkoennen

can

...

Rule 1: Verbs are initial in VPs Within a VP, move the head to the initial position

Page 36: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

25

S-MO

KOUS-CPdamit

so-that

...

VP-OC

VVINF-HDuebernehmen

adopt

PPER-SBSieyou

VINF-HDkoennen

can

Rule 2: Verbs follow complementizers In a subordinated clause mote the head of the clause to follow the complementizer

Page 37: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

25

S-MO

KOUS-CPdamit

so-that

...

VP-OC

VVINF-HDuebernehmen

adopt

PPER-SBSieyou

VINF-HDkoennen

can

Rule 2: Verbs follow complementizers In a subordinated clause mote the head of the clause to follow the complementizer

Page 38: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

26

S-MO

KOUS-CPdamit

so-that

...

VP-OC

VVINF-HDuebernehmen

adopt

PPER-SBSieyou

VINF-HDkoennen

can

Rule 3: Move subject The subject is moved to directly precede the head of the clause

Page 39: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

26

S-MO

KOUS-CPdamit

so-that

...

VP-OC

VVINF-HDuebernehmen

adopt

PPER-SBSieyou

VINF-HDkoennen

can

Rule 3: Move subject The subject is moved to directly precede the head of the clause

Page 40: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

27

S

PPER-SBWirwe

PTKVZ-SVPauf

*PARTICLE*

VVINF-HDfordemaccept

Rule 4: Particles In verb particle constructions, the particle is moved to precede the finite verb

NP-OA

ARTdasthe

NNPraesidiumpresidency

Page 41: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

27

S

PPER-SBWirwe

PTKVZ-SVPauf

*PARTICLE*

VVINF-HDfordemaccept

Rule 4: Particles In verb particle constructions, the particle is moved to precede the finite verb

NP-OA

ARTdasthe

NNPraesidiumpresidency

Page 42: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

28

S

PPER-SBWirwe

VVINF-HDkonnten

could

Rule 5: Infinitives Infinitives are moved to directly follow the finite verb within a clause

VVINF-HDeinreichen

submit

PTK-NEGnichtnot

VP-OC

...

OOER-OAesit

Page 43: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

28

S

PPER-SBWirwe

VVINF-HDkonnten

could

Rule 5: Infinitives Infinitives are moved to directly follow the finite verb within a clause

VVINF-HDeinreichen

submit

PTK-NEGnichtnot

VP-OC

...

OOER-OAesit

Page 44: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

29

S

PPER-SBWirwe

VVINF-HDkonnten

could

Rule 6: Negation Negative particle is moved to directly follow the finite verb

PTK-NEGnichtnot

VP-OC

...

VVINF-HDeinreichen

submit

OOER-OAesit

Page 45: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

29

S

PPER-SBWirwe

VVINF-HDkonnten

could

Rule 6: Negation Negative particle is moved to directly follow the finite verb

PTK-NEGnichtnot

VP-OC

...

VVINF-HDeinreichen

submit

OOER-OAesit

Page 46: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

A  Less  Awful  German  Language

Mark Twain

Ich werde Ihnen den Report aushaendigen, damit Sie den eventuell uebernehmen koennen.

I will to_you the report pass_on, so_that you it perhaps adopt can.

Page 47: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

A  Less  Awful  German  Language

Mark Twain

Ich werde Ihnen den Report aushaendigen, damit Sie den eventuell uebernehmen koennen.

I will to_you the report pass_on, so_that you it perhaps adopt can.

Ich werde aushaendigen Ihnen den Report, damit Sie koennen uebernehmen den eventuell.

I will pass_on to_you the report, so_that you can adopt it perhaps .

Page 48: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

A  Less  Awful  German  Language

Mark Twain

Ich werde Ihnen den Report aushaendigen, damit Sie den eventuell uebernehmen koennen.

I will to_you the report pass_on, so_that you it perhaps adopt can.

Ich werde aushaendigen Ihnen den Report, damit Sie koennen uebernehmen den eventuell.

I will pass_on to_you the report, so_that you can adopt it perhaps .

Now that seems less like the ravings of a madman.

Page 49: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Experiments

• Parallel  training  data:  Europarl  corpus  (751k sentence pairs, 15M German words, 16M English)

• Parsed German training sentences • Reordered the German training sentences with

their 6 clause reordering rules• Trained a phrase-based model• Parsed and reordered the German test sentences• Translated them• Compared against the standard phrase-based

model without parsing/reordering31

Page 50: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Bleu  score  increase

0

5

10

15

20

25

30

Baseline Reordered System

26.825.2

Significant improvement at p<0.01 using the sign test

Page 51: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Human  Transla0on  Judgments

• 100  sentences  (10-­‐20  words  in  length)• Two  annotators• Judged  two  different  versions

–  Baseline  system’s  transla0on–  Reordering  system’s  transla0on

• Judgments:  Worse,  beler  or  equal• Sentences  were  chosen  at  random,  systems’  transla0ons  were  presented  in  random  order

33

Page 52: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Human  Transla0on  Judgments

34

+ = –

Annotator  1 40% 40% 20%

Annotator  2 44% 37% 19%

+ = reordered translation better– = baseline better= = equal

Page 53: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Examples

35

Reference I think it is wrong in principle to have such measures in the European Union

Reordered I believe that it is wrong in principle to take such measures in the European Union

Baseline I believe that it is wrong in principle, such measure in the European Union to take.

Page 54: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Examples

35

Reference I think it is wrong in principle to have such measures in the European Union

Reordered I believe that it is wrong in principle to take such measures in the European Union

Baseline I believe that it is wrong in principle, such measure in the European Union to take.

Page 55: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Examples

36

ReferenceThe current difficulties should encourage us to redouble our efforts to promote coorperation in the Euro-Mediterranean framework.

BaselineThe current problems should spur us, our efforts to promote coorperation within the framework of the e-prozesses to be intensified.

ReorderedThe current problems should spur us to intensify our efforts to promote cooperation within the framework of the e-prozesses.

Page 56: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Examples

36

ReferenceThe current difficulties should encourage us to redouble our efforts to promote coorperation in the Euro-Mediterranean framework.

BaselineThe current problems should spur us, our efforts to promote coorperation within the framework of the e-prozesses to be intensified.

ReorderedThe current problems should spur us to intensify our efforts to promote cooperation within the framework of the e-prozesses.

Page 57: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Examples

37

Reference To go on subsidizing tobacco cultivation at the same time is a downright contridiction.

Baseline At the same time, continue to subsidize tobacco growing, it is quite schizophrenic.

Reordered At the same time, to continue to subsidize tobacco growing is schizophrenic.

Page 58: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Examples

37

Reference To go on subsidizing tobacco cultivation at the same time is a downright contridiction.

Baseline At the same time, continue to subsidize tobacco growing, it is quite schizophrenic.

Reordered At the same time, to continue to subsidize tobacco growing is schizophrenic.

Page 59: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Examples

38

Reference We have voted against the report by Mrs. Lalumiere for reasons that include the following:

ReorderedWe have voted, amongst other things, for the following reasons against the report by Mrs. Lalumiere:

BaselineWe have, among other things, for the following reasons against the report by Mrs. Lalumiere voted:

Page 60: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Examples

38

Reference We have voted against the report by Mrs. Lalumiere for reasons that include the following:

ReorderedWe have voted, amongst other things, for the following reasons against the report by Mrs. Lalumiere:

BaselineWe have, among other things, for the following reasons against the report by Mrs. Lalumiere voted:

Page 61: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Discussion:  Clause  Restructuring

• Are  you  convinced  that  German-­‐English  transla0on  has  improved?

• Do  you  think  that  this  is  a  good  fit  for  phrase-­‐based  machine  transla0on?

• What  limita0ons  does  this  method  have?

(Discuss  with  your  neighbor.)

39

Page 62: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Limita0ons

• Requires  a  parser  for  the  source  language–  We  have  parsers  for  only  a  small  number  of  languages  –  Penalizes  “low  resource  languages”–  Fine  for  transla0ng  from  English  into  other  languages

• Involves  hand  craqed  rules• Removes  the  nice  language-­‐independent  quali0es  of  sta0s0cal  machine  transla0on

40

Page 63: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Learning  the  Rules  Automa0cally

• Great  term  project  idea!• “Improving  a  sta0s0cal  MT  system  with  automa0cally  learned  rewrite  palerns”by  Fei  Xia  and  Michael  McCord  (Coling  2004)

41

Page 64: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Syntac0c  LMs

• Our  goal  is  reorder  the  translated  phrases  so  that  they  are  gramma0cal  English  

• Isn’t  the  language  model  probability  supposed  to  do  that  already?

• Instead  of  an  n-­‐gram  model,  could  we  augment  the  LM  with  syntac0c  informa0on?

42

Page 65: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Right-corner Incremental Parsing

Motivation Machine Translation Incremental Parsing Integration Results

An Incremental Syntactic Language Model for Statistical Phrase-based Machine Translation Lane Schwartz

Transform right-expanding sequences of constituentsinto left-expanding sequences of incomplete constituents

S

VP

PP

NP

Friday

IN

on

VP

NP

NN

board

DT

the

VB

meets

NP

NN

president

DT

The

S

NP

Friday

S/NP

IN

on

S/PP

VP

NN

board

VP/NN

DT

the

VP/NP

VB

meets

S/VP

NP

NN

president

NP/NN

DT

The

Sta0s0cal  parsing

43

Problem: bottom up parsing requires whole sentenceWe need the LM to be able to score partial translations

Page 66: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

One  possibility:  Incremental  parsing

44

Right-corner Incremental Parsing

Motivation Machine Translation Incremental Parsing Integration Results

An Incremental Syntactic Language Model for Statistical Phrase-based Machine Translation Lane Schwartz

Transform right-expanding sequences of constituentsinto left-expanding sequences of incomplete constituents

S

VP

PP

NP

Friday

IN

on

VP

NP

NN

board

DT

the

VB

meets

NP

NN

president

DT

The

S

NP

Friday

S/NP

IN

on

S/PP

VP

NN

board

VP/NN

DT

the

VP/NP

VB

meets

S/VP

NP

NN

president

NP/NN

DT

The

Page 67: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

More  later

• Mal  will  talk  more  about  syntax  based  LMs,  and  why  they  do  a  poor  job  selec0ng  gramma0cal  output  later  in  the  term.

• Thursday,  I’ll  move  away  from  phrase-­‐based  MT  and  talk  synchronous  grammar  models  

45

Page 68: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Ques0ons?

• Ques0ons  about  this  material?• Ques0ons  about  the  Project  Proposals  (due  tonight)?

• HW2  is  released  today

46

Page 69: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

The  Awful  German  LanguageSome German words are so long that they have a perspective. Freundschaftsbezeigungen. Dilettantenaufdringlichkeiten.Stadtverordnetenversammlungen. These things are not words, they are alphabetical processions. And they are not rare; one can open a German newspaper at any time and see them marching majestically across the page–and if he has any imagination he can see the banners and hear the music, too.”

Mark Twain

Page 70: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

The  Awful  German  Language A dog is der Hund; now you put that dog in the genitive case, and is he the same dog he was before? No, sir; he is des Hundes; put him in the dative case and what is he? Why, he is dem Hund. Now you snatch him into the accusative case and how is it with him? Why, he is den Hunden. But suppose he happens to be twins and you have to pluralize him- what then? Why, they'll swat that twin dog around through the 4 cases until he'll think he's an entire international dog show. I don't like dogs, but I wouldn't treat a dog like that. ”

Mark Twain

Page 71: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

The  Awful  German  Language The Germans have an inhuman way of cutting up their verbs. Now a verb has a hard time enough of it in this world when it's all together. It's downright inhuman to split it up. But that's just what those Germans do. They take part of a verb and put it down here, like a stake, and they take the other part of it and put it away over yonder like another stake, and between these two limits they just shovel in German. ”

Mark Twain

Page 72: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

50

VP-OC

PDS-OAdenthat

ADJD-MOeventuellperhaps

VVINF-HDuebernehmen

adopt

S

VINF-HDkoennen

can

...

Rule 1: Verb initial Within a VP, move the head to the initial position

Page 73: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

51

VP-OC

S-MO

VVINF-HDuebernehmen

adopt

...

KOUS-CPdamit

so-that

PPER-SBSieyou

VINF-HDkoennen

can

Rule 2: Verbs follow complementizers In a subordinated clause mote the head of the clause to follow the complementizer

Page 74: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

52

S-MO

KOUS-CPdamit

so-that

...

VP-OC

VVINF-HDuebernehmen

adopt

PPER-SBSieyou

VINF-HDkoennen

can

Rule 3: Move subject The subject is moved to directly precede the head of the clause

Page 75: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

53

S

PPER-SBWirwe

NP-OA

ARTdasthe

PTKVZ-SVPauf

*PARTICLE*

VVINF-HDfordemaccept

Rule 4: Particles In verb particle constructions, the particle is moved to precede the finite verb

NNPraesidiumpresidency

Page 76: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

54

S

PPER-SBWirwe

PTK-NEGnichtnot

VP-OCVVINF-HDkonnten

could

Rule 5: Infinitives Infinitives are moved to directly follow the finite verb within a clause

VVINF-HDeinreichen

submit

...

OOER-OAesit

Page 77: Part1:,Re’ordering,for Phrase’based,translaonu.cs.biu.ac.il/~yogo/courses/mt2016/l9/l9-slides.pdf · German,verbs 18 Ich werde Ihnen den Report aushaendigen . I will to_you the

Clause  Restructuring

55

S

PPER-SBWirwe

VVINF-HDkonnten

could

Rule 6: Negation Negative particle is moved to directly follow the finite verb

VVINF-HDeinreichen

submit

PTK-NEGnichtnot

VP-OC

...

OOER-OAesit