6.881: natural language processing machine translation ii...6.881: natural language processing —...

53
6.881: Natural Language Processing Machine Translation II Philipp Koehn CS and AI Lab MIT November 23, 2004 – p.1

Upload: others

Post on 22-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

  • 6.881: Natural Language Processing

    Machine Translation II

    Philipp Koehn

    CS and AI Lab

    MIT

    November 23, 2004

    – p.1

  • 6.881: Natural Language Processing — Machine Translation IIp

    Outline p

    � Lecture I

    – Introduction to Machine Translation

    – Principles of Statistical MT

    – Word-Based Models

    – Phrase-Based Models

    � Lecture II

    – Beam Search Decoding

    – Evaluation

    – The Challenge of Syntax

    Philipp Koehn, CSAIL, MIT 2 – p.2

  • 6.881: Natural Language Processing — Machine Translation IIp

    Phrase-Based Translation p

    Morgen fliege ich nach Kanada zur Konferenz

    Tomorrow I will fly to the conference in Canada

    � Foreign input is segmented in phrases

    – any sequence of words, not necessarily linguistically motivated

    � Each phrase is translated into English

    � Phrases are reordered

    Philipp Koehn, CSAIL, MIT 3 – p.3

  • 6.881: Natural Language Processing — Machine Translation IIp

    Decoding Algorithm p

    � Goal of the decoding algorithm:

    Put models to work, perform the actual translation

    Philipp Koehn, CSAIL, MIT 4 – p.4

  • 6.881: Natural Language Processing — Machine Translation IIp

    Greedy Decoder pMaria no daba una bofetada a la bruja verde

    Mary no give a slap to the witch green

    Mary no give a slap to the green witch

    Mary no give a slap the green witch

    Mary did not give a slap the green witch

    GLOSS

    SWAP

    ERASE

    INSERTMary not give a slap the green witch

    CHANGE

    Mary did not slap the green witchJOIN

    � Greedy Hill-climbing [Germann, 2003]

    – start with gloss

    – improve probability with actions

    – use 2-step look-ahead to avoid some local minima

    Philipp Koehn, CSAIL, MIT 5 – p.5

  • 6.881: Natural Language Processing — Machine Translation IIp

    Beam-Search Decoding Process pbrujaMaria no verdelaadio una bofetada

    � Build translation left to right

    – select foreign words to be translated

    Philipp Koehn, CSAIL, MIT 6 – p.6

  • 6.881: Natural Language Processing — Machine Translation IIp

    Beam-Search Decoding Process pbrujaMaria no

    Mary

    verdelaadio una bofetada

    � Build translation left to right

    – select foreign words to be translated

    – find English phrase translation

    – add English phrase to end of partial translation

    Philipp Koehn, CSAIL, MIT 7 – p.7

  • 6.881: Natural Language Processing — Machine Translation IIp

    Beam-Search Decoding Process pbrujano verdelaadio una bofetada

    Mary

    Maria

    � Build translation left to right

    – select foreign words to be translated

    – find English phrase translation

    – add English phrase to end of partial translation

    – mark foreign words as translated

    Philipp Koehn, CSAIL, MIT 8 – p.8

  • 6.881: Natural Language Processing — Machine Translation IIp

    Beam-Search Decoding Process pbrujaMaria no

    Mary did not

    verdelaadio una bofetada

    � One to many translation

    Philipp Koehn, CSAIL, MIT 9 – p.9

  • 6.881: Natural Language Processing — Machine Translation IIp

    Beam-Search Decoding Process pbrujaMaria no dio una bofetada

    Mary did not slap

    verdelaa

    � Many to one translation

    Philipp Koehn, CSAIL, MIT 10 – p.10

  • 6.881: Natural Language Processing — Machine Translation IIp

    Beam-Search Decoding Process pbrujaMaria no dio una bofetada

    Mary did not slap the

    verdea la

    � Many to one translation

    Philipp Koehn, CSAIL, MIT 11 – p.11

  • 6.881: Natural Language Processing — Machine Translation IIp

    Beam-Search Decoding Process pbrujaMaria no dio una bofetada a la

    Mary did not slap the green

    verde

    � Reordering

    Philipp Koehn, CSAIL, MIT 12 – p.12

  • 6.881: Natural Language Processing — Machine Translation IIp

    Beam-Search Decoding Process pbrujaMaria

    witch

    no verde

    Mary did not slap the green

    dio una bofetada a la

    � Translation finished

    Philipp Koehn, CSAIL, MIT 13 – p.13

  • 6.881: Natural Language Processing — Machine Translation IIp

    Translation Options pbofetadaunadio a la verdebrujanoMaria

    Mary notdid not

    give a slap to the witch greenby

    to theto

    green witch

    the witch

    did not giveno

    a slapslap

    theslap

    � Look up possible phrase translations

    – many different ways to segment words into phrases

    – many different ways to translate each phrase

    Philipp Koehn, CSAIL, MIT 14 – p.14

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Expansion pdio a la verdebrujanoMaria

    Mary notdid not

    give a slap to the witch greenby

    to theto

    green witch

    the witch

    did not giveno

    a slapslap

    theslap

    e: f: ---------p: 1

    una bofetada

    � Start with null hypothesis

    – e: no English words

    – f: no foreign words covered

    – p: probability 1

    Philipp Koehn, CSAIL, MIT 15 – p.15

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Expansion pdio a la verdebrujanoMaria

    Mary notdid not

    give a slap to the witch greenby

    to theto

    green witch

    the witch

    did not giveno

    a slapslap

    theslap

    e: Maryf: *--------p: .534

    e: f: ---------p: 1

    una bofetada

    � Pick translation option

    � Create hypothesis– e: add English phrase Mary– f: first foreign word covered– p: probability 0.534

    Philipp Koehn, CSAIL, MIT 16 – p.16

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Expansion pdio a la verdebrujanoMaria

    Mary notdid not

    give a slap to the witch greenby

    to theto

    green witch

    the witch

    did not giveno

    a slapslap

    theslap

    e: Maryf: *--------p: .534

    e: witchf: -------*-p: .182

    e: f: ---------p: 1

    una bofetada

    � Add another hypothesis

    Philipp Koehn, CSAIL, MIT 17 – p.17

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Expansion pdio una bofetada a la verdebrujanoMaria

    Mary notdid not

    give a slap to the witch greenby

    to theto

    green witch

    the witch

    did not giveno

    a slapslap

    theslap

    e: Maryf: *--------p: .534

    e: witchf: -------*-p: .182

    e: f: ---------p: 1

    e: ... slapf: *-***----p: .043

    � Further hypothesis expansion

    Philipp Koehn, CSAIL, MIT 18 – p.18

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Expansion pdio una bofetada bruja verdeMaria

    Mary notdid not

    give a slap to the witch greenby

    to theto

    green witch

    the witch

    did not giveno

    a slapslap

    theslap

    e: Maryf: *--------p: .534

    e: witchf: -------*-p: .182

    e: f: ---------p: 1

    e: slapf: *-***----p: .043

    e: did notf: **-------p: .154

    e: slapf: *****----p: .015

    e: thef: *******--p: .004283

    e:green witchf: *********p: .000271

    a lano

    � ... until all foreign words covered

    – find best hypothesis that covers all foreign words

    – backtrack to read off translation

    Philipp Koehn, CSAIL, MIT 19 – p.19

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Expansion p

    Mary notdid not

    give a slap to the witch greenby

    to theto

    green witch

    the witch

    did not giveno

    a slapslap

    theslap

    e: Maryf: *--------p: .534

    e: witchf: -------*-p: .182

    e: f: ---------p: 1

    e: slapf: *-***----p: .043

    e: did notf: **-------p: .154

    e: slapf: *****----p: .015

    e: thef: *******--p: .004283

    e:green witchf: *********p: .000271

    no dio a la verdebrujanoMaria una bofetada

    � Adding more hypothesis

    Explosion of search space

    Philipp Koehn, CSAIL, MIT 20 – p.20

  • 6.881: Natural Language Processing — Machine Translation IIp

    Explosion of Search Space p

    � Number of hypotheses is exponential with respect to

    sentence length

    Decoding is NP-complete [Knight, 1999]

    Need to reduce search space

    – risk free: hypothesis recombination

    – risky: histogram/threshold pruning

    Philipp Koehn, CSAIL, MIT 21 – p.21

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Recombination p

    p=1Mary did not give

    givedid not

    p=0.534

    p=0.164

    p=0.092

    p=0.044

    p=0.092

    � Different paths to the same partial translation

    Philipp Koehn, CSAIL, MIT 22 – p.22

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Recombination p

    p=1Mary did not give

    givedid not

    p=0.534

    p=0.164

    p=0.092

    p=0.092

    � Different paths to the same partial translation

    Combine paths

    – drop weaker hypothesis

    – keep pointer from worse path

    Philipp Koehn, CSAIL, MIT 23 – p.23

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Recombination p

    p=1Mary did not give

    givedid not

    p=0.534

    p=0.164

    p=0.092Joe

    did not givep=0.092 p=0.017

    � Recombined hypotheses do not have to match completely

    � No matter what is added, weaker path can be dropped, if:

    – last two English words match (matters for language model)

    – foreign word coverage vectors match (effects future path)

    Philipp Koehn, CSAIL, MIT 24 – p.24

  • 6.881: Natural Language Processing — Machine Translation IIp

    Hypothesis Recombination p

    p=1Mary did not give

    givedid not

    p=0.534

    p=0.164

    p=0.092Joe

    did not givep=0.092

    � Recombined hypotheses do not have to match completely

    � No matter what is added, weaker path can be dropped, if:

    – last two English words match (matters for language model)

    – foreign word coverage vectors match (effects future path)

    Combine paths

    Philipp Koehn, CSAIL, MIT 25 – p.25

  • 6.881: Natural Language Processing — Machine Translation IIp

    Pruning p

    � Hypothesis recombination is not sufficient

    Heuristically discard weak hypotheses

    � Organize Hypothesis in stacks, e.g. by

    – same foreign words covered

    – same number of foreign words covered (Pharaoh does this)

    – same number of English words produced

    � Compare hypotheses in stacks, discard bad ones

    – histogram pruning: keep top � hypotheses in each stack (e.g., �=100)

    – threshold pruning: keep hypotheses that are at most � times the cost of

    best hypothesis in stack (e.g., � = 0.001)

    Philipp Koehn, CSAIL, MIT 26 – p.26

  • 6.881: Natural Language Processing — Machine Translation IIp

    Comparing Hypotheses p

    � Comparing hypotheses with same number of foreign

    words covered

    Maria no

    e: Mary did notf: **-------p: 0.154

    a la

    e: thef: -----**--p: 0.354

    dio una bofetada bruja verde

    betterpartial

    translation

    coverseasier part

    --> lower cost

    � Hypothesis that covers easy part of sentence is preferred

    Need to consider future costPhilipp Koehn, CSAIL, MIT 27 – p.27

  • 6.881: Natural Language Processing — Machine Translation IIp

    Future Cost Estimation p

    � Estimate cost to translate remaining part of input

    � Step 1: find cheapest translation options– find cheapest translation option for each input span– compute translation model cost– estimate language model cost (no prior context)– ignore reordering model cost

    � Step 2: compute cheapest cost– for each contiguous span:– find cheapest sequence of translation options

    � Precompute and lookup– precompute future cost for each contiguous span– future cost for any coverage vector:

    sum of cost of each contiguous span of uncovered words

    no expensive computation during run time

    Philipp Koehn, CSAIL, MIT 28 – p.28

  • 6.881: Natural Language Processing — Machine Translation IIp

    Word Lattice Generation p

    p=1Mary did not give

    givedid not

    p=0.534

    p=0.164

    p=0.092Joe

    did not givep=0.092

    � Search graph can be easily converted into a word lattice

    – can be further mined for n-best lists

    enables reranking approaches

    enables discriminative training

    Marydid not give

    givedid not

    Joedid not give

    Philipp Koehn, CSAIL, MIT 29 – p.29

  • 6.881: Natural Language Processing — Machine Translation IIp

    Evaluation p

    � Manual Evaluation

    – human judge output

    – expensive

    � Automatic Evaluation

    – machines judge output

    – fast

    – reliable?

    � Task-Oriented Evaluation

    – humans do task with MT

    – tests usefulness of MT

    Philipp Koehn, CSAIL, MIT 30 – p.30

  • 6.881: Natural Language Processing — Machine Translation IIp

    Manual Evaluation p

    � Correct yes/no

    – simple

    – longer sentences almost always have at least one error

    � Correct on scale

    – 0=bad, 5=perfect

    – disagreement between judges

    � More detailed judgments

    – adequacy: how well is meaning preserved?

    – fluency: is it good English?

    – ...

    Philipp Koehn, CSAIL, MIT 31 – p.31

  • 6.881: Natural Language Processing — Machine Translation IIp

    Manual Evaluation p

    � Give grade from 0=bad to 5=perfect

    – In the First Two Months Guangdong’s Export of High-Tech Products 3.76

    Billion US Dollars

    – The Guangdong provincial foreign trade and economic growth has made

    important contributions.

    – Suicide explosion in Jerusalem

    � Agreement

    Philipp Koehn, CSAIL, MIT 32 – p.32

  • 6.881: Natural Language Processing — Machine Translation IIp

    Automatic Evaluation p

    � Why automatic evaluation metrics?

    – manual evaluation is too slow

    – evaluation on large test sets reveals minor improvements

    – automatic tuning to improve machine translation performance

    � History

    – Word Error Rate

    – BLEU since 2002

    – BLEU in short: overlap with reference translations

    Philipp Koehn, CSAIL, MIT 33 – p.33

  • 6.881: Natural Language Processing — Machine Translation IIp

    Bi-Lingual Evaluation Understudy (BLEU) p

    Philipp Koehn, CSAIL, MIT 34 – p.34

  • 6.881: Natural Language Processing — Machine Translation IIp

    Correlation with Manual Evaluation p

    � Correlates with human evaluation (adequacy, fluency)Philipp Koehn, CSAIL, MIT 35– p.35

  • 6.881: Natural Language Processing — Machine Translation IIp

    The Challenge of Syntax p

    foreignwords

    foreignsyntax

    foreignsemantics

    interlingua

    englishsemantics

    englishsyntax

    englishwords

    � Remember the pyramid

    Philipp Koehn, CSAIL, MIT 36 – p.36

  • 6.881: Natural Language Processing — Machine Translation IIp

    Advantages of Syntax-Based Translation p

    � Reordering for syntactic reasons

    – e.g., move German object to end of sentence

    � Better explanation for function words

    – e.g., prepositions, determiners

    � Conditioning to syntactically related words

    – translation of verb may depend on subject or object

    � Use of syntactic language models

    Philipp Koehn, CSAIL, MIT 37 – p.37

  • 6.881: Natural Language Processing — Machine Translation IIp

    Inversion Transduction Grammars p

    � Generation of both English and foreign trees [Wu, 1997]

    � Rules (binary and unary)

    � � ��� ��� � � � ��

    � � ��� ��� � ��� ���

    � �� ��

    � �� ��

    � � � ��

    Common binary tree required

    – limits the complexity of reorderings

    Philipp Koehn, CSAIL, MIT 38 – p.38

  • 6.881: Natural Language Processing — Machine Translation IIp

    Syntax Trees p

    Mary did not slap the green witch

    � English binary tree

    Philipp Koehn, CSAIL, MIT 39 – p.39

  • 6.881: Natural Language Processing — Machine Translation IIp

    Syntax Trees (2) p

    Maria no daba una bofetada a la bruja verde

    � Spanish binary tree

    Philipp Koehn, CSAIL, MIT 40 – p.40

  • 6.881: Natural Language Processing — Machine Translation IIp

    Syntax Trees (3) p

    MaryMaria

    did*

    notno

    slapdaba

    *una

    *bofetada

    *a

    thela

    greenverde

    witchbruja

    � Combined tree with reordering of Spanish

    � Can such trees be learned from data?

    � Do common tree exist with real syntax on both sides?

    Philipp Koehn, CSAIL, MIT 41 – p.41

  • 6.881: Natural Language Processing — Machine Translation IIp

    Dependency Structure p

    gekauftbought

    hattehad

    autocar

    dasthe

    ichI

    � Common dependency tree

    � Interest in dependency-based translation models

    Philipp Koehn, CSAIL, MIT 42 – p.42

  • 6.881: Natural Language Processing — Machine Translation IIp

    String to Tree Translation p

    foreignwords

    foreignsyntax

    foreignsemantics

    interlingua

    englishsemantics

    englishsyntax

    englishwords

    � Use of English syntax trees [Yamada and Knight, 2001]

    – exploit rich resources on the English side

    – obtained with statistical parser [Collins, 1997]

    – flattened tree to allow more reorderings

    – works well with syntactic language model

    Philipp Koehn, CSAIL, MIT 43 – p.43

  • 6.881: Natural Language Processing — Machine Translation IIp

    Yamada and Knight [2001] pVB

    VB1 VB2

    VB

    TO

    TO

    MN

    PRP

    he adores

    listening

    to music

    VB

    VB1VB2

    VB

    TO

    TO

    MN

    PRP

    he adores

    listening

    tomusic

    VB

    VB1VB2

    VB

    TO

    TO

    MN

    PRP

    he adores

    listening

    tomusic

    no

    ha ga desu

    VB

    VB1VB2

    VB

    TO

    TO

    MN

    PRP

    ha daisuki

    kiku

    woongaku

    no

    kare ga desu

    reorder

    insert

    translate

    take leaves

    Kare ha ongaku wo kiku no ga daisuki desu

    Philipp Koehn, CSAIL, MIT 44 – p.44

  • 6.881: Natural Language Processing — Machine Translation IIp

    Syntactic Language Model p

    � Good syntax tree good English

    � Allows for long distance constraints

    NP

    NP PP

    the house of the man

    S

    VP

    is good

    S

    NP VP

    the house is the man

    ?

    VP

    is good

    � Left translation preferred by syntactic LM

    Philipp Koehn, CSAIL, MIT 45 – p.45

  • 6.881: Natural Language Processing — Machine Translation IIp

    String to Tree Transfer and Syntactic LM p

    � Work presented at this MT Summit by Charniak, Knight,Yamada

    – more grammatical correct output

    – more perfectly translated sentences

    – ... but no improvement in BLEU

    � Syntactic transfer and LM on top of phrase translation

    – parse a lattice generated by phrase-based MT

    – no results yet

    Philipp Koehn, CSAIL, MIT 46 – p.46

  • 6.881: Natural Language Processing — Machine Translation IIp

    Augment Models with Syntactic Features p

    � Intuition: other models work fine, syntax provides

    additional clues

    � Define syntactic properties that should hold

    – preservation of plural

    – output should have verb

    – no dropping of content words

    – ...

    � 2003 summer workshop at John Hopkins: little

    improvement

    Philipp Koehn, CSAIL, MIT 47 – p.47

  • 6.881: Natural Language Processing — Machine Translation IIp

    Clause Structure pS PPER-SB Ich VAFIN-HD werde VP-OC PPER-DA Ihnen NP-OA ART-OA die ADJ-NK entsprechenden NN-NK Anmerkungen VVFIN aushaendigen $, , S-MO KOUS-CP damit PPER-SB Sie VP-OC PDS-OA das ADJD-MO eventuell PP-MO APRD-MO bei ART-DA der NN-NK Abstimmung VVINF uebernehmen VMFIN koennen$. .

    I will you the corresponding comments pass on , so that you that perhaps in the vote include can.

    MAINCLAUSE

    SUB-ORDINATE

    CLAUSE

    � Syntax tree from German parser

    – statistical parser by Amit Dubay, trained on TIGER treebank

    Philipp Koehn, CSAIL, MIT 48 – p.48

  • 6.881: Natural Language Processing — Machine Translation IIp

    Reordering When Translating pS PPER-SB Ich VAFIN-HD werde PPER-DA Ihnen NP-OA ART-OA die ADJ-NK entsprechenden NN-NK Anmerkungen VVFIN aushaendigen$, ,S-MO KOUS-CP damit PPER-SB Sie PDS-OA das ADJD-MO eventuell PP-MO APRD-MO bei ART-DA der NN-NK Abstimmung VVINF uebernehmen VMFIN koennen$. .

    I will you the corresponding comments pass on, so that you that perhaps in the vote include can.

    � Reordering when translating into English– tree is flattened– clause level constituents line up

    Philipp Koehn, CSAIL, MIT 49 – p.49

  • 6.881: Natural Language Processing — Machine Translation IIp

    Clause Level Reordering pS PPER-SB Ich VAFIN-HD werde PPER-DA Ihnen NP-OA ART-OA die ADJ-NK entsprechenden NN-NK Anmerkungen VVFIN aushaendigen$, ,S-MO KOUS-CP damit PPER-SB Sie PDS-OA das ADJD-MO eventuell PP-MO APRD-MO bei ART-DA der NN-NK Abstimmung VVINF uebernehmen VMFIN koennen$. .

    I will you the corresponding comments pass on, so that you that perhaps in the vote include can.

    124

    5

    3

    1264

    7

    53

    � Clause level reordering is a well defined task– label German constituents with their English order– done this for 300 sentences, two annotators, high agreement

    Philipp Koehn, CSAIL, MIT 50 – p.50

  • 6.881: Natural Language Processing — Machine Translation IIp

    Systematic Reordering German English p

    � Many types of reorderings are systematic

    – move verb group together

    – subject - verb - object

    – move negation in front of verb

    Write rules by hand

    – apply rules to test and training data

    – train standard phrase-based SMT system

    System BLEU

    baseline system 25.2%

    with manual rules 26.8%

    Philipp Koehn, CSAIL, MIT 51 – p.51

  • 6.881: Natural Language Processing — Machine Translation IIp

    Integration p

    f f’ etransformation

    phrase-basedstatistical MT

    � Transform f into f’ with our methods

    � Translate n-best restructurings with phrase-based MT

    – uses both transformation score and translation/language model score

    – if no restructuring

    baseline performance

    � Transformation does not need to be perfect

    – phrase-based model may still reorder

    Philipp Koehn, CSAIL, MIT 52 – p.52

  • 6.881: Natural Language Processing — Machine Translation IIp

    Improved Translations p

    � we must also this criticism should be taken seriously .

    we must also take this criticism seriously .

    � i am with him that it is necessary , the institutional balance by means of apolitical revaluation of both the commission and the council to maintain .

    i agree with him in this , that it is necessary to maintain the institutionalbalance by means of a political revaluation of both the commission and thecouncil .

    � thirdly , we believe that the principle of differentiation of negotiations note .

    thirdly , we maintain the principle of differentiation of negotiations .

    � perhaps it would be a constructive dialog between the government andopposition parties , social representative a positive impetus in the rightdirection .

    perhaps a constructive dialog between government and opposition partiesand social representative could give a positive impetus in the right direction .

    Philipp Koehn, CSAIL, MIT 53 – p.53

    myheader {Outline}myheader {Phrase-Based Translation}myheader {Decoding Algorithm}myheader {Greedy Decoder}myheader {Beam-Search Decoding Process}myheader {Beam-Search Decoding Process}myheader {Beam-Search Decoding Process}myheader {Beam-Search Decoding Process}myheader {Beam-Search Decoding Process}myheader {Beam-Search Decoding Process}myheader {Beam-Search Decoding Process}myheader {Beam-Search Decoding Process}myheader {Translation Options}myheader {Hypothesis Expansion}myheader {Hypothesis Expansion}myheader {Hypothesis Expansion}myheader {Hypothesis Expansion}myheader {Hypothesis Expansion}myheader {Hypothesis Expansion}myheader {Explosion of Search Space}myheader {Hypothesis Recombination}myheader {Hypothesis Recombination}myheader {Hypothesis Recombination}myheader {Hypothesis Recombination}myheader {Pruning}myheader {Comparing Hypotheses}myheader {Future Cost Estimation}myheader {Word Lattice Generation}myheader {Evaluation}myheader {Manual Evaluation}myheader {Manual Evaluation}myheader {Automatic Evaluation}myheader {Bi-Lingual Evaluation Understudy (BLEU)}myheader {Correlation with Manual Evaluation}myheader {The Challenge of Syntax}myheader {Advantages of Syntax-Based Translation}myheader {Inversion Transduction Grammars}myheader {Syntax Trees}myheader {Syntax Trees (2)}myheader {Syntax Trees (3)}myheader {Dependency Structure}myheader {String to Tree Translation}myheader {Yamada and Knight [2001]}myheader {Syntactic Language Model}myheader {String to Tree Transfer and Syntactic LM}myheader {Augment Models with Syntactic Features}myheader {Clause Structure}myheader {Reordering When Translating}myheader {Clause Level Reordering}myheader {Systematic Reordering German $ightarrow $ English}myheader {Integration}myheader {Improved Translations}