towards interactive and automatic refinement of translation rules

75
Towards Interactive and Automatic Refinement of Translation Rules Ariadna Font Llitjós PhD Thesis Proposal Jaime Carbonell (advisor) Alon Lavie (co-advisor) Lori Levin Bonnie Dorr (Univ. Maryland) 5 November 2004

Upload: said

Post on 06-Jan-2016

59 views

Category:

Documents


0 download

DESCRIPTION

Towards Interactive and Automatic Refinement of Translation Rules. Ariadna Font Llitjós PhD Thesis Proposal Jaime Carbonell (advisor) Alon Lavie (co-advisor) Lori Levin Bonnie Dorr (Univ. Maryland) 5 November 2004. Outline. Introduction Thesis statement and scope Preliminary Research - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Towards Interactive and Automatic Refinement of Translation Rules

Towards Interactive and Automatic Refinement of

Translation Rules

Ariadna Font Llitjós PhD Thesis Proposal

Jaime Carbonell (advisor)

Alon Lavie (co-advisor)

Lori Levin

Bonnie Dorr (Univ. Maryland)

5 November 2004

Page 2: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 2

Outline

• Introduction

• Thesis statement and scope

• Preliminary Research– Interactive elicitation of error information– A framework for automatic rule adaptation

• Proposed Research

• Contributions and Thesis Timeline

Page 3: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 3

Machine Translation (MT)

• Source Language (SL) sentence: Gaudi was a great artist

In Spanish, it translates as: Gaudi era un gran artista

• MT System outputs :*Gaudi estaba un artista grande

*Gaudi era un artista grande

Page 4: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 4

Spanish Adjectives Completed Work Automatic Rule

Adaptation

NP

DET N ADJ

NP

DET ADJ N

a big house una casa grande

NP

DET ADJ N

NP

DET ADJ N

a great artist un gran artista

General order: grande big in size

Exception: gran exceptional

Page 5: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 5

Commercial and Online Systems

Correct Translation: Gaudi era un gran artista

• Systran, Babelfish (Altavista), WorldLingo, Translated.net : *Gaudi era gran artista

• ImTranslation: *El Gaudi era un gran artista

• 1-800-Translate *Gaudi era un fenomenal artista

Page 6: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 6

• Current solutions:

Manual post-editing [Allen, 2003]

Automated post-edition module (APE) [Allen & Hogan, 2000]

Post-editing

Page 7: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 7

Drawbacks of Current Methods

• Manual post-editing Corrections do not generalize Gaudi era un artista grande

Juan es un amigo grande (Juan is a great friend)

Era una oportunidad grande (It is a great

opportunity)

• APE Humans need to predict all the errors ahead of time and code for the post-editing rules + new error

Page 8: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 8

My Solution

• Automate post-editing efforts by feeding them back into the MT system.

• Possible alternatives:Automatic learning of post-editing rules

+ system independent

- several thousands of sentences might need to be corrected for the same error

Automatic refinement of translation rules+ attacks the core of the problem- for transfer-based MT systems (need rules to fix!)

Page 9: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 9

[Corston-Oliver & Gammon, 2003]

[Imamura et al. 2003]

[Menezes & Richardson, 2001]

Related Work

[Callison-Burch, 2004]

[Su et al. 1995]

[Brill, 1993]

[Gavaldà, 2000]

Post-editing

Rule Adaptation

Machine TranslationMy Thesis

No pre-existing training data required

No human reference translations required

Use Non-expert user feedback

Page 10: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 10

Resource-poor Scenarios (AVENUE)

- Lack of electronic parallel data - Lack of manual grammar (or very small initial grammar)

Need to validate elicitation corpus and automatically learned translation rules

Why bother? - Indigenous communities have difficult access to

crucial information that directly affects their life (such as land laws, plagues, health warnings, etc.)

- Preservation of their language and culture

MapudungunQuechuaAymara

Page 11: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 11

How is MT possible for resource-poor languages?

Bilingual speakers

Page 12: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 12

AVENUE Project Overview

Learning

Module

Transfer Rules

Lexical Resources

Run Time Transfer System

Lattice

Word-Aligned Parallel Corpus

Elicitation Tool

Elicitation Corpus

Elicitation Rule Learning

Run-Time System

Handcrafted rules

Morphology

Morpho-logical analyzer

Page 13: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 13

My Thesis

Learning

Module

Transfer Rules

Lexical Resources

Run Time Transfer System

Lattice

Translation

Correction

Tool

Word-Aligned Parallel Corpus

Elicitation Tool

Elicitation Corpus

Elicitation Rule Learning

Run-Time System

Rule Refinement

Rule

Refinement

Module

Handcrafted rules

Morphology

Morpho-logical analyzer

Page 14: Towards Interactive and Automatic Refinement of Translation Rules

Recycle corrections of Machine Translation output back into the system

by refining and expanding existing translation rules

Page 15: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 15

Thesis Statement

- Given a rule-based Transfer MT system, we can extract useful information from non-expert bilingual speakers about the corrections required to make MT output acceptable..

- We can automatically refine and expand translation rules, given corrected and aligned translation pairs and some error information, to improve coverage and overall MT quality.

Page 16: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 16

Assumptions

• No parallel training data available

• No human reference translations available

• The SL sentence needs to be fully parsed by the translation grammar.

• Bilingual speakers can give enough information about the MT errors.

Page 17: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 17

Scope

Types of errors that:

• Focus 1: can be refined fully automatically just by using correction information.

• Focus 2: can be refined fully automatically using correction and error information.

• Focus 3: require a reasonable amount of further user interaction and can be solved by available correction and error information.

Page 18: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 18

Technical Challenges

Elicit minimal MT information from non-expert users

Refine and expand a translation rules

minimally

Manually written Automatically Learned

Automatic Evaluation of Refinement process

Page 19: Towards Interactive and Automatic Refinement of Translation Rules

Preliminary Work

• Interactive elicitation of error information

• A framework for automatic rule adaptation

Page 20: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 20

Interactive Elicitation of MT Errors

Goal: • Simplify MT correction task maximally

Challenges:• Find appropriate level of granularity for MT error

classification

• Design a user-friendly graphic user interface with: • SL sentence (e.g. I see them)• TL sentence (e.g. Yo veo los)• word-to-word alignments (I-yo, see-veo, them-los)• (context)

Page 21: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 21

MT Error Typology for RR (simplified)

Missing word

Extra word

Wrong word order

Incorrect word

Wrong agreement

Completed Work Interactive elicitation of error information

Local vs Long distance

Word vs. phrase

+ Word change

Sense

Form

Selectional restrictions

Idiom

Missing constraint

Extra constraint

Page 22: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 22

TCTool (Demo)• Add a word• Delete a word• Modify a word• Change word order

Actions:

Interactive elicitation of error information

Page 23: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 23

1st Eng2Spa User Study [LREC 2004]

• MT error classification 9 linguistically-motivated classes:

word order, sense, agreement error (number, person, gender, tense), form, incorrect word and no translation

Interactive elicitation of error information

Completed Work

precision recall F1

error detection 90% 89% 89%

error classification 72% 71% 72%

Page 24: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 24

Automatic Rule Refinement Framework

• Find best RR operations given a:• Grammar (G), • Lexicon (L), • (Set of) Source Language sentence(s) (SL), • (Set of) Target Language sentence(s) (TL), • Its Parse tree (P), and • Minimal correction of TL (TL’)

such that TQ2 > TQ1• Which can also be expressed as:

max TQ(TL|TL’,P,SL,RR(G,L))

Completed Work Automatic Rule Adaptation

Page 25: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 25

1. Refine a translation rule:R0 R1 (R0 modified, either made more

specific or more general)

Types of Refinement OperationsCompleted Work Automatic Rule

Adaptation

R0: NP

DET N ADJ

NP

DET ADJ N

R1:

a nice house

una casa bonito

NP

DET N ADJ

NP

DET ADJ N

a nice house

una casa bonita

N gender = ADJ gender

Page 26: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 26

2. Bifurcate a translation rule:R0 R0 (same, general rule)

R1 (R0 modified, specific rule)

Types of Refinement Operations (2)Completed Work Automatic Rule

Adaptation

R0: NP

DET N ADJ

NP

DET ADJ N

NP

DET ADJ N

NP

DET ADJ N

R1:

a nice house una casa bonita

a great artist un gran artista

Page 27: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 27

Formalizing Error Information

Wi = error

Wi’ = correction

Wc = clue word

• Example:

SL: the red car - TL: *el auto roja TL’: el auto rojo

Wi = roja Wi’ = rojo Wc = auto

need to agree

Completed Work Automatic Rule Adaptation

Page 28: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 28

Triggering Feature Detection

Comparison at the feature level to detect triggering feature(s)

Delta function: (Wi,Wi’)

Examples:(rojo,roja) = {gender}

(comiamos,comia) = {person,number}

(mujer,guitarra) = {}

If set is empty, need to

postulate a new binary feature

Completed Work Automatic Rule Adaptation

Page 29: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 29

Deciding on the Refinement Op

Given:

- Action performed by the user (add, delete,

modify, change word order) , and

- Error information is available (clue word, word alignments, etc.)

Refinement Action

Completed Work Automatic Rule Adaptation

Page 30: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 30

Rule Refinement Operations

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’

Page 31: Towards Interactive and Automatic Refinement of Translation Rules

Proposed Work

- Batch and Interactive mode

- User studies

- Evaluation

Page 32: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 32

Rule Refinement ExampleAutomatic Rule Adaptation

Change word order

SL: Gaudí was a great artist

TL: Gaudí era un artista grande

Corrected TL (TL’): Gaudí era un gran artista

Page 33: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 33Refinement Operation Typology

1. Error Information Elicitation

Automatic Rule Adaptation

Page 34: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 34

2. Variable Instantiation from Log File

Correcting Actions:

1. Word order change (artista grande grande artista):

Wi = grande

2. Edited grande into gran:

Wi’ = gran

identified artist as clue word Wc = artist

In this case, even if user had not identified Wc,

refinement process would have been the same

Automatic Rule Adaptation

Page 35: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 35

3. Retrieve Relevant Lexical Entries

• No lexical entry for great gran

• Duplicate lexical entry great-grande and change TL side:

ADJ::ADJ |: [great] -> [gran]

((X1::Y1)

((x0 form) = great)

((y0 agr num) = sg)

((y0 agr gen) = masc))

(Morphological analyzer: grande = gran)

Automatic Rule Adaptation

ADJ::ADJ |: [great] -> [grande]

((X1::Y1)

((x0 form) = great)

((y0 agr num) = sg)

((y0 agr gen) = masc))

Page 36: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 36

4. Finding Triggering Feature(s)

Feature function: (Wi, Wi’) = need to postulate a new binary feature: feat1

5. Blame assignment

tree: <((S,1 (NP,2 (N,5:1 "GAUDI") )

(VP,3 (VB,2 (AUX,17:2 "ERA") )

(NP,8 (DET,0:3 "UN")

(N,4:5 "ARTISTA")

(ADJ,5:4 "GRANDE") ) ) ) )>

Automatic Rule Adaptation

Page 37: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 37

6. Variable Instantiation in the Rules

Wi = grande POSi = ADJ = Y3, y3

Wc = artista POSc = N = Y2, y2

{NP,8} NP::NP : [DET ADJ N] -> [DET N ADJ]( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) ((y1 agr) = (y2 agr)) ; det-noun agreement ((y3 agr) = (y2 agr)) ; adj-noun agreement (y2 = x3) )

Automatic Rule Adaptation

(R0)

Page 38: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 38

7. Refining Rules

{NP,8’}

NP::NP : [DET ADJ N] -> [DET ADJ N]

( (X1::Y1) (X2::Y2) (X3::Y3)

((x0 def) = (x1 def))

(x0 = x3)

((y1 agr) = (y3 agr)) ; det-noun agreement

((y2 agr) = (y3 agr)) ; adj-noun agreement

(y2 = x3)

((y2 feat1) =c + ))

Automatic Rule Adaptation

(R1)

Page 39: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 39

8. Refining Lexical Entries

ADJ::ADJ |: [great] -> [grande]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc)((y0 feat1) = -))

ADJ::ADJ |: [great] -> [gran]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc)((y0 feat1) = +))

Automatic Rule Adaptation

Page 40: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 40

Done? Not yet

NP,8 (R0) ADJ(grande) [feat1 = -]

NP,8’ (R1) ADJ(gran)

[feat1 =c +] [feat1 = +]

Need to restrict application of general rule (R0) to just post-nominal ADJ

Automatic Rule Adaptation

un artista grande

un artista gran

un gran artista

*un grande artista

Page 41: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 41

Add Blocking Constraint

NP,8 (R0) ADJ(grande) [feat1 = -] [feat1 = -]

NP,8’ (R1) ADJ(gran)

[feat1 =c +] [feat1 = +]

Can we also eliminate incorrect translations automatically?

Automatic Rule Adaptation

un artista grande

*un artista gran

un gran artista

*un grande artista

Page 42: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 42

Making the grammar tighter

• If Wc = artista Add [feat1= +] to N(artista) Add agreement constraint to NP,8 (R0)

between N and ADJ ((N feat1) = (ADJ feat1))

Automatic Rule Adaptation

*un artista grande

*un artista gran

un gran artista

*un grande artista

Page 43: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 43

Batch Mode Implementation

• For Refinement operations of errors that can be refined :

– Fully automatically just by using correction information (Focus 1)

– Fully automatically using correction and error information (Focus 2)

Proposed Work Automatic Rule Adaptation

Page 44: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 44

Rule Refinement Operations

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’

Page 45: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 45

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’

Rule Refinement Operations

Focus 1

It is a nice house – Es una casa bonito Es una casa bonita

John and Mary fell – Juan y Maria cayeron Juan y Maria se cayeron

Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista

Page 46: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 46

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’

Rule Refinement Operations

Focus 1

Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista

Es una casa bonito Es una casa bonita

J y M cayeron J y M se cayeron

I will help him fix the car – Ayudaré a él a arreglar el auto

Le ayudare a arreglar el auto

Page 47: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 47

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’

Rule Refinement Operations

Focus 1

I will help him fix the car – Ayudaré a él a arreglar el auto

Le ayudare a arreglar el auto

I would like to go – Me gustaria que ir

Me gustaria ir

Page 48: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 48

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’

Rule Refinement Operations

Focus 1 & 2

I am proud of you – Estoy orgullosa tu Estoy orgullosa de ti

PP PREP NP

Page 49: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 49

Interactive Mode Implementation

• Extra error information is required More sentences need to be evaluated

(and corrected) by users Relevant Minimal Pairs (MP)

– Focus 3: types of errors that require a reasonable amount of further user interaction and can be solved by available correction and error information.

Proposed Work Automatic Rule Adaptation

Page 50: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 50

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’

Rule Refinement Operations

Focus 3

Wally plays the guitar – Wally juega la guitarra Wally toca la guitarra

I saw the woman – Vi la mujer Vi a la mujer

I see them – Veo los Los veo

Page 51: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 51

Example Requiring Minimal PairAutomatic Rule Adaptation

1. Run SL sentence through the transfer engine

I see them *veo los los veo

2. Wi = los but no Wi’ nor Wc

Need a minimal pair to determine appropriate refinement:

I see cars veo autos

3. Triggering feature(s): (los,autos) = {pos}

PRON(los)[pos=pron] N(autos)[pos=n]

Proposed Work

Page 52: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 52

Refining and Adding Constraints

VP,3: VP NP VP NP

VP,3’: VP NP NP VP + [NP pos =c pron]

• Percolate triggering features up to the constituent level:

NP: PRON PRON + [NP pos = PRON pos]

• Block application of general rule (VP,3):

VP,3: VP NP VP NP + [NP pos = (*NOT* pron)]

Proposed Work

Page 53: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 53

Generalization Power

• Have other example sentences with same error that would be translated correctly after refinement!

Page 54: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 54

Rule Refinement Operations

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’

Outside Scope of Thesis

John read the book – A Juan leyó el libro Juan leyó el libro

Where are you from? – Donde eres tu de? De donde eres tu?

Page 55: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 55

User Studies

• TCTool: new MT classification (Eng2Spa)

• Different language pair (Mapudungun or Quechua Spanish)

• Manual vs Learned grammars

• Batch vs Interactive mode (+Active Learning)

• Amount of error information elicited

Proposed Work

Page 56: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 58

Evaluation of Refined MT System

• Evaluate best translation Automatic evaluation metrics (BLEU, NIST, METEOR)

• Evaluate candidate list precision (+parsimony)

Page 57: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 59

Evaluate Best translation

Hypothesis file (translations to be evaluated automatically)

Raw MT output:• Best sentence (picked by user to be correct or

requiring the least amount of correction) Refined MT output:• Use METOR score at sentence level to pick best

candidate from the list

Run all automatic metrics on the new hypothesis file using user corrections as reference translations.

Page 58: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 60

Evaluate Candidate List

• Precision: “tp” binary {0,1}

tp + fp total number of TC

SLTLTLTL

SLTLTLTL

SLTLTLTLTLTL

SLTLTLTLTLTLTL

=

Page 59: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 61

Expected Contributions

• An efficient online GUI to display translations and alignments and solicit pinpoint fixes from non-expert bilingual users.

• An expandable set of rule refinement operators– triggered by user corrections,– to automatically refine and expand different types

of grammars.

• A mechanism to automatically evaluate rule refinements with user corrections as reference translations.

Page 60: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 62

Thesis Timeline

Research components Duration (months)

Back-end implementation 7

User Studies 3

Resource-poor language (data + manual grammar) 2

Adapt system to new language pair 1

Active Learning methods 1

Evaluation 1

Write and defend thesis 3

Total 18

Expected graduation date: May 2006

Page 61: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 63

References

Add references:

• Related work

• Probst et al. 2002

• AL

Page 62: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 64

Thanks!

Questions?

Page 63: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 65

Some Questions

• Is the refinement process deterministic?

Page 64: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 66

Others

• TCTool Demo Simulation

• RR operation patterns

• Automatic Evaluation feasibility study

• AMTA paper results

• BLEU, NIST and METEOR

• Precision, recall and F1

Page 65: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 67

Automatic Rule Adaptation

Page 66: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 68

Automatic Rule AdaptationSL + best TL picked by user

Page 67: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 70

Automatic Rule AdaptationChanging “grande” into “gran”

Page 68: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 71

Automatic Rule Adaptation

Back to main

Page 69: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 72

1

2

3

Automatic Rule Adaptation

Page 70: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 73

Input to RR module

sl: I see them

tl: VEO LOS

tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") )

(NP,0 (PRON,2:3 "LOS") ) ) ) )>

- User correction log file - Transfer engine output (+ parse tree):

Automatic Rule Adaptation

sl: I see cars

tl: VEO AUTOS

tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") )

(NP,2 (N,1:3 “AUTOS") ) ) ) )>

Page 71: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 74

Types of RR Operations

• Grammar:– R0 R0 + R1 [=R0’ + constr] Cov[R0] Cov[R0,R1]– R0 R1[=R0 + constr= -]

R2[=R0’ + constr=c +] Cov[R0] Cov[R1,R2]

– R0 R1 [=R0 + constr] Cov[R0] Cov[R1]

• Lexicon– Lex0 Lex0 + Lex1[=Lex0 + constr] – Lex0 Lex1[=Lex0 + constr]– Lex0 Lex0 + Lex1[Lex0 + TLword] Lex1 (adding lexical item)

bifurcate

refine

Completed Work Automatic Rule Adaptation

Page 72: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 75

Manual vs Learned Grammars

[AMTA 2004]

Automatic Rule Adaptation

NIST BLEU METEOR

Manual grammar 4.3 0.16 0.6

Learned grammar 3.7 0.14 0.55

• Manual inspection:

• Automatic MT Evaluation:

Page 73: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 76

Human Oracle experiment

• As a feasibility experiment, compared raw output with manually corrected MT:

statistically significant (confidence interval test)

• These is an upper-bound on how much difference we should expect any refinement approach to make.

Automatic Rule Adaptation

Completed Work

Page 74: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 77

Active Learning

• Minimize the number of examples a human annotator must label [Cohn et al. 1994] usually by processing examples in order of usefulness.

.

Minimize the number of Minimal Pairs presented to users

Proposed Work Automatic Rule Adaptation

Page 75: Towards Interactive and Automatic Refinement of Translation Rules

Interactive and Automatic Rule Refinement 78

Order deterministic?

• Application of Rule Refinement operations is not deterministic, it directly depends on:– The order in which it sees the corrected

sentences

• Example:• 1st agr constraint + bifurcate (WWO)

– C-set

• Reverse order– C-set (!=)