towards interactive and automatic refinement of translation rules

Towards Interactive and Automatic Refinement of

Translation Rules

Ariadna Font Llitjós PhD Thesis Proposal

Jaime Carbonell (advisor)

Alon Lavie (co-advisor)

Lori Levin

Bonnie Dorr (Univ. Maryland)

5 November 2004

Interactive and Automatic Rule Refinement 2

Outline

• Introduction

• Thesis statement and scope

• Preliminary Research– Interactive elicitation of error information– A framework for automatic rule adaptation

• Proposed Research

• Contributions and Thesis Timeline


Machine Translation (MT)

• Source Language (SL) sentence: Gaudi was a great artist

In Spanish, it translates as: Gaudi era un gran artista

• MT System outputs :*Gaudi estaba un artista grande

*Gaudi era un artista grande


Spanish Adjectives Completed Work Automatic Rule

Adaptation

NP

DET N ADJ

NP

DET ADJ N

a big house una casa grande

NP

DET ADJ N

NP

DET ADJ N

a great artist un gran artista

General order: grande big in size

Exception: gran exceptional


Commercial and Online Systems

Correct Translation: Gaudi era un gran artista

• Systran, Babelfish (Altavista), WorldLingo, Translated.net : *Gaudi era gran artista

• ImTranslation: *El Gaudi era un gran artista

• 1-800-Translate *Gaudi era un fenomenal artista


• Current solutions:

Manual post-editing [Allen, 2003]

Automated post-edition module (APE) [Allen & Hogan, 2000]

Post-editing


Drawbacks of Current Methods

• Manual post-editing Corrections do not generalize Gaudi era un artista grande

Juan es un amigo grande (Juan is a great friend)

Era una oportunidad grande (It is a great

opportunity)

• APE Humans need to predict all the errors ahead of time and code for the post-editing rules + new error


My Solution

• Automate post-editing efforts by feeding them back into the MT system.

• Possible alternatives:Automatic learning of post-editing rules

+ system independent

- several thousands of sentences might need to be corrected for the same error

Automatic refinement of translation rules+ attacks the core of the problem- for transfer-based MT systems (need rules to fix!)


[Corston-Oliver & Gammon, 2003]

[Imamura et al. 2003]

[Menezes & Richardson, 2001]

Related Work

[Callison-Burch, 2004]

[Su et al. 1995]

[Brill, 1993]

[Gavaldà, 2000]

Post-editing

Rule Adaptation

Machine TranslationMy Thesis

No pre-existing training data required

No human reference translations required

Use Non-expert user feedback


Resource-poor Scenarios (AVENUE)

- Lack of electronic parallel data - Lack of manual grammar (or very small initial grammar)

Need to validate elicitation corpus and automatically learned translation rules

Why bother? - Indigenous communities have difficult access to

crucial information that directly affects their life (such as land laws, plagues, health warnings, etc.)

- Preservation of their language and culture

MapudungunQuechuaAymara


How is MT possible for resource-poor languages?

Bilingual speakers


AVENUE Project Overview

Learning

Module

Transfer Rules

Lexical Resources

Run Time Transfer System

Lattice

Word-Aligned Parallel Corpus

Elicitation Tool

Elicitation Corpus

Elicitation Rule Learning

Run-Time System

Handcrafted rules

Morphology

Morpho-logical analyzer


My Thesis

Learning

Module

Transfer Rules

Lexical Resources

Run Time Transfer System

Lattice

Translation

Correction

Tool

Word-Aligned Parallel Corpus

Elicitation Tool

Elicitation Corpus

Elicitation Rule Learning

Run-Time System

Rule Refinement

Rule

Refinement

Module

Handcrafted rules

Morphology

Morpho-logical analyzer

Recycle corrections of Machine Translation output back into the system

by refining and expanding existing translation rules


Thesis Statement

- Given a rule-based Transfer MT system, we can extract useful information from non-expert bilingual speakers about the corrections required to make MT output acceptable..

- We can automatically refine and expand translation rules, given corrected and aligned translation pairs and some error information, to improve coverage and overall MT quality.


Assumptions

• No parallel training data available

• No human reference translations available

• The SL sentence needs to be fully parsed by the translation grammar.

• Bilingual speakers can give enough information about the MT errors.


Scope

Types of errors that:

• Focus 1: can be refined fully automatically just by using correction information.

• Focus 2: can be refined fully automatically using correction and error information.

• Focus 3: require a reasonable amount of further user interaction and can be solved by available correction and error information.


Technical Challenges

Elicit minimal MT information from non-expert users

Refine and expand a translation rules

minimally

Manually written Automatically Learned

Automatic Evaluation of Refinement process

Preliminary Work

• Interactive elicitation of error information

• A framework for automatic rule adaptation


Interactive Elicitation of MT Errors

Goal: • Simplify MT correction task maximally

Challenges:• Find appropriate level of granularity for MT error

classification

• Design a user-friendly graphic user interface with: • SL sentence (e.g. I see them)• TL sentence (e.g. Yo veo los)• word-to-word alignments (I-yo, see-veo, them-los)• (context)


MT Error Typology for RR (simplified)

Missing word

Extra word

Wrong word order

Incorrect word

Wrong agreement

Completed Work Interactive elicitation of error information

Local vs Long distance

Word vs. phrase

+ Word change

Sense

Form

Selectional restrictions

Idiom

Missing constraint

Extra constraint


TCTool (Demo)• Add a word• Delete a word• Modify a word• Change word order

Actions:

Interactive elicitation of error information

http://avenue.lti.cs.cmu.edu/aria/spanish/index-test.html


1st Eng2Spa User Study [LREC 2004]

• MT error classification 9 linguistically-motivated classes:

word order, sense, agreement error (number, person, gender, tense), form, incorrect word and no translation

Interactive elicitation of error information

Completed Work

precision recall F1

error detection 90% 89% 89%

error classification 72% 71% 72%


Automatic Rule Refinement Framework

• Find best RR operations given a:• Grammar (G), • Lexicon (L), • (Set of) Source Language sentence(s) (SL), • (Set of) Target Language sentence(s) (TL), • Its Parse tree (P), and • Minimal correction of TL (TL’)

such that TQ2 > TQ1• Which can also be expressed as:

max TQ(TL|TL’,P,SL,RR(G,L))

Completed Work Automatic Rule Adaptation


1. Refine a translation rule:R0 R1 (R0 modified, either made more

specific or more general)

Types of Refinement OperationsCompleted Work Automatic Rule

Adaptation

R0: NP

DET N ADJ

NP

DET ADJ N

R1:

a nice house

una casa bonito

NP

DET N ADJ

NP

DET ADJ N

a nice house

una casa bonita

N gender = ADJ gender


2. Bifurcate a translation rule:R0 R0 (same, general rule)

R1 (R0 modified, specific rule)

Types of Refinement Operations (2)Completed Work Automatic Rule

Adaptation

R0: NP

DET N ADJ

NP

DET ADJ N

NP

DET ADJ N

NP

DET ADJ N

R1:

a nice house una casa bonita

a great artist un gran artista


Formalizing Error Information

Wi = error

Wi’ = correction

Wc = clue word

• Example:

SL: the red car - TL: *el auto roja TL’: el auto rojo

Wi = roja Wi’ = rojo Wc = auto

need to agree



Triggering Feature Detection

Comparison at the feature level to detect triggering feature(s)

Delta function: (Wi,Wi’)

Examples:(rojo,roja) = {gender}

(comiamos,comia) = {person,number}

(mujer,guitarra) = {}

If set is empty, need to

postulate a new binary feature



Deciding on the Refinement Op

Given:

- Action performed by the user (add, delete,

modify, change word order) , and

- Error information is available (clue word, word alignments, etc.)

Refinement Action



Rule Refinement Operations

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’

Proposed Work

- Batch and Interactive mode

- User studies

- Evaluation


Rule Refinement ExampleAutomatic Rule Adaptation

Change word order

SL: Gaudí was a great artist

TL: Gaudí era un artista grande

Corrected TL (TL’): Gaudí era un gran artista

Interactive and Automatic Rule Refinement 33Refinement Operation Typology

1. Error Information Elicitation

Automatic Rule Adaptation


2. Variable Instantiation from Log File

Correcting Actions:

1. Word order change (artista grande grande artista):

Wi = grande

2. Edited grande into gran:

Wi’ = gran

identified artist as clue word Wc = artist

In this case, even if user had not identified Wc,

refinement process would have been the same



3. Retrieve Relevant Lexical Entries

• No lexical entry for great gran

• Duplicate lexical entry great-grande and change TL side:

ADJ::ADJ |: [great] -> [gran]

((X1::Y1)

((x0 form) = great)

((y0 agr num) = sg)

((y0 agr gen) = masc))

(Morphological analyzer: grande = gran)


ADJ::ADJ |: [great] -> [grande]

((X1::Y1)

((x0 form) = great)

((y0 agr num) = sg)

((y0 agr gen) = masc))


4. Finding Triggering Feature(s)

Feature function: (Wi, Wi’) = need to postulate a new binary feature: feat1

5. Blame assignment

tree: <((S,1 (NP,2 (N,5:1 "GAUDI") )

(VP,3 (VB,2 (AUX,17:2 "ERA") )

(NP,8 (DET,0:3 "UN")

(N,4:5 "ARTISTA")

(ADJ,5:4 "GRANDE") ) ) ) )>



6. Variable Instantiation in the Rules

Wi = grande POSi = ADJ = Y3, y3

Wc = artista POSc = N = Y2, y2

{NP,8} NP::NP : [DET ADJ N] -> [DET N ADJ]( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) ((y1 agr) = (y2 agr)) ; det-noun agreement ((y3 agr) = (y2 agr)) ; adj-noun agreement (y2 = x3) )


(R0)


7. Refining Rules

{NP,8’}

NP::NP : [DET ADJ N] -> [DET ADJ N]

( (X1::Y1) (X2::Y2) (X3::Y3)

((x0 def) = (x1 def))

(x0 = x3)

((y1 agr) = (y3 agr)) ; det-noun agreement

((y2 agr) = (y3 agr)) ; adj-noun agreement

(y2 = x3)

((y2 feat1) =c + ))


(R1)


8. Refining Lexical Entries

ADJ::ADJ |: [great] -> [grande]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc)((y0 feat1) = -))

ADJ::ADJ |: [great] -> [gran]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc)((y0 feat1) = +))



Done? Not yet

NP,8 (R0) ADJ(grande) [feat1 = -]

NP,8’ (R1) ADJ(gran)

[feat1 =c +] [feat1 = +]

Need to restrict application of general rule (R0) to just post-nominal ADJ


un artista grande

un artista gran

un gran artista

*un grande artista


Add Blocking Constraint

NP,8 (R0) ADJ(grande) [feat1 = -] [feat1 = -]

NP,8’ (R1) ADJ(gran)

[feat1 =c +] [feat1 = +]

Can we also eliminate incorrect translations automatically?


un artista grande

*un artista gran

un gran artista

*un grande artista


Making the grammar tighter

• If Wc = artista Add [feat1= +] to N(artista) Add agreement constraint to NP,8 (R0)

between N and ADJ ((N feat1) = (ADJ feat1))


*un artista grande

*un artista gran

un gran artista

*un grande artista


Batch Mode Implementation

• For Refinement operations of errors that can be refined :

– Fully automatically just by using correction information (Focus 1)

– Fully automatically using correction and error information (Focus 2)

Proposed Work Automatic Rule Adaptation







Focus 1

It is a nice house – Es una casa bonito Es una casa bonita

John and Mary fell – Juan y Maria cayeron Juan y Maria se cayeron

Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista







Focus 1

Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista

Es una casa bonito Es una casa bonita

J y M cayeron J y M se cayeron

I will help him fix the car – Ayudaré a él a arreglar el auto

Le ayudare a arreglar el auto







Focus 1

I will help him fix the car – Ayudaré a él a arreglar el auto

Le ayudare a arreglar el auto

I would like to go – Me gustaria que ir

Me gustaria ir







Focus 1 & 2

I am proud of you – Estoy orgullosa tu Estoy orgullosa de ti

PP PREP NP


Interactive Mode Implementation

• Extra error information is required More sentences need to be evaluated

(and corrected) by users Relevant Minimal Pairs (MP)

– Focus 3: types of errors that require a reasonable amount of further user interaction and can be solved by available correction and error information.








Focus 3

Wally plays the guitar – Wally juega la guitarra Wally toca la guitarra

I saw the woman – Vi la mujer Vi a la mujer

I see them – Veo los Los veo


Example Requiring Minimal PairAutomatic Rule Adaptation

1. Run SL sentence through the transfer engine

I see them *veo los los veo

2. Wi = los but no Wi’ nor Wc

Need a minimal pair to determine appropriate refinement:

I see cars veo autos

3. Triggering feature(s): (los,autos) = {pos}

PRON(los)[pos=pron] N(autos)[pos=n]

Proposed Work


Refining and Adding Constraints

VP,3: VP NP VP NP

VP,3’: VP NP NP VP + [NP pos =c pron]

• Percolate triggering features up to the constituent level:

NP: PRON PRON + [NP pos = PRON pos]

• Block application of general rule (VP,3):

VP,3: VP NP VP NP + [NP pos = (*NOT* pron)]

Proposed Work


Generalization Power

• Have other example sentences with same error that would be translated correctly after refinement!







Outside Scope of Thesis

John read the book – A Juan leyó el libro Juan leyó el libro

Where are you from? – Donde eres tu de? De donde eres tu?


User Studies

• TCTool: new MT classification (Eng2Spa)

• Different language pair (Mapudungun or Quechua Spanish)

• Manual vs Learned grammars

• Batch vs Interactive mode (+Active Learning)

• Amount of error information elicited

Proposed Work


Evaluation of Refined MT System

• Evaluate best translation Automatic evaluation metrics (BLEU, NIST, METEOR)

• Evaluate candidate list precision (+parsimony)


Evaluate Best translation

Hypothesis file (translations to be evaluated automatically)

Raw MT output:• Best sentence (picked by user to be correct or

requiring the least amount of correction) Refined MT output:• Use METOR score at sentence level to pick best

candidate from the list

Run all automatic metrics on the new hypothesis file using user corrections as reference translations.


Evaluate Candidate List

• Precision: “tp” binary {0,1}

tp + fp total number of TC

SLTLTLTL

SLTLTLTL

SLTLTLTLTLTL

SLTLTLTLTLTLTL

=


Expected Contributions

• An efficient online GUI to display translations and alignments and solicit pinpoint fixes from non-expert bilingual users.

• An expandable set of rule refinement operators– triggered by user corrections,– to automatically refine and expand different types

of grammars.

• A mechanism to automatically evaluate rule refinements with user corrections as reference translations.


Thesis Timeline

Research components Duration (months)

Back-end implementation 7

User Studies 3

Resource-poor language (data + manual grammar) 2

Adapt system to new language pair 1

Active Learning methods 1

Evaluation 1

Write and defend thesis 3

Total 18

Expected graduation date: May 2006


References

Add references:

• Related work

• Probst et al. 2002

• AL


Thanks!

Questions?


Some Questions

• Is the refinement process deterministic?


Others

• TCTool Demo Simulation

• RR operation patterns

• Automatic Evaluation feasibility study

• AMTA paper results

• BLEU, NIST and METEOR

• Precision, recall and F1


Automatic Rule AdaptationSL + best TL picked by user


Automatic Rule AdaptationChanging “grande” into “gran”



Back to main


1

2

3



Input to RR module

sl: I see them

tl: VEO LOS

tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") )

(NP,0 (PRON,2:3 "LOS") ) ) ) )>

- User correction log file - Transfer engine output (+ parse tree):


sl: I see cars

tl: VEO AUTOS

tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") )

(NP,2 (N,1:3 “AUTOS") ) ) ) )>


Types of RR Operations

• Grammar:– R0 R0 + R1 [=R0’ + constr] Cov[R0] Cov[R0,R1]– R0 R1[=R0 + constr= -]

R2[=R0’ + constr=c +] Cov[R0] Cov[R1,R2]

– R0 R1 [=R0 + constr] Cov[R0] Cov[R1]

• Lexicon– Lex0 Lex0 + Lex1[=Lex0 + constr] – Lex0 Lex1[=Lex0 + constr]– Lex0 Lex0 + Lex1[Lex0 + TLword] Lex1 (adding lexical item)

bifurcate

refine



Manual vs Learned Grammars

[AMTA 2004]


NIST BLEU METEOR

Manual grammar 4.3 0.16 0.6

Learned grammar 3.7 0.14 0.55

• Manual inspection:

• Automatic MT Evaluation:


Human Oracle experiment

• As a feasibility experiment, compared raw output with manually corrected MT:

statistically significant (confidence interval test)

• These is an upper-bound on how much difference we should expect any refinement approach to make.


Completed Work


Active Learning

• Minimize the number of examples a human annotator must label [Cohn et al. 1994] usually by processing examples in order of usefulness.

.

Minimize the number of Minimal Pairs presented to users



Order deterministic?

• Application of Rule Refinement operations is not deterministic, it directly depends on:– The order in which it sees the corrected

sentences

• Example:• 1st agr constraint + bifurcate (WWO)

– C-set

• Reverse order– C-set (!=)

towards interactive and automatic refinement of translation rules

Documents

automatic rule refinementhow

gaudi estaba

postediting efforts

manual postediting allen

grande big

gran exceptionalinteractive

gran artistasystran

gran artista1