anastasiou 1 idioms in ebmt idiom processing within the ebmt system metis-ii dimitra anastasiou...

40
Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou [email protected] Institut für Angewandte Informationsforschung (IAI) Saarland University, Germany School of Computing, Dublin City University, Dublin 15 th October 2008

Upload: kylie-butler

Post on 28-Mar-2015

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

Anastasiou1Idioms in EBMT

Idiom Processing within the EBMT System METIS-II

Dimitra Anastasiou

[email protected]

Institut für Angewandte Informationsforschung (IAI) Saarland University, Germany

School of Computing, Dublin City University, Dublin

15th October 2008

Page 2: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

2 / 40 AnastasiouIdioms in EBMT

Aim-Methods

Aim• Enhancement of translation quality of idiomatic

expressions (idiomatic VPs in particular) within the German-to-English EBMT system METIS-II

Resources• Bilingual idiom dictionary• Monolingual corpus• Syntactic rules according to the German

topological field model

Page 3: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

3 / 40 AnastasiouIdioms in EBMT

Outlook

EBMT: statistical or rule-based MT?• Interpretation of idioms• Topological field model• Treatment of idioms by MT• METIS-II idiom resources• Translation process of METIS-II• Evaluation of METIS-II

Page 4: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

4 / 40 AnastasiouIdioms in EBMT

EBMT: Statistical or Rule-Based MT?

Two tendencies of EBMT:

1) Combinations of EBMT with rule-based MT (RBMT) as hybrid systems[Sumita et al., 1990];

2) Pure EBMT systems [Sato & Nagao, 1990].

EBMT lies between RBMT and statistical MT (SMT) [Carl & Way, 2005]Reason: The transfer between SL and TL is always guided by translation examples, even if the replacement and/or modification of the sub-sequences are completely rule- or data-based.

Page 5: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

5 / 40 AnastasiouIdioms in EBMT

Outlook

• EBMT: statistical or rule-based MT?• Interpretation of idioms Identification by MTSemanticsSyntaxGrammatical and lexical variants• Topological field model• Treatment of idioms by MT• METIS-II idiom resources• Translation process of METIS-II• Evaluation of METIS-II

Page 6: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

6 / 40 AnastasiouIdioms in EBMT

Interpretation of Idioms

• Diverse terms (and accordingly definitions):idiom, semi-idiom, (cranberry) collocation, idiomatic/figurative/fixed/periphrastic phrase/expression, phraseologism, (dead) metaphor, etc.

• Irregularity of idioms depends on: Fixedness of constituents

[Moon, 1998; Trawinski, 2008]; Degree of compositionality;• Syntactic opaqueness: kick the bucket – die

[Jackendoff, 1997; Gazdar et al., 1985]; Poetic marking of the form, e.g.

klipp und klar (clear as daylight) mit Rat und Tat (help and advice)

Page 7: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

7 / 40 AnastasiouIdioms in EBMT

Idiom Identification by MT

jmdn. mit Argusaugen beobachtenso.-with-Argus eyes-observe

watch so. like a hawk

Er beobachtete den Mann, der die Bank betrat, mit Argusaugen. He was watching the man, who entered the bank, like a hawk.

• The contiguous parts of the idiom (mit Argusaugen);• The discontinuous parts of the idiom (beobachten) in any of its declination forms;• The syntactic requirements of the idiom; • The clause boundaries (usually in one clause). More information can be found in Volk (1998).

Page 8: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

8 / 40 AnastasiouIdioms in EBMT

Semantics (Degree of Compositionality)

1) Non-compositional: cranberry/unical constituents, e.g.:

A recent study on cranberry expressions in English and German is that of Trawinski et al. (2008);

2) Partially compositional: light-verb constructions (SVCs)

A recent study on German PP-verb SVCs is that of Krenn (2008);

3) Strictly compositional: collocations, e.g.

as happy as a sandboyon tenterhooks

außer Betrieb gehen – go out of serviceaußer Betrieb sein – be out of order

Maßnahmen ergreifentake measures

Page 9: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

9 / 40 AnastasiouIdioms in EBMT

Syntax

• Syntactic categories of idioms

• Realization of idiomsas for the syntactic gaps

Continuous (without gaps)

Discontinuous (with gaps)

Page 10: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

10 / 40 AnastasiouIdioms in EBMT

Syntactic Categories

• Noun phrase (NP): pink slip

• Prepositional phrase (PP): by hook or crook

• Combination NP-PP: danger for life and health

• Adjective: prim and proper

• Verbal phrase (iVP) NP-Verb: kick the bucket PP-Verb: fall on deaf ears NP-PP-Verb throw out the baby with the

bath water

• Proverb less is sometimes more Saying gimme a break

Page 11: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

11 / 40 AnastasiouIdioms in EBMT

Grammatical Variants (1)

• Number: pull up stakes *pull up stake

Exception! keep tabs on sb/sth keep a tab on sb/sth

• Case: auf die Strasse gehen *auf der Strasse gehen take to the streets

• Determ.: play a role *play the role

• Posses.: in Verbindung treten *in Pos.Pron. Verbindung treten

contact

Pos.Pron. Ohr leihen *das Ohr leihenlisten

Page 12: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

12 / 40 AnastasiouIdioms in EBMT

Grammatical Variants (2)

• Negation: eine Rolle spielen (play a role)

keine Rolle spielen (any-role-play)

*nicht eine Rolle spielen (not-a-role-play)

auf keinen grünen Zweig kommen (never get anywhere)nicht/nie auf einen grünen Zweig kommen

• Passivization The more syntactically opaque an idiom has, the less possible it is to undergo passivization.

opaque: [kick] [the bucket] – die *The bucket was kicked by him (only literal

meaning)

transparent: [spill] [the beans] – [tell] [a secret] The beans were spilled by him

Page 13: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

13 / 40 AnastasiouIdioms in EBMT

Lexical Variants

• Substitution: kick the bucket *kick the pail

hit the sack*hit the hay

• Modifiers

Adjective: keep tabs onkeep close tabs on

Adverb: noch grün hintern den Ohren sein noch absolut grün hintern

den Ohren seinbe half-baked

Page 14: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

14 / 40 AnastasiouIdioms in EBMT

Outlook

• EBMT: statistical or rule-based MT? • Interpretation of idioms• Topological field modelRealization of idiomsDiscontinuous patterns• Treatment of idioms by MT• METIS-II idiom resources• Translation process of METIS-II• Evaluation of METIS-II

Page 15: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

15 / 40 AnastasiouIdioms in EBMT

Topological Field Model for German

The German clauses are divided into five fields; each field can be occupied by a certain number and kind of constituents [Drach, 1963; DUDEN, 1998; Dürscheid, 2000]:

• pre-field (PF): only 1 constituent!;

• left bracket (LB): finite (modal/auxiliary verb);

• middle field (MF):many constituents and in free order;

• right bracket (RB): non-finite verb (infinitive/participle form);

• post-field (PF): subclause(s).

Page 16: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

16 / 40 AnastasiouIdioms in EBMT

Realization of Idioms

• Continuous form: ( iNPMF | iPPMF | [iNPMF iPPMF] ) iVRB

Er will nicht bei den Argumenten ständig den Bock (iNPMF) zum Gärtner (iPPMF)

machen (iVRB)!

He-wants-not-during-the-arguments-always-the-bock-to-the-gardner-make!

He does not always want to set the fox to keep the geese during the argumentation!

• Discontinuous form: iVLB (Adverb)*MF ( iNPMF | iPPMF | [iNPMF iPPMF] )

Er macht (iVLB) oft (Adverb) den Bock (iNPMF) zum Gärtner (iPPMF).

He-makes-often-the-bock-to-the-gardner.

He often sets the fox to keep the geese.

Page 17: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

17 / 40 AnastasiouIdioms in EBMT

Discontinuous patterns

Den Bock zum Gärtner machen (set the fox to keep the geese)

Er macht (iVLB) oft (Adverb) den Bock (iNPMF) zum Gärtner (iPPMF).

Er hat den Bock (iNPMF) zum Gärtner (iPPMF) oft gemacht (iVRB).

?Den Bock (iNPPF) zum Gärtner (iPPPF) hat er oft gemacht (iVRB).

?Den Bock (iNPPF) hat er oft zum Gärtner (iPPMF) gemacht (iVRB).

Page 18: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

18 / 40 AnastasiouIdioms in EBMT

Outlook

• EBMT: statistical or rule-based MT? • Interpretation of idioms• Topological field model• Treatment of idioms by MT Idioms suitable for EBMT• METIS-II idiom resources• Translation process of METIS-II• Evaluation of METIS-II

Page 19: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

19 / 40 AnastasiouIdioms in EBMT

Treatment of Idioms by MT

• Bar-Hillel (1952): “The only way for a machine to treat idioms is - not to have idioms!”

• Power Translator Pro user manual (2000) warns the user to avoid inputting sentences containing idioms!

• Power Translator Pro, SYSTRAN, T1 Langenscheidt cannot identify discontinuous idioms.

Page 20: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

20 / 40 AnastasiouIdioms in EBMT

Idioms suitable for EBMT

• Idiomatic expressions are are not suitable for rule-based MT (RBMT), but are suitable for EBMT.

“Translation of an idiomatic expression can only be used to translate the same idiomatic expression; it cannot be used to translate a similar expression.”

(Sumita et al., 1990: 210).

• By contrast, Nomiyama (1992) emphasizes the disadvantage of EBMT’s using only thesauri to define a general semantic distance, resulting in over-generalization, which is a major problem in translating idiomatic expressions.

• Related work: Santos (1990), Wehrli (1998), Ryu et al. (1999), and Gangadharaiah; Balakrishnan (2006):

Page 21: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

21 / 40 AnastasiouIdioms in EBMT

Outlook

• EBMT: statistical or rule-based MT? • Interpretation of idioms• Topological field model• Treatment of idioms by MT• METIS-II idiom resources Idiom lexiconGerman corpus (annotation), (statistical

analysis)Syntactic rules• Translation process of METIS-II• Evaluation of METIS-II

Page 22: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

22 / 40 AnastasiouIdioms in EBMT

Idiom Resources

• Bilingual idiom dictionary of 871 entries

• Monolingual German corpus of 486 sentences

• Syntactic rules according to the German topological field model

Page 23: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

23 / 40 AnastasiouIdioms in EBMT

METIS-II Project

• Hybrid MT system (EBMT, RBMT, SMT);• Time span: 2004-2007;• SLs: Dutch, German, Greek, Spanish;• TL: Bristish English;• Based on pattern matching;• Sources: Huge monolingual TL corpus (BNC); Bilingual dictionaries; Tokenizer; PoS tagger, chunker, lemmatizer; Manually constructed matching rules.

Page 24: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

24 / 40 AnastasiouIdioms in EBMT

Idiom Dictionary

871 entries

Entry example{de=den_Bock_zum_Gärtner_machen, mde={c=verb}, en=set_the_fox_to_keep_the_geese, men={c=verb}}.

826 equal PoS 45 different PoS(verb/VP-interjection)

598 verbs/VPs

163 interject-ions

37 NPs

28 PPs

Page 25: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

25 / 40 AnastasiouIdioms in EBMT

Manually constructed(IAI)

Idiom Corpus

three corpus resources

Europarl (EP) Mixture of data sets

(MDS)

DWDS (Digital lexicon of theGerman language in the20th century)

Real examples (Internet)

80 MWEs63 cont. (79%)17 disc. (21%)

275 MWEs205 cont. (75%)70 disc. (25%)

131 MWEs91 cont. (69%)40 disc. (31%)

Page 26: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

26 / 40 AnastasiouIdioms in EBMT

Annotation of Idioms in the German Corpus

Continuous form:Er will nicht bei den Argumenten ständig <MWE id=1> den Bock zum Gärtner machen </MWE id=1>.

He does not always want to set the fox to keep the geese during the argumentation.

Discontinuous form:Er <MWE id=1> macht </MWE id=1> oft <MWE id=1> den Bock zum Gärtner </MWE id=1>.

He often sets the fox to keep the geese .

Page 27: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

27 / 40 AnastasiouIdioms in EBMT

Statistical Analysis of iVPs’ Syntactic Patterns

Continuous form patterns EP corpus MDS corpus DWDS corpus

NP-V 8 65 15

PP-V 29 106 60

NP-PP-V 4 21 6

Discontinuous form patterns

EP corpus MDS corpus DWDS corpus

V-NP 1 8 13

V-PP 16 25 18

V-NP-PP - 22 9

Page 28: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

28 / 40 AnastasiouIdioms in EBMT

Syntactic Rule for Continuous Idioms

Er will nicht bei den Argumenten ständig den Bock zum Gärtner machen!

En Bloc Pattern = A:match=yes, last idiom’s word=no, [den Bock,zum

Gärtner]B: match=yes, last idiom’s word=yes [machen]

C: mark_as_continuous_iVP.

where A: first idiom constituent - before last B: last idiom constituent C: command to identify/match as

continuous No alien element between A and B!

Page 29: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

29 / 40 AnastasiouIdioms in EBMT

Syntactic Rule for Discontinuous Idioms

Er macht (iVLB) oft (Adverb) den Bock (iNPMF) zum Gärtner (iPPMF).

Discontinuous Pattern_LBMF = A: match=yes, field=LB, c=verb, [macht] B: [match=no, field=MF]*, [oft] C: match=yes, field=MF, [den Bock, zum

Gärtner] D: mark_as_discontinuous_iVP.

where A: idiom’s verb in the left bracket B: arbitrarily many elements

C: matched idiom’s constituents D: command to identify/match as discontinuous

Alien element(s) between A and C!

Page 30: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

30 / 40 AnastasiouIdioms in EBMT

Outlook

• History of EBMT• Interpretation of idioms• Topological field model• Treatment of idioms by MT• METIS-II idiom resources• Translation process of METIS-IIMETIS-II Idiom Matching Process• Evaluation of METIS-II

Page 31: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

31 / 40 AnastasiouIdioms in EBMT

METIS-II Translation Process

1) SL analysis (tokenization, PoS-tagging, lemmatization, and chunking or shallow parsing);

2) SL-to-TL matching

i) The bilingual idiom dictionary;ii) The syntactic matching rules.

3) TL generation (the main TL resource, BNC, is used as a data-set of examples). The token generator is described in Carl & Schütz (2005).

Page 32: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

32 / 40 AnastasiouIdioms in EBMT

METIS-II Idiom Matching Process

Users• Store an idiom in the bilingual dictionary;• Load the syntactic matching rules;• Enter an input sentence/corpus.

System• The system reads the sentence word by word;• If the idiom is continuous and in the same form as

stored in the dictionary, it is directly correctly translated;

• If the idiom is discontinuous, the system reads the syntactic matching rules (rule by rule), until it finds the appropriate one which is then applied.

Page 33: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

33 / 40 AnastasiouIdioms in EBMT

Outlook

• History of EBMT• Interpretation of idioms• Topological field model• Treatment of idioms by MT• METIS-II idiom resources• Translation process of METIS-II• Evaluation of METIS-IIFor continuous idiomsFor discontinuous idioms

Page 34: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

34 / 40 AnastasiouIdioms in EBMT

Evaluation of METIS-II

Hit: correct matching/correct translationMiss: no matching/reuse of German inputNoise: false matching/literal translation

• Presicion:

• Recall:

• fscore:

noisehits

hits

Pr

misseshits

hits

Re

recallprecision

recallprecisionfscore

2

Page 35: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

35 / 40 AnastasiouIdioms in EBMT

Evaluation Results for Continuous iVPs

Recall Precision f-score

Europarl Corpus 98,3% 96,8% 96,8%

Manually constructed examples and examples from the Web

99% 96,2% 97,4%

DWDS 98,9% 96,7% 97,4%

Page 36: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

36 / 40 AnastasiouIdioms in EBMT

Evaluation Results for Discontinuous iVPs

Recall Precision f-score

Europarl Corpus 88,2% 78,9% 83,2%

Manually constructed examples and examples from the Web

95,7% 84,8% 88,8%

DWDS 92,5% 90,2% 90,6%

Page 37: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

37 / 40 AnastasiouIdioms in EBMT

Conclusion

• Continuous idioms: more than 95% recall and precision

• Discontinuous idioms: Almost more than 90% recall and more than 80% precision.

• The evaluation figures for continuous idioms of all techniques are higher than these for the discontinuous idioms.

This is attributed to the fact that discontinuous idioms are more difficult to identify due to their spread constituents through the sentence.

Page 38: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

38 / 40 AnastasiouIdioms in EBMT

Thank you for your attention!

Dimitra Anastasiou

www.d-anastasiou.com

[email protected]

Page 39: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

39 / 40 AnastasiouIdioms in EBMT

References (1)

• Bar-Hillel, Y., (1952), “The Treatment of ‘idioms’ by a Translating Machine”, presented at the Conference on Mechanical Translation at Massachusetts Institute of Technology, June 1952.

• Brown, R. D., (1999), “Adding Linguistic Knowledge to a Lexical Example-based Translation System”, in: 8th TMI 1999, Chester, England 22-32.

• Carl, M.; Schütz, J., (2005), “A Reversible Lemmatizer/Token-generator for English”, in: EBMT Workshop 2005, MT Summit X, Phuket, Thailand.

• Drach, Erich, (1963), Grundgedanken der deutschen Satzlehre, Darmstadt: Wissenschaftliche Buchgesellschaft.

• DUDEN Redaktion, (1998), Grammatik der deutschen Gegenwartssprache, Mannheim.

• Dürscheid, C., (2000), Syntax: Grundlagen und Theorien, Wiesbaden. • Gangadharaiah, R.; Balakrishnan, N., (2006), “Application of

Linguistic Rules to Generalized Example Based Machine Translation for Indian Languages“, in: Proceedings of the First National Symposium on Modeling and Shallow Parsing of Indian Languages (MSPIL), Mumbai, India.

• Gazdar, G.; Klein, E.; Pullum, G.; Sag, I., (1985), Generalized Phrase Structure Grammar, Basil Blackwell, Oxford

• Jackendoff, Ray. 1997. The Architecture of the Language Faculty. Cambridge, Mass.: MIT Press.

• Krenn, B., (2008), “Description of evaluation resource – German PP-verb data, in: MWE Workshop 2009, at LREC Conference, 7-11.

Page 40: Anastasiou 1 Idioms in EBMT Idiom Processing within the EBMT System METIS-II Dimitra Anastasiou dimitra@d-anastasiou.com Institut für Angewandte Informationsforschung

40 / 40 AnastasiouIdioms in EBMT

References (2)

• Moon, R., (1998), Fixed Expressions and Idioms in English: A Corpus-based Approach, Oxford, England: Clarendon Press.

• Ryu, B. R.; Kim Y. K.; Yuh, S. H.; Park S. K., (1999), “FromTo K/E: A Korean English Machine Translation system based on idiom recognition and fail softening”, in: MT Summit VII, Singapore, 469-475.

• Santos, D., (1990), “Lexical gaps and idioms in Machine Translation”, in: Karlgren, H. (Ed.), 13th COLING 1990, Helsinki, Finland, 330-335.

• Sumita, E.; Iida, H.; Kohyama, H., (1990), “Translating with Examples: A New Approach to Machine Translation”, in: 3rd TMI 1990, Texas, USA, 203-212.

• Trawinski, B., Sailer, M., Soehn, J.P., Lemnitzer, L., Richter, F., (2008),“Cranberry Expressions in English and German”, in: MWE Workshop 2009, at LREC Conference, 35-39.

• Volk, M., (1998), “The Automatic Translation of Idioms. Machine Translation vs. Translation Memory Systems”, in: Nico Weber (Ed.): Machine Translation: Theory, Applications, and Evaluation. An assessment of the state of the art. St. Augustin: Gardez-Verlag.

• Wehrli, E. (1998), “Translating Idioms”, in: 17th COLING 1998, Vol. 2, 1388-1392.