1 historical developments of translation technology (tt) widespread use of fax machines, enabling...
Post on 18-Dec-2015
214 views
TRANSCRIPT
1
Historical Developments of Translation Technology (TT)
• widespread use of fax machines, enabling translation services to operate internationally
1980s• ongoing MT research • translator’s workbench fax-based translation services telephone interpreting services• online access to term banks• multilingual text processing• low cost PCMT
ICT Development TT Development
2
Historical Developments of Translation Technology (TT)
• penetration of PCs - Desktop publishing (DTP)• speech recognition• PCs connected via modem• telework• Internet (Web)• Sony PlayStation• Google
• mobile phones -texting
1990s software localisation services• localisation tools• data-driven MT• online term banks• free WebMT (1997: Babelfish)
web localisation services• Translation Memory
ICT Development TT Development
3
Historical Developments of Translation Technology (TT)
• iPod• Wi Fi (wireless LAN)• films on DVDs• blogging (web log)
• Internet TV• video-search with search engines
2000
• widespread use of localisation tools across translation services• corpus tools • DVD subtitling (incl. audio descriptions)
ICT Development TT Development
4
Translation Technology Continuum
automation human involvement
Automatic Translation
UnaidedTranslation
Computer-aidedTranslation (CAT)
Translation processautomated by use of Machine Translation
Translation process aided by electronic tools such as Translation Memory
Translation process not aided by any electronic tools
Adapted from Hutchins & Somers (1992)
5
Machine Translation (MT)
………..Translation is a fine and exacting art, but there is much about it that is mechanical and routine, if this were given over to a machine, the productivity of the translator would not only be magnified but this work would become more rewarding, more exciting, more human.”
Martin Kay (1987)
“A computer is a device that can be used to magnify human productivity. Properly used, it does not dehumanize by imposing its own Orwellian stamp on the products of human spirit ……….
Rationale for Technology Applications to Translation
6
Machine Translation (MT)
MT research began in 1950’s – Warren Weaver’s 1949 Memo:
“When I look at an article in Russian, I say: This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.”
(in Locke and Booth 1955:18)
7
Machine Translation (MT)
MT history milestones (pre-ALPAC)
• 1954: Georgetown system demo successful translation of 49 Russian sentences into English• 1955-1966: $50m spent in 20 research centres in USA • 1966: Automatic Language Processing Advisory
Committee (ALPAC) Report concludes: - MT was slower, less accurate and twice as expensive
as Human Translation- there was no prospect of useful MT either immediately or in the future
8
Machine Translation (MT)
MT history milestones (post-ALPAC)
• 1969- privately funded projects Logos system (1969); Weidner-CAT (1977); ALPS (1980)• 1975 Météo project in Canada • 1976 European Commission acquires Systran • 1979 Eurotra project in Europe for Multilingual system• 1980 – PC-based system• 1990 – data-driven system; WebMT
9
Machine Translation (MT)
1975 Météo project in Canada
– automatic translation of weather forecasts (en-fr)– sublanguage approach (domain-specific MT)– most successful MT application to date
• public broadcasting since 1977• fr-en available since 1989• only 4% of output needs post-editing• rapid translation staff turnover no longer a problem
10
Machine Translation (MT)
• Technological factorsTechnological factors
- prevalence of PC with improved processing power- prevalence of PC with improved processing power• Translation market factors Translation market factors
- official bilingualism/multilingualism create institutional needs - official bilingualism/multilingualism create institutional needs
- globalisation creates huge commercial needs- globalisation creates huge commercial needs• Advances in computational linguisticsAdvances in computational linguistics• More realistic user expectationsMore realistic user expectations• Internet creates casual access to multilingual informationInternet creates casual access to multilingual information
Renewed interest in MT in 80s and 90s
11
Machine Translation (MT)
MT Design•Rule-based vs Data-driven Systems (SBMT & EBMT)
–Rule-based systems by far the more common
•Architecture for Rule-based Systems–Direct 1st generation MT systems–Transfer –Interlingua 2nd generation MT systems
12
Machine Translation (MT)
MT Design•Direct Systems
- designed for one specific language pair only
- translation done directly with no intermediary representation
- minimal SL analysis with basic strategy of replacement & adjustment
- heavily dependent on bilingual dictionaries
(Hutchins, 1986)
13
Machine Translation (MT)
MT Design•Direct Systems
Advantages:- potential ambiguities left unresolved (wand/mauer parete/muro -> wall)- minimum resources (bilingual dictionary; rudimentary TL knowledge)Disadvantages:-generally poor translation due to the word-for-word model used; no analysis of SL sentence internal structure; no generalisation (case-by-case approach) e.g. SL (fr) Les soldats sont dans le café. TL (en) The soldiers are in the coffee.
14
Machine Translation (MT)
MT Design•Interlingua-based Systems
- in theory the language-independent interlingua can be applied to any language pair
- requires abstraction away from SL texts to a language-independent pivot from which any TL can be generated
- true interlinguas hard to find
(Hutchins, 1986)
language- independent representations
15
Machine Translation (MT)
MT Design•Transfer-based Systems
- analyse SL text to achieve an unambiguous abstract representation (an interface structure) of each sentence
- convert abstract SL representation to an abstract TL representation
- generate TL text from abstract TL representation
(Hutchins, 1986)
language dependent representations
16
Machine Translation (MT)
MT Design•Transfer-based Systems
1. SL analysis 3. TL generation
2. TransferSL Interface Structure TL Interface Structure
Surface SL text Surface TL text
17
Machine Translation (MT)
MT Design•Transfer-based Systems - based on systematic linguistic theory - convenient way of categorising linguistic problems (monolingual or contrastive) - modular - need detailed coding of monolingual and bilingual dictionaries and grammars - a dedicated transfer component is needed for each language pair, in each direction
18
Machine Translation (MT)
transfer
direct translation
Source Text Target Text
analysis generation
Interilingua
19
Machine Translation (MT)
MT Design•Data-driven systems: Statistical MT (SMT)
- linguistic knowledge not encoded- takes advantage of a bilingual parallel corpus to arrive at probable translations of each word- corpus-dependent- At run-time the best translation is searched
(Carl & Way, 2003)
e.g. IBM’s experiment with Canadian Hansard corpus: Candid (1988)
20
Machine Translation (MT)
MT Design•Statistical MT: Candid
In Canadian Hansard (parliamentary debates of 40K sentences in each of en and fr)
the le p= .610 (ie 610 times out of 1000) the la p= .178the l’ p= .083the les p= .023the ce p= .013...
21
Machine Translation (MT)
MT Design• Data-driven systems: Example-based MT (EBMT)
inspired by Nagao (1984) who talked about translation by analogy
“Man does not translate a simple sentence by doing deep linguistic analysis, rather, man does translation, first, by properly decomposing an input sentence into certain fragmental phrases, then by translating these phrases into other language phrases, and finally by properly composing these fragmental translations into one long sentence. The translation of each fragmental phrase will be done by the analogy translation principle with proper examples as its reference”
22
Machine Translation (MT)
MT Design•Example-based MT (EBMT)
• It operates on a bilingual corpus with alignments of translation units on word, phrase, and sentence level• During runtime, the system checks whether an adequate translation is stored in the corpus• Best results are obtained if large coherent parts are found
in the corpus
23
Machine Translation (MT)
MT Design•Example-based MT (EBMT)
1. He buys a book on international politics [ST].2. a. (E) He buys a notebook. (J) Kare wa noto wo kau.
He [topic] notebook[obj] buy.b. (E) I read a book on international politics. (J) Watashi wa kokusaiseiji nitsuite kakareta hon o yomu. I [topic]international politics about concerned book[obj] read
3. Kare wa kokusaiseiji nitsuite kakareta hon o kau [TT]. (Sato & Nagao, 1990)
24
EBMT Principle
Vauquois Triangle adapted for EBMT
transfer
direct translation
Source Text Target Text
analysis generationmatching
exact match
recombination
(Somers, 2003:8)
alignment
25
Translation Memory (TM)
TM• A database of aligned SL and TL segments (translation units) to allow the translator to: - propagate translations of internal repetitions in the source text through the target text - recycle translations for previously encountered source text segments (exact matches or fuzzy matches with some edits)- analyse new source texts for repetitions and matches with already translated texts stored in a translation memory
26
Translation Memory (TM)
TM – how it works:• software segments source language (SL) text• human translator translates an SL segment• software stores the SL and the corresponding TL segment
as a translation unit• software checks an incoming SL segment against the stored SL segments and brings up a relevant translation unit in case of match• translator determines whether or not to use or edit the previous translation called up by the software
27
Translation Memory (TM)
Advantages:
• the translator can find out the degree of internal repetitions within SL text before translating • sentence-level matches and similarities are automatically
brought to the translator’s attention for re-use • productivity boosted when the text type is suitable (ie repetitive, frequent updates, sim-ship etc)• TM normally integrates concordance and terminology management components to assist consistency of use of words and terminology
28
Translation Memory (TM)Disadvantages:• Previous errors contained in TM propagated:
- the translator forgets to update TM - the translator asked not to change the existing poor translation
• A ‘sentence salad’ phenomenon (Bédard, 2000) whereby creating a text less coherent or readable due to:
- the translator confined to work on sentence-level - the translator trying to maximise the recyclability - TM consisting of varying texts translated by different translators (Bowker & Barlow, 2004)
• Similarities in form rather than semantic similarities picked up• Potential de-skilling of the translator (Kenny, 2004)