1
Making the Most ofMachine Translation TodayAn eBook from the ULG library
2
History
Machine Translation is a relatively new technology,
with origins in the early 20th century, that has
consistently experienced rapid advancements since
the 1980s. George Artsrouni and Petr Smirnov-
Troyanskii worked to create MT in the 1930s, with
Smirnov-Troyanskii laying the groundwork for what
was needed in an MT system.
Artsrouni, an engineer, and Troyanskii, a Russian
academic, both applied for patents that are considered
the definitive precursor to modern-day MT. The two
men sought patents for electromechanical tools that
could be used as translation dictionaries (Hutchins).
Troyanskii generally garners more acclaim in MT
history, given that he suggested the system would
require the following component parts: an editor
familiar with a source language to convert words to
base forms; a machine, which would turn them into
equivalent forms in the target language; and a second
editor who would edit the machine translations
(Craciunescu, Gerding-Salas, Stringer-O’Keeffe,
2004).
Attempts to create a successful MT system
continued into the 1950s and 1960s with the advent
Machine Translation (MT) is ever-present in the translation industry.
The technology is being used to shorten project
timelines for Language Service Providers (LSPs)
and reduce costs for clients as they localize
content around the globe. MT is an amazing
tool for business when it’s used properly. Often,
this means working with a linguist to post edit
translated documents to ensure translations
are accurate and readable.
Although MT has reached amazing new levels
of accuracy, it’s still not on par with human
linguistic expertise. But that doesn’t mean
using MT is a fruitless endeavor; in fact, that’s
far from the case. This article describes how
global companies can best leverage MT without
sacrificing the quality of translation projects.
3
of computers. In 1954, the first MT test run took place when
IBM partnered with Georgetown University, a demonstration
widely circulated in the media and considered to be quite a feat
at the time. Although advancements continued in the field,
the thought of digitally translating languages seemed less
and less like a possibility and more like a flawed experiment.
Hopes were diminished when the United States government
created an advisory committee on the technology, the
Automatic Language Processing Advisory Committee
(ALPAC), in 1964. The committee released an unflattering
report of MT’s progress, saying research wasn’t advancing
at the rate it should.
“No one can guarantee, of course, that we will not suddenly
or at least quickly attain machine translation, but we feel that
this is very unlikely,” the report stated.
In ALPAC’s paper, released in 1966, computational linguistics
researcher Victor Yngve, who was working at MIT at the time,
provided a similarly doubtful outlook.
“As to the possibility of fully automatic translation, I am
convinced we will someday reach the point where this will
be feasible and economical. However, there is considerable
basic knowledge required that we simply don’t have atthe
moment, and it is anybody’s guess how soon this knowledge
can be obtained,” said Yngve.
The report put a stop to MT research for some time, before
important developments resurfaced in the 1980s.
MT research and development boomed in the 1990s and
2000s with the advent of the digital age. Statistical- and
example-based translation came to the fore in the 1990s and
2000s, and today the most recent advances, such as Neural
Machine Translation (NMT), are again making waves in the
industry.
Currently, MT is seen as a technology that’s drastically
improved since its infancy but still has a good amount of
room to grow in terms of accuracy. While MT has gotten
better at deciphering context thanks to NMT, which is able
to review complete sentences at a time instead of individual
words or phrases, it is still usually necessary to include Post
Editing (PE) for the best results. PE refers to the process in
which a human translator cleans up a document after it goes
through an MT system.
Functions
MT is now virtually ubiquitous in the modern world, and
sites like Google Translate and Bing allow anyone to use the
technology.
MT is used in many different ways by companies within
different industry verticals, including Legal, Life Sciences,
4
Manufacturing, Information Technology, Finance and
Consumer Products. Businesses have had success using
MT to reduce translation costs and more efficiently convey
global messages.
MT can be used together with other technologies to
expedite translation projects, including:
• Language Identification: Language Identification (LI) automatically identifies the language(s) a document contains based on a pre-populated text corpus stored in the system.
• Optical Character Recognition: Optical Character Recognition scans photos or documents containing coded text and converts that text into an editable format. OCR is commonly used for PDFs or other non-editable documents.
• Terminology Integration: Some MT systems can incorporate Translation Memory (TM) or glossary databases to aid in keeping terminology consistent.
MT can be a boon for businesses, but its limitations may also
outweigh its capabilities if it is used inappropriately or without
proper preparation.
Recently, Google and Microsoft have begun developing
Neural MT (NMT) engines and incorporated them into their
applications. This relatively new technology promises to
replace Phrase-Based Machine Translation (PBMT). NMT
employs artificial intelligence algorithms that can derive
meaning from whole sentences or ideas using so-called neural
networks whereas PBMT works with individual words, or
segments of a sentence (Wu, et all., 2016).
NMT is modeled on neural networks in the human brain, where
information is sent to different “layers” to be processed before
output. The biggest benefit to NMT is its speed and accuracy,
which is possible thanks to its ability to use algorithms to learn
linguistic rules on its own from statistical models.
Statistical MT, or PBMT, on the other hand, uses predictive
algorithms to translate text. These systems are built upon
parallel bilingual text corpora, which serve as a basis for
“matching” to create output with the highest probability of
being correct.
5
When MT Works For Business
Before using MT for corporate translation projects, users
should ask themselves some key questions to see if the
solution will work for them.
To determine whether or not MT solutions are appropriate for your corporate project, ask yourself a few questions:
• Is a quick turnaround time necessary?
• Will the translation be distributed externally?
• Is cost a critical factor?
• Is the source documentation confidential, or protected by regulation?
If quality is your top priority, it is highly unlikely that you will
create accurate translations without the help of a human
translator.
On the other hand, if there is a need to translate a large number
of documents to determine which materials are relevant (for
example, if a law firm has a large number of multilingual court
documents to review), MT can be very helpful. MT is also
handy in less formal situations to generate basic translations
of emails, internal communications or memos (getting the
so-called “gist”). Getting the “gist” of foreign-language
correspondence with MT is much less time-consuming, labor-
intensive and costly than calling in a professional linguist. But
always keep data security in mind: Secure MT applications
that utilize standard methods of security, such as 256-bit
encryption, are critical for the secure translation of emails and/
or websites and make for quick and easy solutions without
storing user data.
Typically, for translations that will be distributed externally
(such as PR and brand-sensitive materials), it is usually best
to use a human translator or a combination of human and
machine translation through post-editing (PE). Marketing and
advertising aren’t usually good candidates for MT, given the
complicated and often idiomatic language they contain.
From a practical standpoint, it’s also important to realize that
a substantial amount of setup and preparation goes into MT.
If you don’t have the time and resources to properly train an
MT system, it will produce very inconsistent results.
Related to this point is the fact that MT systems work on
language pairs and can only translate into a target language
the system has been built with. No MT engine is automatically
able to translate any language; it needs to be properly trained
with existing language information.
When MT Doesn’t Work
The biggest drawback of MT is its ongoing inability to pick up
on linguistic nuances, something humans do naturally.
6
As mentioned earlier, MT systems are weak at translating
marketing copy like blogs, taglines, or proposals accurately and
readably. Likewise, literature and other creative or persuasive
writing are examples of texts where MT may leave readers
confused or put off. Since MT systems are incapable of
sensing and transmitting the “feelings” or “emotions” being
employed in creative copy, MT can’t successfully convey its
message.
In the field of healthcare, using MT for “mission critical”
documentation, such as medical device instructions, is still
not recommended. Whenever patient safety is at stake, it’s
a good rule always to rely on the human touch.
Projects where you should avoid relying solely on MT include:
• Those in highly regulated fields such as medical device or healthcare, where safety could be compromised by a poor translation
• Those that include content that will be distributed externally or used as branding or promotional materials
• Those with content that contain nuanced, complicated meanings such as literature or creative/persuasive texts
That said, it is a good idea for businesses to use MT as a
primer, or first shot, to figure out the “gist” of a text in another
language.
Another key issue is security. It is tempting to rely on
free, online MT offerings, but these almost always lack
confidentiality, and in the end you may be paying a higher
price for the translation than you expect.
Breaching client confidentiality could mean legal action and
damage to professional relationships. Google Translate, for
example, stores any and all data it receives. By inputting your
“Successful application of Machine Translation requires post editing and an engine that will stand up to confidentiality requirements faced by users in regulated fields, or those with sensitive documentation. Modern day machine translation is an invaluable tool when used with the help of professional human linguists.”
– Kristen Giovanis United Language Group President
7
content into Google, you are allowing Google to make free use
of it for its own purposes. From the site’s terms and services:
“When you upload, submit, store, send or receive content
to or through our Services, you give Google (and those we
work with) a worldwide license to use, host, store, reproduce,
modify, create derivative works (such as those resulting from
translations, adaptations or other changes we make so that
your content works better with our Services), communicate,
publish, publicly perform, publicly display and distribute such
content.”
The bottom line is this: MT is a cost-effective and efficient
way to perform an initial translation of content. And, in some
cases, using post-edited MT output will produce the results
you need.
The biggest mistake you can make is relying solely on MT
for translation if your final product will be client-facing or
distributed to the outside world.
Know Your Audience, Industry and Scope of Your Project
MT is an extremely valuable asset to businesses that know
how to use it properly. Using MT to generally translate the gist
of a text, or processing an enormous volume of documents
to sort out which ones must be translated by a human is
hugely beneficial.
Combining PE and MT results in faster turnaround times than
standard human translation (because a linguist isn’t needed
to complete the initial translation) and saves money.
MT tools used in a secure environment can be extremely
advantageous in the corporate world. Maybe most importantly,
to be truly effective, businesses who plan to use MT need to
understand how the technology will be used and what the
scope of their project is.
Always remember that MT cannot pick up on the linguistic
complexities that a human translator would, and that MT
almost always produces a less-than-perfect translation.
Here are some best practices around the use of MT:
• Use MT to get the “gist” of a text quickly and cost-effectively.
• Employ MT when you have a large number of documents that need to be translated quickly.
• Use MT when reviewing documents internally.
• Understand the risks of using MT. The best way to produce accurate translations is to combine machine and human translation.
• Know your audience, your industry and the scope of your project before you deploy MT.
8
References:
Craciunescu, O., Gerding-Salas, C., & Stringer-O’Keeffe, S.
(2004). Machine Translation and Computer-Assisted Translation:
A New Way of Translating? Retrieved from:
http://www.translationjournal.net/ journal/29computers.htm
Wu,Y., Schuster, M., Chen, Z., Le, Q., Norouzi, M. (2016)
Google’s Neural Machine Translation System: Bridging the
Gap Between Human and MachineTranslation.
Retrieved from: https://arxiv.org/pdf/1609.08144v1. pdf
AlSukhni, E., Alkabi, M., Alsmadi, I. (2016) An Automatic
Evaluation for Online Machine Translation: Holy Quran Case
Study. Retrieved from:
http://www.thesai.org/Downloads/Volume7No6/Paper_14-An_
Automatic_Evaluation_for_Online_Machine_Translation.pdf
Hutchins, J. Two Precursors of Machine Translation: Artsrouni
and Trojanskij. Retrieved from:
http://www.hutchinsweb.me.uk/IJT-2004.pdf
Automatic Language Processing Advisory Committee.
Language and Machines: Computers in Translation and
Linguistics. Retrieved from:
http://www.mt-archive.info/ALPAC-1966.pdf
Machine Translation, redefined forthe modern global business.
Security Speed
Text Documents Websites
www.octavemt.info