Overview of technologies for translators and language
service providersBelinda Maia
University of Porto
Translator asLanguage Services Provider
• MUST HAVE KNOWLEDGE OF:– Science and Technology– National and International Economics, Politics, Law
and Current Affairs– Multimedia– Human Language Technologies - HLT– Information Society Technologies - IST
• MUST BE:– A Multidisciplinary Communicator– A Multimedia Communicator– AND an Intercultural Communicator
Translator as Intercultural Communicator
• MUST HAVE KNOWLEDGE OF:– Psycholinguistics– Contrastive linguistics– Sociolinguistics– Cultural theory– Literary theory
• MUST BE:– Multi-lingual and multi-culturally sensitive
Translator as Multimedia Communicator
• MUST HAVE KNOWLEDGE OF:– General IT as user– Special IT for translators – MT, CAT etc– Subtitling and Dubbing programmes– Web Pages– ETC
• MUST BE:– Computer literate and aware of new media
Information Society Technologies
• European Programme at: http://cordis.europa.eu/ist/
• Focus on:– Technology for providing information– Language as vehicle of information– Language as structuring knowledge– Knowledge management
HLT(1)Calls for (research) proposals
• 1999/2000
• MLIS (Multi Lingual Information Society) – the provision of multilingual language
resources over global networks – the development of multilingual networked
services
HLT(2)Calls for (research) proposals
• 2000/2001
• Multilingual communication services and appliances – Multilingual e-service and e-commerce – Natural and multilingual interactivity– Multilingual web – Multimodal and multi-sensorial dialogue
modes
HLT (3)Calls for (research) proposals
• 2002/6 – Focus on– Knowledge and Interface Technologies
• Multi-modal interfaces• Semantic-based knowledge systems
– Cognitive systems– Bio-inspired Intelligent Information Systems
Multimodal Interfaces • Multilingual Communication –
> Facilitating translation for unrestricted domains, especially for spontaneous (unrestricted) or ill-formed (speech) inputs in task oriented settings.
• Areas to be addressed include:
Multimodal Interfaces • human-to-human;
• human-to-things;
• human-to-self;
• human-to-content;
• device-to-device; • human-to-embodied robots.
Multimodal Interfaces • Areas to be addressed include:
• speech-to-speech translation;
• statistical/mixed approaches to translation;
• adaptive techniques, incorporating learning;
• robustness of approach.
Don’t forget
• HLT research proposals are for cutting-edge technology
• The results will be in the future
• But the future is coming!
Technology FOR Translators
• Machine translation (MT)
• Machine assisted translation (MAT)
• Internet for information retrieval
• Corpora use
• Terminology Management
• Multimedia tools
• Summarisation and Revision
MT– a threat, a solution or a tool?
• A threat?• Under present circumstances - No• A solution?• Partially > ‘gist’ translation• A tool?• Increasingly > + pre- and post- editing • OR• Human Assisted MT (HAMT)
On-line MT engines
– Babelfish – FreeTranslation – WorldLingo – Systran – Google – E-Translation Server– Amikai
Online MT - uses
• Training in awareness of lexical and syntactic difficulties for both human and machine translation
• Our experiment with METRA
• It gets hundreds of hits per day, so who is using it?
• A lot of translators….. !
MAHT Commercial Programmes
• SDL + TRADOS - Check
• http://www.sdl.com/
• http://www.trados.com/
• DÉJA VU http://www.atril.com/
• STAR - TRANSIT http://www.star-group.net/eng/home.html
• WORDFAST - http://www.wordfast.net/
MAHT Basic tools
• Translation memories (TMs) + concordancer• TM created:
– As translator works– Using text aligner on previous texts + translations
• Terminology database created:– Pre-translation by terminologist / company / translator– Post- translation by aligning terms in text and
translation
MAHTAdditional tools
• Spelling and grammar checkers – in Word• Machine Translation• File formatting facilities• Terminology > knowledge databases• Project Management facilities• ETC• For further details come to the commercial
sessions on Wednesday!
eCoLoReTraining kits for TM technology
• Problem: OK – we have bought the TM software for our university – but it is empty!
• Solutions?
• Make your own TMs • eCoLoRe at http://ecolore.leeds.ac.uk/
Translation technology- Needs
• To find, keep and re-use information
• To work within multimedia technology
• Good understanding of Linguistics
• Understanding of how/why spelling and grammar checkers, MT, and other HLTs do(n’t) work
Using the Internet
• To find information• Understanding how the internet works• Using browsers intelligently
• To keep information• Collecting site links• Downloading useful information
• To convert information to knowledge• Studying special subjects
Internet information
• Eurodicautom, online terminology, glossaries, dictionaries
• On-line encyclopedias – e.g. Wikipedia
• Translators’ pages
• Translators’ forums and mailing lists
• Systematic finding, analysing and storage of relevant information / knowledge
Monolingual Corpora as tools
• Large quantities of varied types of text
• British National Corpus (BNC) – online at: http://sara.natcorp.ox.ac.uk/lookup.html
• Linguateca – Portuguese corpora – online at: http://www.linguateca.pt
• PLEASE inform of others!
Multilingual Corpora as tools
• EU documents at: http://europa.eu.int/
• Parallel corpora (Translation Memories?)– E.g. COMPARA > EN & PT (literary) online at:
http://www.linguateca,pt – 1 million x 2
• Comparable corpora – originals in different languages, but same domain and/or genre
Corpora - uses
• Monolingual corpora – finding the right word or collocation
• Multilingual / parallel corpora – finding terminology and translation suggestions
• Comparable corpora – discovering expert terminology and local text conventions
Terminology > KnowledgeFrom:
The ‘right word’Glossaries / dictionariesDatabasesThesauriConceptual organizationOntologiesKnowledge databases
Corpógrafo – integrated suite of online tools
• Corpora construction and analysis • Semi-automatic term extraction • Concept databases• Traditional terminology fields• Semi-automatic extraction of definitions and
semantic relations• Visualization of concept systems / ontologies• Produced by Linguateca – PoloCLUP and freely
available at: http://www.linguateca.pt/corpografo
Multimedia translation
• Localization• XML• Sub-titling• Dubbing• Web-pages• Software for interpreters• Speech-to-speech machine translation &
interpretation?
Other skills ± software
• Revision
• Translation evaluation
• Summarization
• Terminology management
• Information retrieval
• Project management
Linguistics
• Essential training for translators– General linguistics– Contrastive linguistics
• Translators > language experts > new specializations – Natural language processing – Translation and terminology tools