microsoft translator william lewis [email protected] kites symposium october 31, 2013 -...

52
Microsoft Translator William Lewis [email protected] Kites Symposium October 31, 2013 - Helsinki, Finland

Upload: randolf-floyd

Post on 23-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Microsoft Translator

William [email protected]

Kites SymposiumOctober 31, 2013 - Helsinki, Finland

Page 2: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

2

Overview

Introduction to Microsoft Translator, Tools, Products, etc.

Extent of Localization - Methods of Applying MT

Collaborative MTAssessing QualityApplication in Knowledge BaseBuilding your own MTCollaboration with Language Communities

Page 3: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Why MT?The purpose

The Crude Extent of localization Data Mining & Business Intelligence Globalized NLP Triage for human translation

Research Machine Learning Statistical Linguistics Same-language translation

The Good Breaking down language barriers Text, Speech, Images & Video Language Preservation

NOT:

Spend less money Take the job of human

translators Perform miracles

Page 4: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Microsoft Translator – Quick Facts

Linguistically informed statistical MT system 41 languages – from any language to any other language Runs in Microsoft Datacenter Simple web service API: SOAP, REST, AJAX, OData, web site widget 2 million characters/month free Available in the Enterprise Agreement, as a monthly subscription For extreme confidentiality situations, available on-premise

Highly customizable:– Collaborative Translations – Involve community, coworkers and customers– Hub: Custom engine training via an easy-to use UI

Web Scale– Powers translations in Bing, Microsoft Office, Microsoft SharePoint, Internet

Explorer, Yammer– Powers translations in Facebook, Twitter, eBay, and many other government and

enterprise sites

4

Page 5: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Microsoft Translator at a Glance

World-class Statistical Machine TranslationBuilt on over a decade of work at Microsoft Research

Big Data PoweredTrained with billions of “parallel” sentences (Bing index & licensed)

General Purpose SystemPowers Bing Translator, supports 40+ languages, any-to-any

Unprecedented Customization CapabilityHub (train before translation) + CTF (edit after translation)

Powerful Cloud APIRich, secure API enabling integrations, 99.9% availability

Page 6: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Fully integrated across the stack, Translator extends the value of Microsoft platform and your solutions built on the Microsoft platform for our customers including consumer facing applications such as Bing Translator, Bing Toolbar, Bing Dictionary, and Windows Phone App.

+80,000 more.

A few of our customers and partners….

Enabling Translation in Many Products

Page 7: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Powerful Tools and Customization

Our machine learning & big-data based translation technology brings the power of instant translations to break down language barriers for users, developers, webmasters, translators and businesses. Robust, industry leading tools such as the HUB and CTF allow for unprecedented customization of the translation experience.

Translation

Instant translation and language services in web, desktop and mobile applications.

Highly scalable and robust cloud-based,  machine-translation service from Microsoft.

Supports SOAP, REST, AJAX, OData, and the Translator web site translation widget.

Extensibility for development on SharePoint, Office , Windows Phone, and more…..

Instant translations of web pages without the need to write any code.

Use the AJAX API to roll-your-own widget.

Use the integrated “Collaborative Translations” (CTF) functionality to tap into your community.

Customization

Custom translation portal to build, train, and deploy customized automatic language translation systems.

Combine your data with Bing big data to tune the translation output to best fit your content.

Free with any level of Translator subscription (including the free tier).

Override, modify or vote for the translated output to best fit the content.

Provide the end-user alternative translations.

Import the edits back into Hub for further training.

Hub CTFWidgetPowerful API

Page 8: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Integrates with your TM tool

8

Top translation tools support Microsoft Translator

Page 9: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Give these a try! (Demo)

Bing Translator for Windows PhoneThe Most Innovative Translation App on any Phone

Lync Conversation TranslatorRealtime Multi-lingual Conversations with Lync

Translator Widget for WebpagesInstant On-demand Translation for any Web site

Word Web App (Microsoft Office)Rich Web based Document Translation, now available in SharePoint, Outlook.com & SkyDrive

Contextual ThesaurusUtilize the Power of Machine Translation to Translate “English to English”

Page 10: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

10

PriceCompetitively priced

Monthly subscription Free for up to 2 million characters per month Base price: $10 per million characters Discounted for higher volumes Paid by credit card or via Microsoft Enterprise agreement

Page 11: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

11

Extent of localizationMethods of applying MT

Post-Editing

Goal: Human translation quality

Increase human translator’s productivity

In practice: 0% to 25% productivity increase– Varies by content, style and

language

Raw publishing

Goals:– Good enough for the

purpose– Speed– Cost

Publish the output of the MT system directly to end user

Best with bilingual UIGood results with technical audiences

Page 12: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

12

Extent of localizationMethods of applying MT

Post-Editing

Goal: Human translation quality

Increase human translator’s productivity

In practice: 0% to 25% productivity increase– Varies by content, style and

language

Raw publishing

Goals:– Good enough for the

purpose– Speed– Cost

Publish the output of the MT system directly to end user

Best with bilingual UIGood results with technical audiences

Post-Publish Post-Editing“P3”Know what you are human translating, and why

Make use of community– Domain experts– Enthusiasts– Employees– Professional translators

Best of both worlds– Fast– Better than raw– Always current

Page 13: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland
Page 14: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland
Page 15: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland
Page 16: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Always thereAlways currentAlways retaining human translationsAlways ready to take feedback and corrections----------Midori Tatsumi, Takako Aikawa, Kentaro Yamamoto, and Hitoshi IsaharaProceedings of Association for Machine Translation in the Americas (AMTA)November 2012

Page 17: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Collaboration: MT + Your community

Your community

Your Web Site

Your App

Microsoft Translator

Collaborative TM

Enormous language knowledge

Microsoft Translator API

TranslationRequest Response

Match first

Translate if no match

What makes this possible – fully integrated 100% matching TM

Collaborative TM entries: Rating 1 to 4: unapproved Rating 5 to10: Approved Rating -10 to -1: Rejected

1 to many is possible

Page 18: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Making it easier for the approver – Pending edits highlight

Page 19: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Making it easier for the approver – Managing authorized users

Page 20: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Making it easier for the approver – Bulk approvals

Page 21: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

What is Important?In this order

QualityAccessCoverage

Page 22: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Measuring Quality: Human EvaluationsKnowledge powered by people

Absolute 3 to 5 independent human evaluators are asked to rank

translation quality for 200 sentences on a scale of 1 to 4– Comparing to human translated sentence– No source language knowledge required

24

4 Ideal Grammatically correct, all information included

3 Acceptable Not perfect, but definitely comprehensible, and with accurate transfer of all important information

2 Possibly Acceptable

May be interpretable given context/time, some information transferred accurately

1 Unacceptable Absolutely not comprehensible and/or little or not information transferred accuratelyAlso: Relative evals, against a competitor, or a previous version of

ourselves

Page 23: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Measuring Quality: BLEU*Cheap and effective – but be aware of the limits

A fully automated MT evaluation metric–Modified N-gram precision, comparing a test

sentence to reference sentencesStandard in the MT community

–Immediate, simple to administer–Correlates with human judgments

Automatic and cheap: runs daily and for every change

Not suitable for cross-engine or cross-language evaluations

25

* BLEU: BiLingual Evaluation Understudy

Result are always relative to the test set.

Page 24: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

26

Measuring Quality In ContextReal-world data

Instrumentation to observe user’s behavior

A/B testingPolling

In-Context gives you the most useful results

Page 25: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

27

Knowledge Base (since 2003)

Page 26: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland
Page 27: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

29

Page 28: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

30

Knowledge base feedback

Page 29: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Source: Martine Smets, Microsoft

Customer Support

31

Knowledge Base Resolve Rate

Human TranslationMachine Translation

Microsoft is using a customized version of Microsoft Translator

Page 30: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Statistical MT - The Simple View

TranslationEngine

Collect and store parallel and target

language data

Web mined data

Train statistical models

Government dataMicrosoft manuals

DictionariesPhrasebooks

Publisher data

High-Performance Computing Cluster

TranslationEngine

User InputText, web pages, Chat etc

Distributed Runtime

Translated Output

Translation APIs and UX

32

Page 31: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Collaboration: MT + Your community

Your community

Your Web Site

Your App

Microsoft Translator

Collaborative TM

Enormous language knowledge

Microsoft Translator API

Remember the collaborative TM? There is more.

Page 32: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Collaboration: You, your community, and Microsoft

Your community

Your Web Site

Your App

Your TMs

Your previously translated documents

Microsoft Translator

Collaborative TM

Microsoft Translator Hub

Enormous language knowledge

Microsoft Translator API

Your custom MT system

Your collaborators

You, your community and Microsoft working together to create the optimal MT system for your terminology and style

Page 33: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

35

Page 34: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland
Page 35: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland
Page 36: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland
Page 37: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland
Page 38: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Multiple community models–Necessity: driven by crisis–Love of language: driven by strong language/cultural identification

–Preservation: desire to preserve language

Haitian CreoleWhite Hmong

Community-driven MT

Page 39: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

One of two official languages in HaitiA creole that evolved from French, Spanish, and several African languages (large % French-like)

Spoken natively by most of Haiti’s 8M peopleRecent as a written language (first literature dates to late 18th century), growing literature base

Semi-literate population, with preference to French (until recently)

Somewhat inconsistent orthographyLimited (but growing) Web presence

Haitian Creole

Page 40: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

The earthquake of January 12th, 2010 a significant humanitarian crisis.

Aid agencies, foreign governments, a variety of NGOs, all responded en masse

Tranbleman tè nan Pòtoprens, kapital Ayiti!

Moun ap fouye pami debri yon bilding ki kraze nan tranblemann' tè 12 Janvye a.

Pòtoprens te catastrophically afekte 12 janvye 2010 tranbleman tè a.

Need for translated materials critical, especially those related to medicine and the relief effort.

Mission 4636 text messages from the field (up to 5K/day at peak) require rapid translation

Page 41: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

At 10:30 a.m. on Tuesday, January 19th 2010, our team received an e-mail from a Microsoft employee in the field:– Do we have a translator for Haitian Creole?– If not, could we make one?

A little soul searching:– No one on our team knew anything about Creole

• No native speakers• No linguistic background on the language• No idea about grammatical structure

– No idea about encoding or orthography– No knowledge about registers or the degree of literacy– No parallel or monolingual training data of any kind (nor readily

available documents we could start with)– In effect, we were starting at Zero

So what else could we do but say “YES!”

The E-mail

Page 42: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Emergency SMS infrastructure Setup immediately in wake of Jan.

12, 2010 quake

Mission 4636

Mission 4636:Received SMSsTranslatedCategorizedTriagedRouted to aid agencies

Page 43: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Fanmi mwen nan Kafou, 24 Cote Plage, 41A bezwen manje ak dlo

Moun kwense nan Sakre Kè nan Pòtoprens

Ti ekipman Lopital General genyen yo paka minm fè 24 è

Fanm gen tranche pou fè yon pitit nan Delmas 31

Mission 4636 Messages

My family in Carrefour, 24 Cote Plage, 41A needs food and water

People trapped in Sacred Heart Church, PauP

General Hospital has less than 24 hrs. supplies

Undergoing children delivery Delmas 31

Over 80,000 messages received, up to 5,000+/day

Page 44: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Crisis Infrastructure: Message Pipeline

SMS

Tweets

Media

MessagePortal

Crowd(Translate)

MT

Triage

Geolocate

Lewis et al, 2011

Page 45: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

White Hmong: not a crisis scenario like Creole

But, a language in crisisSome background:

–The Hmong Languages–The Hmong Diaspora–Decline of White Hmong and its usage in younger Hmong

White Hmong

Page 46: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Involves two critical groups:–Community of native speakers–Community leader(s)

Wide spectrum of users across the Hmong community:–College students–High school students–School teachers–School administrators, deans, professors–Business professionals–Elders

Community Engagement

Page 47: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Locating and vetting data–Locate data–Review documents that contain Hmong data

–Review parallelism of Hmong-English documents

Actively correcting errors from the engineContributing translation “repairs” on web sites that translate to Hmong

Building MT: Community Contributions

Page 48: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Home page (Web page viewer, cut-and-paste translator)

Haitian Creole and Hmong are among the languages available through our API (Advanced Programming Interface)–Multiple interfaces: AJAX, SOAP, HTTP–Can integrate translation directly into a variety

of appsWidget

–Integrate translation into Web pages–Traffic kept client side

Tools Available for Haitian Creole and Hmong

Page 49: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Widget/Collaborative Translation Framework (CTF)–Community can contribute translations–These can be published to Web pages

–Mixes MT with “trusted” human translations

Tools Available for Haitian Creole and Hmong

Page 50: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

52

Just visit http://hub.microsofttranslator.com to do it yourself

Page 51: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

Contacts

Web sitewww.microsoft.com/translator

Licensing & Pricing [email protected]

General & Customer [email protected]

Page 52: Microsoft Translator William Lewis wilewis@microsoft.com Kites Symposium October 31, 2013 - Helsinki, Finland

54