TurningNMTresearchintocommercialproducts
DragosMunteanuandAdriàdeGispert
Proceedings of AMTA 2018, vol. 2: MT Users' Track Boston, March 17 - 21, 2018 | Page 166
• Foundedin1992• 3800+Employees• 56Offices• 38Countries• 400Partners• 1500Enterprisecustomers
helpingbigbrandsgoglobal
marketingcampaigns eCommerce documentation web,socialmedia analytics
78ofthetop100globalcompaniesworkwithSDL
+10BILLIONwordstranslatedmonthly
Proceedings of AMTA 2018, vol. 2: MT Users' Track Boston, March 17 - 21, 2018 | Page 167
SDLResearch–alonghistoryinMT
• ResearchlabsinLosAngeles(USA)andCambridge(UK)• Teammembershavepublished+100onSMTandrelatedtech
– BillByrne,AbdessamadEchihabi,DragosMunteanu,GonzaloIglesias,EvaHasler,AdriàdeGispert,SteveDeNeefe,JonathanGraehl,WesFeely,LingTsou...
• FormerlyLanguageWeaver– 15yearsofleadingexpertiseinSMT– majorcontributions(papers/patents)inphrase-basedandstring-to-treeMT,
automata-basedhierarchicalMT,qualityestimation,tuning,evaluation...
• Stronglinkswithacademia(UniversityofCambridge)
• Summerinternships,industrialpost-docs
Proceedings of AMTA 2018, vol. 2: MT Users' Track Boston, March 17 - 21, 2018 | Page 168
Ourmission:BringMTresearchresultstoproducts
Approachesthatworkformanylanguagepairs
Connectors,plug-ins…
Translationspeed
Privacy!Top-qualityMTonpremiseandinprivatecloud
Hightranslationquality
Customization/Personalization
Respectfileformatsandtags
Controllablememoryanddiskfootprint
Robustnesstomis-spellings
○ Westrivetoprovideourcustomers:
Abilitytolearnovertime(AdaptiveMT)
Terminologyanddictionaries Consistency
Proceedings of AMTA 2018, vol. 2: MT Users' Track Boston, March 17 - 21, 2018 | Page 169
SDLSecureEnterpriseTranslationServer
ü 45NMTenginescurrentlyavailable
DataSecurity- Onpremises/privatecloud- Usedbygov’tfor15years
Quality/Customization- NeuralMT- CustomMTout-of-the-box
Cost-effectivescalability- Elastic,optimizedfootprint- Commodityhardware
EaseofUse/Integration- deploysInhours- MSplug-in&RESTAPI
Proceedings of AMTA 2018, vol. 2: MT Users' Track Boston, March 17 - 21, 2018 | Page 170
Neural Machine Translation
Proceedings of AMTA 2018, vol. 2: MT Users' Track Boston, March 17 - 21, 2018 | Page 171
SMT• Symbolicmodels• Independenceassumption
(separatesub-problems)• Maximum-likelihood
estimation• CPU-orientedtraining• Source-side-guideddecoding• Largedatabases
NeuralMT• Continuous-spacemodels• Singleend-to-endmodel• Discriminativetraining• RelianceonGPUs• Target-side-guideddecoding• Smallercompactmodels
Aparadigmshift
Proceedings of AMTA 2018, vol. 2: MT Users' Track Boston, March 17 - 21, 2018 | Page 172
Bettertranslationmodels
[Zhouetal.’16]
[Gehringetal.’17][Vaswanietal.’17]
[Sutskeveretal.’14][Bahdanauetal.’15]
Proceedings of AMTA 2018, vol. 2: MT Users' Track Boston, March 17 - 21, 2018 | Page 173
BetterBLEUscores
Rules-Based 1970
Statistical MT
2002
Neural MT
2016
WAT Jpn-Eng Eng-Jpn
2014 23.8 35.0
2015 25.4 35.8
2016 27.6 36.2
2017 28.4 41.5
+4.4 !! +6.5 !!
COMPARINGOUTPUTS
©2017SDLPlc
SMT
NMT
Observablequalityimprovement
国連難民高等弁務官事務所(UNHCR)は、内戦状態にあるシリアから逃れた難民の数が5百万人を超えたと発表した。
Office of the United Nations High Commissioner for Refugees (UNHCR) is in a state of civil war when the number of refugees who have escaped from Syria have exceeded 5 million people.
The United Nations High Commissioner for Refugees (UNHCR) announced that the number of refugees escaped from Syria in the civil war was over five million people.
ü 30%improvementoverSMTacrossallourproductizedengines
But…isitALLthatgood?
TherearesituationsinwhichNMTfails
• Whenitfails,itfailsspectacularly – unrelatedfluenttext– repetitions,neurobabble…
• MTuser/customerexpectations– “MTisnotsupposedtodothis”!!!?!– “CanitsupportthefeaturesIneed”???
Over-generationand‘neurobabble’
There was no clear correlation between the measured mass density and the measured mass density, and neither experiment A or B.
The company will pay approximately EUR 600 million in fines, and the U.S. Department of Justice (SEC) to pay for approximately EUR 600 million, and the U.S. Department of Justice and the Justice Department of Justice (SEC) to reduce the amount of internal control of the board of directors of the board of directors of the board of directors…
Over-generationand‘neurobabble’
There was no clear correlation between the measured mass density and the measured mass density, and neither experiment A or B.
The company will pay approximately EUR 600 million in fines, and the U.S. Department of Justice (SEC) to pay for approximately EUR 600 million, and the U.S. Department of Justice and the Justice Department of Justice (SEC) to reduce the amount of internal control of the board of directors of the board of directors of the board of directors…
DataisEVENMOREimportant
• NewNMTmodelsarebetterlearners– Abetterfittothetrainingdata– Relevanttrainingdataiskey– Avoidbabbleandgethugegains!
• Domainadaptation/dataselection[FreitagandAl-Onaizan’16][Chenetal.’17][Britzetal.’17][Farajianetal.’17][VanderWeesetal.’17][Wangetal.’17]…
Adaptingneuralmodels
Majorimprovements!Challenge:§ Adapttocustomer
domain/datawithminimalre-training
§ Maintainhighqualityacrossdomains
Jpn-Engcorpus #words
Generic >300M
Automotive <1M
Lexicalselection
• NMTmodelshavefreedomtoproduceanytargetword– Guidedbysource,notconstrained
• SMTenginesweregoodatlexicalselection–canweleverage?– T-table,n-gramandphraseprobabilities,memory-augmentedmodels/search
[Arthuretal.EMNLP’16][Stahlbergetal.EACL’17][Wangetal.;Dahlmannetal.;Fengetal.EMNLP’17][Zhangetal.IJCNLP’17]…
[Arthuretal.EMNLP’16]
[Wangetal.EMNLP’17]
NMTcanuseN-gramposteriorprobabilities
Stahlbergetal.(EACL’17):“NeuralMachineTranslationbyMinimisingtheBayes-riskwithRespecttoSyntacticTranslationLattices”
But…arethereguarantees?
• Controlisamustforcommercialsuccess• Oneverybadsentencecanputoffacustomer
– Back-offifneeded
• Customers/Usersexpectcertain‘features’– Decodingspeed,dictionarysupport,formattingconstraints,AdaptiveMT,…
Dictionarysupport
• Translationoutputmusttranslatedictionarymatchesexactly–constrainedsearch
• EasyforSMTdecoders• NMTbeamdecoderdoesnotkeepanalignmentbetween
sourceandtargetwords
“Zimra Games continues to innovate with the release next month of Coke Assault 3, which will satisfy the most demanding gamers.”
English German
ZimraGames ZimraGamesGmbH
CokeAssault3 CokeAssaultIII
… …
[Andersonetal.EMNLP’17][Hokamp&LiuACL’17][Chatterjeeetal.WMT’17]
Dictionarysupport
Constrainedsearch• Buildafinite-stateacceptorwith
thetarget-sideconstraints• Keeponeseparatestackper
eachacceptorstate• Outputonlyhypothesesfrom
thefinalacceptorstateØ Constraintscanbewordsor
phrases[Andersonetal.EMNLP’17]
Dictionarysupport
Challenges• Computationalcomplexity
growsexponentiallywiththenumberofconstraints– orderisunknown
• Nothingpreventsrepeateddecoding:
“Zimra Games GmbH setzt mit dem Veröffentlichung auf Coke Assault III im nächsten Monat der Angriff …
Entityconstraints
• Decodermustalsorespectmeta-tags– KeytosupportfileformatsusedbyMTusers
• NMTmodelshouldnotbreaksequentialhistory• Solutionsrequiremodelspecializationand/ordecodingrestrictions
“<B>Zimra Games</B> continues to innovate with the release <I>next month</I> of <B>Coke Assault <c=red>3</c></B>, which will satisfy the most demanding gamers.”
Decodingspeed
• MTusersareexpectedtocertaintranslationspeeds– Targetspeedvaries,butwellaboveresearchengines
• Goalistoprovidebestqualityatdesiredspeed– Speedvsqualitytrade-off
• NMTdeploymentscenarios– CPUonly–hand-helddevices,…– GPU
• NMTtrainingspeedalsorelevant
Decodingspeedvsqualitytrade-off(1)
• Modelarchitecture– recurrent,convolutional,attentional…
– numberofparameters,layerprecomputations…
– Unfoldingandshrinkingensembles
[StahlbergandByrne,EMNLP’17]
StahlbergandByrne(EMNLP’17):“UnfoldingandShrinkingNeuralMachineTranslationEnsembles”
Decodingspeedvsqualitytrade-off(2)
• HardwareandLinearAlgebralibrary– TypeofGPUcard– CPU-GPUcommunication– GPUusage
• Batching– standardintraining
Decodingspeedvsqualitytrade-off(3)
• Decodingparameters– beamsize,earlystopping…
• Reducedvocabularysoftmax(CPU)• Weightclippingintraining
– Low-precisioninference[Wuetal.’16][Devlin,EMNLP’17]…
Copyright©2008-2017SDLplc.Allrightsreserved.Allcompanynames,brandnames,trademarks,servicemarks,imagesandlogosarethepropertyoftheirrespectiveowners.ThispresentationanditscontentareSDLconfidentialunlessotherwisespecified,andmaynotbecopied,usedordistributedexceptasauthorisedbySDL.
SoftwareandServicesforHumanUnderstanding