machine translation: the neural frontier
TRANSCRIPT
Machine Translation The Neural Frontier
John Tinsley
GALA, Amsterdam, March 2017
Source: http://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf
What we’re actually going to cover this morning!
How does it work?
What’s all the fuss about?
“Neural machine translation is ______.”
What is the status as of today?
Is it really that good?
What does all this mean for the future?
What they actually said... “In some cases human and GNMT translations are nearly indistinguishable on the relatively simplistic and isolated sentences sampled from Wikipedia and news articles for this experiment.”
What was reported...
MT developers around the world
Evolution or
Revolution?
Source: (modified from) http://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf
Rule Based Statistical Neural
A brief history of MT…
“State of the Union”
The initial splash made by
statistical MT
The initial splash made by neural
MT
wow that’s pretty good!
We’re about here now
March 27th 2007
This is where the excitement is coming from
Statistical Machine
Translation
MT
Qua
lity
NeuralMachine
Translation
20+ years worth of research
?
Neural machine translation is exciting!
Neural machine translation is the future
Neural machine translation is ultimately just another type of MT
Neural machine translation is not going to replace human translators
Neural machine translation is not a silver bullet
Still early stage
Language independent
Fundamental practical considerations not yet addressed
Neural Machine Translation March 27th 2017
Generic applications only
No flexibility for customisation
Significant hurdles for cost-effective scalable production performance
Academia Industry
Output can be insanely fluent!
Source:https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html
They needed more computers — “G.P.U.s,” graphics processors reconfigured for neural networks — for training…
“Should we ask for a thousand G.P.U.s?”
“Why not 2,000?”
Ten days later, they had the additional 2,000 processors.
Is it really that good? (Yes, it can be!)
• “Yeah it looks better”Anecdotal
• Generally, neural is better*• More obviously so for complex languages• It falls over badly on long sentences
Academic
• Stark improvements for Chinese and Arabic• Comparable performance on other
languagesWIPO
What evaluations are out there?
WIPO large scale apples-to-apples comparison
English to Chinese
Arabic to Chinese
Spanish to Chinese
French to Chinese
• “Yeah it looks better”Anecdotal
• Generally, neural is better*• More obviously so for complex languages• It falls over badly on long sentences
Academic
• Stark improvements for Chinese and Arabic• Comparable performance on other languagesWIPO
• Practical comparison with production MT• Mixed results depending on content type• Clear strengths and weaknesses emerging
Iconic
What evaluations are out there?
Real-world languages and content
Chinese to English patents, mature production engine, highly tuned.
“Real-world” comparative use case
Apples to apples comparison
Access to same training data, test data, including all of the ugly parts.
Effective qualitative evaluation
No one-size-fits-all, so what MT good and what and where does it fall down?
Short Sentences
All Sentences
u Iconic Production MT
u Iconic Neural MT
Neural MT works – and it’s good!
It is not a silver bullet
+ word order + agreement - omitting phrases
+ terminology + error free output - sentence structure
New Opportunities = New Challenges
Black Box Customisation Production “Why is this error happening?”
“Can you fix this error please?”
“How much is that GPU??!”
Data Evaluation Pricing Still needed, now more than ever!
Do we know how to quantify “quality”?
How much does it cost now?
Old Challenges
Short term • Research which takes time• More effective use of general machine translation
2-5 years • Emerging use cases, new types of hybrid, and clarity
Longer term • “Zero-shot” translation?
What does this mean for the future?
Rule-basedStatistical
Neural
You are here
1st Recurrent
NeuralNetwork
2nd
RecurrentNeural
Network
0.0342034233.3434234232.2342352340.4534234230.0023402342.2342342345.0232342343.3423423550.0342034233.343423423
“GORAIBH MAITHAGAT”
“THANK YOU”
Encoder Decoder
EncodedSentenceGaelic
Input EnglishOutput
Memory of previously translated words influence
next result
Thank you!
P.S. This is kind of how neural machine translation works…
[email protected] @johntins