towards a human language project for multilingual europe · 2018-04-24 · towards a human language...

19
Georg Rehm German Research Center for Artificial Intelligence (DFKI) GmbH Language Technology Lab Berlin, Germany META-NET, General Secretary [email protected] Towards a Human Language Project for Multilingual Europe AI and Interpretation

Upload: others

Post on 06-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

Georg Rehm

German Research Center for Artificial Intelligence (DFKI) GmbH

Language Technology Lab – Berlin, Germany

META-NET, General Secretary

[email protected]

Towards a Human Language

Project for Multilingual Europe

AI and Interpretation

Page 2: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

SCIC Universities Conference (19/20 April 2018) 2

Page 3: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

SCIC Universities Conference (19/20 April 2018)

Data Intelligence

Current breakthroughs based on Machine Learning (“Deep Learning”)Also still in use: symbolic, rule-based methods and systems

Artificial Intelligence

• Huge data sets + powerful algorithms + extremely fast hardware

• Enormous potential for disruptions in all sectors and areas

3

Page 4: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

• Since approx. 2015, with breakthroughs in neural technolo-

gies, Machine Translation has been getting better and better.

• All areas of AI look for “super-human performance” but

language is fundamentally different and much more complex.

• Neural AI approaches cannot understand language, they

process it according to huge underlying data sets.

• In many use cases, mistakes can be tolerated.

• But: translation and interpretation are often mission-critical!

• Mistakes can have serious consequences (politics, medicine).

Translation and Interpretation

SCIC Universities Conference (19/20 April 2018) 4

Page 5: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

• Example: Lecture Translator

– University lectures are automatically transcribed and translated,

in near-real time, into several languages

– Students can follow the translation through a web interface

• Example: Presentation Translator

– Presenter can have the speech automatically translated

– Translations are displayed as subtitles

• Example: Call Translator

– Internet telephony provider offers automatic voice translation

Speech Translation

SCIC Universities Conference (19/20 April 2018) 5

Page 6: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

• The three example applications work surprisingly well for

general-domain language and input. But:

– They are far from being perfect.

– They aren’t robust.

– They cannot cope with unforeseen situations.

– They cannot understand language as humans do.

– They are not (yet?) suited for conference interpretation.

➢ Limitations as regards their fields of application.

• Interpretation is often mission-critical.

➢ Human interpreters won’t be replaced anytime soon.

Issues and Limitations

SCIC Universities Conference (19/20 April 2018) 6

Page 7: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

SCIC Universities Conference (19/20 April 2018) 7

https://slator.com/features/ai-interpreter-fail-at-china-summit-sparks-debate-about-future-of-profession/

Page 8: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

• LT in Europe: World class research, strong SME base, thousands

of LSPs; immense fragmentation; need for coordination.

• Need for High-Quality LT: translation, interpretation, MDSM etc.

• The European Language Challenge cannot be – it must not be –

abandoned or outsourced!

➢ Need for Language Technology, made in Europe, for Europe!

➢ STOA Workshop in the EP (January 2017): “Language equality in

the digital age – towards a Human Language Project”

LT – Current Developments

SCIC Universities Conference (19/20 April 2018) 8

Page 9: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

• Goal: Deep Natural Language Understanding by 2030

• Vision: EU FET Flagship Project (10+ years)

• Broad coverage, high quality, high precision

• Create approaches, algorithms, data sets, resources

• Across modalities: text, text types, speech, video etc.

Artificial Intelligenceincluding cognition, perception, vision,

cross-modal, cross-platform, cross-culture etc.

Machine Learning

Language TechnologyLinguistics

SCIC Universities Conference (19/20 April 2018)

Human Language Project

9

Page 10: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

Summary & Conclusions• AI is disrupting all industries – including translation

and, increasingly, also interpretation.

➢ But: perfect, robust, precise language technologies (incl.

written/spoken MT and interpretation) are still far away.

• Linguists are increasingly needed – new profiles emerging

➢ The machine will support human experts and help them

become more efficient – it will not replace them.

• The Human Language Project is still a vision. Its goal:

develop new breakthroughs in Language Technology.

SCIC Universities Conference (19/20 April 2018) 10

Page 11: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

Recommendation• SCIC Speech Repository

• 4,000 speeches (3,000 public + 1,000 private)

• Extremely interesting data set and language resource for

Language Technology researchers!

• Many R&D groups currently work on TED talk data sets

• Recommendation: establish bridges between SCIC

and research groups for spoken language translation

• Help build the next generation of AI tools for interpreters

• AI tools that are tailored to the needs and wishes, topics

and domains of conference interpreters in the EC/EP

SCIC Universities Conference (19/20 April 2018) 11

Page 12: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

Thank you!

Dr. Georg Rehm

DFKI Berlin

👉🏻 [email protected]

👉🏻 http://de.linkedin.com/in/georgrehm

👉🏻 https://www.slideshare.net/georgrehm

SCIC Universities Conference (19/20 April 2018) 12

Strategic Research and Innovation Agenda

Language Technologies for

Multilingual Europe

Towards a Human Language Project

SRIA Editorial Team

Version 1.0 – December 2017

Page 13: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference
Page 14: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

• Multilingualism is at the heart of the European idea

• 24 EU languages – all have the same status

• Dozens of regional and minority languages as well as

languages of immigrants and trade partners

• Many economic and social challenges:

– The Digital Single Market needs to be multilingual

– Cross-border, cross-lingual, cross-cultural

communication

Page 15: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

60 research centres in 34 countries (founded in 2010)

Chair of Executive Board: Jan Hajic (CUNI)

Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde)

General Secretary: Georg Rehm (DFKI)

Multilingual Europe

Technology Alliance.

826 members in

67 countries

(published in 2013) (31 volumes; published in 2012)

T4ME (META-NET) CESAR METANET4UMETA-NORD

Page 16: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

Basque

Bulgarian*

Catalan

Croatian*

Czech*

Danish*

Dutch*

English*

Estonian*

Finnish*

French*

Galician

German*

Greek*

Hungarian*

Icelandic

Irish*

Italian*

Latvian*

Lithuanian*

Maltese*

Norwegian

Polish*

Portuguese*

Romanian*

Serbian

Slovak*

Slovene*

Spanish*

Swedish*

Welsh

* Official EU languagehttp://www.meta-net.eu/whitepapers

Page 17: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

MT

English

good

French, Spanish

moderate fragmentary

Catalan, Dutch, German,

Hungarian, Italian, Polish,

Romanian

weak or no support through LT

Basque, Bulgarian, Croatian,

Czech, Danish, Estonian, Finnish,

Galician, Greek, Icelandic, Irish,

Latvian, Lithuanian, Maltese,

Norwegian, Portuguese, Serbian,

Slovak, Slovene, Swedish, Welsh

excellent

Czech, Dutch,

Finnish, French,

German, Italian,

Portuguese,

Spanish

moderate fragmentary

Basque, Bulgarian, Catalan,

Danish, Estonian, Galician,

Greek, Hungarian, Irish,

Norwegian, Polish, Serbian,

Slovak, Slovene, Swedish

weak or no support through LT

Croatian, Icelandic, Latvian,

Lithuanian, Maltese, Romanian,

Welsh

excellent

English

good

Sp

ee

ch

English

good

Dutch, French,

German, Italian,

Spanish

moderate fragmentary

Basque, Bulgarian, Catalan,

Czech, Danish, Finnish,

Galician, Greek, Hungarian,

Norwegian, Polish,

Portuguese, Romanian,

Slovak, Slovene, Swedish

weak or no support through LT

Croatian, Estonian, Icelandic, Irish,

Latvian, Lithuanian, Maltese,

Serbian, Welsh

excellent

English

good

Czech, Dutch,

French, German,

Hungarian, Italian,

Polish, Spanish,

Swedish

moderate fragmentary

Basque, Bulgarian, Catalan,

Croatian, Danish, Estonian,

Finnish, Galician, Greek,

Norwegian, Portuguese,

Romanian, Serbian, Slovak,

Slovene

Icelandic, Irish, Latvian,

Lithuanian, Maltese, Welsh

weak or no support through LTexcellent

Re

so

urc

es

Te

xt

An

aly

tic

s

Page 18: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg,

New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)

Page 19: Towards a Human Language Project for Multilingual Europe · 2018-04-24 · Towards a Human Language Project for Multilingual Europe AI and Interpretation. SCIC Universities Conference

Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg,

New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)

We carried out the study in 2010/2012. While support

for many languages has improved in the meantime,

the overall picture remains mostly the same.