learning bilingual word no bilingual data · 2017. 9. 16. · ⇒test dictionary: 1,500 word pairs...

135
Learning bilingual word embeddings with (almost) no bilingual data Mikel Artetxe, Gorka Labaka, Eneko Agirre IXA NLP group – University of the Basque Country (UPV/EHU)

Upload: others

Post on 11-May-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Learning bilingual word embeddings with (almost) no bilingual dataMikel Artetxe, Gorka Labaka, Eneko Agirre

IXA NLP group – University of the Basque Country (UPV/EHU)

Page 2: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

Page 3: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

word embeddings are useful!

Page 4: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

word embeddings are useful!

Page 5: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

word embeddings are useful!

Page 6: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

word embeddings are useful!

Page 7: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

word embeddings are useful!

Page 8: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

word embeddings are useful!

Page 9: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

word embeddings are useful!

Page 10: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

word embeddings are useful!

Page 11: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

Previous work

word embeddings are useful!

Page 12: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

Previous work

- parallel corpora

word embeddings are useful!

Page 13: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

Previous work

- parallel corpora

- comparable corpora

word embeddings are useful!

Page 14: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

Previous work

- parallel corpora

- comparable corpora

- (big) dictionaries

word embeddings are useful!

Page 15: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

Previous work

- parallel corpora

- comparable corpora

- (big) dictionaries

word embeddings are useful!

Page 16: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

This talkPrevious work

- parallel corpora

- comparable corpora

- (big) dictionaries

word embeddings are useful!

Page 17: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

This talk

- 25 word dictionary

Previous work

- parallel corpora

- comparable corpora

- (big) dictionaries

word embeddings are useful!

Page 18: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Who cares?

- inherently crosslingual tasks

- crosslingual transfer learning

bilingual signal for training

This talk

- 25 word dictionary

- numerals (1, 2, 3…)

Previous work

- parallel corpora

- comparable corpora

- (big) dictionaries

word embeddings are useful!

Page 19: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Page 20: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings𝑋 𝑍

Basque English

Page 21: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings𝑋 𝑍

Seed dictionaryBasque English

Page 22: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Txakur

Sagar

Egutegi

Dog

Apple

Calendar

𝑋 𝑍

Seed dictionaryBasque English

Page 23: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Txakur

Sagar

Egutegi

Dog

Apple

Calendar

𝑊

𝑋 𝑍

Seed dictionaryBasque English

Page 24: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Txakur

Sagar

Egutegi

Dog

Apple

Calendar

𝑊

𝑋 𝑍 𝑋𝑊

Seed dictionaryBasque English

Page 25: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

𝑋1,∗𝑋2,∗⋮

𝑋𝑛,∗

𝑊 ≈

𝑍1,∗𝑍2,∗⋮

𝑍𝑛,∗

Txakur

Sagar

Egutegi

Dog

Apple

Calendar

𝑊

𝑋 𝑍 𝑋𝑊

Seed dictionaryBasque English

Page 26: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

𝑋1,∗𝑋2,∗⋮

𝑋𝑛,∗

𝑊 ≈

𝑍1,∗𝑍2,∗⋮

𝑍𝑛,∗

Txakur

Sagar

Egutegi

Dog

Apple

Calendar

𝑊

𝑋 𝑍 𝑋𝑊

Seed dictionaryBasque English

Page 27: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

𝑋1,∗𝑋2,∗⋮

𝑋𝑛,∗

𝑊 ≈

𝑍1,∗𝑍2,∗⋮

𝑍𝑛,∗

Txakur

Sagar

Egutegi

Dog

Apple

Calendar

𝑊

𝑋 𝑍 𝑋𝑊

arg min𝑊∈𝑂(𝑛)

𝑖

𝑋𝑖∗𝑊 − 𝑍𝑗∗2

Basque English

Page 28: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

𝑋1,∗𝑋2,∗⋮

𝑋𝑛,∗

𝑊 ≈

𝑍1,∗𝑍2,∗⋮

𝑍𝑛,∗

Txakur

Sagar

Egutegi

Dog

Apple

Calendar

𝑊

𝑋 𝑍 𝑋𝑊

arg min𝑊∈𝑂(𝑛)

𝑖

𝑋𝑖∗𝑊 − 𝑍𝑗∗2

Basque English

Page 29: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

𝑋1,∗𝑋2,∗⋮

𝑋𝑛,∗

𝑊 ≈

𝑍1,∗𝑍2,∗⋮

𝑍𝑛,∗

Txakur

Sagar

Egutegi

Dog

Apple

Calendar

𝑊

𝑋 𝑍 𝑋𝑊

arg min𝑊∈𝑂(𝑛)

𝑖

𝑋𝑖∗𝑊 − 𝑍𝑗∗2

Basque English

Page 30: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

𝑋1,∗𝑋2,∗⋮

𝑋𝑛,∗

𝑊 ≈

𝑍1,∗𝑍2,∗⋮

𝑍𝑛,∗

Txakur

Sagar

Egutegi

Dog

Apple

Calendar

𝑊

𝑋 𝑍 𝑋𝑊

arg min𝑊∈𝑂(𝑛)

𝑖

𝑋𝑖∗𝑊 − 𝑍𝑗∗2

Basque English

Page 31: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Page 32: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Monolingual embeddings

Page 33: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Monolingual embeddings

Dictionary

Page 34: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Monolingual embeddings

Dictionary

Page 35: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping

Monolingual embeddings

Dictionary

Page 36: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping

Monolingual embeddings

Dictionary

Page 37: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Page 38: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary better!

Page 39: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary better!

Page 40: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping

better!

Page 41: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping

better!

Page 42: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping Dictionary

better!

Page 43: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping Dictionary

better!

even better!

Page 44: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping Dictionary

better!

even better!

Page 45: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping

Mapping

Dictionary

better!

even better!

Page 46: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping

Mapping

Dictionary

better!

even better!

Page 47: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping

Mapping Dictionary

Dictionary

better!

even better!

Page 48: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping

Mapping Dictionary

Dictionary

better!

even better!

even better!

Page 49: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

Page 50: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

proposed self-learning method

Page 51: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

proposed self-learning methodformalization and implementation details in the paper based on the mapping method of Artetxe et al. (2016)

Page 52: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

proposed self-learning methodformalization and implementation details in the paper based on the mapping method of Artetxe et al. (2016)

Too good to be true?

Page 53: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Bilingual embedding mappings

Mapping Dictionary

Monolingual embeddings

Dictionary

proposed self-learning methodformalization and implementation details in the paper based on the mapping method of Artetxe et al. (2016)

Too good to be true?

Page 54: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

Page 55: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015)

Page 56: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015)

English-Italian

Page 57: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish

English-Italian

Page 58: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish

English-Italian English-German English-Finnish

Page 59: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)

English-Italian English-German English-Finnish

Page 60: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs

English-Italian English-German English-Finnish

5,000 5,000 5,000

Page 61: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs

English-Italian English-German English-Finnish

5,000 25 5,000 25 5,000 25

Page 62: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

Page 63: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

Page 64: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

word translation induction

Page 65: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

Mikolov et al. (2013a)

Xing et al. (2015)

Zhang et al. (2016)

Artetxe et al. (2016)

word translation induction

Page 66: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

Mikolov et al. (2013a)

Xing et al. (2015)

Zhang et al. (2016)

Artetxe et al. (2016)

Our method

word translation induction

Page 67: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%

Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%

Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%

Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%

Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%

word translation induction

Page 68: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%

Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%

Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%

Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%

Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%

word translation induction

Page 69: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%

Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%

Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%

Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%

Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%

word translation induction

Page 70: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%

Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%

Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%

Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%

Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%

word translation induction

Page 71: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian English-German English-Finnish

5,000 25 num. 5,000 25 num. 5,000 25 num.

Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%

Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%

Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%

Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%

Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%

word translation induction

Page 72: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals⇒ Test dictionary: 1,500 word pairs

English-Italian

5,000 25 num.

Mikolov et al. (2013a) 34.93% 0.00% 0.00%

Xing et al. (2015) 36.87% 0.00% 0.13%

Zhang et al. (2016) 36.73% 0.07% 0.27%

Artetxe et al. (2016) 39.27% 0.07% 0.40%

Our method 39.67% 37.27% 39.40%

word translation induction

Page 73: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

Page 74: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

crosslingual word similarity

Page 75: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

EN-IT EN-DE

WS RG WS

crosslingual word similarity

Page 76: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

EN-IT EN-DE

Bi. data WS RG WS

crosslingual word similarity

Page 77: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

EN-IT EN-DE

Bi. data WS RG WS

Luong et al. (2015) Europarl

crosslingual word similarity

Page 78: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

EN-IT EN-DE

Bi. data WS RG WS

Luong et al. (2015) Europarl

Mikolov et al. (2013a) 5k dict

Xing et al. (2015) 5k dict

Zhang et al. (2016) 5k dict

Artetxe et al. (2016) 5k dict

crosslingual word similarity

Page 79: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

EN-IT EN-DE

Bi. data WS RG WS

Luong et al. (2015) Europarl

Mikolov et al. (2013a) 5k dict

Xing et al. (2015) 5k dict

Zhang et al. (2016) 5k dict

Artetxe et al. (2016) 5k dict

Our method

5k dict

25 dict

num.

crosslingual word similarity

Page 80: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

EN-IT EN-DE

Bi. data WS RG WS

Luong et al. (2015) Europarl 33.1% 33.5% 35.6%

Mikolov et al. (2013a) 5k dict 62.7% 64.3% 52.8%

Xing et al. (2015) 5k dict 61.4% 70.0% 59.5%

Zhang et al. (2016) 5k dict 61.6% 70.4% 59.6%

Artetxe et al. (2016) 5k dict 61.7% 71.6% 59.7%

Our method

5k dict 62.4% 74.2% 61.6%

25 dict 62.6% 74.9% 61.2%

num. 62.8% 73.9% 60.4%

crosslingual word similarity

Page 81: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

EN-IT EN-DE

Bi. data WS RG WS

Luong et al. (2015) Europarl 33.1% 33.5% 35.6%

Mikolov et al. (2013a) 5k dict 62.7% 64.3% 52.8%

Xing et al. (2015) 5k dict 61.4% 70.0% 59.5%

Zhang et al. (2016) 5k dict 61.6% 70.4% 59.6%

Artetxe et al. (2016) 5k dict 61.7% 71.6% 59.7%

Our method

5k dict 62.4% 74.2% 61.6%

25 dict 62.6% 74.9% 61.2%

num. 62.8% 73.9% 60.4%

crosslingual word similarity

Page 82: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Experiments

• Dataset by Dinu et al. (2015) extended to German and Finnish⇒ Monolingual embeddings (CBOW + negative sampling)⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals

EN-IT EN-DE

Bi. data WS RG WS

Luong et al. (2015) Europarl 33.1% 33.5% 35.6%

Mikolov et al. (2013a) 5k dict 62.7% 64.3% 52.8%

Xing et al. (2015) 5k dict 61.4% 70.0% 59.5%

Zhang et al. (2016) 5k dict 61.6% 70.4% 59.6%

Artetxe et al. (2016) 5k dict 61.7% 71.6% 59.7%

Our method

5k dict 62.4% 74.2% 61.6%

25 dict 62.6% 74.9% 61.2%

num. 62.8% 73.9% 60.4%

crosslingual word similarity

Page 83: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Page 84: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionary

Page 85: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionarysmall

Page 86: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionarysmall large

Page 87: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionarysmall

no error

large

Page 88: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionarysmall

no error

large

errors

Page 89: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping Dictionary

small

no error

large

errors

Page 90: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping Dictionary

small

no error

large

errors

better?

Page 91: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping Dictionary

small

no error

large

errors

better?

worse?

Page 92: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping

Mapping Dictionary

Dictionary

small

no error

large

errors

better?

worse?

Page 93: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping

Mapping Dictionary

Dictionary

small

no error

large

errors

better?

worse?

even better?

Page 94: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?

Mapping Dictionary

Monolingual embeddings

Dictionary

Mapping

Mapping Dictionary

Dictionary

small

no error

large

errors

better?

worse?

even better?

even worse?

Page 95: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

Page 96: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 97: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Independent from seed dictionary!

Page 98: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 99: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 100: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 101: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 102: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 103: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 104: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 105: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 106: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 107: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 108: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 109: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 110: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 111: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 112: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 113: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 114: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 115: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Independent from seed dictionary!

Page 116: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Independent from seed dictionary!

So why do we need a seed dictionary?

Page 117: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Independent from seed dictionary!

So why do we need a seed dictionary?

Avoid poor local optima!

Page 118: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Why does it work?𝑍𝑋𝑊

𝑊∗ = arg max𝑊

𝑖

max𝑗

𝑋𝑖∗𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊𝑇 = 𝑊𝑇𝑊 = 𝐼Implicit objective:

Page 119: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Conclusions𝑍𝑋𝑊

Page 120: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Conclusions𝑍𝑋𝑊

• Simple self-learning method to train bilingual embedding mappings

Page 121: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Conclusions𝑍𝑋𝑊

• Simple self-learning method to train bilingual embedding mappings

• High quality results with almost no supervision (25 words, numerals)

Page 122: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Conclusions𝑍𝑋𝑊

• Simple self-learning method to train bilingual embedding mappings

• High quality results with almost no supervision (25 words, numerals)

• Implicit optimization objective independent from seed dictionary

Page 123: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Conclusions𝑍𝑋𝑊

• Simple self-learning method to train bilingual embedding mappings

• High quality results with almost no supervision (25 words, numerals)

• Implicit optimization objective independent from seed dictionary

• Seed dictionary necessary to avoid poor local optima

Page 124: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Conclusions𝑍𝑋𝑊

• Simple self-learning method to train bilingual embedding mappings

• High quality results with almost no supervision (25 words, numerals)

• Implicit optimization objective independent from seed dictionary

• Seed dictionary necessary to avoid poor local optima

• Future work: fully unsupervised training

Page 125: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊>

Page 126: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git

Page 127: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git

>

Page 128: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git

> python3 vecmap/map_embeddings.py

Page 129: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git

> python3 vecmap/map_embeddings.py --self_learning --numerals

Page 130: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git

> python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB

Page 131: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git

> python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB

Page 132: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git

> python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB

>

Page 133: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git

> python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB

> vecmap/reproduce_acl2017.sh

Page 134: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

One more thing…𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git

> python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB

> vecmap/reproduce_acl2017.sh--------------------------------------------------------------------------------

ENGLISH-ITALIAN

--------------------------------------------------------------------------------

5,000 WORD DICTIONARY

- Mikolov et al. (2013a) | Translation: 34.93% MWS353: 62.66%

- Xing et al. (2015) | Translation: 36.87% MWS353: 61.41%

- Zhang et al. (2016) | Translation: 36.73% MWS353: 61.62%

- Artetxe et al. (2016) | Translation: 39.27% MWS353: 61.74%

- Proposed method | Translation: 39.67% MWS353: 62.35%

25 WORD DICTIONARY

- Mikolov et al. (2013a) | Translation: 0.00% MWS353: -6.42%

- Xing et al. (2015) | Translation: 0.00% MWS353: 19.49%

- Zhang et al. (2016) | Translation: 0.07% MWS353: 15.52%

- Artetxe et al. (2016) | Translation: 0.07% MWS353: 17.45%

- Proposed method | Translation: 37.27% MWS353: 62.64%

NUMERAL DICTIONARY

- Mikolov et al. (2013a) | Translation: 0.00% MWS353: 28.75%

- Xing et al. (2015) | Translation: 0.13% MWS353: 27.75%

- Zhang et al. (2016) | Translation: 0.27% MWS353: 27.38%

- Artetxe et al. (2016) | Translation: 0.40% MWS353: 24.85%

- Proposed method | Translation: 39.40% MWS353: 62.82%

Page 135: Learning bilingual word no bilingual data · 2017. 9. 16. · ⇒Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000

Thank you!

𝑍𝑋𝑊

https://github.com/artetxem/vecmap