[paper introduction] bilingual word representations with monolingual quality in mind

16
Bilingual Word Representations with Monolingual Quality in Mind Minh-Thang Luong, Hieu Pham, Christopher D. Manning Proceedings of NAACL-HLT 2015 Workshop AHC-Lab M1 Hiroyuki Fudaba 1

Upload: naist-machine-translation-study-group

Post on 15-Apr-2017

380 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Bilingual Word Representations with Monolingual Quality in Mind

Minh-Thang Luong, Hieu Pham, Christopher D. Manning

Proceedings of NAACL-HLT 2015 Workshop

AHC-Lab

M1 Hiroyuki Fudaba

1

Page 2: [Paper Introduction] Bilingual word representations with monolingual quality in mind

What are Word Representations?

Vectors representing words

• One-hot word representations

• Distributed word representations [Bengio et al. 2003]

0, 0, 0, … , 0, 1, 0, 0, 0, … , 0

1.1, 0.5, −3.2, 0.5, … , 0.4

2

Page 3: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Distributed Word Representations

• Vectors representing words’ syntactic / semantic features

3

Page 4: [Paper Introduction] Bilingual word representations with monolingual quality in mind

2 different languages in 1 vector space

4

Page 5: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Why do we need bilingual word representations?

• Crosslingual document classification

5

Apple Inc. Google

apple banana

companies

fruits

アップル株式会社

りんご

Which is more appropriate?

Page 6: [Paper Introduction] Bilingual word representations with monolingual quality in mind

How to do 2-in-1

• Mapping

• Learning with Joint model

6

𝑦 = 𝑊𝑥dog

cat

cat猫

dog犬

Page 7: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Problem of previous work

Perform poorly on monolingual tasks

Why?

tradeoff between bilingual tasks’ performance and monolinguals’

7

Page 8: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Paper’s approach

Substitute words to predict surroundings

8

Page 9: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Which one to substitute?

1. No alignment (BiSkip-MonoAlign)

2. Align before substitution (BiSkip-UnsupAlign)

I have a dog .

私は 犬を 飼って います .

9

Page 10: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Which one to substitute?

1. No alignment (BiSkip-MonoAlign)

2. Align before substitution (BiSkip-UnsupAlign)

I have a dog .

私は 犬を 飼って います .

10

Page 11: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Bilingual Skipgram Model

11

is

my

,

Delicious

Try to predict“is my , Delicious” from “犬”

Page 12: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Evaluation: word similarity

• Measures semantic quality of the word vectors monolingually

e.g.

tiger cat

computer keyboard internet

12

Page 13: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Evaluation: CLDC

Train with language A’s vector, and predict documents with language B

13

Document classifier (perceptron)

Page 14: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Result

14

Page 15: [Paper Introduction] Bilingual word representations with monolingual quality in mind

Conclusion and future work

What this paper say

• Substituting words make better bilingual word representations

Future work

• Pivoting to improve performance

15

Page 16: [Paper Introduction] Bilingual word representations with monolingual quality in mind

references

• [Bengio et al. 2003] A Neural Probabilistic Language Model

• [Xiaochuan et al. 2011] Cross Lingual Text Classification by Mining Multilingual Topics from Wikipedia

16