Download - [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models
![Page 1: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/1.jpg)
Efficient Lattice Rescoring using Recurrent Neural Network Language ModelsX. Liu, Y. Wang, X. Chen, M. J. F. Gales & P. C. Woodland Proc. of ICASSP 2014
Introduced by Makoto Morishita 2016/02/25 MT Study Group
![Page 2: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/2.jpg)
What is a Language Model
• Language models assign a probability to each sentence.
2
W1 = speech recognition system
W2 = speech cognition system
W3 = speck podcast histamine
P(W1) = 4.021 * 10-3
P(W2) = 8.932 * 10-4
P(W3) = 2.432 * 10-7
![Page 3: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/3.jpg)
What is a Language Model
• Language models assign a probability to each sentence.
3
W1 = speech recognition system
W2 = speech cognition system
W3 = speck podcast histamine
P(W1) = 4.021 * 10-3
P(W2) = 8.932 * 10-4
P(W3) = 2.432 * 10-7
Best!
![Page 4: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/4.jpg)
In this paper…
• Authors propose 2 new methods to efficiently re-score speech recognition lattices.
4
0 1
7
9
2 3 4 5 6
8
high this is my mobile phone
phones
this
this
hi
hy
![Page 5: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/5.jpg)
Language Models
![Page 6: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/6.jpg)
n-gram back off model
6
This is my mobile phone
hone
home2345
1
• Use n-gram words to estimate the next word probability.
![Page 7: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/7.jpg)
n-gram back off model
• Use n-gram words to estimate the next word probability.
7
This is my mobile phone
hone
home2345
1If bi-gram, use these words.
![Page 8: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/8.jpg)
Feedforward neural network language model
• Use n-gram words and feedforward neural network.
8
[Y. Bengio et. al. 2002]
![Page 9: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/9.jpg)
Feedforward neural network language model
9
[Y. Bengio et. al. 2002]
http://kiyukuta.github.io/2013/12/09/mlac2013_day9_recurrent_neural_network_language_model.html
![Page 10: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/10.jpg)
Recurrent neural network language model
• Use full history contexts and recurrent neural network.
10
[T. Mikolov et. al. 2010]
001
0
current word
history
sigmoid softmax
wi�1
si�2
si�1
si�1
P (wi|wi�1, si�2)
![Page 11: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/11.jpg)
Language Model States
![Page 12: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/12.jpg)
LM states
12
• To use LM for re-scoring task, we need to store the states of LM to efficiently score the sentence.
![Page 13: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/13.jpg)
bi-gram
13
0 1 2 3
a
b
c
e
d
SR Lattice
bi-gram LM states
1aa
bc
e
1b
2c
2d
0<s> 3e
e
cd
d
![Page 14: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/14.jpg)
tri-gram
14
0 1 2 3
a
b
c
e
d
SR Lattice
tri-gramLM states
1<s>,aa
b
0<s>
2<s>,b
2a,c
2a,d
2a,c
2a,d
c
d
c
d
3e,d
3e,c
e
ee
e
![Page 15: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/15.jpg)
tri-gram
15
0 1 2 3
a
b
c
e
d
SR Lattice
tri-gramLM states
1<s>,aa
b
0<s>
2<s>,b
2a,c
2a,d
2a,c
2a,d
c
d
c
d
3e,d
3e,c
e
ee
e
States become larger!
![Page 16: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/16.jpg)
Difference
• n-gram back off model & feedforward NNLM - Use only fixed n-gram words.
• Recurrent NNLM- Use whole past words (history). - LM states will grow rapidly. - It takes a lot of computational cost.
16
We want to reduce recurrent NNLM states
![Page 17: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/17.jpg)
Hypothesis
![Page 18: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/18.jpg)
Context information gradually diminishing
• We don’t have to distinguish all of the histories.
• e.g.I am presenting the paper about RNNLM. ≒ We are presenting the paper about RNNLM.
18
![Page 19: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/19.jpg)
Similar history make similar vector
• We don’t have to distinguish all of the histories.
• e.g.I am presenting the paper about RNNLM. ≒ I am introducing the paper about RNNLM.
19
![Page 20: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/20.jpg)
Proposed Method
![Page 21: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/21.jpg)
n-gram based history clustering
• I am presenting the paper about RNNLM. ≒ We are presenting the paper about RNNLM.
• If the n-gram is the same,we use the same history vector.
21
![Page 22: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/22.jpg)
History vector based clustering
• I am presenting the paper about RNNLM. ≒ I am introducing the paper about RNNLM.
• If the history vector is similar to other vector, we use the same history vector.
22
![Page 23: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/23.jpg)
Experiments
![Page 24: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/24.jpg)
Experimental results
24
4-gram back-off LMFeedforward NNLM
RNNLM Reranking
RNNLM n-gram based history clustering
RNNLM history vector based clustering
Baseline
![Page 25: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/25.jpg)
Experimental results
25
4-gram back-off LMFeedforward NNLM
RNNLM Reranking
RNNLM n-gram based history clustering
RNNLM history vector based clustering
Baseline
![Page 26: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/26.jpg)
Experimental results
26
4-gram back-off LMFeedforward NNLM
RNNLM Reranking
RNNLM n-gram based history clustering
RNNLM history vector based clustering
Baseline
comparable WER and70% reduction in lattice size
![Page 27: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/27.jpg)
27
RNNLM n-gram based history clustering
RNNLM history vector based clustering
Same WER and45% reduction in lattice size
Experimental results
![Page 28: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/28.jpg)
28
RNNLM n-gram based history clustering
RNNLM history vector based clustering
Same WER and7% reduction in lattice size
Experimental results
![Page 29: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/29.jpg)
Experimental results
29
4-gram back-off LMFeedforward NNLM
RNNLM Reranking
RNNLM n-gram based history clustering
RNNLM history vector based clustering
Baseline
Comparable WER and72.4% reduction in lattice size
![Page 30: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/30.jpg)
Conclusion
![Page 31: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/31.jpg)
Conclusion
• Proposed methods can achieve comparable WER with 10k-best re-ranking, as well as over 70% compression in lattice size.
• Small lattice size make computational cost smaller!
31
![Page 32: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/32.jpg)
References
• これもある意味Deep Learning,Recurrent Neural Network Language Modelの話 [MLAC2013_9日目]http://kiyukuta.github.io/2013/12/09/mlac2013_day9_recurrent_neural_network_language_model.html
32
![Page 33: [Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Network Language Models](https://reader031.vdocument.in/reader031/viewer/2022030302/587e29de1a28abb93e8b5b07/html5/thumbnails/33.jpg)
Prefix tree structuring
33