learning spoken language representations with b neural ...yvchen/doc/acl20_latticelm_slide.pdf ·...

11
NTU MIULAB Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei Huang Yun-Nung (Vivian) Chen National Taiwan University [email protected] [email protected] Code available at https://github.com/MiuLab/Lattice-ELMo

Upload: others

Post on 21-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB

Learning Spoken Language Representations with

Neural Lattice Language Modeling

Chao-Wei Huang Yun-Nung (Vivian) ChenNational Taiwan University

[email protected] [email protected]

Code available at https://github.com/MiuLab/Lattice-ELMo

Page 2: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB

• The idea of LM pretraining is adopted on lattices

• We introduce a lattice language modeling objective

• A 2-stage framework is proposed for learning contextualizedrepresentations of lattices efficiently

Highlights

Page 3: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB • Intuitive way for SLU: pipelined approach

Task: Spoken Language Understanding

ASR NLU

• ASR errors affects downstream tasks

We can preserve uncertainty using ASR lattices

Page 4: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB

Preserve uncertainty using ASR lattices

• Lattices:

directed acyclic graphs which encode several ASR hypotheses

Page 5: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB

Preserve uncertainty using ASR lattices

Using lattices helps

LatticeRNN

LM pre-training helps

ELMo

Can we combine them together?

Page 6: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB

• Use LatticeLSTM to encode nodes of a lattice

• Ask the model to predict the outgoing transitions(words) given a node’s representation

• When the lattice has only one hypothesis, this reduces to normal language modeling

Lattice language modeling

Page 7: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB

• So now we can pre-train a LatticeELMo!

Lattice language modeling

• However, LatticeLSTM runs prohibitively slow

• Observation: sequential text is actually a lattice with only one hypothesis=> normal LM pretraining is also lattice LM pretraining

We can do pre-training in two stages!

Page 8: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB

LatticeLSTM

LSTM LSTM LSTM

What a day

Linear

a day <EOS>

the, 1.0

0.80.2

Linear

0.9 1.0 1.0

0.1

1.0 1.0

the, 1.0

LatticeLSTM

Max pooling

classificationTraining Target Task ClassifierStage 1: Pre-Training on

Sequential TextsStage 2: Pre-Training on Lattices

LatticeLSTM

Two-stage pre-training

Page 9: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB 100

96.8

72.18

81.48

91.6 91.89

60.54

67.35

94.9991.98

61.65

68.52

91.6993.43

61.29

69.95

95.84 95.37

62.88

72.04

95.9793.29

61.23

67.9

ATIS SNIPS SWDA MRDA

Manual + ELMo 1-best 1-best + ELMo LatticeLSTM Proposed BERT-base

Results

Page 10: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB

• We extend the sequential LM objective to a lattice language modeling objective

• We propose a 2-stage framework for learning contextualizedrepresentations of lattices efficiently

• Experiments on various SLU tasks show that our proposed framework provides consistent improvements

Conclusion

Page 11: Learning Spoken Language Representations with B Neural ...yvchen/doc/ACL20_LatticeLM_slide.pdf · Learning Spoken Language Representations with Neural Lattice Language Modeling Chao-Wei

NTU

MIU

LAB

Thanks for listening!

Code available at https://github.com/MiuLab/Lattice-ELMo

[email protected] [email protected] Huang Yun-Nung (Vivian) Chen