new dialogue generation paula gradu march 31, 2020 xinyi chen · 2020. 3. 31. · pioneered seq2seq...

152
Dialogue Generation March 31, 2020 Paula GraduXinyi Chen

Upload: others

Post on 16-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Dialogue GenerationMarch 31, 2020

Paula Gradu,Xinyi Chen

Page 2: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Dialogue Agents

● Personal assistants:○ Google Assistant, Alexa, Bixby

● Communicating with robots

● Therapy for mental health

● For fun

Page 3: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Two Classes of Systems

● Chatbots (this lecture and next lecture)

○ Open domain

● Goal-based dialog agents (next Tuesday)

○ Book restaurants or flights

○ Closed domain

Page 4: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Previous Chatbots

● Eliza (1966)

● Parry (1968)

● CleverBot

● Microsoft Little Bing (小冰)

Page 5: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Chatbot Architectures

● Rule-based ○ Pattern-action rules (Eliza) ○ + a mental model (Parry)

● Corpus-based○ Information Retrieval (IR) (CleverBot)○ Neural Network Encoder-Decoder (this lecture)

Page 6: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Eliza

Page 7: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Eliza

The trick: to be a Rogerian psychologist: draw patients out by reflecting back at them

Page 8: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Eliza

Page 9: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

CleverBot

https://www.cleverbot.com

Page 10: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IR-based Systems

● Given user query, find response to the closest turn in corpus

● Return closest turn in corpus

Page 11: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Evaluation

● Ideally want human evaluation, but it’s not feasible for training, expensive even for evaluation

● Heuristics: perplexity, BLEU, dialogue length/diversity

● Still an open question

Page 12: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

This lecture: neural conversational models

● Paper 1: (Vinyals and Le, 2015) A Neural Conversational Model○ Pioneered seq2seq models for dialogue generation, trained

on both large closed-domain and open-domain datasets

● Paper 2: (Li et al, 2016) Deep Reinforcement Learning for Dialogue Generation○ Proposed to apply deep reinforcement learning to model

future reward in chatbot conversations

Page 13: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Paper 1: A Neural Conversation Model

Motivation:

● Previously this task was specialized to a narrow domain and relied on hand-engineered features.

● Advances in Seq2Seq models are suitable for conversational modeling.

● This work casts this task as sequence prediction. End-to-end approach, not domain-specific.

Page 14: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Concurrent Work

● Twitter: A Neural Network Approach to Context-Sensitive

Generation of Conversational Responses (Sordoni et al. 2015)

● Weibo: Neural Responding Machine for Short-Text Conversation

(Shang et al. 2015)

● Main Idea: Use RNNs to model dialogue in short conversations

Page 15: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Concurrent Work on TwitterA Neural Network Approach to Context-Sensitive Generation of Conversational

Responses (Sordoni et al. 2015)

● Encoding step: Multilayer non-linear forward

architecture

● Decoding step: Recurrent Neural Network

Language Model (RLM)

Page 16: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Concurrent Work on Twitter

Page 17: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Neural Responding Machine for Short-Text Conversation (Shang et al. 2015)

● encoder-decoder framework

● both steps use an RNN

● argues against seq2seq encoding

approach

Concurrent Work on Weibo

Page 18: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Concurrent Work on Weibo

Page 19: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq ‘backbone’

● Encode the input sequence to fixed-size vector w/ one RNN

● Decode the vector to the target sequence w/ another RNN

● <EOS> marker enables model to define a probability distribution over sequences of all possible lengths.

Page 20: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

● Input: Context

● Output: Response

● Loss: Cross-entropy

● Teacher forcing during training

● ‘Greedy’ inference

Page 21: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 22: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 23: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 24: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 25: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 26: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 27: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 28: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 29: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 30: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Seq2Seq Models for Response Generation

Page 31: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Closed Domain Dataset: IT Helpdesk

Page 32: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk (Closed Domain) Experiment● Dataset:

○ From a IT helpdesk

troubleshooting chat service

○ ~400 words/ interaction

○ Turns are clearly signaled

○ Training: ~30M tokens

○ Validation: ~3M tokens

● Model:

○ Vocab: most common 20k words

○ 1-layer LSTM, 1024 memory cells

○ SGD

Page 33: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk Example

Page 34: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk Example

The <URL> indeed contains information about vpn access!!

Page 35: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk Example

The <URL> indeed contains information about vpn access!!

Page 36: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk Example

Page 37: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk Example

Page 38: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk Example

Page 39: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk Example

Page 40: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk Example

Page 41: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

IT Helpdesk Example

Page 42: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

● Pros:

○ Model sometimes successfully resolves issue

● Cons:

○ Doesn’t always make sense

○ Doesn’t always retain information

○ Not very good English

○ Very non-human-like

IT Helpdesk Example Take-aways

Page 43: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Open Domain Dataset: OpenSubtitles

Page 44: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles (Open Domain) Experiment● Dataset:

○ Movie subtitles transcript

○ Assume every sentence is a turn

○ Training: ~920M tokens

○ Validation: ~395M tokens

● Model:

○ Vocab: most frequent 100K words

○ 2-layer LSTM, 4096 cells

○ AdaGrad

Page 45: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 46: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 47: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 48: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 49: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 50: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 51: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 52: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 53: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples Take-away I

● Pros:

○ Remembering facts

○ Understanding concepts

○ Common sense reasoning

○ Generalization

Page 54: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 55: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 56: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 57: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples

Page 58: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

OpenSubtitles Examples Take-away II

● Cons:

○ No coherent personality

○ Unsatisfying replies

Page 59: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Comparison with CleverBot

Page 60: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Comparison with CleverBot

Page 61: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Evaluation: Perplexity

● 8 for IT Helpdesk (vs. 18 for n-gram model)

● 17 for OpenSubtitles (vs. 28 for smoothed 5-gram)

Page 62: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Human Evaluation

● four different humans rate proposed model vs. CleverBot

Page 63: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Human Evaluation

● four different humans rate proposed model vs. CleverBot

● 48.5% proposed model

Page 64: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Human Evaluation

● four different humans rate proposed model vs. CleverBot

● 48.5% proposed model

Page 65: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Human Evaluation

● four different humans rate proposed model vs. CleverBot

● 48.5% proposed model

● 30% CleverBot

Page 66: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Human Evaluation

● four different humans rate proposed model vs. CleverBot

● 48.5% proposed model

● 30% CleverBot

Page 67: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Human Evaluation

● four different humans rate proposed model vs. CleverBot

● 48.5% proposed model

● 30% CleverBot

● 21.5 % tie or disagreement

Page 68: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Human Evaluation

● four different humans rate proposed model vs. CleverBot

● 48.5% proposed model

● 30% CleverBot

● 21.5 % tie or disagreement

Page 69: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Designing a good metric to quickly measure the quality of a conversational model remains an open question!

Page 70: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Advantages

● General architecture, can be used in translation, Q&A, etc.

● Simple implementation

● Among the first end-to-end neural network approaches to

dialogue generation

Page 71: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Limitations

● Objective function doesn’t capture human communication

● Lacks long-term consistency

● Doesn’t measure the amount of information exchanged

Page 72: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Paper 2: Deep RL for Dialogue Generation

● Vanilla seq2seq gives repetitive or boring replies

● MLE may be a bad objective for approximating real-world

goals of chatbots

Page 73: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

“I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; )

How old are you ?

Problem 1: Dull and generic responses

Page 74: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

“I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; )

I don’t know .

How old are you ?

Problem 1: Dull and generic responses

Page 75: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

How is life ?

Problem 1: Dull and generic responses

“I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; )

Page 76: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

I don’t know what you are talking about.

How is life ?

Problem 1: Dull and generic responses

“I don’t know“ problem (Sordoni et al., 2015; Serban et al.,2015; )

Page 77: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Shut up !

Problem 2: Repetitive responses

Page 78: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

No, you shut up !

Shut up !

Problem 2: Repetitive responses

Page 79: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

No, you shut up !

No, you shut up !

Shut up !

Problem 2: Repetitive responses

Page 80: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

No, you shut up !

No, you shut up !

Shut up !

No, you shut up !

Problem 2: Repetitive responses

Page 81: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

No, you shut up !

No, you shut up !

Shut up !

No, you shut up !

No, you shut up !

……

Problem 2: Repetitive responses

Page 82: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

How old are you ?

Problem 3: Short-sighted decisions

Page 83: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

How old are you ?

i 'm 16 .

Problem 3: Short-sighted decisions

Page 84: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

How old are you ?

i 'm 16 .

16 ?

Problem 3: Short-sighted decisions

Page 85: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

How old are you ?

i 'm 16 .

16 ?i don 't know what you 're talking about

Problem 3: Short-sighted decisions

Page 86: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

How old are you ?

i 'm 16 .

16 ?i don 't know what you 're talking about

you don 't know what you 're saying

Problem 3: Short-sighted decisions

Page 87: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

How old are you ?

i 'm 16 .

16 ?i don 't know what you 're talking about

you don 't know what you 're saying

i don 't know what you 're talking about

Problem 3: Short-sighted decisions

Page 88: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

How old are you ?

i 'm 16 .

16 ?i don 't know what you 're talking about

you don 't know what you 're saying

i don 't know what you 're talking about

you don 't know what you 're saying

Problem 3: Short-sighted decisions

Page 89: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

How old are you ?

i 'm 16 .

16 ?i don 't know what you 're talking about

you don 't know what you 're saying

i don 't know what you 're talking about

you don 't know what you 're saying

Bad Action

Problem 3: Short-sighted decisions

Page 90: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Deep RL for Dialogue Generation

● Want to reward interesting, diverse and informative replies

● Need ability to model future direction of a dialogue

● Goal: integrate the Seq2Seq and reinforcement learning

paradigms

Page 91: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Why RL?

● Powerful paradigm for optimizing for long-term goals across a

conversation

● Ability to integrate rewards that better mimic the true goal of

chatbot development

● Overcome the limitations of the MLE objective

Page 92: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

RL Recap

● RL problem formulation

■ action space■ state space■ transition probability■ reward function■ time horizon

● Goal: learn a policy that maximizes the expected reward.

● Policy: a probability distribution over actions given a state

Page 93: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

RL for Open-Domain Dialogue

● Action: the dialogue utterance to generate

● State: the previous two dialogue turns

● Policy: the parameters of the LSTM encoder-decoder

● Reward: computable reward function observed after agent

reaches end of each sentence (more about this on next slides!)

Page 94: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Reward I: Ease of Answering

● Define set of dull responses S = { “I have no idea”, “I don’t know what you’re talking about”, … }

● Reward negative log likelihood of responding with a sentence in S

● Hope: a model less likely to generate replies in S is also less likely to give other dull replies

Page 95: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Reward II: Information Flow

● Want semantic dissimilarity between consecutive terms of the same agent

● Reward negative log of cosine similarity

● Hope: agent will contribute new info to keep dialogue moving and avoid repetitive sequences

Page 96: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Reward III: Semantic Coherence

● Want coherent and appropriate replies

● Model by mutual information between action a and previous turns

■ Train a backward seq2seq with sources and targets swapped

■ Scale probabilities by target length

Page 97: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Total Reward

● weighted sum of the 3 rewards discussed

● λ1 = 0.25 [ ease of answering ]

● λ2 = 0.25 [ information flow ]

● λ3 = 0.5 [ semantic coherence ]

Page 98: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Integrating the RL paradigm: Overview

● Idea: simulate two virtual agents taking turns conversing with each

other

● Stage 1: Supervised Learning

○ Seq2Seq with attention on OpenSubtitles

○ 80M source-target pairs

○ Target: each turn

○ Source: concatenate two previous sentences

Page 99: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Integrating the RL paradigm: Overview

● Stage 2: Maximize the Mutual Information

○ Treat as an RL problem

○ Training: MIXER, a form of curriculum learning

● Stage 3: Train the Policy

○ Simulation: explore the state-action space

○ Optimization

○ Curriculum learning

Page 100: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Mutual Information Model

● Motivation: promote diversity for better exploration

● Treat max mutual information task as an RL problem with mutual

information as reward

● Recall that the policy is initialized with parameters of a vanilla

Seq2Seq model

Page 101: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

MIM Training

● Use policy gradient

● For each state, generate a list of actions from policy

● For each generated action, obtain mutual information score as reward

Page 102: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

MIM Training

● Reward:

● Gradient:

● Add baseline to reduce variance

● Use curriculum learning: MIXER

Page 103: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

MIXER

Page 104: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

● Loss:

● Gradient:

● Compared to cross-entropy:

REINFORCE

Estimated using a NN

Generated words

Page 105: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Final Policy Training

● Initialize agents with model after Stage 2

● Dialogue Simulation

● maximize expected future reward

Page 106: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Visualization of Dialogue Simulation Procedure

Page 107: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Visualization of Dialogue Simulation Procedure

Page 108: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Visualization of Dialogue Simulation Procedure

Page 109: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Visualization of Dialogue Simulation Procedure

Page 110: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Visualization of Dialogue Simulation Procedure

Page 111: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Visualization of Dialogue Simulation Procedure

Page 112: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Visualization of Dialogue Simulation Procedure

Page 113: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Notes on Dialogue Simulation: Curriculum Learning

● 5 candidate responses generated at each step of the simulation

● simulate the dialogue for 2 turns at first

● gradually increase the number of simulated turns (up to 5)

Page 114: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Visualization of Training Procedure

Page 115: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain
Page 116: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Compute Accumulated Reward R(S1, …, Sn) !

Page 117: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Compute Accumulated Reward R(S1, …, Sn) !

Page 118: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

1. Easy to Answer (r1)

Compute Accumulated Reward R(S1, …, Sn) !

Page 119: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

1. Easy to Answer (r1)

r1(S1) r2(S2) r1(Sn)

Compute Accumulated Reward R(S1, …, Sn) !

Page 120: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

1. Easy to Answer (r1)

2. Information Flow (r2)

Compute Accumulated Reward R(S1, …, Sn) !

Page 121: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

1. Easy to Answer (r1)

2. Information Flow (r2)

Compute Accumulated Reward R(S1, …, Sn) !

Page 122: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

1. Easy to Answer (r1)

2. Information Flow (r2)

r2(S1, S3)

Compute Accumulated Reward R(S1, …, Sn) !

Page 123: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

1. Easy to Answer (r1)

2. Information Flow (r2)

3. Semantic Coherence (r3)

Compute Accumulated Reward R(S1, …, Sn) !

Page 124: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

1. Easy to Answer (r1)

2. Information Flow (r2)

3. Semantic Coherence (r3)

Compute Accumulated Reward R(S1, …, Sn) !

Page 125: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

r3(S1, S2)

1. Easy to Answer (r1)

2. Information Flow (r2)

3. Semantic Coherence (r3)

Compute Accumulated Reward R(S1, …, Sn) !

Page 126: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

r2(S1, S3)r3(S1, S2)

1. Easy to Answer (r1)

2. Information Flow (r2)

3. Semantic Coherence (r3)

Compute Accumulated Reward R(S1, …, Sn) !

Page 127: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

r2(S1, S3)r3(S1, S2)

r3(S2, S3)

1. Easy to Answer (r1)

2. Information Flow (r2)

3. Semantic Coherence (r3)

Compute Accumulated Reward R(S1, …, Sn) !

Page 128: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain
Page 129: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain
Page 130: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain
Page 131: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain
Page 132: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain
Page 133: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain
Page 134: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

what we want to learn!

Page 135: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Experiments: Baseline Models

● Vanilla Seq2Seq

● Mutual Information Model

■ Seq2Seq + Mutual Information Rescoring during testing

Page 136: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Mutual Information Baseline

● avoid responses with unconditionally high probability

● bias towards those responses specific to the given input

● Objective:

● Introduce hyperparameter that controls how much to penalize

generic responses

Page 137: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Mutual Information Baseline

● adapting MMI to SEQ2SEQ training is empirically nontrivial

● Want to adjust λ w/o repeatedly training neural network models

from scratch

● Solution: train MLE model, use MMI criterion only during testing

● Decoding:

○ Generate best n responses from

○ Re-rank them by

Page 138: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Experiments

● For open-domain, the goal is not to predict the highest probability

response, but rather the long-term success of the dialogue

=> BLEU and Perplexity are not appropriate

● Evaluate on dialogue length and diversity instead

● + human evaluation against proposed baseline models

Page 139: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Dialogue Length

● Dialogue ends when “I don’t know” is generated or utterances are highly overlapping

● Test set: 1000 input messages

● Limit # of simulated turns to be less than 8

Page 140: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Diversity

● Number of distinct unigrams and bigrams in generated responses per token

Page 141: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Human Evaluation

● single-turn general quality: which generated reply is better for a given input message

● single-turn ease to answer: which output is easier to respond to

● multi-turn general quality: which simulated conversation is of higher quality

Page 142: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Example Responses

Page 143: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Example Conversations

Page 144: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Example Analysis

● RL model generates more interactive responses than the other baselines

● RL model has a tendency to end a sentence with another question and hand the conversation over to the user

● RL model manages to produce more interactive and sustained conversations than the mutual information model

● Model sometimes starts a less relevant topic during the conversation (trade-off between relevance and less repetitiveness!)

Page 145: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Example Conversations with Cycle

● Model sometimes enters a cycle with length greater than one

● can be ascribed to the limited amount of conversational history considered

Page 146: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Limitations

● Currently there is a tradeoff between relevance and less repetitiveness

● A manually defined reward function can’t cover all crucial aspects of an ideal conversation; would be idea to receive real rewards from humans

● Can only afford to explore a very small number of candidates and simulated turns since the number of cases to consider grow exponentially

Page 147: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Extra: Meena Chatbot

● multi-turn open-domain chatbot● 2.6B parameter neural network

○ Seq2seq○ Evolved Transformer

Page 148: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Extra: Meena Chatbot

● trained end-to-end on social media conversations

● simply trained to minimize perplexity of the next token

Page 149: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Extra: Meena Chatbot

● trained end-to-end on social media conversations

● simply trained to minimize perplexity of the next token

Much more appropriate than OpenSubtitles!

Page 150: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Extra: Meena Chatbot

Page 151: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Sensibleness and Specificity Average (SSA)

● new evaluation metric

● covers two fundamental aspects of a human-like chatbot

■ making sense

■ being specific

● human judges to label every model response on these two criteria

● static and interactive version

Page 152: New Dialogue Generation Paula Gradu March 31, 2020 Xinyi Chen · 2020. 3. 31. · Pioneered seq2seq models for dialogue generation, trained on both large closed-domain and open-domain

Sensibleness and Specificity Average (SSA)

● (surprising) empirical observation: strongly correlated with perplexity, both in static and interactive evaluation