dilek hakkani-tur at ai frontiers: conversational machines: deep learning for goal-oriented dialogue...
TRANSCRIPT
Deep Learning for Goal-Oriented Conversational UnderstandingDilek Hakkani-Tur
ACKNOWLEDGMENTS:
GOKHAN TUR, LARRY HECK, ABHINAV RASTOGI, PARARTH SHAH, ANKUR BAPNA, NEHA NAYAK, ANNA KHASIN, RAGHAV GUPTA, YANG SONG, GRADY SIMON, AMIR FAYAZI, JINDONG CHEN, GEORGI NIKOLOV, BING LIU (CMU), IZZEDDIN GUR (UCSB), RAMA PASUMARTHI (CMU), SAURABH KUMAR (GT), SHYAM UDAPHYAY (UIUC), ASLI CELIKYILMAZ (MSR), VIVIAN CHEN (NTU), MARILYN WALKER (UCSB)
Data-Driven Dialogue Systems
Human-like interactions for goal/task-oriented dialogues.
Learn from data:● High variability and noise in language● Adapt to available meaning representations
● Integrate common sense and world knowledge
● Robust modeling of context
Book me a table at Cascal
Sure, for what time?
Nothing is available at 7pm, would 8pm be ok?
Around 7pm, for 2 people
That is too late, what about Amarin?
OK, I can book you a table at Amarin at 7pm.
3
Dialogue Systems
•Personal assistant, helps users achieve a certain task
•Goal: Task completion•Combination of rules and learning.•Examples:
•End-to-end trainable task-oriented dialogue system (Wen et al., 2016)
•End-to-end reinforcement learning dialogue system (Zhao and Eskenazi, 2016)
Goal/Task-Oriented
•No specific goal, focus on natural responses
•Goal: User engagement, naturalness•Using variants of seq2seq models•Examples:
•A neural conversation model (Vinyals and Le, 2015)
•Reinforcement learning for dialogue generation (Li et al., 2016)
Chit-Chat
3
4
Task-Oriented Dialogue as a Collaborative Game
USER
Has a goal (fixed/flexible)
AGENT
Has access to data
Can perform actions
Book my flu shot with Dr. Straw
OK. Monday at October 6th at 5:15pm and 6pm are available. What time would you prefer?
Games take many forms:
● Adversarial (Chess, Go, …)● Cooperative (20 questions, Pictionary)● Collaborative (Dialogue)
5
Task-Oriented Dialogue as a Collaborative Game
USER
Has a goal (fixed/flexible)
AGENT
Has access to data
Can perform actions
Book my flu shot with Dr. Straw
OK. Monday at October 6th at 5:15pm and 6pm are available. What time would you prefer?
Games take many forms:
● Adversarial (Chess, Go, …)● Cooperative (20 questions, Pictionary)● Collaborative (Dialogue)
Large space of actions and statesMulti-action turns and flexible turn-taking
6
Why learn?
Challenge Our solutions
Variety in NL & user requests More flexible parsing mechanism
Noise in input Models learn to correct for likely noise (e.g., ASR errors)
Modeling context Integrating contextual information
Dialogue-level planning End-to-end modeling with reinforcement learning
Scale
Recall Continuous training from the logs, transfer learning, active learning
Intents Transfer learning, warm-start, multi-task modeling
Languages Transfer learning, multi-lingual embeddings
7
ConversationalLanguage
Understanding
Dialogue State
Tracking
Response Generation
SYSTEM/AGENT
Dialogue Manager
BackEnd Action/Knowledge
Providers
Book me a table at Cascal for 2 people
Sure, at what time do you want the reservation?
Request(time)
Goal-Oriented Dialogue Systems
restaurantsreserve_rest.Rest._name: CascalNum_people: 2
Back-end query
Response
8
ConversationUnderstanding
Dialogue State
Tracking
Response Generation
SYSTEM/AGENT
Dialogue Manager
BackEnd Action/Knowledge
Providers
Book me a table at Cascal for 2 people
Sure, at what time do you want the reservation?
Request(time)
Goal-Oriented Dialogue Systems - Components
restaurantsreserve_rest.Rest._name: CascalNum_people: 2
Back-end query
Response
ht-1
ht+1
htW W W W
taiwanese
B-cuisine
UfoodU
pleaseU
V
O
V
O
V
hT+1
EOS
U
FIND_REST
V
Slot Filling Domain/Intent Prediction
Conversational Language Understanding (CLU): Multi-Domain, Joint Semantic Frame Parsing
Joint, Sequence-based
• Slot filling and intent prediction in the same output sequence
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/IS16_MultiJoint.pdf
➢ One model: Holistic multi-domain, multi-task modeling
➢ Estimate all semantic frames covering all domains in single RNN model
➢ Data from each domain reinforces each other
D. Hakkani-Tur, G. Tur, A. Celikyilmaz, Y-N. Chen, J. Gao, L. Deng, and Y-Y. Wang, “Multidomain joint semantic frame parsing using bi-directional RNN-LSTM,” in INTERSPEECH, 2016.
E2E MemNN for Contextual CLU
What does this utterance say?What do the previous utterances say?
(what the last slide showed)
Y-N. Chen, D. Hakkani-Tur, Gokhan Tur, J. Gao, and L. Deng, “End-to-end memory networks with knowledge carryover for multi-turn spoken language understanding,” in INTERSPEECH, 2016.A. Bapna, G. Tur, D. Hakkani-Tur, L.Heck. “Improving frame semantic parsing with hierarchical dialogue encoders”, SigDial, 2017.
E2E MemNN for Contextual CLU
How relevant are each of the previous utterances to the
current one?
What does this utterance say?What do the previous utterances say?
(what the last slide showed)
Y-N. Chen, D. Hakkani-Tur, Gokhan Tur, J. Gao, and L. Deng, “End-to-end memory networks with knowledge carryover for multi-turn spoken language understanding,” in INTERSPEECH, 2016.A. Bapna, G. Tur, D. Hakkani-Tur, L.Heck. “Improving frame semantic parsing with hierarchical dialogue encoders”, SigDial, 2017.
E2E MemNN for Contextual CLU
How relevant are each of the previous utterances to the
current one?
What do the relevant previous utterances say?
What does this utterance say?What do the previous utterances say?
(what the last slide showed)
Y-N. Chen, D. Hakkani-Tur, Gokhan Tur, J. Gao, and L. Deng, “End-to-end memory networks with knowledge carryover for multi-turn spoken language understanding,” in INTERSPEECH, 2016.A. Bapna, G. Tur, D. Hakkani-Tur, L.Heck. “Improving frame semantic parsing with hierarchical dialogue encoders”, SigDial, 2017.
E2E MemNN for Contextual CLU
How relevant are each of the previous utterances to the
current one?
What do the relevant previous utterances say?
4. Sequence taggingGiven the relevant information from the previous and current utterances,
how do I tag each token?
What does this utterance say?What do the previous utterances say?
(what the last slide showed)
Y-N. Chen, D. Hakkani-Tur, Gokhan Tur, J. Gao, and L. Deng, “End-to-end memory networks with knowledge carryover for multi-turn spoken language understanding,” in INTERSPEECH, 2016.A. Bapna, G. Tur, D. Hakkani-Tur, L.Heck. “Improving frame semantic parsing with hierarchical dialogue encoders”, SigDial, 2017.
Do you wanna take Angela to go see a movie tonight?
Sure, I will be home by 6.
Let's grab dinner before the movie.
How about some Mexican?
Let's go to Vive Sol and see Inferno after that.
Angela wants to watch the Trolls movie.
Ok. Lets catch the 8 pm show.
InfernoMovie
Date
Time
#People
Movies
6 pm 7 pm
2 3
11/15/16
Vive SolRestaurant
MexicanCuisine
6:30 pm 7 pm
11/15/16Date
Time
Restaurants
7:30 pm
Century 16
Theatre
Trolls
8 pm 9 pm
Dialogue State Tracking (DST)● System's belief of the user's goal at any time
● Inputs at user turn t: DSt-1
, CLUt, Output: DS
t
● Used for accessing information and making transactions
● NN models
Dialogue State Tracking (DST)
A. Rastogi, D. Hakkani-Tur, L. Heck. “Scalable Multi-Domain Dialogue State Tracking”, IEEE ASRU, 2017.
S> How about 6 pm?U> I am busy at 6, book it for 7 pm instead. ● Candidate set generation
○ Slots with large/unbounded
value sets
○ Previously unseen slot values
Dialogue State Tracking (DST)
A. Rastogi, D. Hakkani-Tur, L. Heck. “Scalable Multi-Domain Dialogue State Tracking”, IEEE ASRU, 2017.
S> How about 6 pm?U> I am busy at 6, book it for 7 pm instead. ● Candidate set generation
○ Slots with large/unbounded
value sets
○ Previously unseen slot values
Dialogue State Tracking (DST)
A. Rastogi, D. Hakkani-Tur, L. Heck. “Scalable Multi-Domain Dialogue State Tracking”, IEEE ASRU, 2017.
S> How about 6 pm?U> I am busy at 6, book it for 7 pm instead. ● Candidate set generation
○ Slots with large/unbounded
value sets
○ Previously unseen slot values
Dialogue State Tracking (DST)
A. Rastogi, D. Hakkani-Tur, L. Heck. “Scalable Multi-Domain Dialogue State Tracking”, IEEE ASRU, 2017.
S> How about 6 pm?U> I am busy at 6, book it for 7 pm instead. ● Candidate set generation
○ Slots with large/unbounded
value sets
○ Previously unseen slot values
● Sharing parameters between
different slots
● Transfer learning to unseen
domains
19
Dialogue State ~ Game Board
User Acts:inform(category)
System Acts:request(location)
Grounded Information:time
Dialogue Move ~
Transformation of the dialogue state
I’m hungry, find me a Mediterranean restaurant
Which area do you prefer?
Near downtown Mountain View.User Acts:
inform(location)
Dialogue Manager (DM) Policy
20
Dialogue State ~ Game Board
User Acts:inform(category)
System Acts:request(location)
Grounded Information:time
System Acts:offer(restaurant)
Grounded Information:time, location
Dialogue Move ~
Transformation of the dialogue state
I’m hungry, find me a Mediterranean restaurant
Which area do you prefer?
Would you like to eat at Cascal?
Near downtown Mountain View.User Acts:
inform(location)
Dialogue Manager (DM) Policy
Learning DM PolicyMulti stage training of dialogue manager:
Dialogue Manager
Human expert
User
Dialogue Corpus
Bootstrap
Supervised Learning
P. Shah, D. Hakkani-Tur, L. Heck. “Interactive reinforcement learning for task-oriented dialogue management”, Deep Learning for Action and Interaction, NIPS, 2016.
Learning DM PolicyMulti stage training of dialogue manager:
Dialogue Manager
Human expert
User
Dialogue Corpus
Bootstrap
Dialogue Manager
Task-level Reward
User Simulator
Simulated Refinement
Supervised Learning Reinforcement Learning
P. Shah, D. Hakkani-Tur, L. Heck. “Interactive reinforcement learning for task-oriented dialogue management”, Deep Learning for Action and Interaction, NIPS, 2016.
Learning DM PolicyMulti stage training of dialogue manager:
Dialogue Manager
Human expert
User
Dialogue Corpus
Bootstrap
Dialogue Manager
Task-level Reward
User Simulator
Simulated Refinement
Dialogue Manager
Task-level RewardUser
Continual Learning
Turn-level Feedback
Supervised Learning Interactive RLReinforcement Learning
P. Shah, D. Hakkani-Tur, L. Heck. “Interactive reinforcement learning for task-oriented dialogue management”, Deep Learning for Action and Interaction, NIPS, 2016.
Learning task-oriented dialogue management through:
Dialogue Manager
Human expert
User
Dialogue Corpus
Pretraining
Dialogue Manager
Reward Function
User Simulator
Simulated Play
Dialogue Manager
Reward FunctionUser
Real Interactions
Feedback
Imitation Experimentation Feedback
Supervised Learning Reinforcement Learning Interactive RL
to scalably manage: ● Task complexity● Discourse complexity
Learning DM Policy
Natural Language Generation (NLG)● Convert system’s action into natural language system turns.
○ Sequence-to-sequence model with attention
● System action is flattened into a sequence.
● Output could be de-lexicalized NL, i.e.,
<restaurant> does not have a table at <time1>, would <time2> work for you?
● Slot values are important for surface realization.
request time go
reservationyouriswhen
ci
…N. Nayak, D. Hakkani-Tur, M. Walker, L. Heck. “To Plan or not to Plan? Discourse planning in slot-value informed sequence to sequence models for language generation”, INTERSPEECH, 2017.
26
ConversationUnderstanding
Dialogue State
Tracking
Response Generation
SYSTEM/AGENT
Dialogue Manager
BackEnd Action/Knowledge
Providers
Book me a table at Cascal for 2 people
Sure, at what time do you want the reservation?
Request(time)
Goal-Oriented Dialogue Systems - Training
restaurantsreserve_rest.Rest._name: CascalNum_people: 2
Back-end query
Response
27
ConversationalLanguage
Understanding
Dialogue State
Tracking
Response Generation
SYSTEM/AGENT
Dialogue Manager
Task Data
ConversationalLanguage
Understanding
Dialogue State
Tracking
Response Generation
USER SIMULATOR
Dialogue Manager
UserGoal
● User simulators that mimic real user and interact with system agent to collect data, bootstrap modeling, and perform evaluation.
I/O: dialogue states
Building User Simulators
28
Building User Simulators: User Characteristics
Personality traits: OCEAN (Wiggins, 1996), PEN (Eysenck, 1990)
Model aspects that change conversation flow
● Talkativeness
● Cooperativeness
● Consistency
● Flexibility
0 0.71 1
0 0.49 1
0 0.71 1
0 0.26 1
quiet talkative
consistenthesitant
strict flexible
cooperativeuncooperative
29
Machines Talking to Machines
Dialogue Acts
S: greeting()U: greeting intent=reserve_restaurant inform(restaurant_name=il fornaio)S: request(date,time)
U: inform(date=tonight,time=7pm)S: request(num_people)U: inform(num_people=3)S: negate(time=7pm) offer(time=6:30)
U: affirm()S: notify_success()
U: thanks() bye()
S: bye()
User Simulator System Agent
Scenario: User type: cooperative User goal: Intent= reserve_restaurant
r_name= Il Fornaiodate=tonighttime = 7pm *Num_people = 3
30
Machines Talking to Machines
Dialogue Acts Crowd Workers’ Surface Realization
S: greeting()U: greeting intent=reserve_restaurant inform(restaurant_name=il fornaio)S: request(date,time)
U: inform(date=tonight,time=7pm)S: request(num_people)U: inform(num_people=3)S: negate(time=7pm) offer(time=6:30)
U: affirm()S: notify_success()
U: thanks() bye()
S: bye()
Hi, how can I help you?Hey, can I reserve a spot at il Fornaio.
Sure, what time and day are you dining?The dinner is tonight at 7 pmHow many people will be attending?Myself and two others.Il Fornaio doesn’t have a table available at 7 pm. Would you be ok with 6:30 pm?Sure, that is also good.Great, We have your appointment all set.Awesome, I appreciate it. have a good day.You too. bye.
User Simulator System Agent
Scenario: User type: cooperative User goal: Intent= reserve_restaurant
r_name= Il Fornaiodate=tonighttime = 7pm *Num_people = 3
31
Machines Talking to Machines
Dialogue Acts Crowd Workers’ Surface Realization
S: greeting()U: greeting intent=reserve_restaurant inform(restaurant_name=il fornaio)S: request(date,time)
U: inform(date=tonight,time=7pm)S: request(num_people)U: inform(num_people=3)S: negate(time=7pm) offer(time=6:30)
U: affirm()S: notify_success()
U: thanks() bye()
S: bye()
Hi, how can I help you?Hey, can I reserve a spot at il Fornaio.
Sure, what time and day are you dining?The dinner is tonight at 7 pmHow many people will be attending?Myself and two others.Il Fornaio doesn’t have a table available at 7 pm. Would you be ok with 6:30 pm?Sure, that is also good.Great, We have your appointment all set.Awesome, I appreciate it. have a good day.You too. bye.
User Simulator System Agent
Scenario: User type: cooperative User goal: Intent= reserve_restaurant
r_name= Il Fornaiodate=tonighttime = 7pm *Num_people = 3
NLG CLU
DST
32
What is next?● Understanding meaning beyond words
○ “Later today”: 7-9pm for dinner, 3-5pm for meetings
● Personalization
● More lively conversations
● Complex conversations
○ Compositionality
○ Multi-domain tasks
● Interactions beyond domain boundaries
Thanks!