dilek hakkani-tur at ai frontiers: conversational machines: deep learning for goal-oriented dialogue...

Deep Learning for Goal-Oriented Conversational UnderstandingDilek Hakkani-Tur

ACKNOWLEDGMENTS:

GOKHAN TUR, LARRY HECK, ABHINAV RASTOGI, PARARTH SHAH, ANKUR BAPNA, NEHA NAYAK, ANNA KHASIN, RAGHAV GUPTA, YANG SONG, GRADY SIMON, AMIR FAYAZI, JINDONG CHEN, GEORGI NIKOLOV, BING LIU (CMU), IZZEDDIN GUR (UCSB), RAMA PASUMARTHI (CMU), SAURABH KUMAR (GT), SHYAM UDAPHYAY (UIUC), ASLI CELIKYILMAZ (MSR), VIVIAN CHEN (NTU), MARILYN WALKER (UCSB)

Data-Driven Dialogue Systems

Human-like interactions for goal/task-oriented dialogues.

Learn from data:● High variability and noise in language● Adapt to available meaning representations

● Integrate common sense and world knowledge

● Robust modeling of context

Book me a table at Cascal

Sure, for what time?

Nothing is available at 7pm, would 8pm be ok?

Around 7pm, for 2 people

That is too late, what about Amarin?

OK, I can book you a table at Amarin at 7pm.

3

Dialogue Systems

•Personal assistant, helps users achieve a certain task

•Goal: Task completion•Combination of rules and learning.•Examples:

•End-to-end trainable task-oriented dialogue system (Wen et al., 2016)

•End-to-end reinforcement learning dialogue system (Zhao and Eskenazi, 2016)

Goal/Task-Oriented

•No specific goal, focus on natural responses

•Goal: User engagement, naturalness•Using variants of seq2seq models•Examples:

•A neural conversation model (Vinyals and Le, 2015)

•Reinforcement learning for dialogue generation (Li et al., 2016)

Chit-Chat

3

4

Task-Oriented Dialogue as a Collaborative Game

USER

Has a goal (fixed/flexible)

AGENT

Has access to data

Can perform actions

Book my flu shot with Dr. Straw

OK. Monday at October 6th at 5:15pm and 6pm are available. What time would you prefer?

Games take many forms:

● Adversarial (Chess, Go, …)● Cooperative (20 questions, Pictionary)● Collaborative (Dialogue)

5

Task-Oriented Dialogue as a Collaborative Game

USER

Has a goal (fixed/flexible)

AGENT

Has access to data

Can perform actions

Book my flu shot with Dr. Straw

OK. Monday at October 6th at 5:15pm and 6pm are available. What time would you prefer?

Games take many forms:

● Adversarial (Chess, Go, …)● Cooperative (20 questions, Pictionary)● Collaborative (Dialogue)

Large space of actions and statesMulti-action turns and flexible turn-taking

6

Why learn?

Challenge Our solutions

Variety in NL & user requests More flexible parsing mechanism

Noise in input Models learn to correct for likely noise (e.g., ASR errors)

Modeling context Integrating contextual information

Dialogue-level planning End-to-end modeling with reinforcement learning

Scale

Recall Continuous training from the logs, transfer learning, active learning

Intents Transfer learning, warm-start, multi-task modeling

Languages Transfer learning, multi-lingual embeddings

7

ConversationalLanguage

Understanding

Dialogue State

Tracking

Response Generation

SYSTEM/AGENT

Dialogue Manager

BackEnd Action/Knowledge

Providers

Book me a table at Cascal for 2 people

Sure, at what time do you want the reservation?

Request(time)

Goal-Oriented Dialogue Systems

restaurantsreserve_rest.Rest._name: CascalNum_people: 2

Back-end query

Response

8

ConversationUnderstanding

Dialogue State

Tracking

Response Generation

SYSTEM/AGENT

Dialogue Manager


Providers



Request(time)

Goal-Oriented Dialogue Systems - Components


Back-end query

Response

ht-1

ht+1

htW W W W

taiwanese

B-cuisine

UfoodU

pleaseU

V

O

V

O

V

hT+1

EOS

U

FIND_REST

V

Slot Filling Domain/Intent Prediction

Conversational Language Understanding (CLU): Multi-Domain, Joint Semantic Frame Parsing

Joint, Sequence-based

• Slot filling and intent prediction in the same output sequence

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/IS16_MultiJoint.pdf

➢ One model: Holistic multi-domain, multi-task modeling

➢ Estimate all semantic frames covering all domains in single RNN model

➢ Data from each domain reinforces each other

D. Hakkani-Tur, G. Tur, A. Celikyilmaz, Y-N. Chen, J. Gao, L. Deng, and Y-Y. Wang, “Multidomain joint semantic frame parsing using bi-directional RNN-LSTM,” in INTERSPEECH, 2016.

E2E MemNN for Contextual CLU

What does this utterance say?What do the previous utterances say?

(what the last slide showed)

Y-N. Chen, D. Hakkani-Tur, Gokhan Tur, J. Gao, and L. Deng, “End-to-end memory networks with knowledge carryover for multi-turn spoken language understanding,” in INTERSPEECH, 2016.A. Bapna, G. Tur, D. Hakkani-Tur, L.Heck. “Improving frame semantic parsing with hierarchical dialogue encoders”, SigDial, 2017.


How relevant are each of the previous utterances to the

current one?






current one?

What do the relevant previous utterances say?






current one?

What do the relevant previous utterances say?

4. Sequence taggingGiven the relevant information from the previous and current utterances,

how do I tag each token?




Do you wanna take Angela to go see a movie tonight?

Sure, I will be home by 6.

Let's grab dinner before the movie.

How about some Mexican?

Let's go to Vive Sol and see Inferno after that.

Angela wants to watch the Trolls movie.

Ok. Lets catch the 8 pm show.

InfernoMovie

Date

Time

#People

Movies

6 pm 7 pm

2 3

11/15/16

Vive SolRestaurant

MexicanCuisine

6:30 pm 7 pm

11/15/16Date

Time

Restaurants

7:30 pm

Century 16

Theatre

Trolls

8 pm 9 pm

Dialogue State Tracking (DST)● System's belief of the user's goal at any time

● Inputs at user turn t: DSt-1

, CLUt, Output: DS

t

● Used for accessing information and making transactions

● NN models

Dialogue State Tracking (DST)

A. Rastogi, D. Hakkani-Tur, L. Heck. “Scalable Multi-Domain Dialogue State Tracking”, IEEE ASRU, 2017.

S> How about 6 pm?U> I am busy at 6, book it for 7 pm instead. ● Candidate set generation

○ Slots with large/unbounded

value sets

○ Previously unseen slot values

Dialogue State Tracking (DST)

A. Rastogi, D. Hakkani-Tur, L. Heck. “Scalable Multi-Domain Dialogue State Tracking”, IEEE ASRU, 2017.

S> How about 6 pm?U> I am busy at 6, book it for 7 pm instead. ● Candidate set generation

○ Slots with large/unbounded

value sets

○ Previously unseen slot values

● Sharing parameters between

different slots

● Transfer learning to unseen

domains

19

Dialogue State ~ Game Board

User Acts:inform(category)

System Acts:request(location)

Grounded Information:time

Dialogue Move ~

Transformation of the dialogue state

I’m hungry, find me a Mediterranean restaurant

Which area do you prefer?

Near downtown Mountain View.User Acts:

inform(location)

Dialogue Manager (DM) Policy

20

Dialogue State ~ Game Board

User Acts:inform(category)

System Acts:request(location)

Grounded Information:time

System Acts:offer(restaurant)

Grounded Information:time, location

Dialogue Move ~

Transformation of the dialogue state

I’m hungry, find me a Mediterranean restaurant

Which area do you prefer?

Would you like to eat at Cascal?

Near downtown Mountain View.User Acts:

inform(location)

Dialogue Manager (DM) Policy

Learning DM PolicyMulti stage training of dialogue manager:

Dialogue Manager

Human expert

User

Dialogue Corpus

Bootstrap

Supervised Learning

P. Shah, D. Hakkani-Tur, L. Heck. “Interactive reinforcement learning for task-oriented dialogue management”, Deep Learning for Action and Interaction, NIPS, 2016.


Dialogue Manager

Human expert

User

Dialogue Corpus

Bootstrap

Dialogue Manager

Task-level Reward

User Simulator

Simulated Refinement

Supervised Learning Reinforcement Learning



Dialogue Manager

Human expert

User

Dialogue Corpus

Bootstrap

Dialogue Manager

Task-level Reward

User Simulator

Simulated Refinement

Dialogue Manager

Task-level RewardUser

Continual Learning

Turn-level Feedback

Supervised Learning Interactive RLReinforcement Learning


Learning task-oriented dialogue management through:

Dialogue Manager

Human expert

User

Dialogue Corpus

Pretraining

Dialogue Manager

Reward Function

User Simulator

Simulated Play

Dialogue Manager

Reward FunctionUser

Real Interactions

Feedback

Imitation Experimentation Feedback

Supervised Learning Reinforcement Learning Interactive RL

to scalably manage: ● Task complexity● Discourse complexity

Learning DM Policy

Natural Language Generation (NLG)● Convert system’s action into natural language system turns.

○ Sequence-to-sequence model with attention

● System action is flattened into a sequence.

● Output could be de-lexicalized NL, i.e.,

<restaurant> does not have a table at <time1>, would <time2> work for you?

● Slot values are important for surface realization.

request time go

reservationyouriswhen

ci

…N. Nayak, D. Hakkani-Tur, M. Walker, L. Heck. “To Plan or not to Plan? Discourse planning in slot-value informed sequence to sequence models for language generation”, INTERSPEECH, 2017.

26

ConversationUnderstanding

Dialogue State

Tracking

Response Generation

SYSTEM/AGENT

Dialogue Manager


Providers



Request(time)

Goal-Oriented Dialogue Systems - Training


Back-end query

Response

27


Understanding

Dialogue State

Tracking

Response Generation

SYSTEM/AGENT

Dialogue Manager

Task Data


Understanding

Dialogue State

Tracking

Response Generation

USER SIMULATOR

Dialogue Manager

UserGoal

● User simulators that mimic real user and interact with system agent to collect data, bootstrap modeling, and perform evaluation.

I/O: dialogue states

Building User Simulators

28

Building User Simulators: User Characteristics

Personality traits: OCEAN (Wiggins, 1996), PEN (Eysenck, 1990)

Model aspects that change conversation flow

● Talkativeness

● Cooperativeness

● Consistency

● Flexibility

0 0.71 1

0 0.49 1

0 0.71 1

0 0.26 1

quiet talkative

consistenthesitant

strict flexible

cooperativeuncooperative

29

Machines Talking to Machines

Dialogue Acts

S: greeting()U: greeting intent=reserve_restaurant inform(restaurant_name=il fornaio)S: request(date,time)

U: inform(date=tonight,time=7pm)S: request(num_people)U: inform(num_people=3)S: negate(time=7pm) offer(time=6:30)

U: affirm()S: notify_success()

U: thanks() bye()

S: bye()

User Simulator System Agent

Scenario: User type: cooperative User goal: Intent= reserve_restaurant

r_name= Il Fornaiodate=tonighttime = 7pm *Num_people = 3

30


Dialogue Acts Crowd Workers’ Surface Realization




U: thanks() bye()

S: bye()

Hi, how can I help you?Hey, can I reserve a spot at il Fornaio.

Sure, what time and day are you dining?The dinner is tonight at 7 pmHow many people will be attending?Myself and two others.Il Fornaio doesn’t have a table available at 7 pm. Would you be ok with 6:30 pm?Sure, that is also good.Great, We have your appointment all set.Awesome, I appreciate it. have a good day.You too. bye.




31


Dialogue Acts Crowd Workers’ Surface Realization




U: thanks() bye()

S: bye()

Hi, how can I help you?Hey, can I reserve a spot at il Fornaio.

Sure, what time and day are you dining?The dinner is tonight at 7 pmHow many people will be attending?Myself and two others.Il Fornaio doesn’t have a table available at 7 pm. Would you be ok with 6:30 pm?Sure, that is also good.Great, We have your appointment all set.Awesome, I appreciate it. have a good day.You too. bye.




NLG CLU

DST

32

What is next?● Understanding meaning beyond words

○ “Later today”: 7-9pm for dinner, 3-5pm for meetings

● Personalization

● More lively conversations

● Complex conversations

○ Compositionality

○ Multi-domain tasks

● Interactions beyond domain boundaries

Thanks!

dilek hakkani-tur at ai frontiers: conversational machines: deep learning for goal-oriented dialogue...

Data & Analytics