mitre dialog management workshop – a review
DESCRIPTION
MITRE Dialog Management Workshop – a review. Dan Bohus Dialogs on Dialogs reading group CMU, November 2003. The Workshop. MITRE Dialog Workshop @ MITRE, Bedford/Boston October 27-28, 2003 Idea Bring together researchers working on dialog management Give them a homework - PowerPoint PPT PresentationTRANSCRIPT
MITRE Dialog Management Workshop – a review
Dan Bohus
Dialogs on Dialogs reading groupCMU, November 2003
MITRE Dialog Management Workshop
The Workshop
MITRE Dialog Workshop @ MITRE, Bedford/Boston October 27-28, 2003
Idea Bring together researchers working on dialog
management Give them a homework
Adapt you dialog manager to a medical diagnosis domain (details in a sec)
Discuss, compare, learn
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
The Homework Implement a dialog system for the medical diagnosis
domain Task left open-ended (diagnosis, tutoring, etc) No speech, just text in and out Backend provided backend.doc
Java version and web-based interface version 3 diseases: malaria, coccidioidomycosis, another one List of symptoms: headache, nausea, muscle pain, etc. Decision tree involving symptoms and tests (fever, blood
tests, travel patterns, etc)
Small enough to presumably not be lots of work, but large enough to allow illustration of functionalities, and provide some skeleton to the discussions…
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Participants
MITRE (Carl Burke et al) MiDiKi Gothenburg (Staffan Larsson) GoDiS (TRINDIKit) USC ICT (David Traum) ICT Dialogue Manager NTT/CMU (Matthias Denecke) Ariadne CMU (Dan, Alex) RavenClaw Ames (Beth-Ann Hockey) NASA Dialogue Manager DFKI (Norbert Reithinger) DFKI Dialogue Manager MERL (Candy Sidner, Charles Rich) COLLAGEN
… and others invited but not present
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
GoDiS
GoDiS
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
GoDiS
TRINDIKit – information state update dialogue management toolkit Information state
Private: dialog plan, beliefs, agenda (short term goals) Shared: established facts, QUD, last utterance information
Dialog moves Update rules
GoDiS: dialog management system implemented in TRINDIKit, handing: information oriented dialogue action oriented dialogue
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
TRINDIKit / GoDiS architecture
inputinter-pret
TISDEVICES LEXICON DOMAIN
backendinterface
control
update selectgene-rate
output
lexicon domainknowledge
DME
Dialog plans
Ontology
Connection to Java
Backend
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
GoDiS: Task Representation
Plans; propositional logic Dialogue plans for dealing with diagnosis (issues
opened at dialogue start) ?x.disease(x): ”which disease is diagnosed?” ?confirmed_by_interview: ”Is the diagnosis confirmed by
additional information?” ?confirmed_by_tests: ”Is the diagnosis confirmed by medical
tests?”
Additional plans ?x.info(x): ”What information is there about a given
disease?” ?x.treatment(x): ”What treatment is there for a given
disease?”
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
GoDiS: Alternate Tasks
User-driven dialogue (implemented) Not load issues when resetting; user has to raise all issues User can ask system to
Provide a diagnosis Confirm whether user has given disease
Decision trees as dialogue plans Move backend knowledge into dialogue plans Information conversion could be done automatically
Separate genre: expert system dialogue Add special purpose update rules Dynamic dialogue planning by expert
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
GoDiS: Highlights / Lowlights
Highlights: Reuse, you get for free:
Grounding Accomodation / plan recognition Multiple simultaneous issues & info sharing
High-level abstraction for dialog plans Rapid prototyping
Lowlights Not used in this type of domain so far, so not
entirely straight-forward (update rule changes) Dynamic dialog plans (backend decides)
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
GoDiS
RavenClaw
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw
Captures all domain-specific dialog (task) logic with a hierarchical description
The authoring effort is focused entirely here
Dialog Task (Specification)
Domain-independent Dialog Engine
Manages dialog by executing the dialog task specification
Provides domain-independent conversational strategies
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw Architecture
Dialog Stack
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw Architecture
Dialog Stack
Madeleine
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw Architecture
Dialog Stack
Madeleine
Welcome
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw Architecture
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw Architecture
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
LoadSymptoms
R:Headache R: R: R:
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
headache
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw Architecture
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
R:Headache R: R: R:
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
headache
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw Architecture
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
R:Headache R: R: R:
GeneralFeel
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
headache
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw Architecture
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
R:Headache R: R: R:
GeneralFeel
How are you feeling today?
general_feeling
chart
have_fever
diagnostic
HowAreYou
Expectation Agenda
general_feeling: [good], [bad], [soso]
general_feeling: [good], [bad], [soso] [good], [bad], [soso]
general_feeling: [good], [bad], [soso] [good], [bad], [soso]have_fever: [fever]. ![yes], ![no] ![yes], ![no]headache: [headache], ![yes], ![no] ![yes], ![no]cough: [cough], ![yes], ![no] ![yes], ![no]……
GeneralFeel
I:Glad I:Sorry
Not so good, I think I have a fever[soso](not so good)[fever](I think I have a fever)
headache
GeneralFeel
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Illustrated Features
Dynamic generation of dialog task structure Symptoms loaded from backend, appropriate
structures to “talk about them” created on-the-fly New symptoms – no DM changes
Dynamic dialog control policy The order in which symptoms are addressed is
controlled by the backend
Conversational skills
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Illustrated Features
Dynamic generation of dialog task structure Symptoms loaded from backend, appropriate
structures to “talk about them” created on-the-fly New symptoms – no DM changes
Dynamic dialog control policy The order in which symptoms are addressed is
controlled by the backend
Conversational skills
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Dynamic Dialog Control …
Dialog Stack
Madeleine
Hi, this is Madeleine, the automated…How are you today?Not so good, I think I have a headacheSorry to hear you’re not feeling so good,Tell me more about your symptoms…Do you have abdominal pain?
Madeleine
E:LoadSymptoms GeneralFeel
R:HowAreYou? I:Glad I:Sorry
Diagnose
Fever Travel
R:AskFever E:MeasureTemp I:InformFever
I:Welcome
R:Headache R: R: R:
Diagnose
Expectation Agenda
general_feeling
chart
have_fever
diagnostic
headache
Backend Decision Tree
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Illustrated Features
Dynamic generation of dialog task structure Symptoms loaded from backend, appropriate
structures to “talk about them” created on-the-fly New symptoms – no DM changes
Dynamic dialog control policy The order in which symptoms are addressed is
controlled by the backend
Conversational skills
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Conversational Skills
Corresponding agencies added automatically to the dialog task tree Help What Can I Say? Repeat Suspend / Resume Start Over Timeout handling (not illustrated)
Still need all the language generation prompts and grammar, but some of those are develop-once, too
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
RavenClaw Conclusion
Highlights Set task posed no challenges to the framework
Easy to implement Dynamic dialog structure and control Automatic use of domain-independent
conversational skills
Lowlights? Toolkit perspective: how easy would it be for
someone else to build it? Asynchronous behaviors? (timing) Couple of bugs / fixes (or is that a highlight?)
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
GoDiS
Collagen
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
COLLAGEN
Collaborative Interface Agent
communicate
interactinteract
observe observe
plan tree
focus stack *
Collagen
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
COLLAGEN Systems
air travel planning email reading and responding (w. IBM/Lotus) GUI design tool operation car navigation system operation airport landing path planning (w. MITRE) gas turbine operator training (w. USC/ISI) personal video recorder operation programmable thermostat operation (with Delft U.) multi-modal web-based form-filling
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Collagen: Theory and Implementation
Intentional
purposes,contributes
Linguistic
segments,lexical items
Attentional
focus spaces,focus stack
SharedPlan Discourse Theory
(Grosz, Sidner, Kraus, Lochbaum 1974-1998)
Java Implementation
focus stack
purpose tree
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Collagen: Discourse Segments and Purposes
(Grosz, 1974)
E: Replace the pump and belt please.
A: Ok, I found a belt in the back.
A: Is that where it should be?
A: [removes belt]
A: It’s done.
E: Now remove the pump.
…
E: First you have to remove the flywheel.
…
E: Now take the pump off the base plate.
A: Already did.
replacebelt
replacepump
replacepump
andbelt
(fixing an air compressor, E = expert, A = apprentice)
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Discourse state representation
E: Replace the pump and belt please.
A: Ok, I found a belt in the back.
A: Is that where it should be?
A: [removes belt]
A: It’s done
Focus Stack
replace belt
replace pump and belt
Purpose Tree
replace pump and belt
replace pump replace belt
currentfocus space
(Grosz & Sidner, 1986)
replace belt
replace pump
and belt
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Discourse interpretation algorithm
(Lochbaum, 1998)
• starts a new segment/focus space (push)• ends the current segment/focus space (pop)• continues (contributes to) the current segment/... (add)
The current (communication or manipulation) act either:
focus stack
• directly achieves the purpose
• is a step in the plan for the purpose *
• identifies the recipe used to achieve the purpose
• identifies who should perform the purpose or a step in the plan
• identifies a parameter of the purpose or a step in the plan
An act contributes to the purpose of a segment if it:
purpose tree
* does not include recursive plan recognition (see later topic)
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
COLLAGEN … my take
Separation of task from dialog/discourse engine Recipes / Domain plans / Task tree
Full-blown HTN Hierarchical Preconditions (constraints) Effects Completion / failure Live nodes
Stack to keep track of focus and discourse structure Tree explicitly contains agent and user nodes
Formalized / descriptive recipe specs (actually Java underneath), with procedure overwrites…
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
GoDiS
Themes …
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Themes: Task Representation
Task representation Separation of task representation from dialog
engine High-level representations of task Descriptive rather than procedural
Procedural will be unavoidable for complex tasks Expressive power
GoDiS, RavenClaw, Collagen: plan based representations of task
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Themes: Task/Domain/Gendre
The notion of dialog gendre Tutoring Diagnosis Information Access
Where to fold it in a dialog manager? GoDiS: update/select rules Ariadne: plugins RavenClaw: collapsed with task
How clear is that separation: task vs. gendre?
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Themes: Development time
Systems took on the order of 3-5 days to develop Significant effort in the backend connection
Some sites shortcut it Significant effort in grammar/language generation
development Some sites shortcut it
Everyone that had an implementation: “fixed a couple of bugs, but no major changes required”
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Themes: Development tools
Regression testing (GoDiS) Systems are complex. Change something in a
dialog management framework, can you prove that it did not screw up things that used to work?
System-wise, very intractable Component-wise, maybe: i.e. DM with DM
inputs/outputs
System diagnosis / log visualization tools (Collagen)
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Themes: Timing
(Micro)timing unaddressed
Turn-taking models in general, very rudimentary
Asynchronous behaviors Could be accomplished, but no-one seemed to
have it
Multi-party conversation unaddressed
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Themes: the important problems
Different people have different views of what those are: Plan / Intention recognition Reference resolution Backup in complex systems Tense problems Negations Grounding; error prevention / recovery
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Themes: Reasoning
Dialog Managers vs Backends Where to draw the line? Who does the reasoning? Can we avoid duplicating it? How rich is the interaction between them?
Dialog systems - use language to act in a domain, so they are generally strongly tied
Basic set of conversational skills can be identified
Drawing that line is still an “art”, no general agreement or solutions exist
workshop : godis : ravenclaw : collagen : themes
MITRE Dialog Management Workshop
Themes: Science of Dialog?
How much science do we have? Theory vs. experiment
Interesting Collagen / RavenClaw similarities
Representation or not? GUI analogy
Do we have the checkboxes and radio-buttons?
workshop : godis : ravenclaw : collagen : themes