learning the structure of task-oriented conversations from the corpus of in-domain dialogs ph.d....
Post on 20-Dec-2015
216 views
TRANSCRIPT
Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain
Dialogs
Ph.D. Thesis Defense
Ananlada ChotimongkolCarnegie Mellon University, 18th December 2007
Thesis Committee:Alexander Rudnicky (Chair)William CohenCarolyn Penstein Rosé Gokhan Tur (SRI International)
2
Outline
Introduction Structure of task-oriented conversations Machine learning approaches Conclusion
3
A spoken dialog system
Speech Synthesizer
Speech Recognizer
Natural Language Generator
“I would like to fly to Seattle tomorrow.”
“When would you like to leave?”
Natural Language
Understanding
Dialog Manager
problem | dialog structure | learning approaches | conclusion
DomainKnowledgetasks, steps,
domain keywords
4
Problems in acquiring domain knowledge
Problems:• Require domain
expertise• Subjective• May miss some cases
(Yankelovich, 1997)
example dialogs
Domain Knowledge(tasks, steps,
domain keywords)
problem | dialog structure | learning approaches | conclusion
Problems:• Require domain
expertise• Subjective• May miss some cases• Time consuming
(Bangalore et al., 2006)
5
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
Task-oriented dialog
problem | dialog structure | learning approaches | conclusion
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
step1: reserve a flight
step2: reserve a car
step3: reserve a hotel
• Observable structure• Reflect domain information• Observable -> learnable?
6
Proposed solution
problem | dialog structure | learning approaches | conclusion
example dialogs
Domain Knowledge(tasks, steps,
domain keywords)
dialog system
human revises
7
Learning system output
problem | dialog structure | learning approaches | conclusion
air travel
dialogs
Domain Knowledge
task = create a travel itinerarysteps = reserve a flight, reserve a
hotel, reserve a carkeywords = airline, city name, date
8
Thesis statement
Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach
problem | dialog structure | learning approaches | conclusion
9
Thesis scope (1)
What to learn: domain-specific information in a task-oriented dialog
A list of tasks and their decompositions(travel reservation: flight, car, hotel)
Domain keywords(airline, city name, date)
Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach
problem | dialog structure | learning approaches | conclusion
10
Thesis scope (2)
Resources: a corpus of in-domain conversations Recorded human-human conversations are
already available
Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach
problem | dialog structure | learning approaches | conclusion
11
Thesis scope (3)
Learning approach: unsupervised learning No training data available for a new domain Annotating data is time consuming
Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach
problem | dialog structure | learning approaches | conclusion
12
Proposed approach
2 research problems1. Specify a suitable domain-specific information
representation2. Develop a learning approach that infers domain
information captured by this representation from human-human dialogs
Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach
problem | dialog structure | learning approaches | conclusion
13
Outline Introduction Structure of task-oriented conversations
Properties of a suitable dialog structure Form-based dialog structure representation Evaluation
Machine learning approaches Conclusion
14
Properties of a desired dialog structure
Sufficiency Capture all domain-specific information
required to build a task-oriented dialog system
Generality (domain-independent) Able to describe task-oriented dialogs in
dissimilar domains and types Learnability
Can be identified by an unsupervised machine learning algorithm
problem | dialog structure : properties| learning approaches | conclusion
15
Domain-specific informationin task-oriented dialogs A list of tasks and their decompositions
Ex: travel reservation = flight + car + hotel A compositional structure of a dialog based
on the characteristics of a task Domain keywords
Ex: airline, city name, date The actual content of a dialog
problem | dialog structure : properties | learning approaches | conclusion
16
Existing discourse structures
Discourse structure Sufficiency Generality
Learnability
Segmented Discourse Representation Theory (Asher, 1993)
Focus on meaning not actual entities
? ?
Grosz and Sidner’s Theory (Grosz and Sidner, 1986)
Doesn’t model domain keywords
unsupervised?
DAMSL extension (Hardy et al., 2003)
Doesn’t model a compositional structure
?unsupervise
d?
A plan-based model (Cohen and Perrault, 1979)
unsupervised?
problem | dialog structure : properties | learning approaches | conclusion
17
Form-based dialog structure representation
Based on a notion of form (Ferrieux and Sadek, 1994)
A data representation used in the form-based dialog system architecture
Focus only on concrete information Can be observed directly from in-domain
conversations
problem | dialog structure : form-based | learning approaches | conclusion
18
Form-based representation components
Consists of 3 components1. Task2. Sub-task3. Concept
problem | dialog structure : form-based | learning approaches | conclusion
Form-based representation components
1. Task A subset of a dialog that has a specific goal
make a travel reservation
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
Form-based representation components
reserve a flight
reserve a car
reserve a hotel
2. Sub-task A step in a task that contributes toward the goal Contains sufficient information to execute a
domain action
Form-based representation components
3. Concept (domain keywords) A piece of information required to perform an
action
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
22
Data representation
Represented by a form A repository of related pieces of information
necessary for performing an action
problem | dialog structure : form-based | learning approaches | conclusion
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
Data representation Form = a repository of related pieces of
information Sub-task: contains sufficient information to
execute a domain action a form
Form: flight query
reserve a flight
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
Data representation Form = a repository of related pieces of
information Task: a subset of a dialog that has a specific goal
a set of forms Form: flight query
Form: hotel query
Form: car query
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
Data representation
Form: flight query
Form = a repository of related pieces of information
Concept: a piece of information required to perform an action a slot
Form: flight query
DepartCity: PittsburghArriveCity: HoustonArriveState: TexasDepartDate: February twentieth
26
Form-based representation properties Sufficiency
The form is already used in a form-based dialog system Philips train timetable system (Aust et al., 1995) CMU Communicator system (Rudnicky et al., 1999)
Generality (domain-independent) A broader interpretation of the form is provided The analysis of six dissimilar domains
Learnability Components are observable directly from a dialog (by human) annotation scheme reliability (by machine) the accuracy of the domain information
learned by the proposed approaches
problem | dialog structure : form-based | learning approaches | conclusion
27
Outline Introduction Structure of task-oriented conversations
Properties of a suitable dialog structure Form-based dialog structure representation Evaluation
Dialog structure analysis (generality) Annotation experiment (human learnability)
Machine learning approaches Conclusion
28
Dialog structure analysis Goal:
To verify that form-based representation can be applied to dissimilar domains
Approach: Analyze 6 task-oriented domains
Air travel planning (information-accessing task) Bus schedule inquiry (information-accessing task) Map reading (problem-solving task) UAV flight simulation (command-and-control task) Meeting (personnel resource management) Tutoring (physics essay revising)
problem | dialog structure : analysis | learning approaches | conclusion
30
Map reading domain(problem-solving task) Task: draw a route on a map
Sub-task: draw a segment of a route Concepts:
StartLocation = {White_Mountain, Machete, …} Direction = {down, left, …}Distance = {a couple of centimeters, an inch, …}
Sub-task: ground a landmark Concepts:
LandmarkName = {White_Mountain, Machete, …}
Location = {below the start, …}
problem | dialog structure : analysis | learning approaches | conclusion
GIVER 1: okay ... ehm ... right, you have the start?
FOLLOWER 2: yeah.
GIVER 3: right, below the start do you have ... er like a missionary camp?
FOLLOWER 4: yeah.
GIVER 5: okay, well ... if you take it from the start just run ... horizontally.
FOLLOWER 6: uh-huh.
GIVER 7: eh to the left for about an inch.
FOLLOWER 8: right.
GIVER 9: and then go down along the side of the missionary camp.
FOLLOWER 10: uh-huh.
GIVER 11: 'til you're about an inch ... above the bottom of the map.
FOLLOWER 12: right.
GIVER 13: then you need to go straight along for about 'til about ...
Dialog structure analysis (map reading domain)
GIVER 1: okay ... ehm ... right, you have the start?
FOLLOWER 2: yeah. (action: (implicit) define_a_landmark)
GIVER 3: right, below the start do you have ... er like a missionary camp?
FOLLOWER 4: yeah. (action: define_a_landmark)
GIVER 5: okay, well ... if you take it from the start just run ... horizontally.
FOLLOWER 6: uh-huh.
GIVER 7: eh to the left for about an inch.
FOLLOWER 8: right. (action: draw_a_segment)
GIVER 9: and then go down along the side of the missionary camp.
FOLLOWER 10: uh-huh.
GIVER 11: 'til you're about an inch ... above the bottom of the map.
FOLLOWER 12: right.
GIVER 13: then you need to go straight along for about 'til about ...
Form: grounding
LandmarkName: missionary camp Location: below the start
Form: grounding
LandmarkName: missionary camp Location: below the start
Form: segment description
Start Location: startDirection: leftDistance: an inchPath:End Location:
Form: segment description
Start Location: startDirection: leftDistance: an inchPath:End Location:
32
UAV flight simulation domain(command-and-control task) Task: take photos of the targets
Sub-task: take a photo of each target Sub-subtask: control a plane
Concepts:Altitude = {2700, 3300, …}Speed = {50 knots, 200 knots, …}Destination = {H-area, SSTE, …}
Sub-subtask: ground a landmark Concepts:1. LandmarkName = {H-area, SSTE, …}
LandmarkType = {target, waypoint}
problem | dialog structure : analysis | learning approaches | conclusion
33
Meeting domain
Task: manage resources for a new employee Sub-task: get a computer
Concepts: Type = {desktop, laptop, …} Brand = {IBM, Dell, …}
Sub-task: get office space Sub-task: create an action item
Concepts:Description = {have a space, …}Person = {Hardware Expert, Building Expert, …} StartDate = {today, …}EndDate = {the fourteenth of december, …}
problem | dialog structure : analysis | learning approaches | conclusion
34
Characteristics of form-based representation Focus only on concrete information
That is observable directly from in-domain conversations Describe a dialog with a simple model Pros:
Possible to be learned by an unsupervised learning approach
Cons: Can’t capture information that is not clearly expressed in a
dialog Omitted concept values
Nevertheless, 93% of dialog content can be accounted for Can’t model a complex dialog that has a dynamic structure
A tutoring domain But it is good enough for many real world applications
problem | dialog structure : analysis | learning approaches | conclusion
35Form-based representation properties(revisit) Sufficiency
The form is already used in a form-based dialog system
Can account for 93% of dialog content Generality (domain-independent)
A broader interpretation of the form representation is provided
Can represent 5 out of 6 disparate domains Learnability
Components are observable directly from a dialog (by human) annotation scheme reliability (by machine) the accuracy of the domain
information learned by the proposed approaches
problem | dialog structure : analysis | learning approaches | conclusion
36
Annotation experiment
Goal To verify that the form-based representation can be
understood and applied by other annotators Approach
Conduct an annotation experiment with non-expert annotators
Evaluation Similarity between annotations Accuracy of annotations
problem | dialog structure : annotation experiment | learning approaches | conclusion
37
Challenges in annotation comparison Different tagsets may be used since annotators
have to design theirs own tagsets
Some differences are acceptable if they conform to the guideline
Different dialog structure designs can generate dialog systems with the same functionalities
Annotator 1 Annotator 2
<NoOfStop> -
<DestinationCity> <DestinationLocation><City>
<Date> <DepartureDate> and <ArrivalDate>
problem | dialog structure : annotation experiment | learning approaches | conclusion
Cross-annotator correction
originalannotation
(dialog A)
tagset
1
tagset
2
originalannotation
(dialog A)
corrected annotation
(dialog A)
corrected annotation
(dialog A)
cross-annotator
comparison
Annotator 2
correct
Annotator 1
annotates
Annotator 2
annotates
Annotator 1
corrects
direct comparison
cross-annotator
comparison
Each annotator creates his/her own tagset and then annotate dialogs Each annotator critiques and corrects another annotator’s work Compare the original annotation with the corrected one
39
Annotation experiment
2 domains Air travel planning domain (information-accessing
task) Map reading domain (problem-solving task)
4 subjects in each domain People who are likely to use the form-based
representation in the future Each subject has to
Design a tagset and annotate the structure of dialogs Critique other subjects’ annotation on the same set
of dialogs
problem | dialog structure : annotation experiment | learning approaches | conclusion
40
Evaluation metrics
Annotation similarity Acceptability is the degree to which an
original annotation is acceptable to a corrector
Annotation accuracy Accuracy is the degree to which a subject’s
annotation is acceptable to an expert
41
Annotation results
High acceptability and accuracy Except task/sub-task accuracy in map reading domain
Concepts can be annotated more reliably than tasks and sub-tasks
Smaller units Have to be communicated clearly
Concept Annotatio
n
Air Travel
Map Reading
acceptability
0.96 0.95
accuracy 0.97 0.89
Task/subtask
Annotation
Air Travel
Map Readin
g
acceptability 0.81 0.84
accuracy 0.90 0.65
problem | dialog structure : annotation experiment | learning approaches | conclusion
42Form-based representation properties(revisit) Sufficiency
The form is already used in a form-based dialog system
Can account for 93% of dialog content Generality (domain-independent)
A broader interpretation of the form representation is provided
Can represent 5 out of 6 disparate domains Learnability
Components are observable directly from a dialog Can be applied reliably by other annotators in most
of the cases (by machine) the accuracy of the domain
information learned by the proposed approaches
problem | dialog structure : annotation experiment | learning approaches | conclusion
43
Outline
Introduction Structure of task-oriented conversations Machine learning approaches Conclusion
44
Overview of learning approaches
Divide into 2 sub-problems1. Concept identification
What are the concepts? What are their members?
2. Form identification What are the forms? What are the slots (concepts) in each form?
Use unsupervised learning approaches Acquisition (not recognition) problem
problem | dialog structure | learning approaches | conclusion
45
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
Learning example
problem | dialog structure | learning approaches | conclusion
Client: I'D LIKE TO FLY TO HOUSTON TEXASAgent : AND DEPARTING PITTSBURGH ON WHAT DATE ?Client: DEPARTING ON FEBRUARY TWENTIETH
...Agent : DO YOU NEED A CAR ?Client : YEAHAgent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR
FOR TWENTY THREE NINETY A DAYClient : OKAYAgent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ?Client : YES
...Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ?Client : YESAgent : AND WHERE AT IN HOUSTON ?Client : /UM/ DOWNTOWNAgent : OKAYAgent : DID YOU HAVE A HOTEL PREFERENCE ?
...
Form: flight query
DepartCity: PittsburghArriveCity: HoustonArriveState: TexasArriveAirport: Intercontinental
Form: hotel query
City: HoustonArea: Downtown HotelName:
Form: car query
Pick up location: HoustonPickup Time: Return Time:
46
Outline
Introduction Structure of task-oriented conversations Machine learning approaches
Concept identification Form identification
Conclusion
47
Concept identification
Goal: Identify domain concepts and their members City={Pittsburgh, Boston, Austin, …} Month={January, February, March, …}
Approach: word clustering algorithm Identify concept words and group the
similar ones into the same cluster
problem | dialog structure | learning approaches : concept identification | conclusion
48
Word clustering algorithms
Use word co-occurrences statistics Mutual information (MI-based) Kullback-Liebler distance (KL-based)
Iterative algorithms need a stopping criteria Use information that is available during the
clustering process Mutual information (MI-based) Distance between clusters (KL-based) Number of clusters
problem | dialog structure | learning approaches : concept identification | conclusion
49
Clustering evaluation
Allow more than one cluster to represent a concept To discover as many concept words as possible However, the clustering result that doesn’t
contain splited concepts is preferred Quality score (QS) = harmonic mean of
Precision (purity) Recall (completeness) Singularity Score (SS)
SS of conceptj =jconceptaslabeledclusters#
1
problem | dialog structure | learning approaches : concept identification | conclusion
50
Concept clustering results
Algorithm
Precision
Recall SS QSMaxQ
S
MI-based 0.78 0.43 0.77 0.61 0.68
KL-based 0.86 0.60 0.70 0.70 0.71
Domain concepts can be identified with acceptable accuracy Example clusters
{GATWICK, CINCINNATI, PHILADELPHIA, L.A., ATLANTA} {HERTZ, BUDGET, THRIFTY}
Low recall for infrequent concepts An automatic stopping criterion yields close to optimal results
problem | dialog structure | learning approaches : concept identification | conclusion
51
Outline
Introduction Structure of task-oriented conversations Machine learning approaches
Concept identification Form identification
Conclusion
52
Form Identification Goal: determine different types of forms
and their associated slots Approach:
1. Segment a dialog into a sequence of sub-tasks Dialog segmentation
2. Group the sub-tasks that associate with the same form type into a cluster Sub-task clustering
3. Identify a set of slots associated with each form type Slot extraction
problem | dialog structure | learning approaches : form identification | conclusion
53
Step1: dialog segmentation Goal: segment a dialog into a sequence of sub-
tasks Equivalent to identify sub-task boundaries
Approach: TextTiling algorithm (Hearst, 1997)
Based on lexical cohesion assumption (local context) HMM-based segmentation algorithm
Based on recurring patterns (global context) HMM states = topics (sub-tasks) Transition probability = probabilities of topic shifts Emission probability = a state-specific language model
problem | dialog structure | learning approaches : form identification | conclusion
54
Modeling HMM states HMM states = topics (sub-tasks)
Induced by clustering reference topics (Tür et al., 2001)
Need annotated data Utterance-based HMM (Barzilay and Lee,
2004) Some utterances are very short
Induced by clustering predicted segments from TextTiling
55
Modifications for fine-grained segments in spoken dialogs Average segment length
Air travel domain = 84 words Map reading domain = 55 words (WSJ = 428, Broadcast News = 996)
Modifications include: A data-driven stop word list
Reflect the characteristics of spoken dialogs A distance weight
Higher weight for the context closer to candidate boundary
problem | dialog structure | learning approaches : form identification | conclusion
56
Dialog segmentation experiment Evaluation metrics
Pk (Beeferman et al., 1999) Probabilistic error metric Sensitive to the value of k
Concept-based F-measure (C. F-1) F-measure (or F-1) is a harmonic mean of precision and
recall Count a near miss as a match if there is no concept in
between Incorporate concept information in word token
representation A concept label + its value -> [Airline]:northwest A concept label -> [Airline]
problem | dialog structure | learning approaches : form identification | conclusion
57
TextTiling results
Augmented TextTiling is significantly better than the baseline
problem | dialog structure | learning approaches : form identification | conclusion
AlgorithmAir Travel Map Reading
Pk C. F-1 Pk C. F-1
TextTiling (baseline) 0.387 0.621 0.412 0.396
TextTiling (augmented)
0.371 0.712 0.384 0.464
58
HMM-based segmentation results
Inducing HMM states from predicted segments is better than inducing from utterances
Abstract concept representation yields better results Especially on map reading domain
HMM-based is significantly better than TextTiling on map reading domain
problem | dialog structure | learning approaches : form identification | conclusion
Algorithm
Air Travel Map Reading
Pk C. F-1 Pk C. F-1
HMM-based (utterance) 0.398 0.624 0.392 0.436
HMM-based (segment) 0.385 0.698 0.355 0.507
HMM-based (segment + label)
0.386 0.706 0.250 0.686
TextTiling (augmented) 0.371 0.712 0.384 0.464
59
Segmentation error analysis TextTiling algorithm performs better on
consecutive sub-tasks of the same type HMM-based algorithm performs better
on very fine-grained segments (only 2-3 utterances long) Map reading domain
problem | dialog structure | learning approaches : form identification | conclusion
60
Step2: sub-task clustering
Approach Bisecting K-mean clustering algorithm Incorporate concept information in word
token representation Evaluation metrics
Similar to concept clustering
problem | dialog structure | learning approaches : form identification | conclusion
61
Sub-task clustering results
Inaccurate segment boundaries affect clustering performance
But don’t affect frequent sub-tasks much Missing boundaries are more problematic than false alarms
Abstract concept representation yields better results More improvement in the map reading domain Even better than using reference segments Appropriate feature representation is better than accurate
segment boundaries
Concept Word RepresentationAir
TravelMap
Reading
concept label + value (oracle segment)
0.738 0.791
concept label + value 0.577 0.675
concept label 0.601 0.823
problem | dialog structure | learning approaches : form identification | conclusion
62
Step3: Slot extraction Goal:
Identify a set of slots associated with each form type
Approach: Analyze concepts contained in each cluster
problem | dialog structure | learning approaches : form identification | conclusion
63
Slot extraction results
Form: flight query
Airline (79)ArriveTimeMin (46)DepartTimeHour (40)DepartTimeMin (39)ArriveTimeHour (36)ArriveCity (27)FlightNumber (15)ArriveAirport (13)DepartCity (13)DepartTimePeriod (11)
Form: hotel query
Fare (75)City (36)HotelName (33)Area (28)ArriveDateMonth (14)
Form: car query
car_type(13)city (3)state (1)
Form: flight fare query
Fare(257)City (27)CarRentalCompany (17)HotelName (15)ArriveCity (14)AirlineCompany (11)
problem | dialog structure | learning approaches : form identification | conclusion
Concepts are sorted by frequency
64
Outline Introduction Structure of task-oriented conversations Machine learning approaches
Concept identification and clustering Form identification
Conclusion
65
Form-based dialog structure representation Forms are a suitable domain-specific
information representation according to these criteria
Sufficiency Can account for 93% of dialog content
Generality (domain-independent) A broader interpretation of the form representation
is provided Can represent 5 out of 6 disparate domains
Learnability (human) can be applied reliably by other annotators
in most of the cases (machine) can be identified with acceptable accuracy
using unsupervised machine learning approaches
problem | dialog structure | learning approaches | conclusion
66Unsupervised learning approaches for inferring domain information Require some modifications in order to learn the
structure of a spoken dialog Can identify components in form-based
representation with acceptable accuracy Concept accuracy, QS = 0.70 Sub-task boundary accuracy, F-1 = 0.71 (air travel),
= 0.69 (map reading) Form type accuracy, QS = 0.60 (air travel),
= 0.82 (map reading) Can learn with inaccurate information
If the number of errors is moderate Propagated errors don’t affect frequent components much Dialog structure acquisition doesn’t require high learning
accuracy
67
Conclusion To represent a dialog for a learning purpose we
based our representation on an observable structure
This observable representation Can be generalize for various types of task-oriented
dialog Can be understood and applied by different annotators Can be learned by unsupervised learning approach
The result from this investigation can be apply for Acquiring domain knowledge in a new task Exploring the structure of a dialog Could potentially reduce human effort when developing
a new dialog system
69
References (1) N. Asher. 1993. Reference to Abstract Objects in Discourse. Dordrecht, the
Netherlands: Kluwer Academic Publishers. H. Aust, M. Oerder, F. Seide, and V. Steinbiss. 1995. The Philips automatic
train timetable information system. Speech Communication, 17(3-4):249-262. S. Bangalore, G. D. Fabbrizio, and A. Stent. 2006. Learning the Structure of
Task-Driven Human-Human Dialogs. In Proceedings of COLING/ACL 2006. Sydney, Australia.
R. Barzilay and L. Lee. 2004. Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization. In HLT-NAACL 2004: Proceedings of the Main Conference, pp. 113-120. Boston, MA.
D. Beeferman, A. Berger, and J. Lafferty. 1999. Statistical Models for Text Segmentation. Machine Learning, 34(1-3):177-210.
P. R. Cohen and C. R. Perrault. 1979. Elements of a plan-based theory of speech acts. Cognitive Science, 3:177-212.
A. Ferrieux and M. D. Sadek. 1994. An Efficient Data-Driven Model for Cooperative Spoken Dialogue. In Proceedings of ICSLP 1994. Yokohama, Japan.
B. J. Grosz and C. L. Sidner. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3):175-204.
70
References (2) H. Hardy, K. Baker, H. Bonneau-Maynard, L. Devillers, S. Rosset, and T. Strzalkowski.
2003. Semantic and Dialogic Annotation for Automated Multilingual Customer Service. In Proceedings of Eurospeech 2003. Geneva, Switzerland.
M. A. Hearst. 1997. TextTiling: segmenting text into multi-paragraph subtopic passages. Computational Linguistics, 23(1):33-64.
W. C. Mann and S. A. Thompson. 1988. Rhetorical Structure Theory: Toward a functional theory of text organization. Text, 8(3):243-281.
L. Polanyi. 1996. The Linguistic Structure of Discourse, Technical Report CSLI-96-200. Stanford CA, Center for the Study of Language and Information, Stanford University.
A. I. Rudnicky, E. Thayer, P. Constantinides, C. Tchou, R. Shern, K. Lenzo, X. W., and A. Oh. 1999. Creating natural dialogs in the Carnegie Mellon Communicator system. In Proceedings of Eurospeech 1999. Budapest, Hungary.
J. M. Sinclair and M. Coulthard. 1975. Towards an analysis of Discourse: The English used by teachers and pupils: Oxford University Press.
G. Tür, A. Stolcke, D. Hakkani-Tür, and E. Shriberg. 2001. Integrating prosodic and lexical cues for automatic topic segmentation. Computational Linguistics, 27(1):31-57.
N. Yankelovich. 1997. Using Natural Dialogs as the Basis for Speech Interface Design. In Susann Luperfoy (Ed.), Automated Spoken Dialog Systems. Cambridge, MA: MIT Press.