presentation: the trips system bogdan stanescu feb 23 - mar 01, 2004 it 803 spring 2004 -...
TRANSCRIPT
Presentation: The TRIPS System
Bogdan StanescuFeb 23 - Mar 01, 2004
IT 803 Spring 2004 - Mixed-Initiative Intelligent Systems - Prof. G. Tecuci
2
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
3
TRIPS Overview
TRIPS is a mixed-initiative planning system designed to be easily ported to new domains
TRIPS is the successor of the TRAINS system Supports the construction of more complex plans than
TRAINS Has a more complex model of mixed-initiative problem
solving than TRAINS Developed at University of Rochester Department of
Computer Science
4
TRIPS - The Task
TRIPS acts as an assistant to the human user in constructing a plan
TRIPS has been successfully ported to several domains such as:
Pacifica: the user collaborates with TRIPS to construct a plan in a crisis situation involving the island of Pacifica (typically planning an evacuation of the population due to an approaching hurricane)
Airlift: the user collaborates with TRIPS to schedule plans for airlift operations (e.g. specify the aircraft resources, routes, and ports of refueling) given a set of requirements
TRIPS-991: the user acts as an emergency dispatcher, cooperating with TRIPS to dynamically allocate resources (e.g. ambulances, fire trucks) and make plans for solving problems as they arise in Monroe County, NY
Underwater Survey: the user collaborates with TRIPS to control three robots for locating mines in an underwater area
5
TRIPS - Mixed-Initiative System
TRIPS follows our class definition of mixed-initiative intelligent system: it is a collaborative multi-agent system where the component agents work together to achieve a common goal, in a way that takes advantage of their complementary capabilities
The user and TRIPS collaborate in creating a plan TRIPS has a multi-agent architecture, where the agents
collaborate in achieving the goal of the system
6
TRIPS - Integrated AI System
Integration between AI components Speech recognition Natural language understanding Discourse processing Planning Natural language generation Speech generation
Advantage of an integrated system: How confident can you be in the evaluation of an AI module
(such as NLU or planning), if it's not as part of an integrated system?
7
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
8
The Pacifica Domain
Scenario: The island of Pacifica is in the path of an approaching
hurricane and the population must be evacuated to one of the cities (the "safe" city)
The user’s task is to plan the evacuation, using a variety of vehicles
There are constraints on vehicle capabilities, total time, cost, weather related factors
9
Demo Transcript
0:00 H: Hello. 0:02 S: Hi there. 0:03 H: Show me a map of Pacifica. 0:07 S: Ok. Displays large map of the island of Pacifica 0:08 H: Where are the people? 0:11 S: There are two groups of people at Exodus, two at
Calypso, two at Barnacle, and one group of people at Abyss. Displays a bar chart showing the numbers of people in the cities. This chart is updated as the conversation progresses to show the state of the world given the plan under consideration.
0:17 H: And where are the transports? 0:19 S: There are two trucks and one helicopter at Delta.
Displays a chart showing vehicle status and contents (also updated through the conversation)
[Ferguson et Allen, 1998] page 3, Copyright © 1998 American Association for Artificial Intelligence (www.aaai.org)
10
Demo Transcript
0:24 H: Use a truck to get the people from Calypso to Delta. 0:31 S: Your wish is my command. Displays a construction
window with a timeline view of the plan under development, with the vehicles on the vertical axis and time horizontally. The map is also updated to show the route taken to Calypso and back.
0:34 H: How long will that take? 0:37 S: It will take six hours and forty-two minutes. 0:43 H: What if we went along the coast instead? 0:47 S: That option would take ten hours and twenty-seven
minutes. Displays another construction window showing the option alongside the original plan.
0:53 H: Forget it. 0:55 S: Ok. Deletes second construction window from the
display.[Ferguson et Allen, 1998] page 3, Copyright © 1998 American Association for Artificial Intelligence (www.aaai.org)
11
Demo Transcript
1:05 H: Use the other truck to get the people from Exodus to Delta.
1:13 S: Ok. Adds the new task to the original construction window, using a different color, and updates the map. In this case, the task is clearly seen to take more than ten hours.
1:22 H: Let’s use the helicopter instead. 1:34 S: Ok. Changes the construction window view of this task
and updates the map display. 1:38 H: Now send truck one to Barnacle. 1:41 S: Your wish is my command. Adds a third task to the
construction window (in a third color) and updates the map. 1:44 H: Pick up the people there. 1:48 S: Ok. Updates displays 1:50 H: Go on to Abyss. 1:54 S: Ok. Updates displays 1:57 H: And take the people there to Delta. 2:05 S: Ok. Updates displays[Ferguson et Allen, 1998] page 3, Copyright © 1998 American Association for Artificial Intelligence (www.aaai.org)
12
Demo Transcript
2:10 H: That looks good. 2:12 S: Glad to be of service. We now suppose that the manager learns of a change in road
status from an external source, and informs the system so that the plan can be adjusted. Delta Bridge crosses the inlet between Delta and South Delta.
2:18 H: Delta bridge is out. 2:21 S: Ok. The highlighted actions may need to be revised.
Flashes Delta Bridge in red, and highlights actions whose routes use the bridge in red in the construction window.
2:37 H: Send truck two along the coast instead. 2:39 S: Ok. Displays new route (to Calypso and back) on the
map and updates the construction window (longer trips). 2:42 H: Have the other truck stop at South Delta instead. 3:02 S: Ok. Display new route (along coast to Barnacle, then to
Abyss, then to South Delta) and updates the construction window.
3:06 H: Unload the people there. 3:13 S: Ok. Updates displays[Ferguson et Allen, 1998] page 3, Copyright © 1998 American Association for Artificial Intelligence (www.aaai.org)
13
Demo Transcript
3:18 H: And use the helicopter to get the people from South Delta to Delta.
4:03 S: No problem. Multiple flights with the helicopter are added to the third task in the construction window and the map is updated to show them.
[Ferguson et Allen, 1998] page 3, Copyright © 1998 American Association for Artificial Intelligence (www.aaai.org)
14
Demo
http://www.cs.rochester.edu/research/cisd/projects/trips/movies/TRIPS-98_v4.0/
15
Demo - Discussion
Did mixed-initiative help? Compare with the case where the entire task would have been done only by the human, or only by the system
What were the strengths and the weaknesses of each of the participants (user and TRIPS system)
16
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
17
Abstract Problem Solving Model
TRIPS uses an abstract problem solving model to represent the user and the system contributions to the task to be performed; these contributions are represented as actions operating on problem solving objects
Examples of problem solving objects: objectives, goals, resources, atomic actions, constrains
18
Example of an Objective
(RescuePersonFrame :goal (at-loc ?p ?h) :ling-desc (:id D1 :predicate Rescue :theme ?p :goal ?h :instrument ?v) :resources ((Vehicle ?v) (Hospital ?h)) :input-vars ((Person ?p) (Ailment ?a) (Location ?l)) :constraints ((Hospital ?h) (Person ?p) (Vehicle ?v) (Location ?l) (Ailment ?a) (has ?p ?a) (treats ?h ?a) (at-loc ?p ?l)) :focused-solution S1 :solutions ((:id S1 :agreed-actions NIL :expected-actions (((move ?v ?l) NIL) ((load ?p ?v) ((at-loc ?p ?l) (at-loc ?v ?l))) ((move ?v ?h) ((in ?v ?p))) ((unload ?p ?v) ((in ?v ?p) (at-loc ?v ?h)))))))
[Blaylock, 2001] page 28
19
Components of an Objective
A goal Linguistic information describing how to generate a
speech act corresponding to the objective A specification of the resources that should be used
to achieve the goal (e.g. vehicles) A set of constraints; more constraints can be added
as the interaction progresses A set of solutions (plans) to the goal that have been
investigated, one of them can be the focused solution
The actions contained in a solution can be atomic or not, and they can be agreed (with the user) or not yet
20
Abstract Problem Solving Model
Examples of problem solving actions: create new objects (e.g. objectives, goals, resources,
constraints) focus on an objective evaluate a solution (e.g. wrt. time) compare two solutions modify a solution repair a solution abandon an objective or solution communication actions:
describe an object communicate an action or action request
21
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
22
Architecture Overview
[TRIPS Home Page]
23
Architecture - KQML Message Passing
The components communicate by exchanging KQML messages using a central message-passing Input Manager (not shown in the figure)
KQML = Knowledge Query and Manipulation Language
Lisp-like syntax (tell :content (word
“hello”) :sender M :receiver S) (request :content (kill P)) (achieve :content (killed P))
[Ferguson et al, 1996] pages 8,12
24
KQML - Standard Verbs
[Ferguson et al, 1996] page 8
25
KQML - Standard Parameters
[Ferguson et al, 1996] page 7
26
Architecture - Message Passing
Following is an example of a sequence of messages passed between the components as result to a user action
27
Architecture - Message Passing
User: "Where are the ambulances?"
Speech Input Component: recognizes the words and sends a sequence of messages to the Parser, as they are recognized:
(tell :content (word “where” :uttnum 30 :index 1 :frame … :score …))
Parser: generates a direct speech act describing the input, containing a parse tree annotated with features such as:
the objects identified in the message (e.g. “ambulances”)
the lexical subject and object (even if missing)[Ferguson et al, 1996] page 19
28
Architecture - Message Passing
Interpretation Manager: sends to the Discourse Context Agent a message indicating that the user has taken the discourse turn
Discourse Context Agent: forwards the message to the Generation Manager
Generation Manager: may cancel or delay an answer to a previous action
Interpretation Manager: asks the Task Manager if “ambulances” are considered resources in the application domain
Task Manager: answers affirmatively
29
Architecture - Message Passing
Interpretation Manager: sends to the Discourse Context Agent a message indicating the speech act to be recorded and a new discourse obligation of the system:
(introduce-obligation :id OBLIG1 :who system :what (respond-to (wh-question :id UTT1 :who user :what (at-loc (the-set ?x (type ?x ambulance)) (wh-term ?l (type ?l location))) :why (initiate PS1))))
[Allen et al, 2001] page 6, Copyright 2001 ACM 1581133251/01/0001
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation
30
Architecture - Message Passing
Interpretation Manager: sends the message to the Context Agent (continued)
Discourse Context Agent: forwards the discourse obligation to the Generation Manager
Generation Manager: plans to produce an acknowledgment and adopts the goal of answering, which can be satisfied when it receives the appropriate information
31
Architecture - Message Passing
Interpretation Manager: sends to the Behavioral Agent the interpretation of the input:(tell :content (done (initiate :who user :what (identify-resource :id PS1 :what (set-of ?x (type ?x ambulance))))))
[Allen et al, 2001] page 6, Copyright 2001 ACM 1581133251/01/0001Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation
32
Architecture - Message Passing
Behavioral Agent can decide to: Fulfill the request: after communicating
with the Task Manager it sends the information to the Generation Manager: (identify-resource :who system :what (and (at-loc amb-1 Rochester) ...) :why (complete :who system :what PS1))
Request a clarification to the user, e.g. whether the user wants to identify all the ambulances that exist, or just the ambulances used in the plan
Answer with failure to complete the task Delay the execution of the task, if more
important tasks need to be performed (e.g. correct a crisis situation caused by a change in the external world)
[Allen et al, 2001] page 6, Copyright 2001 ACM 1581133251/01/0001
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation
33
Architecture - Message Passing
For each possible decision, the Behavioral Agent sends to the Generation Manager a message that will help it satisfy the discourse obligation
When the discourse obligation is satisfied, the Generation Manager requests the Discourse Context Agent to remove it
34
Architecture - Message Passing
Question: What could be a typical way for the Generation Manager to satisfy a discourse obligation in each of these cases:
The Behavioral Agent (BA) fulfills the request
The BA requests a clarification The BA answers with failure The BA delays the task execution
35
Architecture - Message Passing
Answer: The BA fulfills the request: the GM
answers the request The BA requests a clarification: the
GM asks the clarification (this satisfies the discourse obligation but not the problem solving task)
The BA answers with failure: the GM answers with failure (this satisfies the discourse obligation but not the problem solving task)
The BA delays the task execution: the GM can apologize and promise to address the issue later, or plan to apologize later (this satisfies the discourse obligation but not the problem solving task)
36
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
37
Speech Input Component
Performs user-independent speech recognition using a general-purpose speech recognizer (CMU Sphinx-II) with a language model customized for the application domain
The output is a set of word hypotheses which are sent to the parser as they are recognized. Example:
(tell :content (start :uttnum 30)) (tell :content (word “where” :uttnum
30 :index 1 :frame F :score S) ... (tell :content (word “ambulances” :uttnum
30 :index 4 :frame F2 :score S2) (tell :content (end :uttnum 30))
Can revise the previous recognition as more words arrive, and send a message to disregard the previous words. Example:
(tell :content (backto :uttnum 30 :index 3))[Ferguson et al, 1996] pages 19,20
38
GUI Input Components
Receive GUI user input, such as typed sentences or mouse clicks
39
Parser Component (More Later)
Receives inputs from the Speech Input and GUI Input components
Constructs plausible direct speech acts corresponding to the text and GUI input
40
Reference Component
Resolves the references in the dialogue
41
Discourse Context Agent
Manages all knowledge about the discourse:
Salient entities in the discourse, to support reference resolution and generation of anaphora
Structure and interpretation of the immediately preceding utterance, to support ellipsis resolution
Status of the discourse turn: whether it is assigned to one conversant or currently open
Discourse history consisting of the speech-act interpretations of the utterances in the conversation so far
Discourse obligations to respond to the other conversant’s last utterance; obligations may act as a stack during clarification sub-dialogues
42
Interpretation Manager (More Later)
Interprets user input as it arises Sends the interpretation of the
recognized speech act to the Behavioral Agent
Incrementally updates the Discourse Context
43
Task Manager (More Later)
Is the one that has domain-specific information about what each object is and how each operation is performed in the domain (the Interpretation Manager, Behavioral Agent, and Generation Manager use only the abstract problem solving model as the representation language)
Is the only one to interact with the planner and scheduler, using domain-specific knowledge
44
Behavioral Agent
Plans system behavior based on Its own goals and obligations The user’s actions Changes in the world state
Examples of initiative: If the user initiates creating a new
objective, the Behavioral Agent can adopt a new problem solving obligation to find a solution, ask the Task Manager to compute a tentative solution, and propose the solution to the user
If the Behavioral Agent receives a report that a bridge has been damaged, it can inform the user and/or adopt the goal to solve the problem without using the bridge
45
Behavioral Agent
Sends to the Generation Manager the actions that involve communication and collaboration with the user
46
Planner
A planner specific to the domain
47
Scheduler
Used to schedule plan actions (in case of domains where the plans do not contain a complete allocation of resources)
48
Monitors and Events
Inform the Behavioral Agent about facts and events from the outside world
Unanswered question: shouldn't the monitors and events communicate to the Task Manager instead?
49
Generation Manager
Receives problem solving goals requiring generation from the Behavioral Agent and discourse obligations from the Discourse Context
Produces sequences of speech acts and sends them to the Surface Generation Agent
50
Surface Generation Agent
Transforms speech acts received from the Generation Manager into surface forms to be sent to Speech Output and GUI Output
51
Speech Output Component
Synthesizes Speech using a general-purpose generation engine (Entropics TrueTalk)
52
GUI Output components
Generate graphical output: maps, figures, etc.
53
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
54
Parser Component
Receives inputs from the Speech Input and GUI Input components
Constructs and sends a sequence of direct speech acts corresponding to the input
Input can come noisy from the Speech Input Component. Example:
User: “Okay now let's take the last train and go from Albany to Milwaukee.”
Speech Input Component: “Okay now I take the last train in go from Albany to is”
Parser: CONFIRM “Okay” TELL “Now I take the last train” REQUEST “go from Albany”
[Ferguson et al, 1996] pages 34,36
55
Sample Speech Act
(SA-REQUEST :OBJECTS ((DESCRIPTION (STATUS NAME) (VAR V2242) (CLASS CITY) (LEX AVON) (SORT INDIVIDUAL)) (DESCRIPTION (STATUS NAME) (VAR V2253) (CLASS CITY) (LEX BATH) (SORT INDIVIDUAL))) :PATHS ((PATH (VAR V2238) (CONSTRAINT (AND (FROM V2238 V2242) (TO V2238 V2253))))) :SEMANTICS (PROP (VAR V2231) (CLASS GO-BY-PATH) (CONSTRAINT (AND (LSUBJ V2231 *YOU*) (LCOMP V2231 V2238)))) :NOISE NIL :SOCIAL CONTEXT (POLITE PLEASE) :RELIABILITY 100 :MODE TEXT :SYNTAX ((SUBJECT . *YOU*) (OBJECT)) :SETTING NIL :INPUT (GO FROM AVON TO BATH PLEASE))
[Ferguson et al, 1996] page 37
56
Speech Act Elements
The type of speech act (e.g. SA REQUEST) Objects mentioned in the utterance (e.g. the logical forms for
“Avon” and “Bath”) Paths mentioned (e.g. the logical form for “from Avon to Bath”) Semantics capturing the full propositional content Noise: words that were unknown to the parser Social context: indication of politeness Reliability: a score between 0 and 100 indicating the parser
confidence Mode: whether the input was typed, spoken, or graphical (used
for generating responses) Syntax: parts of the semantics associated with syntactic
constructs, such as the subject, object, etc. Input: the input from which the speech act was constructed
57
Speech Act Types
[Ferguson et al, 1996] page 35
58
Speech Act Types
[Ferguson et al, 1996] page 35
59
Parser Component
In case of GUI input, the interpretation is based on ad-hoc rules
The grammar and lexicon are divided into domain independent and domain dependent pieces
60
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
61
The Task Manager
Has domain-specific information about what each object is and how each operation is performed in the domain
Is the only one to interact with the planner and scheduler using the domain-specific knowledge
62
The Task Manager
Representation: Hierarchical plan skeletons
containing ways to decompose domain objectives in sub-objectives and ways to achieve solutions to objectives using elementary actions
The plan being constructed, containing information that allows:
shifting the focus comparing alternative solutions undo/redo
63
The Task Manager
Provides the following services to the Interpretation Manager and Behavioral Agent:
Answering queries about objects in the domain
Mapping between generic problem solving acts and domain specific actions
Helping the Interpretation Manager with intention recognition
64
The Task Manager
Answering queries about objects in the domain:
Is an "ambulance" a resource? Which are the "send" actions in the
plan?
65
The Task Manager
Mapping between problem solving acts and domain specific actions:
The Behavioral Agent sends to the Task Manager requests expressed using the abstract problem solving model (e.g. create a solution)
The Task Manager performs the domain-specific operations required by these requests (e.g. build a course of action to evacuate the city using two trucks)
The Task Manager acts like a translator that understands both languages: the abstract problem solving language and the domain specific language of the planner. This enables TRIPS to use a complex planner without requiring extensions to the other modules, except probably the Task Manager
66
The Task Manager
Helping the Interpretation Manager with intention recognition:
Given a problem solving act, return two scores depending on the current context:
The coherence (recognition) score: whether or not the problem solving act is something that "makes sense" in the context
The feasibility (answer) score: whether or not the requested action is possible in the context (i.e. all constraints are satisfied)
In order to evaluate each interpretation, the Task Manager may invoke the planner to perform the actual reasoning required
67
The Task Manager
Questions: What might be the typical system
response to a bad coherence score? What might be the typical system
response to a bad feasibility score?
68
The Task Manager
Answers: Typical response to a bad coherence
score: A clarification question to the
user No immediate response. Reason
with ambiguity expecting the future speech acts to clarify it
Typical response to a bad feasibility score:
Informing the user why the requested action cannot be performed
69
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
70
The Interpretation Manager
The main task of the Interpretation Manager is to convert a direct speech act received from the parser into a problem-solving act to be sent to the Behavioral Agent
Calls the reference component to ground the referenced entities
71
The Interpretation Manager
Generates possible indirect speech acts based on linguistic syntactic and semantic patterns:
E.g. if the user utterance is "Let’s use the helicopter instead", the interpretations are of the form "replace something in the plan with the helicopter". Because a truck was used in the last subtask, one possible interpretation is to replace it with the helicopter
E.g. if the user utterance is "The bridge over the Genesee is blocked", two possible indirect interpretations are:
Initiate re-planning given the new fact
Add a new goal to reopen the bridge
72
The Interpretation Manager
Filters out the interpretations by using the Task Manager to score their coherence and feasibility
Question: in which situation would each of the interpretations before make sense? (Initiate re-planning vs. add a new goal)
73
The Interpretation Manager
Filters out the interpretations by using the Task Manager to score their coherence and feasibility
Question: in which situation would each of the interpretations before make sense? (Initiate re-planning vs. add a new goal)
Answer: Re-planning: when the current plan
uses the bridge Adding a new goal: when the user
has just started a new session with the system
74
The Interpretation Manager
Identifies new discourse obligations and sends them to the discourse context
Identifies discourse obligations fulfilled by the user and removes them from the discourse context
75
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
76
The Control Issue
The user is in control most of the time, and typically the system responds to the last user utterance
The system can take the initiative To request clarifications (question: is this initiative?) To propose a tentative solution to an objective To create a new objective when the world state is changed To inform the user when a change in the world state is
detected
77
The Awareness Issue
The system uses speech to inform the user about the initiatives it takes; this makes it easy for the user to notice the system initiatives
The system also has a domain-dependent GUI that shows a summary of the current state of the plan (remember the demo of the Pacifica domain)
I did not find other means of addressing the awareness issue in the papers I investigated
[Blaylock, 2001] (page 29) explains that in future versions of TRIPS the user will be able to query the system about its state
78
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
79
Evaluation: TRAINS-96
I did not find an evaluation of the TRIPS system. The most recent evaluation I found was done for the TRAINS-96 system, the precursor of TRIPS
16 subjects, each used the system for 1 hour on average 5 tasks: in each of them 3 trains have to be moved to 3
cities, given 3 cities experiencing delays Subjects were shown a short tutorial movie Each subject solved the tasks in different order (i.e. the
first subject started with task 1, the second subject with task 2, etc.)
The first task solved by each subject was not evaluated (it was considered practice session)
After performing the tasks, each subject answered a questionnaire
80
TRAINS-96 Evaluation Results
Out of the 16*(5-1)=64 tasks In 42 of them the subjects met the goal (i.e. successfully
completed the task to move the trains) In 4 of them the subjects thought they had met the goal but
didn’t In 7 of them the subjects met the goal at some point, but in
the final configuration the goal was not met In 6 of them the goal was never met In 5 of them the system crashed 3 of them were restarted because the system crashed very
early
[Stent et Allen, 1997] page 13
81
TRAINS-96 Evaluation Results
The experiment also tested whether 2 system components had a correlation with the time to achieve a solution and the efficiency of the achieved solution
A component for enhancing the parser robustness A component for displaying speech feedback
The conclusion was that there was no statistical difference with/without these components due the high variance of the results (presumably the experiment size was too small)
Question: how could natural language components such as parser robustness or speech feedback have a correlation with the efficiency of the achieved solution?
82
TRAINS-96 Evaluation Results
Question: how could natural language components such as parser robustness or speech feedback have a correlation with the efficiency of the achieved solution?
Answer: they should not have a direct correlation, however they do have an indirect correlation: they are correlated with the time to achieve an initial solution to the task, which is correlated with the efficiency of the final solution (if a subject achieves an initial solution very fast, he/she is more likely to spend time trying to improve it, thus achieving a better final solution)
83
TRAINS-96 Evaluation Questionnaire
The subjects were asked to blame three parts of the system for the difficulties experienced in completing the tasks (1 = smallest blame, 10 = highest blame):
One: speech recognition Two: language understanding Three: route planning
[Stent et Allen, 1997] page 17
84
Presentation Outline
TRIPS Overview Sample Application TRIPS Architecture
Abstract Problem Solving Model Architecture Overview Architecture Components The Parser Component (Details) The Task Manager (Details) The Interpretation Manager (Details)
Other Mixed-Initiative Issues TRIPS Evaluation Lessons Learned
85
Lessons Learned
Multiple input methods processed by a single parser component
Speech acts contain parse tree annotated with features such as missing lexical subjects or recognized paths
Parse trees are constructed only for fragments of the input that can be parsed
Separation between Reasoning about the interpretation of the user input
(Interpretation Manager) Reasoning about the abstract tasks (Behavioral Agent) Reasoning about the domain tasks (Task Manager) Reasoning about the response planning (Generation Manager)
Separation between domain independent components and domain dependent components
Separation between linguistic and discourse knowledge
86
Bibliography
James Allen, George Ferguson, and Amanda Stent, "An architecture for more realistic conversational systems," in Proceedings of Intelligent User Interfaces 2001 (IUI-01), Santa Fe, NM, January 14-17, 2001. http://www.cs.rochester.edu/research/cisd/pubs/2001/allen-ferguson-stent-iui2001.pdf
Nate Blaylock. Retroactive recognition of interleaved plans for natural language dialogue. Technical Report 761, University of Rochester, Department of Computer Science, December 2001. http://www.cs.rochester.edu/research/cisd/pubs/2001/tr761.blaylock.pdf
TRIPS Project Home Page. http://www.cs.rochester.edu/research/cisd/projects/trips/
George Ferguson and James Allen, "TRIPS: An Intelligent Integrated Problem-Solving Assistant," in Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98), Madison, WI, 26-30 July 1998, pp. 567-573. http://www.cs.rochester.edu/research/cisd/pubs/1998/ferguson-allen-aaai98.pdf
Amanda J. Stent and James F. Allen, TRAINS-96 System Evaluation, TRAINS Technical Note 97-1, Computer Science Dept., University of Rochester, March 1997. ftp://ftp.cs.rochester.edu/pub/papers/ai/97.tn1.TRAINS-96_system_evaluation.ps.gz
George M. Ferguson, James F. Allen, Brad W. Miller and Eric K. Ringger, The Design and Implementation of the TRAINS-96 System: A Prototype Mixed-Initiative Planning Assistant, TRAINS Technical Note 96-5, Computer Science Dept., University of Rochester, October 1996. ftp://ftp.cs.rochester.edu/pub/papers/ai/96.tn5.Design_and_implementation_of_TRAINS-96_system.ps.gz