learning through interactive behavior specifications
DESCRIPTION
Learning through Interactive Behavior Specifications. Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University of Michigan. Goal. Automatically generate cognitive agents Reduce the cost of agent development - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/1.jpg)
1
Learning through Interactive Behavior Specifications
Tolga KonikCSLI, Stanford University
Douglas PearsonThree Penny Software
John LairdUniversity of Michigan
![Page 2: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/2.jpg)
2
Goal
Automatically generate cognitive agents
Reduce the cost of agent development
Reduce the expertise required to develop agents.
![Page 3: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/3.jpg)
3
Domains
Autonomous Cognitive agents Dynamic Virtual Worlds Real time decisions based on
knowledge and sensed data Soar agent architecture
![Page 4: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/4.jpg)
4
Learning by Observation
Approach: Observe expert behavior Learn to replicate it
Why? We may want human-like agents In complex domains, imitating
humans maybe easier than learning from scratch
![Page 5: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/5.jpg)
5
Bottleneck in pure Learning by Observation
PROBLEM: You cannot observe the internal reasoning
of the expert
SOLUTION: Ask the expert for additional information
Goal annotations Use additional knowledge sources
Task & domain knowledge
![Page 6: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/6.jpg)
6
Learning by Observation
Agent
Actions Percepts
Learner
Goalannotations
Additional Task Knowledge
Interface EnvironmentExpert
![Page 7: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/7.jpg)
7
Agent Interface Environment
ILP 2004
Machine Learning Journal (forthcoming)
Learning by Observation
![Page 8: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/8.jpg)
8
Learning by ObservationCritic Mode
Agent Interface Environment
Expert
critic
Learner
![Page 9: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/9.jpg)
9
One Body, Two Minds
?
How and when to switch control
How the expert and the agent program communicate
? Agent Interface Environment
Expert
![Page 10: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/10.jpg)
10
Expert
Diagrammatic Behavior Specification
Agent
EnvironmentRedux
Learner
![Page 11: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/11.jpg)
11
Redux
Visual rule editing
Diagrammatic Behavior Specification
![Page 12: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/12.jpg)
12
Get-item-in-room(Item)
Get-item(Item)
Go-through(Door)
Goto-next-roomGet-item-different-room(Item)
Go-to-door(D)Go-to(Door)
Goal Hierarchy
Task-Performance knowledge is represented with a hierarchy of durative goals.
i3
r1
r2 r3
r4d1
d2d3 d4
d5 d6 i4
i3 i3 i3
![Page 13: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/13.jpg)
13
r1
r2 r3
r4d1
d2d3 d4
d5 d6 i4
i3
Get-item-in-room(Item)
Get-item(i3)
Go-through(Door)
Goto-next-roomGet-item-different-room(Item)
Go-to-door(D)Go-to(Door)
i3
Get-item-in-room(i3)
Item=i3
Goal Hierarchy
![Page 14: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/14.jpg)
14
r1
r2 r3
r4d1
d2d3 d4
d5 d6 i4
i3
Get-item-different-room(Item)Get-item-different-room(i3)
Go-to(Door)
Get-item-in-room(Item)
Get-item(i3)
Go-through(Door)Go-to(d1)
i3
Door=d1
Item=i3
Goal Hierarchy
![Page 15: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/15.jpg)
15
r1
r2 r3
r4d1
d2d3 d4
d5 d6 i4
i3
Get-item-in-room(Item)
Get-item(i3)
Go-through(d1)
Goto-next-roomGet-item-different-room(i3)
Go-to-door(D)Go-to(Door)
i3
Door=d1
Goal Hierarchy
![Page 16: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/16.jpg)
17
Behavior Specification
Agent
Expert
Expert draws initial abstract situation Create senario by selecting actions
![Page 17: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/17.jpg)
18
Goal Specification
Agent
Expert
Goals are explicitly selected The agent contributes based on the current
situation, current goal and its knowledge
![Page 18: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/18.jpg)
20
Goal Hierarchy
Learning by Observation perspective Unobservable mental reasoning of the expert
Learning Perspective Bias hypothesis space “learn agent” problem reduced to “learn goal
selection and termination” MI Perspective
information exchange between the expert and the agent
![Page 19: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/19.jpg)
21
Relevant Knowledge Specification
Agent
Prepare food
Expert can mark important objects in a decision
Expert
![Page 20: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/20.jpg)
22
Expert specified undesired actions and goals
Expert rejected actions and goals of the approximately learned agent program
Watch TV
Rich Behavior Trace
![Page 21: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/21.jpg)
23
Hypothetical Actions and Goals Situation history : a tree structure of
possible behaviors
Rich Behavior Trace
![Page 22: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/22.jpg)
24
Input: Relational Situations Goal and action selections and rejections Additional annotations (i.e. important objects) Background knowledge
Output: Rule based agent program
Learn goal/action selection/termination generalizing over multiple examples
Inductive Logic Programming to combine rich knowledge structures
Relational Learning by Observation
![Page 23: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/23.jpg)
25
Relational Learning by Observation
![Page 24: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/24.jpg)
26Find the common structures in the decision examples
Relational Learning by Observation
![Page 25: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/25.jpg)
27
?
“Select a door in the current room, which leads to a room that contains the item the agent wants to get”
Learn relations between what the agent wants, perceives and knows.
Relational Learning by Observation
![Page 26: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/26.jpg)
32
Summary
Diagrammatic behavior specification approach: To extract rich behavior knowledge Interactive behavior specification Communication medium between the
agents (explicit goals and assumed situation)
Relational learning by observation approach to combine multiple complex knowledge sources
![Page 27: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/27.jpg)
33
Future Work
Improve mixed initiative interaction of the interface
Explore domain independent diagrammatic interface features
Allow the expert to enter context sensitive knowledge
![Page 28: Learning through Interactive Behavior Specifications](https://reader035.vdocument.in/reader035/viewer/2022062500/56815231550346895dc0772e/html5/thumbnails/28.jpg)
34
Mixed initiative perspective
Interactive behavior specification Diagrammatic representation of behavior
communication medium between the agents Explicit goals and desired behavior
Facilitates interaction between the agents