cognitive user interfaces: an engineering approach machine intelligence laboratory information...
TRANSCRIPT
Cognitive User Interfaces:Cognitive User Interfaces:
An Engineering ApproachAn Engineering Approach
Machine Intelligence LaboratoryMachine Intelligence LaboratoryInformation Engineering DivisionInformation Engineering Division
Cambridge University Engineering DepartmentCambridge University Engineering DepartmentCambridge, UKCambridge, UK
Steve YoungSteve Young
2
ICASSP Plenary April 2009 © Steve Young
Outline of TalkOutline of Talk
Introduction: what is a cognitive user interface?
Example: a simple gesture-driven interface.
Human decision-making and planning.
Partially Observable MDPs – an intractable solution?
Scaling up: statistical spoken dialogue systems.
Conclusions and future work.
3
ICASSP Plenary April 2009 © Steve Young
What is a cognitive user interface?What is a cognitive user interface?
Capable of reasoning and inference
Able to optimize communicative goals
Able to adapt to changing environments
Able to learn from experience
An interface which supports intelligent, efficient and robust interaction between a human and a machine.
4
ICASSP Plenary April 2009 © Steve Young
Example: A Simple Gesture-Driven User InterfaceExample: A Simple Gesture-Driven User Interface
SwipeSwipe
ScrollForward
ScrollBackward
DeletePhoto
SwipeSwipe
A photo sorter
5
ICASSP Plenary April 2009 © Steve Young
Interpreting the InputInterpreting the Input
Backwards
Delete
ForwardsBackwards
Delete
Forwards
angle
P(angle)
Decision Boundaries
6
ICASSP Plenary April 2009 © Steve Young
Pattern ClassificationPattern Classification
angle
P(angle)
G=forwards G=delete G=backwards
Conf(G=backwards)
7
ICASSP Plenary April 2009 © Steve Young
Flowchart-based Decision MakingFlowchart-based Decision Making
Confidence?
Gesture?
backwards
Move back
>= Threshold
Do Nothing
< Threshold
8
ICASSP Plenary April 2009 © Steve Young
What is missing?What is missing?
No modeling of uncertainty
No tracking of belief in the user’s required goal
No quantifiable objectives hence sub-optimal decision making
9
ICASSP Plenary April 2009 © Steve Young
Modeling Uncertainty and Inference – Bayes’ RuleModeling Uncertainty and Inference – Bayes’ Rule
Reverend Thomas Bayes (1702-1761)
) data (
) belief ( ) belief|data () data|belief (
P
PPP
new belief
data
st 1
st
ot
at 1
old belief
action
BayesianNetwork b(s)
s
moveback
?
b’(s)
s
Inference viaBayes Rule
10
ICASSP Plenary April 2009 © Steve Young
Optimizing Decisions – Bellman’s EquationOptimizing Decisions – Bellman’s Equation
V *(b) maxa
r(b,a) P( o o
| b,a)V *( b )
Richard E Bellman(1920-1984)
Reward= + + + +…
r(b1,a1)
r(b2,a2 )
r(bT 1aT 1)
r(bT ,aT )
(b1) a1
(b2 ) a2
(bT 1) aT 1
(bT ) aTPolicy
s1 s2 sT-1 sT
a1 a2 aT-1 aTo1 o2 oT-1 oT
b1 b2 bT-1 bT
Reinforcement Learning
11
ICASSP Plenary April 2009 © Steve Young
Optimizing the Photo-SorterOptimizing the Photo-Sorter
SwipeSwipe
ScrollForward
ScrollBackward
DeletePhoto
SwipeSwipe
{ scroll-forward, scroll-backward, delete-photo }User’s Goal(states)
{ go-forward, go-back, do-delete, do-nothing }System Action
+1 +1 +5 0Rewards
-20
All other: -1
Iteratively optimize policy to maximize rewards …
12
ICASSP Plenary April 2009 © Steve Young
Performance on the Photo-Sorting TaskPerformance on the Photo-Sorting Task
10% 20% 30% 40% 50%0%
Reward
Effective Error Rate
Flow-charted Policy
Fixed Policyand Model
Adapted Policyand Model
TrainingPoint
13
ICASSP Plenary April 2009 © Steve Young
Is Human Decision Making Bayesian?Is Human Decision Making Bayesian?
Humans have brains so that they can move.
So how do humans plan movement? ….
14
ICASSP Plenary April 2009 © Steve Young
A Simple Planning TaskA Simple Planning Task
Prior
Observation
Kording and Wolpert (Nature, 427, 2004)
15
ICASSP Plenary April 2009 © Steve Young
Models for Estimating Target LocationModels for Estimating Target Location
0 1 2 Lateral shift (cm)
Pro
babi
lity
Prior
PosteriorObservation
Kording and Wolpert (Nature, 427, 2004)
0
1
-10 1 2
0
1
-10 1 2
0
1
-10 1 2D
evia
tion
from
Tar
get
True lateral shift
Prior ignored Bayesian Min Error Mapping
17
ICASSP Plenary April 2009 © Steve Young
Bayesian Model Selection in Human VisionBayesian Model Selection in Human Vision
Inventory
Test
“Which is more familiar?”
Train
“Watch these!”
Orban, Fiser, Aslin, Lengyel (Proc Nat. Academy Science, 105, 2008)
Not visible to subjects
18
ICASSP Plenary April 2009 © Steve Young
Partially Observable Markov Decision ProcessesPartially Observable Markov Decision Processes
Belief represented by distributions over states andupdated from observations by Bayesian inference
Objectives defined by the accumulation of rewards
Policy which maps beliefs into actions and whichcan be optimized by reinforcement learning
s
s
o
a
r(b,a)
(b)
b
• Principled approach to handling uncertainty and planning
• Humans appear to use similar mechanisms
• Principled approach to handling uncertainty and planning
• Humans appear to use similar mechanisms
So what is the problem ?
19
ICASSP Plenary April 2009 © Steve Young
The state and action sets are often very large.
Real-time belief update is intractable.
The mapping is extremely complex
Exact policy optimization is intractable.
The state and action sets are often very large.
Real-time belief update is intractable.
The mapping is extremely complex
Exact policy optimization is intractable.
Scaling-upScaling-up
(b) a
Applying the POMDP framework in real world user interfacesis not straightforward:
20
ICASSP Plenary April 2009 © Steve Young
Spoken Dialog Systems (SDS)Spoken Dialog Systems (SDS)
Database
RecognizerSemanticDecoder
DialogControl
SynthesizerMessageGenerator
User
Waveforms WordsDialogActs
Is that near the tower?
confirm(near=tower)
negate(near=castle)No, it is nearthe castle.
21
ICASSP Plenary April 2009 © Steve Young
Architecture of the Hidden Information State SystemArchitecture of the Hidden Information State System
BeliefUpdate
bSpeechUnderstanding
SpeechGeneration
User
o
g
aDialogPolicy
Williams and Young (CSL 2007)Young et al (ICASSP 2007)
Two key ideas:
States are grouped into equivalence classes called partitionsand belief updating is applied to partitions rather than states
Belief space is mapped into a much simpler summary spacefor policy implementation and optimization
SummarySpace
ˆ b
HeuristicMapping
ˆ a
b(s)
sPOMDP
22
ICASSP Plenary April 2009 © Steve Young
The HIS Belief SpaceThe HIS Belief Space
s g, u, h Each state is composed of three factors:
User Goal User Act Dialog History
Young et al (CSL 2009)
find(venue(hotel,area=east))
find(venue(bar,area=east))
find(venue(hotel,area=west))
….
find(venue)
˜ u 1˜ u 2...
˜ u N
UserRequest
UserInformed
SystemInformed
Grounded DeniedQueried
Initial
× ×
User goals are grouped into partitions
HIS Belief Space
Beliefs update is limited to the most likely members of this set.
23
ICASSP Plenary April 2009 © Steve Young
Master <-> Summary State MappingMaster <-> Summary State Mapping
Master space is mapped into a reduced summary space:
find(venue(hotel,area=east,near=Museum))
find(venue(bar,area=east,near=Museum))
find(venue(hotel,area=east)
find(venue(hotel,area=west)
find(venue(hotel)
....etc
b
P(top)
P(Nxt)
T12Same
TPStatus
THStatus
TUserAct
LastSA
b̂
a HeuristicMapping
act type
ˆ a
Policy
GreetBold RequestTentative RequestConfirmOfferInform.... etc
GreetBold RequestTentative RequestConfirmOfferInform.... etc
VQVQ
confirm( )confirm(area=east)
24
ICASSP Plenary April 2009 © Steve Young
Learning with a simulated UserLearning with a simulated User
Learning by interaction with real users is expensive/impractical.A solution is to use a simulated user, trained on real data.
UserSimulator
includesASR error
model
DialogCorpus
o
a
BeliefUpdate
HeuristicMapping
b SummarySpace
ˆ b
DialogPolicy
ˆ a
Q-Learning
Random action)(randomP
Schatzmann et al (Knowledge Eng Review 2006)
26
ICASSP Plenary April 2009 © Steve Young
HIS Performance in NoiseHIS Performance in Noise
Success Rate (%)
Error Rate (%)
0 5 10 15 20 25 30 35 40 45
95
90
85
80
75
70
65
60
55
HIS
MDP
Hand-crafted(HDC)
Simulated User
27
ICASSP Plenary April 2009 © Steve Young
Representing beliefsRepresenting beliefs
An alternative is to model beliefs directly using dynamic Bayesian nets …
Beliefs in a spoken dialog system entail a large number of so-calledslot variables. Eg for tourist information:
Cardinality is huge and we cannot handle the full joint distribution.
P(venue, location, pricerange, foodtype, music, …)P(venue, location, pricerange, foodtype, music, …)
In the HIS system, we threshold the joint distribution and just record the highprobability values. The partitions marginalize out all the unknowns.
But this is approximate, and belief update now depends on theassumption that the underlying user goal does not change.
P(venue=bar, location=central, music=jazz) = 0.32P(venue=bar, location=central, music=blues) = 0.27P(venue=bar, location=east, music=jazz) = 0.11etc
P(venue=bar, location=central, music=jazz) = 0.32P(venue=bar, location=central, music=blues) = 0.27P(venue=bar, location=east, music=jazz) = 0.11etc
28
ICASSP Plenary April 2009 © Steve Young
Modeling Belief with Dynamic Bayesian Networks (DBNs)Modeling Belief with Dynamic Bayesian Networks (DBNs)
Decompose state into DBN, retaining only essential conditional dependencies
gtype
gfood
utype ufood
htype
hfood
a
u
o
g’type
g’food
u’type u’food
h’type
h’food
a’
u’
o’
g
u
h
Time t Time t+1
Eg restaurant
Eg chinese
Thomson et al (ICASSP, 2008)
29
ICASSP Plenary April 2009 © Steve Young
Factor Graph for the Full Tourist Information SystemFactor Graph for the Full Tourist Information System
Factor graphs are very large, even with minimal dependency modeling.Henceneed very efficient belief updatingneed to define policies directly on full belief networks
30
ICASSP Plenary April 2009 © Steve Young
Bayesian Update of Dialog State (BUDS) SystemBayesian Update of Dialog State (BUDS) System
Thomson et al (CSL 2009)
Belief update depends on message passing
x1
xM
…. xf
x f
f x
f x (x) f (x) x fxm
x
(xm )
sum over all combinationsof variable values
P(food)
foodZ1 Z2 Z3
Grouping possible valuesinto partitions greatly simplifies these summationsFr It ….
31
ICASSP Plenary April 2009 © Steve Young
Belief Propagation TimesBelief Propagation Times
Network Branching Factor
Time
StandardLBP
LBP withGrouping
LBP withGrouping &Const Prob of Change
32
ICASSP Plenary April 2009 © Steve Young
Policy Optimization in the BUDS SystemPolicy Optimization in the BUDS System
Thomson et al (ICASSP, 2008)
Summary space now depends on forming a simple characterization ofeach individual slot.
Define policy as a parametric function and optimize wrt θusing Natural Actor Criticalgorithm.
'
)(.
)(.
'),|(
a
b
b
a
a
e
eba
])(, ,...)( ,)([)( ,*,,TTTT bbbb alocationafoodaa
Each action dependent basis function is separated out into slot-basedcomponents, eg
01000
1.0 0.0 0.00.8 0.2 0.00.6 0.4 0.00.4 0.4 0.20.3 0.3 0.4
1st 2nd Rest
slot beliefquantization
actionindicatorfunction
33
ICASSP Plenary April 2009 © Steve Young
BUDS Performance in NoiseBUDS Performance in Noise
Error Rate (%)
Simulated User
BUDS
MDP
Average Reward
34
ICASSP Plenary April 2009 © Steve Young
ConclusionsConclusions
Future generations of intelligent systems and agents will needrobust, adaptive, cognitive human-computer interfaces
Bayesian belief tracking and automatic strategy optimizationprovide the mathematical foundations
Human evolution seems to have come to the same conclusion
Early results are promising but research is needed
a) to develop scalable solutions which can handle very largenetworks in real time
b) to incorporate more detailed linguistic capabilities
c) to understand how to integrate different modalities: speech, gesture, emotion, etc
d) to understand how to migrate these approaches into industrialsystems.
35
ICASSP Plenary April 2009 © Steve Young
CreditsCredits
EU FP7 Project: Computational Learning in Adaptive Systems for Spoken Conversation
Spoken Dialogue Management using Partially Observable Markov Decision Processes
Past and Present Members of the CUED Dialogue Systems Group
Milica Gasic, Filip Jurcicek, Simon Keizer, Fabrice Lefevre, Francois Mairesse, Jost Schatzmann, Matt Stuttle, Blaise Thomson, Karl Weilhammer, Jason Williams, Hui Ye, Kai Yu
Colleagues in the CUED Information Engineering Division
Bill Byrne, Mark Gales, Zoubin Ghahramani, Mate Lengyel, Daniel Wolpert, Phil Woodland