twenty second conference on artificial intelligence aaai 2007 improved state estimation in...
TRANSCRIPT
Twenty Second Conference on Artificial Intelligence
AAAI 2007
Improved State Estimation in Multiagent Settings with Continuous or Large Discrete State Spaces
Prashant DoshiDept. of Computer Science
University of Georgia
SpeakerYifeng Zeng
Aalborg University, Denmark
State Estimation
Physical State(Loc, Orient,...)
Physical State(Loc, Orient,...)
1tia
tio
i
Single agent setting
1
)(),,(),,()( 11111
ts
tti
tti
ti
ti
ti
ti
tti sbsasToasOsb
Interactive state
State Estimation
Physical State(Loc, Orient,...)
Physical State(Loc, Orient,...)
1tia
tio
i
Multiagent setting
j
1tja
tjo
ji SIS (See AAMAS 05)
State Estimation in Multiagent Settings
Ascribe intentional models (POMDPs) to other agents
Update the other agents' beliefs
Estimate the interactive state jS
(See JAIR’05)
Previous Approach
Interactive particle filter (I-PF; see AAMAS'05, AAAI'05)
Generalizes PF to multiagent settings
Approximate simulation of the state estimation
Limitations of the I-PF Large no. of particles needed even for small state spaces
Distributes particles over the physical state and model spaces
Poor performance when the physical state space is large or continuous
Factoring the State Estimation
Update the physical state space
Update other agent's model
Factoring the State Estimation
Sample particles from just the physical state space
Substitute in state estimation
Implement using PF
Perform as exactly as possible
Rao-Blackwellisation of the I-PF
Assumptions on Distributions Prior beliefs
Singly nested and conditional linear Gaussian (CLG)
Transition functions
Deterministic or CLG
Observation functions
Softmax or CLG
Why these distributions? Good statistical properties
Well-known methods for learning these distributions from
data
Applications in target tracking, fault diagnosis
Belief Update over Models
Step 1: Update other agent's level 0 beliefs
Product of a Gaussian and Softmax
Use variational approximation of softmax (see Jordan '99)
Softmax Gaussian – tight lower bound
Update is then analogous to the Kalman filter
Belief Update over ModelsStep 2: Update belief over other's beliefs
Solve other's models – compute other's policy
Large variance – Listen
Obtain piecewise distributions
L
Updated Gaussian if prior belief supports the action
0 otherwise
Updated
belief over
other's belief
=
Approximate piecewise with Gaussian using ML
Belief Update over Models
Step 3: Form a mixture of Gaussians
Each Gaussian is for the optimal action and possible
observation of the other agent
Weight the Gaussian with the likelihood of receiving
the observation
Mixture components grow unbounded
components after one step
components after t steps
Comparative Performance
Compare accuracy of state estimation with I-PF (L1 metric)
Continuous multi-agent tiger problem
Public good problem with punishment
RB-IPF focuses particles on the large physical state space
Updates beliefs over other's models more accurately (supporting plots in paper)
Comparative Performance
Compare run times with I-PF (Linux, Xeon
3.4GHz, 4GB RAM)
Sensitivity to Gaussian approximation of
piecewise distribution
Discussion
How restrictive are the assumptions on the distributions?
Can we generalize RB-IPF, like I-PF?
Will RB-IPF scale to large number of update steps?
Closed form mixtures are needed
Is RB-IPF applicable to multiply-nested beliefs
Recursive application may not improve performance over I-
PF
Thank you
Questions?