twenty second conference on artificial intelligence aaai 2007 improved state estimation in...

Twenty Second Conference on Artificial Intelligence

AAAI 2007

Improved State Estimation in Multiagent Settings with Continuous or Large Discrete State Spaces

Prashant DoshiDept. of Computer Science

University of Georgia

SpeakerYifeng Zeng

Aalborg University, Denmark

State Estimation

Physical State(Loc, Orient,...)


1tia

tio

i

Single agent setting

1

)(),,(),,()( 11111

ts

tti

tti

ti

ti

ti

ti

tti sbsasToasOsb

Interactive state

State Estimation



1tia

tio

i

Multiagent setting

j

1tja

tjo

ji SIS (See AAMAS 05)

State Estimation in Multiagent Settings

Ascribe intentional models (POMDPs) to other agents

Update the other agents' beliefs

Estimate the interactive state jS

(See JAIR’05)

Previous Approach

Interactive particle filter (I-PF; see AAMAS'05, AAAI'05)

Generalizes PF to multiagent settings

Approximate simulation of the state estimation

Limitations of the I-PF Large no. of particles needed even for small state spaces

Distributes particles over the physical state and model spaces

Poor performance when the physical state space is large or continuous

Factoring the State Estimation

Update the physical state space

Update other agent's model

Factoring the State Estimation

Sample particles from just the physical state space

Substitute in state estimation

Implement using PF

Perform as exactly as possible

Rao-Blackwellisation of the I-PF

Assumptions on Distributions Prior beliefs

Singly nested and conditional linear Gaussian (CLG)

Transition functions

Deterministic or CLG

Observation functions

Softmax or CLG

Why these distributions? Good statistical properties

Well-known methods for learning these distributions from

data

Applications in target tracking, fault diagnosis

Belief Update over Models

Step 1: Update other agent's level 0 beliefs

Product of a Gaussian and Softmax

Use variational approximation of softmax (see Jordan '99)

Softmax Gaussian – tight lower bound

Update is then analogous to the Kalman filter

Belief Update over ModelsStep 2: Update belief over other's beliefs

Solve other's models – compute other's policy

Large variance – Listen

Obtain piecewise distributions

L

Updated Gaussian if prior belief supports the action

0 otherwise

Updated

belief over

other's belief

=

Approximate piecewise with Gaussian using ML

Belief Update over Models

Step 3: Form a mixture of Gaussians

Each Gaussian is for the optimal action and possible

observation of the other agent

Weight the Gaussian with the likelihood of receiving

the observation

Mixture components grow unbounded

components after one step

components after t steps

Comparative Performance

Compare accuracy of state estimation with I-PF (L1 metric)

Continuous multi-agent tiger problem

Public good problem with punishment

RB-IPF focuses particles on the large physical state space

Updates beliefs over other's models more accurately (supporting plots in paper)

Comparative Performance

Compare run times with I-PF (Linux, Xeon

3.4GHz, 4GB RAM)

Sensitivity to Gaussian approximation of

piecewise distribution

Discussion

How restrictive are the assumptions on the distributions?

Can we generalize RB-IPF, like I-PF?

Will RB-IPF scale to large number of update steps?

Closed form mixtures are needed

Is RB-IPF applicable to multiply-nested beliefs

Recursive application may not improve performance over I-

PF

Thank you

Questions?

twenty second conference on artificial intelligence aaai 2007 improved state estimation in...

Documents