tightly and loosely coupled decision paradigms in...

64
Tightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang & Frank Hanshar University of Guelph Ontario, Canada PGM 2008 September 18, 2008

Upload: phamhanh

Post on 25-Mar-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Tightly and Loosely Coupled Decision Paradigms in Multiagent Expedition

Yang Xiang & Frank HansharUniversity of Guelph

Ontario, Canada

PGM 2008September 18, 2008

Page 2: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Outline

• Introduction

• What is Multiagent Expedition?

• Collaborative Design Network

• Graphical Model for Multiagent Expedition

• Recursive Model for Multiagent Expedition

• Experimental Results & Discussion

• Conclusion

2Y. Xiang and F. Hanshar PGM ‘08

Page 3: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Introduction

• We consider frameworks for online decision making:

• Loosely-coupled frameworks (LCF): do not communicate, rely on observing other agents actions to discern state and coordinate with each other

• Tightly-coupled frameworks (TCF): agents communicate through messages over interfaces that are rigourously defined

3Y. Xiang and F. Hanshar PGM ‘08

Page 4: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Introduction cont.• Relevant computational advantages of each

paradigm are poorly understood.

• We wish to understand the tradeoffs in LCFs and TCFs for multiagent planning.

• In this work we select one example framework from LCFs (RMM) and one from TCFs (CDN).

• We resolve technical issues encountered, and compare them experimentally on a test problem called multiagent expedition.

4Y. Xiang and F. Hanshar PGM ‘08

Page 5: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

What is Multiagent Expedition (MAE)?

• Agents have no prior knowledge of how rewards are distributed in the environment.

• Multiple alternative goals with varying rewards are present.

• Coordination problem - objective is for agents to cooperate to maximize team reward.

• Possible applications: multi-robot exploration of Mars, sea-floor exploration, disaster rescue, ...

5Y. Xiang and F. Hanshar PGM ‘08

Page 6: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Instance of MAE

• Each cell has a reward pair (a).

• Observations are local (b).

- Agent can observe the 13 cells around it.

• Effect of an action uncertain (c).

6

(b)(a) (c)

0.025

0.0250.025

0.9

0.025

0.3 0.4 0.1 0.1 0.20.8 0.9 0.4 0.5 0.5

0.3 0.4 0.1 0.2 0.40.7 0.9 0.3 0.5 0.9

0.3 0.1 0.4 0.2 0.20.7 0.3 0.9 0.7 0.7

0.1 0.4 0.2 0.4 0.10.9 0.9 0.5 0.9 0.5

0.1 0.3 0.3 0.2 0.30.3 0.7 0.7 0.5 0.6

Y. Xiang and F. Hanshar PGM ‘08

Page 7: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Instance of MAE

• Each cell has a reward pair (a).

• Observations are local (b).

- Agent can observe the 13 cells around it.

• Effect of an action uncertain (c).

6

(b)(a) (c)

0.025

0.0250.025

0.9

0.025

0.3 0.4 0.1 0.1 0.20.8 0.9 0.4 0.5 0.5

0.3 0.4 0.1 0.2 0.40.7 0.9 0.3 0.5 0.9

0.3 0.1 0.4 0.2 0.20.7 0.3 0.9 0.7 0.7

0.1 0.4 0.2 0.4 0.10.9 0.9 0.5 0.9 0.5

0.1 0.3 0.3 0.2 0.30.3 0.7 0.7 0.5 0.6

Y. Xiang and F. Hanshar PGM ‘08

Page 8: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Instance of MAE

• Each cell has a reward pair (a).

• Observations are local (b).

- Agent can observe the 13 cells around it.

• Effect of an action uncertain (c).

6

(b)(a) (c)

0.025

0.0250.025

0.9

0.025

0.3 0.4 0.1 0.1 0.20.8 0.9 0.4 0.5 0.5

0.3 0.4 0.1 0.2 0.40.7 0.9 0.3 0.5 0.9

0.3 0.1 0.4 0.2 0.20.7 0.3 0.9 0.7 0.7

0.1 0.4 0.2 0.4 0.10.9 0.9 0.5 0.9 0.5

0.1 0.3 0.3 0.2 0.30.3 0.7 0.7 0.5 0.6

Y. Xiang and F. Hanshar PGM ‘08

Page 9: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Instance of MAE

• Each cell has a reward pair (a).

• Observations are local (b).

- Agent can observe the 13 cells around it.

• Effect of an action uncertain (c).

6

(b)(a) (c)

0.025

0.0250.025

0.9

0.025

0.3 0.4 0.1 0.1 0.20.8 0.9 0.4 0.5 0.5

0.3 0.4 0.1 0.2 0.40.7 0.9 0.3 0.5 0.9

0.3 0.1 0.4 0.2 0.20.7 0.3 0.9 0.7 0.7

0.1 0.4 0.2 0.4 0.10.9 0.9 0.5 0.9 0.5

0.1 0.3 0.3 0.2 0.30.3 0.7 0.7 0.5 0.6

Y. Xiang and F. Hanshar PGM ‘08

Page 10: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

MAE Rewards

• Agents move and collect utility.

• Cells revert to default rewards after they are visited.

d = (r1, r2) = (0.1, 0.2)

d

7

• Physical interaction between agents has some optimal level.

- Above or below this level will reduce the reward.

- We set this level at 2 agents, but other levels could be used.

• Thus each cell has a reward pair

- denotes unilateral reward, bilateral reward.r1 r2

(r1, r2), r1, r2 ! [0, 1]

0.20.30.1

A

Y. Xiang and F. Hanshar PGM ‘08

Page 11: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

MAE Rewards

• Agents move and collect utility.

• Cells revert to default rewards after they are visited.

d = (r1, r2) = (0.1, 0.2)

d

7

• Physical interaction between agents has some optimal level.

- Above or below this level will reduce the reward.

- We set this level at 2 agents, but other levels could be used.

• Thus each cell has a reward pair

- denotes unilateral reward, bilateral reward.r1 r2

(r1, r2), r1, r2 ! [0, 1]

0.20.30.1

A0.20.10.1

A

Y. Xiang and F. Hanshar PGM ‘08

Page 12: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

MAE Rewards

• Agents move and collect utility.

• Cells revert to default rewards after they are visited.

d = (r1, r2) = (0.1, 0.2)

d

7

• Physical interaction between agents has some optimal level.

- Above or below this level will reduce the reward.

- We set this level at 2 agents, but other levels could be used.

• Thus each cell has a reward pair

- denotes unilateral reward, bilateral reward.r1 r2

(r1, r2), r1, r2 ! [0, 1]

0.20.30.1

A0.20.10.1

A0.10.10.1

A

Y. Xiang and F. Hanshar PGM ‘08

Page 13: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Unilateral / Bilateral Rewards

8

A ➙ North B ➙ South A Reward = 0.3 B Reward = 0.3

Total = 0.6

A ➙ North B ➙ West A Reward = 0.7/2=0.35 B Reward = 0.7/2=0.35

Total = 0.7

Initial Unilateral Bilateral

0.3

0.7

0.3

0.7

A

B

0.3

0.7

0.3

0.7A

B

0.3

0.7

0.3

0.7A B

Y. Xiang and F. Hanshar PGM ‘08

Page 14: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Unilateral / Bilateral Rewards

8

A ➙ North B ➙ South A Reward = 0.3 B Reward = 0.3

Total = 0.6

A ➙ North B ➙ West A Reward = 0.7/2=0.35 B Reward = 0.7/2=0.35

Total = 0.7

• If 3 agents cooperate- Two receive bilateral reward- One receives default unilateral reward

Initial Unilateral Bilateral

0.3

0.7

0.3

0.7

A

B

0.3

0.7

0.3

0.7A

B

0.3

0.7

0.3

0.7A B

Y. Xiang and F. Hanshar PGM ‘08

Page 15: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

MAE (DEC-W-POMDP)

9

• Instance of DEC-W-POMDP (NEXP-complete)

- stochastic since effects uncertain.

- Markovian since new state is conditionally independent of the history given the current state and joint action of agents.

- partially observable agents cannot perceive other agents neighbourhoods

- w-weakly agents can perceive absolute location and their own local neighbourhood.

• For agents, and horizon , each agent needs to evaluate = possible effects.

B

C

5246 2

6! 1016

Y. Xiang and F. Hanshar PGM ‘08

Page 16: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Outline

• Introduction

• What is Multiagent Expedition?

• Collaborative Design Network

• Graphical Model for Multiagent Expedition

• Recursive Model for Multiagent Expedition

• Experimental Results & Discussion

• Conclusion & Future Work

10Y. Xiang and F. Hanshar PGM ‘08

Page 17: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Collaborative Design Network (CDN) [Xiang, Chen and Havens, AAMAS 05]

• A multiagent component-based design paradigm.

• CDN gives optimal design based on preferences of all agents.

• Scales linearly with the addition of agents.

• Efficient when the overall dependency structure is sparse.

• We use CDN in this work as a collaborative decision network.

11Y. Xiang and F. Hanshar PGM ‘08

Page 18: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Design Network (DN)• is a DAG, where

• the set of design nodes.

- design decisions

• the set of environmental nodes.

- uncertainty over working environment of the product under design

• the set of performance nodes.

- refers to objective measures of functionality of the design.

• the set of utility nodes.

- subjective measures dependent strictly on performance nodes.12

G = (V,E) V = D ! T !M ! U

D

T

M

U

Y. Xiang and F. Hanshar PGM ‘08

Page 19: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

DN Continued...

• Syntactically each node is associated with a conditional probability distribution.

• Semantically, the nodes differ. E.g encodes a design constraint.

• The goal is to find a design which is maximal.

13

d! EU(d!)

P (d|!(d))

Y. Xiang and F. Hanshar PGM ‘08

Page 20: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Collaborative Design Network (CDN)

• Collaborative design network extends multiply sectioned Bayesian networks to multiagent decision making.

- DAG domain structuring.

- Hypertree agent organization.

- Belief over private and shared variables.

- Partial evaluation of partial design communicated over small set of shared variables btw agents.

- Design is globally optimal.

- Local design at each agent remains private.14Y. Xiang and F. Hanshar PGM ‘08

Page 21: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

CDN for MAE

• Each time-step an agent:

- Utilizes a dynamic graphical model.

- Updates domains for movement and position nodes.

- Updates utility distributions from locally observed rewards.

- Communicates with other agents to find globally optimal joint action.

15Y. Xiang and F. Hanshar PGM ‘08

Page 22: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Position Nodes

• Encode probability of uncertain location given agent movement .

mvx,i psx,1 = (0, 0) psx,1 = (1, 0) psx,1 = (!1, 0) psx,1 = (0, 1) psx,1 = (0,!1)north 0.025 0.025 0.025 0.9 0.025south 0.025 0.025 0.025 0.025 0.9east 0.025 0.9 0.025 0.025 0.025west 0.025 0.025 0.9 0.025 0.025halt 0.9 0.025 0.025 0.025 0.025

mvx,i

psx,1

16Y. Xiang and F. Hanshar PGM ‘08

Page 23: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Utility Nodes

17Y. Xiang and F. Hanshar PGM ‘08

psx,1A psx,1

B psx,1C P (rwABC,1 = y|psx,1

A , psx,1B , psx,1

C )(0,0) (0,0) (0,0) 0.4(0,0) (0,0) (1,0) 0.2(0,0) (0,0) (2,0) 0.1

......

(-2,-2) (-2,-2) (-2,0) 0(-2,-2) (-2,-2) (-2,-1) 0.3(-2,-2) (-2,-2) (-2,-2) 1.0

Page 24: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Applying CDN to MAE

• Encode movements {N, S, E, W, H} in design nodes.

• Encode uncertain locations given agent movements through performance nodes.

• Encode reward in utility nodes.

• Communicate EU over design nodes btw agents to find maximal utility design which corresponds to the globally optimal joint plan.

18Y. Xiang and F. Hanshar PGM ‘08

Page 25: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

psA,1 psC,2

psC,2

mvB,2mvB,1

mvB,1 mvB,2

mvB,1 mvB,2

psB,1 psB,2psB,1 psB,2

rwAB,1 rwAB,2 rwBC,1rwBC,2

psA,1

psA,2

psB,1psB,2

rwBC,2

rwBC,1

rwAB,1

rwAB,2

A B

C

A

B

C

!

mvA,1 mvA,2

psC,1

psC,1

psA,2

mvA,1

mvA,2

mvC,1 mvC,2

mvC,1

mvC,2

Graphical Model

19

CDN of a 3 agent group (A,B,C) for expedition/planning, where is the hypertree.!Y. Xiang and F. Hanshar PGM ‘08

Page 26: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

psA,1 psC,2

psC,2

mvB,2mvB,1

mvB,1 mvB,2

mvB,1 mvB,2

psB,1 psB,2psB,1 psB,2

rwAB,1 rwAB,2 rwBC,1rwBC,2

psA,1

psA,2

psB,1psB,2

rwBC,2

rwBC,1

rwAB,1

rwAB,2

A B

C

A

B

C

!

mvA,1 mvA,2

psC,1

psC,1

psA,2

mvA,1

mvA,2

mvC,1 mvC,2

mvC,1

mvC,2

Graphical ModelDesign

19

CDN of a 3 agent group (A,B,C) for expedition/planning, where is the hypertree.!Y. Xiang and F. Hanshar PGM ‘08

Page 27: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

psA,1 psC,2

psC,2

mvB,2mvB,1

mvB,1 mvB,2

mvB,1 mvB,2

psB,1 psB,2psB,1 psB,2

rwAB,1 rwAB,2 rwBC,1rwBC,2

psA,1

psA,2

psB,1psB,2

rwBC,2

rwBC,1

rwAB,1

rwAB,2

A B

C

A

B

C

!

mvA,1 mvA,2

psC,1

psC,1

psA,2

mvA,1

mvA,2

mvC,1 mvC,2

mvC,1

mvC,2

Graphical ModelDesign

19

CDN of a 3 agent group (A,B,C) for expedition/planning, where is the hypertree.

Performance

!Y. Xiang and F. Hanshar PGM ‘08

Page 28: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

psA,1 psC,2

psC,2

mvB,2mvB,1

mvB,1 mvB,2

mvB,1 mvB,2

psB,1 psB,2psB,1 psB,2

rwAB,1 rwAB,2 rwBC,1rwBC,2

psA,1

psA,2

psB,1psB,2

rwBC,2

rwBC,1

rwAB,1

rwAB,2

A B

C

A

B

C

!

mvA,1 mvA,2

psC,1

psC,1

psA,2

mvA,1

mvA,2

mvC,1 mvC,2

mvC,1

mvC,2

Graphical ModelDesign

19

CDN of a 3 agent group (A,B,C) for expedition/planning, where is the hypertree.

Performance

Utility

!Y. Xiang and F. Hanshar PGM ‘08

Page 29: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

psA,1 psC,2

psC,2

mvB,2mvB,1

mvB,1 mvB,2

mvB,1 mvB,2

psB,1 psB,2psB,1 psB,2

rwAB,1 rwAB,2 rwBC,1rwBC,2

psA,1

psA,2

psB,1psB,2

rwBC,2

rwBC,1

rwAB,1

rwAB,2

A B

C

A

B

C

!

mvA,1 mvA,2

psC,1

psC,1

psA,2

mvA,1

mvA,2

mvC,1 mvC,2

mvC,1

mvC,2

Graphical ModelDesign

19

CDN of a 3 agent group (A,B,C) for expedition/planning, where is the hypertree.

Performance

Utility

!Y. Xiang and F. Hanshar PGM ‘08

Page 30: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

psA,1 psC,2

psC,2

mvB,2mvB,1

mvB,1 mvB,2

mvB,1 mvB,2

psB,1 psB,2psB,1 psB,2

rwAB,1 rwAB,2 rwBC,1rwBC,2

psA,1

psA,2

psB,1psB,2

rwBC,2

rwBC,1

rwAB,1

rwAB,2

A B

C

A

B

C

!

mvA,1 mvA,2

psC,1

psC,1

psA,2

mvA,1

mvA,2

mvC,1 mvC,2

mvC,1

mvC,2

Graphical ModelDesign

19

CDN of a 3 agent group (A,B,C) for expedition/planning, where is the hypertree.

Performance

Utility

!Y. Xiang and F. Hanshar PGM ‘08

Page 31: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

psA,1 psC,2

psC,2

mvB,2mvB,1

mvB,1 mvB,2

mvB,1 mvB,2

psB,1 psB,2psB,1 psB,2

rwAB,1 rwAB,2 rwBC,1rwBC,2

psA,1

psA,2

psB,1psB,2

rwBC,2

rwBC,1

rwAB,1

rwAB,2

A B

C

A

B

C

!

mvA,1 mvA,2

psC,1

psC,1

psA,2

mvA,1

mvA,2

mvC,1 mvC,2

mvC,1

mvC,2

Graphical ModelDesign

19

CDN of a 3 agent group (A,B,C) for expedition/planning, where is the hypertree.

Performance

Utility

!Y. Xiang and F. Hanshar PGM ‘08

Page 32: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Outline

• Introduction

• What is Multiagent Expedition?

• Collaborative Design Network

• Graphical Model for Multiagent Expedition

• Recursive Model for Multiagent Expedition

• Experimental Results & Discussion

• Conclusion & Future Work

20Y. Xiang and F. Hanshar PGM ‘08

Page 33: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Recursive Modeling Method (RMM) [Gmytrasiewicz et al. IEEE 98]

• Loosely-coupled multiagent decision making paradigm.

- No explicit communication btw agents.

• Matrix-based agent representation.

• Agents model other agents in order to coordinate actions.

• Agents have probability distributions over other agent’s models.

21Y. Xiang and F. Hanshar PGM ‘08

Page 34: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

A1

A2

A3Utility

0.5

{N,N}

{N,S}

{H,W}

{H,H}

Actions

Payoff Matrix Representation

22Y. Xiang and F. Hanshar PGM ‘08

Page 35: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

A1

A2

A3Utility

0.5

{N,N}

{N,S}

{H,W}

{H,H}

Actions

Payoff Matrix Representation

22Y. Xiang and F. Hanshar PGM ‘08

Page 36: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

A1

A2

A3Utility

0.5

{N,N}

{N,S}

{H,W}

{H,H}

Actions

Payoff Matrix Representation

22Y. Xiang and F. Hanshar PGM ‘08

Page 37: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

A1

A2

A3Utility

0.5

{N,N}

{N,S}

{H,W}

{H,H}

Actions

Payoff Matrix Representation

22Y. Xiang and F. Hanshar PGM ‘08

Page 38: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

A1

A2

A3Utility

0.5

{N,N}

{N,S}

{H,W}

{H,H}

Actions

Payoff Matrix Representation

22Y. Xiang and F. Hanshar PGM ‘08

Page 39: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

A1

A2

A3Utility

0.5

{N,N}

{N,S}

{H,W}

{H,H}

Actions

Payoff Matrix Representation

22Y. Xiang and F. Hanshar PGM ‘08

Page 40: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

A1

A2

A3Utility

0.5

{N,N}

{N,S}

{H,W}

{H,H}

Actions

Payoff Matrix Representation

22Y. Xiang and F. Hanshar PGM ‘08

Page 41: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

RMM for MAE• Observations are local to each agent in MAE.

• How does RMM evaluate joint actions of agents when some payoffs of other agents are unknown?

23

AB

A’s Private Observations

Shared Observations

B’s Private Observations

Y. Xiang and F. Hanshar PGM ‘08

Page 42: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

RMM for MAE cont.

• Knowing the correct state that an agent is in allows for successful planning.

• Idea: Agents model other agents states in the RMM tree.

• A state categorizes a neighbourhood payoff.

• Based on past observations of agent actions, update belief on the state of neighbouring agents.

24Y. Xiang and F. Hanshar PGM ‘08

Page 43: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Recursive Model Structure

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.50(N,N) (N,S) (N,N) 0.50

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.50(N,N) (N,S) (N,N) 0.50

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.40(N,N) (N,S) (N,N) 0.40

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.40(N,N) (N,S) (N,N) 0.40

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.70(N,N) (N,S) (N,N) 0.70

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

P (nbpBA = allLow, nbpB

C = ¬allLow) P (nbpBA = ¬allLow, nbpB

C = allLow)

P (nbpBA = allLow, nbpB

C = allLow) P (nbpBA = ¬allLow, nbpB

C = ¬allLow)

25Y. Xiang and F. Hanshar PGM ‘08

Page 44: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Recursive Model Structure

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.50(N,N) (N,S) (N,N) 0.50

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.50(N,N) (N,S) (N,N) 0.50

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.40(N,N) (N,S) (N,N) 0.40

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.40(N,N) (N,S) (N,N) 0.40

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.70(N,N) (N,S) (N,N) 0.70

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

P (nbpBA = allLow, nbpB

C = ¬allLow) P (nbpBA = ¬allLow, nbpB

C = allLow)

P (nbpBA = allLow, nbpB

C = allLow) P (nbpBA = ¬allLow, nbpB

C = ¬allLow)

25

Payoff Matrix

Y. Xiang and F. Hanshar PGM ‘08

Page 45: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Recursive Model Structure

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.50(N,N) (N,S) (N,N) 0.50

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.50(N,N) (N,S) (N,N) 0.50

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.40(N,N) (N,S) (N,N) 0.40

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.40(N,N) (N,S) (N,N) 0.40

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.70(N,N) (N,S) (N,N) 0.70

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

P (nbpBA = allLow, nbpB

C = ¬allLow) P (nbpBA = ¬allLow, nbpB

C = allLow)

P (nbpBA = allLow, nbpB

C = allLow) P (nbpBA = ¬allLow, nbpB

C = ¬allLow)

25

Payoff Matrix

models for missing observations2n

Y. Xiang and F. Hanshar PGM ‘08

Page 46: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Recursive Model Structure

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.50(N,N) (N,S) (N,N) 0.50

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.50(N,N) (N,S) (N,N) 0.50

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.40(N,N) (N,S) (N,N) 0.40

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.40(N,N) (N,S) (N,N) 0.40

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

A B C u(N,N) (N,N) (N,N) 0.85(N,N) (N,N) (N,S) 0.70(N,N) (N,S) (N,N) 0.70

......

(H,H) (H,H) (H,E) 0.30(H,H) (H,H) (H,W) 0.41(H,H) (H,H) (H,H) 0.72

P (nbpBA = allLow, nbpB

C = ¬allLow) P (nbpBA = ¬allLow, nbpB

C = allLow)

P (nbpBA = allLow, nbpB

C = allLow) P (nbpBA = ¬allLow, nbpB

C = ¬allLow)

25

Payoff Matrix

models for missing observations2n

Model probabilities

Y. Xiang and F. Hanshar PGM ‘08

Page 47: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

RMM Update IssuesNeed to compute:

Can compute:(1)(2)

• Specifying joint probabilities is difficult in RMM.

• More difficult as the number of agents increases.

- Each joint probability distribution is larger.

- Number of distributions is exponential .

• Strong independence assumptions are needed to equate (1) & (2).

26

mn

Y. Xiang and F. Hanshar PGM ‘08

P (areaA|moveA) · P (area|moveC)P (areaA, areaC |moveA, moveC)

Page 48: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Outline

• Introduction

• What is Multiagent Expedition?

• Collaborative Design Network

• Graphical Model for Multiagent Expedition

• Recursive Model for Multiagent Expedition

• Experimental Results & Discussion

• Conclusion & Future Work

27Y. Xiang and F. Hanshar PGM ‘08

Page 49: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Agent teams• CDN

- Agents communicate over agent interfaces

• RMM (No communication)

- Agents update belief about state of other agents

• Greedy (No communication)

- GRDU: agent maximizes unilateral utility

- GRDB: agent maximizes bilateral + unilateral utility

28Y. Xiang and F. Hanshar PGM ‘08

Page 50: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Experimental Instances

29

(b)(a)

Y. Xiang and F. Hanshar PGM ‘08

Page 51: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Experimental Instances

29

Dense

(b)(a)

Y. Xiang and F. Hanshar PGM ‘08

Page 52: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Experimental Instances

29

Dense Path

(b)(a)

Y. Xiang and F. Hanshar PGM ‘08

Page 53: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Experimental Results

30

Reward collected

• 30 runs for RMM,CDN,GRDU & GRDB

• 40 time-steps per run

Table 1: Experimental results. Highest means bolded.Barren Dense Path

µ ! µ ! µ !CDN 55.84 4.21 25.14 3.27 20.41 3.39GRDU 48.56 0.56 12.32 0.20 12.20 0.15GRDB 48.64 0.62 18.57 1.10 16.80 2.39RMM 50.35 5.95 18.50 3.39 18.71 2.79

Y. Xiang and F. Hanshar PGM ‘08

Page 54: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Experimental Results

30

Reward collected

• 30 runs for RMM,CDN,GRDU & GRDB

• 40 time-steps per run

Table 1: Experimental results. Highest means bolded.Barren Dense Path

µ ! µ ! µ !CDN 55.84 4.21 25.14 3.27 20.41 3.39GRDU 48.56 0.56 12.32 0.20 12.20 0.15GRDB 48.64 0.62 18.57 1.10 16.80 2.39RMM 50.35 5.95 18.50 3.39 18.71 2.79

Y. Xiang and F. Hanshar PGM ‘08

Page 55: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Experimental Results

30

Reward collected

• 30 runs for RMM,CDN,GRDU & GRDB

• 40 time-steps per run

Table 1: Experimental results. Highest means bolded.Barren Dense Path

µ ! µ ! µ !CDN 55.84 4.21 25.14 3.27 20.41 3.39GRDU 48.56 0.56 12.32 0.20 12.20 0.15GRDB 48.64 0.62 18.57 1.10 16.80 2.39RMM 50.35 5.95 18.50 3.39 18.71 2.79

Y. Xiang and F. Hanshar PGM ‘08

Page 56: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Experimental Results

30

Reward collected

• 30 runs for RMM,CDN,GRDU & GRDB

• 40 time-steps per run

Table 1: Experimental results. Highest means bolded.Barren Dense Path

µ ! µ ! µ !CDN 55.84 4.21 25.14 3.27 20.41 3.39GRDU 48.56 0.56 12.32 0.20 12.20 0.15GRDB 48.64 0.62 18.57 1.10 16.80 2.39RMM 50.35 5.95 18.50 3.39 18.71 2.79

Y. Xiang and F. Hanshar PGM ‘08

Page 57: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Significance Testing

Comparison between CDN and each other method.

31

Table 1: The t-test results.CDN GRDU GRDB RMMBU

Barren!

99.99!

99.99!

99.99Dense

!99.99

!99.99

!99.99

Path!

99.99!

99.99!

96.20

Y. Xiang and F. Hanshar PGM ‘08

Page 58: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

• CDN has higher mean reward collected on all instances than RMM. Why?

- RMM has no communication.

- If multiple local optimal plans exist that involve bilateral action:

- No way for agents to agree which to take.

- What about adopting a social convention?

Performance Discussion

32Y. Xiang and F. Hanshar PGM ‘08

Page 59: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Social Convention• A social convention defines, for each agent what

action to take when multiple optimal actions exist.

• Lexicographic ordering as social convention:

• Assume: u < b

33

(0, 0) (1, 0) (2, 0) (3, 0) (4, 0) (5, 0) (6, 0)

A B Cu b, u b, u uS1 :

S2 : A B Cu b, u b, u b + u

2

Y. Xiang and F. Hanshar PGM ‘08

Page 60: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Discussion / Conclusion• Work motivated by lack of comparative LCF - TCF research

• Setup level

- The agent organization is easier to set-up in LCF.

- TCF more involved.

• Modeling level

- RMM and LCFs are limited by the need to model agent interactions without sufficient information.

- In TCF we design agent interfaces such that the agent sub-domains are rendered conditionally independent to take advantage of communication.

- RMM uses an exponentially complex matrix-based representation. A MAID could be adopted, but the above limitation stands.

34Y. Xiang and F. Hanshar PGM ‘08

Page 61: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Discussion / Conclusion• Decision-making level

- RMM and LCFs must guess about the states of other agents based on observation.

- RMM may misjudge states and may misjudge when multiple optimal joint plans exist.

- Social convention cannot alleviate this difficulty.

- In TCFs conditional independence rendering interfaces convey sufficient states and decisions and lead to better coordination.

35Y. Xiang and F. Hanshar PGM ‘08

Page 62: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Discussion / Conclusions

- Generality

- Both RMM and CDN are decision-theoretic.

- The difference lies in the agent coupling, and promises that our empirical results can generalize to other domains.

36Y. Xiang and F. Hanshar PGM ‘08

Page 63: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

Thanks for listening.

37

Page 64: Tightly and Loosely Coupled Decision Paradigms in ...pgm08.cs.aau.dk/SlidesAndPosters/33-slides.pdfTightly and Loosely Coupled Decision Paradigms in Multiagent Expedition Yang Xiang

References• [Pollock90] M. Pollock and M. Ringuette. Introducing the tileworld: experimentally evaluating agent architectures. In

T. Dietterich and W. Swartout, editors, Proc. of the 8th National Conf. on Artificial Intelligence, pages 183–189, Menlo Park, CA, 1990. AAAI Press.

• [Noh98] S. Noh and P. J. Gmytrasiewicz. Coordination and belief update in a distributed anti-air environment. In Proceedings of the 31st Hawaii International Conference on System Sciences, volume 5, pages 142–151, Los Alamitos, CA, January 1998. IEEE Computer Society.

• [Bern02] The Complexity of Decentralized Control of Markov Decision Processes. D.S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. Mathematics of Operations Research, 27(4):819-840, 2002.

• [Blythe99] Decision-Theoretic Planning, Jim Blythe, AI Magazine, Summer 1999.

• [Lee02] A Multi-agent Q-learning Framework for Optimizing Stock Trading Systems, Jae Won Lee, Jangmin O, Database and Expert Systems Applications : 13th International Conference, DEXA 2002 Aix-en-Provence, France, September 2-6, 2002. Proceedings, Lecture Notes in Computer Science, 2002.

• [Bexker04] R. Becker, S. Zilberstein, V. Lesser, and C. V. Goldman. Solving transition independent decentralized markov decision processes, Journal of Artificial Intelligence Research, 22:423-455, 2004.

• [Xiang04] A Decision-Theoretic Graphical Model for Collaborative Design on Supply Chains, Y. Xiang, J. Chen, and A. Deshmukh. Canadian AI 2004, LNAI 3060, pp. 355-369, 2004.

• [Thrun05] S. Thrun, W. Burgard, and D. Fox. Probabilistic Robotics. MIT Press, 2005.

• [Xiang05] Optimal Design in Collaborative Design Network, Y. Xiang, J. Chen and W.S. Havens, Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, July 2005.

38Y. Xiang and F. Hanshar PGM ‘08