a multi-agent decision framework for ddd iii environment · a loosely coupled network of agents...

1

June 17, 2003June 17, 2003

A MultiA Multi--Agent Decision FrameworkAgent Decision Frameworkfor DDDfor DDD--III EnvironmentIII Environment

CandraCandra MeirinaMeirinaGeorgiyGeorgiy M. M. LevchukLevchukKrishna R. Krishna R. PattipatiPattipati

Dept. of Electrical and Computer EngineeringDept. of Electrical and Computer EngineeringUniversity of Connecticut University of Connecticut

Contact: Contact: [email protected]@engr.uconn.edu (860) 486(860) 486--28902890

mailto:[email protected]

2

Outline

Motivation• Incorporate agent models of human decision-making processes to drive

experiments with larger, partially or fully simulated, organizations• Introduction to the third generation distributed-dynamic-decision-making

(DDD-III) simulator• Agent driven DDD-III simulation: a sample run

Three stage agent decision-making process• Environment sensing • Information processing• Action selection: centralized and auction-based assignment

Results • Scenario 1: Defend a friendly airbase• Scenario 2: Part of A2C2 experiment 8

3

Why Agent-based Models?

Challenges:• Large-scale experiments

(human & synthetic agents)• Analysis of large teams

Proposed Method: Intelligent agent network by utilizing analytic model–based algorithms in UConn’sorganizational design process and human cognitive limitations/biases embedded.

What constitutes an intelligent agent?

• Flexible autonomous agents• Goal oriented• Task knowledge/skills

What is a multi-agent network?A loosely coupled network of agents that work together to solve problems that are beyond the individual capabilities or knowledge of each problem solver.

• Agent Model: Stimulus Hypothesis Option Response (SHOR, Wohl, 1980s)– based cooperative agents

• Multi-Agent Architecture: Heterogeneous communicating network with a flexible control architecture (hierarchy, heterarchy, or hybrid) to optimize a set of objectives (i.e., minimize completion time, minimize internal–external workloads, maximize total gain, etc.)

• Embed agents into DDD-III simulator

4

Task Execution in DDD–III SimulatorDDD-III simulator provides a controllable, multi-player, multi-platform, real-time

organizational environment

Each player represents a

decision-maker (DM)

A set of humans or agents or hybrid-humans-agents working together as a team are responsible to execute a set of tasks

Platforms: physical assets

(e.g., ships, helicopters, Ground units, bases, etc.)

4 resource types: 3 units of type 2 and

3 units of type 4

Task execution by a single

platform

Task execution by collaborating

platforms

Tasks: (e.g., hostile fighter, minefield,

friendly fighter, etc.)

P2 P3 T1P1

0 1 0 0

3 0 3 3DM-Resourcecapabilities

Task-Resourcerequirements 0 0 0 0

3 2 1 3

5

Integrated Framework

DDD-III

Scenario Generation

External Conduit

DDD Action Executionand Object Status

Update

DDD – Agent

Planning via3-Phase Design with a

Fixed Organization

Re-planning(Event-Based-Scheduling)

Communication Socket

Information Exchanges and Messages

Multi – Agent Network

Other Agents

3-Phase Design Algorithms

HierarchyGroupingScheduling

Static Mission &Org. Data

DynamicEvent Data

Strategies(Task-Plat.

Alloc., Sched.)

StaticMission

Data

Dynamic Mission and Event data

6

DDD Agents Within DDD–III Paradigm

DDD-III(LINUX)

SOCKET INTERFACE

AGENTS(WINDOWS)

Three-Stage Agent Model

Environment Sensing (ES)• Receives information about existing objects (tasks, assets,

and other DMs) from DDD via external conduit• Inquires and receives information about existing objects from

DDD or other DMs via the communication link

Information Processing (IP)Processes information via a set of computational algorithms based on limited knowledge of environment (errors in estimatingTask-resource requirements, errors in task and asset locations,limited knowledge of other DMs’ capabilities, etc.)

Action Selection (AS)• Selects actions according to a set of algorithmic rules• Dynamically updates its schedule as new information becomes

available

8

Environment Sensing Sub-model

Are there any tasks within detection range?

Can they be identified as hostile or friendly?

Can their resource requirements be measured?

Are there other members in the team?

Who owns what?

Who should be notified?

All subordinates of the current DM and his superior?

All of the team members? Simplify communication pattern(suitable for centralized C2)

Suitable for distributed C2Potential coordinating partners?

9

Information Processing Sub-modelCentralized implementation

Addresses the question of ‘what should be done now?’

READY Tasks: • Identified• Measured

known (estimated) resource reqs., stayingtime, location, speed, course, value, etc.

• Satisfy precedence constraints

FREE Platforms: Unassigned to any tasksat present

Environment Sensing (ES)

Select from FREE, subset of platforms that satisfy task resource requirements at least partially → FREE1

Order tasks by priority: • High task value• Time criticality

10

Action Selection Sub-modelAddresses the question of ‘who should do what and when?’

WHILE READY is not empty: Select from READY a task i with the highest priority

WHILE i’s resource requirements are not satisfiedSelect from FREE1 a platform with thehighest execution accuracy and minimumimpact on other tasks

END WHILE

Add task to ACTION queueEND WHILE

PeriodicOR

Event Driven

WHILE ACTION is not empty: Select from ACTION a task i (breadth first)

Execute i: Move closer, pursue, attack, coordinated attack, etc.

END WHILE

11

Scenario 1: Defend a Friendly Airbase (1)

Internal Workload Distribution

05

10152025303540

Decisionmaker ID

Inte

rnal

Wor

kloa

d

DM0DM1

DM2DM3

DM4DM5DM6

Gain Accumulation Over Time

0

20

40

60

80

100

120

1.47

2.27

2.75

3.07 3.4

3.8

4.13 4.6

4.82

5.22

5.37

5.67 5.8

6.47

7.13

7.45

7.98

8.87

Time (minutes)

Tota

l Gai

n

Workload distribution is not balanced Processing time: 8.87 minutes

Why ?

Task Arrival

01020304050607080

1 2 3 4 5

Time (min)

Num

ber

of T

asks

Each DM

XXXXXX2GNS2GNS

XXXXXX2A92A9

1F15XXX2A122A7

1F14XXX2A122A12

1BAS1AW4F148F15

Performance Measures:• Accrued Gain Over Time:

Measure of team efficiency in processing tasks → accuracy and timeliness

• Workload Distribution Among DMs:Balanced workload over all DMs is desired. Higher workload and increased differences in workload lead to degraded organizational performance

Aggregated operating

assets of a DM

Scenario:• A team of 7 identical DMs defend a friendly airbase• One hundred tasks arrive randomly from random directions

12

Scenario 1: Defend a Friendly Airbase (2)

• Basic strategy: DMs have identical capability to undertake the incoming tasks → handle tasks with minimal effort (fuel efficiency) → minimum platform-to-task distance

• Uneven platform spread among DMs → uneven platform-to-task distances among platforms belonging to different DMs → increased workload disparity

• Strategy adjustment: Better initial platform placement → more balanced workload distribution among DMs

Internal Workload Distribution

0

5

10

15

20

25

30

Decisionmaker ID

Inte

rnal

Wor

kloa

d

DM0DM1DM2DM3DM4DM5DM6

Gain Accumulation Over Time

0

20

40

60

80

100

120

1.3

2.03 2.2

2.7

3.08

3.28

3.75

4.08 4.3

4.7

5.05

5.08

5.33 5.6

6.05 6.6

7.2

8.38

Time (minutes)

Tota

l Gai

n

Lower processing time: 8.38 minutes(6% improvement)

Balanced workload distribution

13

Example 2: Part of A2C2 Experiment 8A team of 6 heterogeneous agents coordinate to execute a set of 15 complex tasks

(with values ranging from 0 to 50)FUNCTIONAL

D DM 1 2 3 4 5 6I Platform STRIKE BMD ISR AWC SuWC/MINES SOF/SARV 1 CVN 2F18S xxx 1UAV 2F18A, E2C 1FAB, 1MH53 1HH60I 2 DDGA 8TLAM 3ABM,4TTOM 1UAV 6SM2 1FAB, 2HARP 1HH60,1SOFS 3 DDGB 8TLAM 3ABM,4TTOM 1UAV 6SM2 1FAB, 2HARP 1HH60,1SOFI 4 CG 8TLAM 3ABM 1UAV 6SM2 1FAB,2HARP,1MH53 1HH60O 5 FFG* 2F18S xxx 1UAV 2F18A,E2C,4SM2 1FAB,2HARP,1MH53 1HH60N 6 DDGC 8TLAM 3ABM,4TTOM 1UAV 6SM2 1FAB, 2HARP 1HH60,1SOFAL

Differentiated by resource

requirements

F

D

START CMDCTR

NBW ABW

PORT

ABE NBE

3SOF

2SOF+2FAB

2SOF+2FAB

6STRK

2SOF

6STRK

2SOFBLOW

BRIDGE

Functional Scenario - f

RescueEfforts

START CMDCTR

NBW ABW

PORT

ABE NBE

1SOF+2STRK

1SOF+2STRK+1FAB

1SOF+2STRK +1FAB

2STRK+1FAB

1SOF+1STRK

1SOF+2STRK

BLOWBRIDGE

1SOF+2STRK

Divisional Scenario - d

RescueEfforts

14

Divisional Scenario (d): Accrued Gain

-50

0

50

100

150

200

250

300

350

0 80 160

240

320

400

480

560

640

720

800

880

960

1040

1120

1200

1280

1360

1440

Time (secs)

Acc

rued

Gai

nD_on_d F_on_d

Centralized Assignment

32.8

33

33.2

33.4

33.6

33.8

34

34.2

34.4

Organizations

Gai

n A

rea

(x10

000)

Divisional

Functional

∆=2.5%

Congruence →higher gain

15

Divisional Scenario (d): Workload Distribution

The external coordination workload of a DM is the sum of its direct coordination (aggregated time associated with simultaneous processing

of the same set of tasks) with other DMs

( )

( ) ( )∑

∑

=

≠=

⋅=

=

Task

DM

N

iiliki

N

kll

tuulkD

lkDkE

1

,1

,min,

,)(

0

100

200

300

400

500

600

700

1 2 3 4 5 6

DM ID

Agg

rega

ted

Coo

rdin

atio

n Ti

me

Divisional Organization

0

100

200

300

400

500

600

700

800

1 2 3 4 5 6

DM ID

Agg

rega

ted

Coo

rdin

atio

n Ti

me

Functional Organization

Incongruence →more time spent on coordination

16

-50

0

50

100

150

200

250

300

350

020

028

036

044

052

060

068

076

084

092

010

0010

8011

6012

4013

2014

0014

90

Time (secs)

Acc

rued

Gai

nF_on_f D_on_f

Centralized Assignment

24.8

25

25.2

25.4

25.6

25.8

26

26.2

26.4

26.6

Gai

n A

rea

(x10

000)

Functional

Divisional

∆=4.0%

Organizations

Tasks require multiple

resources of the same type

Functional Mission (f): Accrued Gain

17

Centralized Assignment:Performance Comparison

240

260

280

300

320

340

360

Gai

n A

rea

(X10

00)

Scenario f Scenario dD/f

D/d

F/d

F/f

Functional scenario

generates smaller gain

area

Congruent with Functional OrganizationCongruent with Divisional Organization

Incongruent Organization-mission pair

170

220

270

320

370

420

470

Ave

rage

Ext

erna

l Coo

rdin

atio

n

Scenario f Scenario d

D/f

D/d

F/d

F/f

Divisional organization copes better with higher

coordination requirement

150

200

250

300

350

400

450

263.9 253.4 341.8 333.3

Gain Area (x1000)

Ave

rage

Ext

erna

l Coo

rdin

atio

n

F/f

F/dD/f

D/d

18

Platform-to-Task Allocation via Auction

Match buyers to sellers to minimizesum of excess prices

↔A task selects the ‘best’ subset of platform(s)

Each platform is assigned to the ‘most attractive’ task

Price adjustment:

•Platform sets a current price•Task adjusts its offer

Platforms (sellers): Find highest offered

price

Task (buyer): Find cheapestavailable price

19

Action Selection via AuctionWHILE READY is not empty:

Select from READY a task i with highest priority

WHILE i’s resource requirement is not satisfiedSelect to bid from FREE1 a platform withthe highest execution accuracy and minimumimpact on other tasks

END WHILE

Add task to AUCTION_READY queueEND WHILE

WHILE ACTION is not empty: Select from ACTION a task i (breadth first)

Execute i: Move closer, pursue, attack, coordinated attack, etc.

END WHILE

WHILE not all members of AUCTION_READY is MATCHED: Select from AUCTION_READY a task i with highest priorityBid for all selected platformsPlatforms offer themselves to the highest bidders

Adjust the bid prices:WHILE i’s resource requirements are not satisfied

Select to bid from FREE1 a platform with the highest executionaccuracy and minimum impact on other tasks based on theadjusted prices

END WHILE

END WHILE

Auction Initialization Action Execution

Auction Process

20

Accrued Task Gain

-50

0

50

100

150

200

250

300

0 80 160

240

300

380

460

540

620

700

780

860

940

1020

1100

1180

1260

1340

1420

1500

Time (secs)A

ccru

ed G

ain

F_on_f D_on_f

Functional Scenario

-50

0

50

100

150

200

250

300

0 80 160

240

300

380

460

540

620

700

780

860

940

1020

1100

1180

1260

1340

1420

1500

Time (secs)

Acc

rued

Gai

n

D_on_d F_on_d

Divisional ScenarioShorter

completion time

Lower accrued gain

Auction (Functional Scenario)

26.5

27

27.5

28

28.5

29

29.5

Gai

n A

rea

(x10

000)

Functional

Divisional

∆=6.4%

Organizations

Auction (Divisional Scenario)

28.1

28.2

28.3

28.4

28.5

28.6

28.7

28.8

28.9

29

Organizations

Gai

n A

rea

(x10

000)

Divisional

Functional

∆=1.3%

Organizations

21

Congruent with Functional OrganizationCongruent with Divisional Organization

Incongruent Organization-mission pair

170

220

270

320

370

420

Ave

rage

Ext

erna

l Coo

rdin

atio

n


D/f

D/d

F/d

F/f270

275

280

285

290

295

Gai

n A

rea

(X10

00)


D/f

D/d

F/d

F/f

Auction-based Assignment:Performance Comparison

200

220

240

260

280

300

320

340

360

293.5 274.8 287.7 284

Gain Area (x1000)

Ave

rage

Ext

erna

l Coo

rdin

atio

n

F/f

F/dD/f

D/d

Incongruence → performance

degradation

22

Summary and Future Work

Results from the DDDIII-based agent framework demonstrate the potential of utilizing agents to drive large-scale C2 experiments

Extend the implementation to distributed decision-making processes via limited look-ahead, improved auction-based algorithm

Incorporate human cognitive limitations into the agent model to simulate more realistic decision-making processes

Extend the system to an integrated, dynamic, decision support system

a multi-agent decision framework for ddd iii environment · a loosely coupled network of agents...

Documents