a multi-agent decision framework for ddd iii environment · a loosely coupled network of agents...
TRANSCRIPT
1
June 17, 2003June 17, 2003
A MultiA Multi--Agent Decision FrameworkAgent Decision Frameworkfor DDDfor DDD--III EnvironmentIII Environment
CandraCandra MeirinaMeirinaGeorgiyGeorgiy M. M. LevchukLevchukKrishna R. Krishna R. PattipatiPattipati
Dept. of Electrical and Computer EngineeringDept. of Electrical and Computer EngineeringUniversity of Connecticut University of Connecticut
Contact: Contact: [email protected]@engr.uconn.edu (860) 486(860) 486--28902890
2
Outline
Motivation• Incorporate agent models of human decision-making processes to drive
experiments with larger, partially or fully simulated, organizations• Introduction to the third generation distributed-dynamic-decision-making
(DDD-III) simulator• Agent driven DDD-III simulation: a sample run
Three stage agent decision-making process• Environment sensing • Information processing• Action selection: centralized and auction-based assignment
Results • Scenario 1: Defend a friendly airbase• Scenario 2: Part of A2C2 experiment 8
3
Why Agent-based Models?
Challenges:• Large-scale experiments
(human & synthetic agents)• Analysis of large teams
Proposed Method: Intelligent agent network by utilizing analytic model–based algorithms in UConn’sorganizational design process and human cognitive limitations/biases embedded.
What constitutes an intelligent agent?
• Flexible autonomous agents• Goal oriented• Task knowledge/skills
What is a multi-agent network?A loosely coupled network of agents that work together to solve problems that are beyond the individual capabilities or knowledge of each problem solver.
• Agent Model: Stimulus Hypothesis Option Response (SHOR, Wohl, 1980s)– based cooperative agents
• Multi-Agent Architecture: Heterogeneous communicating network with a flexible control architecture (hierarchy, heterarchy, or hybrid) to optimize a set of objectives (i.e., minimize completion time, minimize internal–external workloads, maximize total gain, etc.)
• Embed agents into DDD-III simulator
4
Task Execution in DDD–III SimulatorDDD-III simulator provides a controllable, multi-player, multi-platform, real-time
organizational environment
Each player represents a
decision-maker (DM)
A set of humans or agents or hybrid-humans-agents working together as a team are responsible to execute a set of tasks
Platforms: physical assets
(e.g., ships, helicopters, Ground units, bases, etc.)
4 resource types: 3 units of type 2 and
3 units of type 4
Task execution by a single
platform
Task execution by collaborating
platforms
Tasks: (e.g., hostile fighter, minefield,
friendly fighter, etc.)
P2 P3 T1P1
0 1 0 0
3 0 3 3DM-Resourcecapabilities
Task-Resourcerequirements 0 0 0 0
3 2 1 3
5
Integrated Framework
DDD-III
Scenario Generation
External Conduit
DDD Action Executionand Object Status
Update
DDD – Agent
Planning via3-Phase Design with a
Fixed Organization
Re-planning(Event-Based-Scheduling)
Communication Socket
Information Exchanges and Messages
Multi – Agent Network
Other Agents
3-Phase Design Algorithms
HierarchyGroupingScheduling
Static Mission &Org. Data
DynamicEvent Data
Strategies(Task-Plat.
Alloc., Sched.)
StaticMission
Data
Dynamic Mission and Event data
6
DDD Agents Within DDD–III Paradigm
DDD-III(LINUX)
SOCKET INTERFACE
AGENTS(WINDOWS)
Three-Stage Agent Model
Environment Sensing (ES)• Receives information about existing objects (tasks, assets,
and other DMs) from DDD via external conduit• Inquires and receives information about existing objects from
DDD or other DMs via the communication link
Information Processing (IP)Processes information via a set of computational algorithms based on limited knowledge of environment (errors in estimatingTask-resource requirements, errors in task and asset locations,limited knowledge of other DMs’ capabilities, etc.)
Action Selection (AS)• Selects actions according to a set of algorithmic rules• Dynamically updates its schedule as new information becomes
available
8
Environment Sensing Sub-model
Are there any tasks within detection range?
Can they be identified as hostile or friendly?
Can their resource requirements be measured?
Are there other members in the team?
Who owns what?
Who should be notified?
All subordinates of the current DM and his superior?
All of the team members? Simplify communication pattern(suitable for centralized C2)
Suitable for distributed C2Potential coordinating partners?
9
Information Processing Sub-modelCentralized implementation
Addresses the question of ‘what should be done now?’
READY Tasks: • Identified• Measured
known (estimated) resource reqs., stayingtime, location, speed, course, value, etc.
• Satisfy precedence constraints
FREE Platforms: Unassigned to any tasksat present
Environment Sensing (ES)
Select from FREE, subset of platforms that satisfy task resource requirements at least partially → FREE1
Order tasks by priority: • High task value• Time criticality
10
Action Selection Sub-modelAddresses the question of ‘who should do what and when?’
WHILE READY is not empty: Select from READY a task i with the highest priority
WHILE i’s resource requirements are not satisfiedSelect from FREE1 a platform with thehighest execution accuracy and minimumimpact on other tasks
END WHILE
Add task to ACTION queueEND WHILE
PeriodicOR
Event Driven
WHILE ACTION is not empty: Select from ACTION a task i (breadth first)
Execute i: Move closer, pursue, attack, coordinated attack, etc.
END WHILE
11
Scenario 1: Defend a Friendly Airbase (1)
Internal Workload Distribution
05
10152025303540
Decisionmaker ID
Inte
rnal
Wor
kloa
d
DM0DM1
DM2DM3
DM4DM5DM6
Gain Accumulation Over Time
0
20
40
60
80
100
120
1.47
2.27
2.75
3.07 3.4
3.8
4.13 4.6
4.82
5.22
5.37
5.67 5.8
6.47
7.13
7.45
7.98
8.87
Time (minutes)
Tota
l Gai
n
Workload distribution is not balanced Processing time: 8.87 minutes
Why ?
Task Arrival
01020304050607080
1 2 3 4 5
Time (min)
Num
ber
of T
asks
Each DM
XXXXXX2GNS2GNS
XXXXXX2A92A9
1F15XXX2A122A7
1F14XXX2A122A12
1BAS1AW4F148F15
Performance Measures:• Accrued Gain Over Time:
Measure of team efficiency in processing tasks → accuracy and timeliness
• Workload Distribution Among DMs:Balanced workload over all DMs is desired. Higher workload and increased differences in workload lead to degraded organizational performance
Aggregated operating
assets of a DM
Scenario:• A team of 7 identical DMs defend a friendly airbase• One hundred tasks arrive randomly from random directions
12
Scenario 1: Defend a Friendly Airbase (2)
• Basic strategy: DMs have identical capability to undertake the incoming tasks → handle tasks with minimal effort (fuel efficiency) → minimum platform-to-task distance
• Uneven platform spread among DMs → uneven platform-to-task distances among platforms belonging to different DMs → increased workload disparity
• Strategy adjustment: Better initial platform placement → more balanced workload distribution among DMs
Internal Workload Distribution
0
5
10
15
20
25
30
Decisionmaker ID
Inte
rnal
Wor
kloa
d
DM0DM1DM2DM3DM4DM5DM6
Gain Accumulation Over Time
0
20
40
60
80
100
120
1.3
2.03 2.2
2.7
3.08
3.28
3.75
4.08 4.3
4.7
5.05
5.08
5.33 5.6
6.05 6.6
7.2
8.38
Time (minutes)
Tota
l Gai
n
Lower processing time: 8.38 minutes(6% improvement)
Balanced workload distribution
13
Example 2: Part of A2C2 Experiment 8A team of 6 heterogeneous agents coordinate to execute a set of 15 complex tasks
(with values ranging from 0 to 50)FUNCTIONAL
D DM 1 2 3 4 5 6I Platform STRIKE BMD ISR AWC SuWC/MINES SOF/SARV 1 CVN 2F18S xxx 1UAV 2F18A, E2C 1FAB, 1MH53 1HH60I 2 DDGA 8TLAM 3ABM,4TTOM 1UAV 6SM2 1FAB, 2HARP 1HH60,1SOFS 3 DDGB 8TLAM 3ABM,4TTOM 1UAV 6SM2 1FAB, 2HARP 1HH60,1SOFI 4 CG 8TLAM 3ABM 1UAV 6SM2 1FAB,2HARP,1MH53 1HH60O 5 FFG* 2F18S xxx 1UAV 2F18A,E2C,4SM2 1FAB,2HARP,1MH53 1HH60N 6 DDGC 8TLAM 3ABM,4TTOM 1UAV 6SM2 1FAB, 2HARP 1HH60,1SOFAL
Differentiated by resource
requirements
F
D
START CMDCTR
NBW ABW
PORT
ABE NBE
3SOF
2SOF+2FAB
2SOF+2FAB
6STRK
2SOF
6STRK
2SOFBLOW
BRIDGE
Functional Scenario - f
RescueEfforts
START CMDCTR
NBW ABW
PORT
ABE NBE
1SOF+2STRK
1SOF+2STRK+1FAB
1SOF+2STRK +1FAB
2STRK+1FAB
1SOF+1STRK
1SOF+2STRK
BLOWBRIDGE
1SOF+2STRK
Divisional Scenario - d
RescueEfforts
14
Divisional Scenario (d): Accrued Gain
-50
0
50
100
150
200
250
300
350
0 80 160
240
320
400
480
560
640
720
800
880
960
1040
1120
1200
1280
1360
1440
Time (secs)
Acc
rued
Gai
nD_on_d F_on_d
Centralized Assignment
32.8
33
33.2
33.4
33.6
33.8
34
34.2
34.4
Organizations
Gai
n A
rea
(x10
000)
Divisional
Functional
∆=2.5%
Congruence →higher gain
15
Divisional Scenario (d): Workload Distribution
The external coordination workload of a DM is the sum of its direct coordination (aggregated time associated with simultaneous processing
of the same set of tasks) with other DMs
( )
( ) ( )∑
∑
=
≠=
⋅=
=
Task
DM
N
iiliki
N
kll
tuulkD
lkDkE
1
,1
,min,
,)(
0
100
200
300
400
500
600
700
1 2 3 4 5 6
DM ID
Agg
rega
ted
Coo
rdin
atio
n Ti
me
Divisional Organization
0
100
200
300
400
500
600
700
800
1 2 3 4 5 6
DM ID
Agg
rega
ted
Coo
rdin
atio
n Ti
me
Functional Organization
Incongruence →more time spent on coordination
16
-50
0
50
100
150
200
250
300
350
020
028
036
044
052
060
068
076
084
092
010
0010
8011
6012
4013
2014
0014
90
Time (secs)
Acc
rued
Gai
nF_on_f D_on_f
Centralized Assignment
24.8
25
25.2
25.4
25.6
25.8
26
26.2
26.4
26.6
Gai
n A
rea
(x10
000)
Functional
Divisional
∆=4.0%
Organizations
Tasks require multiple
resources of the same type
Functional Mission (f): Accrued Gain
17
Centralized Assignment:Performance Comparison
240
260
280
300
320
340
360
Gai
n A
rea
(X10
00)
Scenario f Scenario dD/f
D/d
F/d
F/f
Functional scenario
generates smaller gain
area
Congruent with Functional OrganizationCongruent with Divisional Organization
Incongruent Organization-mission pair
170
220
270
320
370
420
470
Ave
rage
Ext
erna
l Coo
rdin
atio
n
Scenario f Scenario d
D/f
D/d
F/d
F/f
Divisional organization copes better with higher
coordination requirement
150
200
250
300
350
400
450
263.9 253.4 341.8 333.3
Gain Area (x1000)
Ave
rage
Ext
erna
l Coo
rdin
atio
n
F/f
F/dD/f
D/d
18
Platform-to-Task Allocation via Auction
Match buyers to sellers to minimizesum of excess prices
↔A task selects the ‘best’ subset of platform(s)
Each platform is assigned to the ‘most attractive’ task
Price adjustment:
•Platform sets a current price•Task adjusts its offer
Platforms (sellers): Find highest offered
price
Task (buyer): Find cheapestavailable price
19
Action Selection via AuctionWHILE READY is not empty:
Select from READY a task i with highest priority
WHILE i’s resource requirement is not satisfiedSelect to bid from FREE1 a platform withthe highest execution accuracy and minimumimpact on other tasks
END WHILE
Add task to AUCTION_READY queueEND WHILE
WHILE ACTION is not empty: Select from ACTION a task i (breadth first)
Execute i: Move closer, pursue, attack, coordinated attack, etc.
END WHILE
WHILE not all members of AUCTION_READY is MATCHED: Select from AUCTION_READY a task i with highest priorityBid for all selected platformsPlatforms offer themselves to the highest bidders
Adjust the bid prices:WHILE i’s resource requirements are not satisfied
Select to bid from FREE1 a platform with the highest executionaccuracy and minimum impact on other tasks based on theadjusted prices
END WHILE
END WHILE
Auction Initialization Action Execution
Auction Process
20
Accrued Task Gain
-50
0
50
100
150
200
250
300
0 80 160
240
300
380
460
540
620
700
780
860
940
1020
1100
1180
1260
1340
1420
1500
Time (secs)A
ccru
ed G
ain
F_on_f D_on_f
Functional Scenario
-50
0
50
100
150
200
250
300
0 80 160
240
300
380
460
540
620
700
780
860
940
1020
1100
1180
1260
1340
1420
1500
Time (secs)
Acc
rued
Gai
n
D_on_d F_on_d
Divisional ScenarioShorter
completion time
Lower accrued gain
Auction (Functional Scenario)
26.5
27
27.5
28
28.5
29
29.5
Gai
n A
rea
(x10
000)
Functional
Divisional
∆=6.4%
Organizations
Auction (Divisional Scenario)
28.1
28.2
28.3
28.4
28.5
28.6
28.7
28.8
28.9
29
Organizations
Gai
n A
rea
(x10
000)
Divisional
Functional
∆=1.3%
Organizations
21
Congruent with Functional OrganizationCongruent with Divisional Organization
Incongruent Organization-mission pair
170
220
270
320
370
420
Ave
rage
Ext
erna
l Coo
rdin
atio
n
Scenario f Scenario d
D/f
D/d
F/d
F/f270
275
280
285
290
295
Gai
n A
rea
(X10
00)
Scenario f Scenario d
D/f
D/d
F/d
F/f
Auction-based Assignment:Performance Comparison
200
220
240
260
280
300
320
340
360
293.5 274.8 287.7 284
Gain Area (x1000)
Ave
rage
Ext
erna
l Coo
rdin
atio
n
F/f
F/dD/f
D/d
Incongruence → performance
degradation
22
Summary and Future Work
Results from the DDDIII-based agent framework demonstrate the potential of utilizing agents to drive large-scale C2 experiments
Extend the implementation to distributed decision-making processes via limited look-ahead, improved auction-based algorithm
Incorporate human cognitive limitations into the agent model to simulate more realistic decision-making processes
Extend the system to an integrated, dynamic, decision support system