multiagent coordination and cooperation: challenges and techniques

2

No Agent is an Island• Monitoring electricity networks (Jennings)

• Distributed design and engineering (Petrie et al.)

• Distributed meeting scheduling (Sen & Durfee)

• Teams of robotic systems acting in hostile environments (Balch & Arkin, Tambe)

• Collaborative Internet-agents (Etzioni & Weld, Weiss)

• Collaborative interfaces (Grosz & Ortiz, Andre)

• Information agent on the Internet (Klusch)

• Cooperative transportation scheduling (Fischer)

• Supporting hospital patient scheduling (Decker & Jin)

3

Design of automated agents to interact effectively

• Coordinate: to act upon one another in harmony (necessary)

• Cooperate: to work together (beneficial)

• Example: driving in Tel-Aviv v.s. Driving in a convoy.

4

Teams and Individuals

• Teams of agents that need to coordinate joint activities; problems: distributed information, distributed decision solving, local conflicts.

• Self-motivated agents acting in the same environment; problems: need motivation to cooperate , conflict resolution, trust, distributed and hidden information.

5

Cooperation and Coordination by Others

• Other entities coordinate their actions and cooperate in multi-entities environments: humans, animals, computers, particles.

• Formal theories: game-theory, decision theory, physics, logic.

• Non-formal theories: organizational theories, political science theories, “advisory” negotiation.

6

Using other disciplines’ results

• No need to start from scratch!• Required modification and adjustment; AI

gives insights and complimentary methods.

• Is it worth it to use formal methods for multiagent Systems?

7

Negotiations in the Pollution Sharing Problem

Collaborator: Esti Freitsis

(forthcoming book “Strategic Negotiation in Multiagent Environments”, MIT Press)

8

Environment Description

• There are some closely grouped plants in an industrial region.

• Each plant can produce several types of products and. has a utility function (profit).

• There are several types of pollutants.• Each plant has norms, restricting maximal

emission of each pollutant that it emits. We refer to the situation when only these norms have to be carried out as usual circumstances.

9

Special circumstances

• Sometimes there is a need to reduce pollution for some period because of external factors such as weather (high humidity, wind towards residential area). In this case plants receive new norms. We refer to this situation as special circumstances.

10

Current solution

• Current solution: each plant reduce pollution according to the new norms.

• Disadvantage: for one plant it is less costly to reduce one substance while for another it is less costly to reduce another substance.

11

Negotiations

• Our solution: plants negotiate to reach beneficial agreements about the emission of what substances and by which percent each of them must be reduced.– The conflict solution: following the new norms.– First, we consider complete information

situations.

ICMAS-98 12

Strategic Negotiation Model

• Model of alternative offers (Rubinstein) which takes negotiation time into consideration: reduces negotiation time.

• During the strategic-negotiations agents communicate their respective desires to reach mutually beneficial agreement.

• The model provides a unified to many problems.

ICMAS-98 13

Structure of the Negotiation There are N self motivated agent, randomly

designated 1,2,... All the agents negotiate to reach an agreement The negotiation process may include several

equidistant iterations 0,1,2… ־Time and can continue forever. In each time period t, agent j(t) =t mod N makes an offer.

ICMAS-98 14

Structure of the Negotiation - cont.

The other agents respond simultaneously: YES or NO or OPT.– If the offer was accepted by all the agents:

the last offer is implemented.– If at least one agent opts out:

a conflict occurs.– Otherwise (the offer was rejected by at least

one agent), the negotiation proceeds to period t+1.

15

Negotiations Protocols

• Simultaneous responses:an agent responding to an offer is not informed of the other responses.

• Sequential responses: an agent responding to an offer is informed of the responses of the preceding agents (assuming that the agents are ordered).

ICMAS-98 16

EquilibriumNash equilibrium:

A strategy profile p is a Nash Equilibriumif no player has a different strategy yielding an outcome that he prefers to that generated when it chooses pi.

Subgame Perfect Equilibrium:If the strategy profile induced in every subgame is a Nash Equilibrium of this subgame.

17

Negotiations strategies for simultaneous responses

• For each possible agreement x that is better to all the plants than the conflict solution there is a subgame-perfect equilibrium of the bargaining game, with the outcome x offered and unanimously accepted in period 0.

ICMAS-98 18

Choosing the Allocation

• The owners of the plants can agree in advance on a joint technique for choosing x:

• giving each server its conflict utility.

• maximizing a social welfare criterion:– the sum of the servers’ utilities.

– or the generalized Nash product of the servers’ utilities:(Us(x)-Us(conflict)).

19

Negotiations strategies for sequential responses

• Assumption: there is a time period, T where negotiation cannot continue anymore. In T the conflict allocation is implemented.

• Perfect equilibrium by backward induction: – At T-1 if negotiations hasn’t ended, AT-1 suggests the best

agreement to itself which is better to all agents than the conflict solution (denoted by OT-1 ); the other agents accept.

– At T-2, AT-2 suggests the best agreement to itself which is better to all agents than the conflict solution and OT-1 (denoted by OT-2). The other agents accept.

– By induction, at the first time period A0 O0 the others accept.

20

Assumptions about the environment

• Profit is a linear function of the number of items of each product produced by the plant

• Pollution is a linear function of the number of items of each product produced.

21

Techniques that were checked

• Sequential response: backtracking• Simultaneous response:

– Maximization of the sum with guaranties of default profit (MaxSum)

– Maximization of the sum and Nash Products with side payments (MaxSumNash)

• Simplex - method for linear optimization– Maximization of the Nash Product:

• Praxis - method for multi-variable nonlinear function minimization.

• Hill Climbing

22

Simulation Parameters

• Number of plants is varied from 5 to 20.• Number of pollution types is varied from 5 to 20.

For each product pollution of some type is produced with probability 1/2.

• Each plant produces Max_prod different types of products. Max_prod is varied from 5 to 20. Pollution and profit per item of product and pollution constraints are set randomly.

• Results: Average of 25 simulation runs.

23

Plants’ utility as the function of the number of plants

0

200

400

600

800

1000

1200

5 10 15 20

Number of Plants

Uti

lity

pe

r P

lan

t

MaxSum

Nash Praxis

BackTracking

Nash Hill climbing

MaxSumNash

24

Plants’ utility as a function of the number of products

0

200

400

600

800

1000

1200

5 10 15 20

Number of Products

Uti

lity

pe

r P

lan

t

MaxSum

Nash Praxis

BackTracking

Nash Hill climbing

MaxSumNash

25

Plants’ utility as the function of the number of pollutants

0

50

100

150

200

250300

350

400

450

500

5 10 15 20

Number of Pollutants

Uti

lity

pe

r P

lan

t

MaxSum

Nash Praxis

BackTracking

Nash Hill climbing

MaxSumNash

26

Conclusions (Complete Information)

• Simultaneous response: – If side payments are permitted the MaxSumNash

method is the best.

– If side payments are not permitted either BackTracking or MaxSum should be used.

• Sequential response: BackTracking should be used.

• Techniques: game theory, heuristic search, optimization methods

27

Incomplete Information

• In real world situations the plants do not have complete information about each other’s utility function.

• Solution: using economic theories for distributed mechanisms for reallocation of resource in “markets” with many agents and many divisible resources (Wellman 93).

28

General Equilibrium theory

• The general-equilibrium theory studies how the market prices are determined by the actions of the individuals.

• General equilibrium is obtained when a set of prices is found such that supply meets demand for each good and where the agents optimize their use of the goods at the current price levels.

29

General Equilibrium theory (Cont)

• Assumption: each agent behaves competitively - it takes prices as given, independently of its actions.

• Used for distributed mechanisms for resources allocation in environments with many agents and many divisible resources (Welman).

30

Tatonnement

• It is a price-adjustment process (Wallras1954). • The tatonnement process starts with some arbitrary

price vector p0.

• The agents determine their demand at those prices and report the quantities demanded from an “auctioneer”.

• The auctioneer repeatedly adjusts the prices, pt+1=pt+(quantity_demanded-quantity_available )

31

Tatonnement (Cont)

• If the sequence p0,p1,... converges then the result is competitive equilibrium.

• However, the tatonnement process does not converge to equilibrium in general.

• Gross substitutability: if the price for one good rises, the demand for other goods does not decrease.

• In the pollution allocation environment this condition does not hold.

32

Tatonnement (Cont)

• Moreover, in our case the utility functions are the result of constrained optimization and therefore the aggregate demand function is not continuous

• Thus, general equilibrium does not always exists!

33

Market Mechanisms

• We propose three algorithms for finding suboptimal solution of the pollution allocation problem.

• Tatonnement based mechanism: Competitive Equilibrium Market (CEM): the allocation of the pollutants is performed only after the process is terminated; very similar to WALRAS algorithm [Wellman].

34

Greedy market mechanisms

• Market-Clearing with Intermediate Transactions (MCIT)

• Market-Clearing Intermediate Exchange (MCIE)

• A redistribution of the pollutants is done in each cycle of the mechanism. In MCIT a monetary transaction is performed after each cycle and in the MCIE exchange of two pollutants is done after each cycle.

35

The Three Market Mechanisms

• In all the mechanisms, at the beginning of the process the plants are allowed to emit their default allocation.

• In each cycle of the three mechanisms the auctioneer chooses one (or two in MCIE) of the pollutants randomly, and tries to determine its clearing price - the price at which demand is equal to supply, while keeping the prices of the other pollutants fixed. It uses binary search to find the clearing price.

36

Market Mechanisms (Cont)

• The process is terminated when the prices do not change for a predefined number of iterations, or when it reaches the predefined maximal number of iterations.

• The differences from the Tatonnement:– the procedure used to find the clearing prices– the division of the pollutants given the clearing

prices– the maximization problem is solved by the plants

when computing their demands.

37

The Influence of the Number of Plants on Plants’ Utilities

0

200

400

600

800

1000

1200

5 10 15 20

Number of Plants

Uti

lity

pe

r P

lan

t

BackTracking

MCIE

MaxSumNash

CEM

MCIT

38

The Influence of the Number of Products per Plant on the

Plants’ Utilities

0

200

400

600

800

1000

1200

5 10 15 20

Number of Products

Uti

lity

pe

r P

lan

t

BackTracking

MCIE

MaxSumNash

CEM

MCIT

39

The Influence of the number of pollutants on the Plants’ utility

0

50

100

150

200

250300

350

400

450

500

5 10 15 20

Number of Pollutants

Uti

lity

pe

r P

lan

t

BackTracking

MCIE

MaxSumNash

CEM

MCIT

40

Conclusions (Incomplete Information)• If side payments are permitted and the number of

pollutants is small then MCIT method is the best.

• If side payments are not permitted or the number of pollutants is large then the MCIE method is the best.

• Techniques: economics, heuristic search, optimization methods, binary search.

• Problem: will each plant behave competitively??

41

Motivating Example

: upgrade software ona network of workstationsas part of a sys-admin grouptomorrow from 6-8 p.m.

: go to theatre with friendstomorrow from 7-9 p.m.

???

Agent must reconcile intentions:• its intention to do the group task • a potential intention to do

42

Problem Description• Self-interested agents

– committed to a collaborative activity– receive outside offers

• They need to reconcile intentions, deciding between:– defaulting on their group-related commitment – rejecting the outside offer

• Agents assess outcomes using utility functions.

• How can agents be encouraged to consider the group’s good?

• What utility functions should agents use?

43

SPIRE Simulation System(SharedPlans Intention Reconciliation

Experiments)• Study the impact of:

– group norms and policies – agent utility functions – environmental factors

• Goal: provide insights that agent developers can use to develop collaboration-capable agents (Grosz, Sullivan, Das, Kraus)

44

• Multi-attributed decision making: application:– Intentions reconciliation in SharedPlans

• Benefits: using results of MADM, e.g., Specific method is not so important, standardization techniques.

• Problems: choosing attributes; assigning values, choosing weights.

Decision-theory Based Frameworks

45

Game-theory Based Frameworks(Non-cooperative Models)

• Strategic-negotiation model based on: alternating offers model of Rubinstein. Applications: Forthcoming book Kraus, 2001 MIT Press)– pollution allocation– Data allocation (Schwartz & kraus AAAI97),– Resource allocation , task distribution – hostage crisis (Kraus Wilkenfeld).

46

Advantages and Difficulties:Negotiation on Data Allocation

• Beneficial results; proved to be better than current methods; simple strategies.

• Problems:– Need to develop utility functions;– Finding possible action: identifying optimal

allocations is NP complete;– Incomplete information: game-theory

provides limited solutions.

47

Game-theory Based Frameworks(Non-cooperative Models)

• Auctions applications: – Data allocation (Schwartz & Kraus ATAL97,

ICMAS00),– Electronic commerce.

• Subcontracting based on: principle agent models. Applications: – Task allocation (kraus, AIJ96).

48

Advantages and Difficulties:Auctions for Data Allocation

• Beneficial results; proved to be better than current methods.

• Problems: – Utility functions,– Difficult to find bidding when there is

incomplete information and the evaluations are dependant on each other: no procedures; Need to combine with learning.

49

• Coalition theories applications: – Group and teams formation (shehory &kraus CI99).

• Benefits: well-defined concepts of stability; mechanisms to divide benefits.

• Difficulties: utility functions, no procedures for coalition formation; exponential problems.

• DPS model: combinatory theories & operations research (shehory &kraus AIJ98).

Game-theory Based Frameworks(Cooperative Models)

Logical ModelsLogical Models

Building agents on top of any software packages.

Logic is a basis for an agent programming language (Subrahmanian et al. Heterogeneous Agent Systems: Theory and Implementation, MIT Press, 2,000.)

service layer

messagelayer

authorizationlayer

decisionlayer

codecode PP

per Wwrap

51

Logical Models

• Modal logic: BDI models:applications:– Automated argumentation's (kraus, sycara &

eventchick AIJ99).– Specification of sharedplans (Grosz & Kraus AIJ96).

– Bounded agents (Nirkhe, Kraus,Perlis JLC97).– Agents reasoning about other agents (Kraus &

Lehmann TCT88 Kraus & Subrahmanian IJIS95).

52

Advantages and Difficulties:Logical Models

• Formal models with well studied properties:excellent for specification.

• Problems: – Some assumptions are not valid (e.g., omnicience).– Complexity problems.– There are no procedures for actions: required a lot of

programming; decision making; developing preferences.

53

Physics Based Models

• Physical models of particle-dynamics Applications: Cooperation in large-scale multi-agent systems: freight deliveries within a metropolitan area. (Shehory & Kraus ECAI96 Shehory,

Kraus & Yadgar ATAL98 AIJ99).

• Benefits: efficient; inherits the physics properties.

• Problems: adjustments; potential functions

54

Summary

• Benefits: formal models which have already been studied; lead to efficient results. No need to invent the wheel.

• Problems: – Restrictions and assumptions made by other

disciplines are not valid in real world MAS situations: extensions are needed.

– It is difficult to develop utility functions.– Complexity problems.

multiagent coordination and cooperation: challenges and techniques

Documents

new norms

strategicnegotiations

negotiation time

distributed information

selfmotivated agents

individualsteams of

advisory negotiation

nonformal theories