distributed rational decision making

Distributed Rational Decision Making

Author: Tuomas W. SandholmSpeakers: Praveen Guddeti (1---5) Tibor Moldovan (6---9)

CSE 976, April 15, 2002

Outline1. Introduction2. Evaluation criteria3. Non-cooperative interaction protocols

1. Voting2. Auctions3. Bargaining4. General equilibrium market mechanisms5. Contract nets6. Coalition formation

4. Conclusions

Introduction

• Automated negotiation systems with self-interested agents are becoming increasing important.

1. Technology push. 2. Application pull.• Paper deals with protocols designed using a

non-cooperative, strategic perspective.

Evaluation Criteria

1. Social welfare2. Pareto efficiency3. Individual rationality4. Stability5. Computational efficiency6. Distribution and communication efficiency

Evaluation Criteria

1. Social Welfare• It is the sum of all agents’ payoffs or

utilities in a given solution.• Requires inter-agent utility comparisons.

Evaluation Criteria

2. Pareto Efficiency• A solution x is Pareto efficient if there is no other

solution x’ such that – at least one agent is better off in x’ than in x, and– no agent is worse off in x’ than in x.

• Does not require inter-agent utility comparisons.• Social welfare maximizing solutions are a subset

of Pareto efficient ones.

Evaluation Criteria

3. Individual Rationality• Participation in a negotiation is individually

rational to an agent only if it is profitable.• A mechanism is individually rational if

participation is individually rational for all agents.

• Only individually rational mechanisms are viable.

Evaluation Criteria

4. Stability• The protocol mechanisms should motivate each agent

to behave in the desired manner.• Protocol mechanisms may have dominant strategies.

This means that an agent is best off by using a specific strategy no matter what strategies the other agents use.

• Nash equilibrium: Each agent chooses a strategy that is a best response to the other agents’ strategies.

Nash Equilibrium: Formal definition

• The strategy profile S*A ={ S*1, S*2,…, S*|A|} among agents A is in Nash equilibrium if for each agent i , S*i is the agent’s best strategy given that the other agents choose strategies {S*1, S*2,…, S*i-1, S*i+1,… S*|A|}.

• In some games no Nash equilibrium exists.• Some games have multiple Nash equilibrium.

Nash Equilibrium: Comments

• Even if Nash equilibrium exists and is unique, there are limitations regarding what the Nash equilibrium guarantees.

• In sequential games it only guarantees stability in the beginning of the game.

• Subgame perfect Nash equilibrium.• Nash equilibrium is often too weak because

subgroups of agents can deviate in a coordinated manner.

• Sometimes efficiency and stability goals conflict.

Evaluation Criteria

5. Computational Efficiency• The protocol mechanisms when used by

agents should need as little computation as possible.

• Trade off between:– the cost of the computation needed for the

protocol mechanisms and – the solution quality.

Evaluation Criteria

6. Distribution and Communication Efficiency

• Distributed protocols should be preferred in order to avoid a single point of failure and a performance bottleneck – among other reasons.

• Minimize the amount of communication required to get to a desired global solution.

• These two goals can conflict.

Non-cooperative interaction protocols


Non-cooperative Interaction Protocols

1. Voting• All agents give input to a mechanism.• Outcome chosen by the mechanism is

solution for all agents.• Outcome is enforced.• Voters.

– Truthful voters.– Strategic (Insincere) voters.

Voters

Truthful Voters (1)

• Each agent i A has an asymmetric and transitive strict preference relations i on O.

• Social choice rule.– input the agents’ preference relations ( 1,…, |A|).– output the social preferences denoted by a relation

*.

Truthful Voters (2)

Properties of a social choice rule * should exist for all possible inputs. * should be defined for every pair o, o’ O. * should be asymmetric and transitive over O.• The outcome should be Pareto efficient.• The scheme should be independent of irrelevant

alternatives.• No agent should be a dictator.

Truthful Voters (3)

• Arrow’s Impossibility Theorem: No social rule satisfies all of these six conditions.

• Relax the first property.• Relax the third property.

– Plurality protocol.– Binary protocol.– Borda protocol.

Truthful Voters (4)

Plurality Protocol• Majority voting protocol.• All alternatives are compared simultaneously.• The one having the highest number of votes wins.• Irrelevant alternative can split the majority.

Truthful Voters (5)

Binary Protocol (1)

• Pair wise voting with the winner staying to challenge remaining alternatives.

• Irrelevant alternatives can change outcomes.

• Agenda i.e. order of the pairings can change the outcomes.

Truthful Voters (6)

Binary Protocol (2)

35 % of agents have preferences cdba33 % of agents have preferences acdb 32 % of agents have preferences bacd

Truthful Voters (7)

Borda Protocol (1)

• Assign an alternative |O| points whenever it is the highest in some agent’s preference list, |O| -1 when it is second and so on.

• Sum the counts of all alternatives.• Alternative with highest count is the winner.• Irrelevant alternatives lead to paradoxical

results.

Borda Protocol (2) Agent Preferences 1 a b c d 2 b c d a 3 c d a b 4 a b c d 5 b c d a 6 c d a b 7 a b c dBorda count c wins with 20, b has 19, a has 18, d loses with 13

Borda count a wins with 15, b has 14,loses with 13with d removed

Voters

Strategic (Insincere) Voters (1)

• Revelation principle: Suppose some protocol implements social choice function f(.) in Nash (or dominant strategy) equilibrium, then f(.) is implementable in Nash (or dominant strategy) equilibrium via a single-step protocol where the agents reveal their types truthfully.

Voters

Strategic (Insincere) Voters (2)• Gibbard-Satterthwaite impossibility theorem:

Let each agent’s type i, consist of a preference order i on O. Let there be no restrictions on i, i.e. each agent may rank the outcomes O in any order. Let |O| 3. Now, if the social choice function f(.) is truthfully implementable in a dominant strategy equilibrium, then f(.) is dictatorial, i.e. there is some agent i who gets (one of) his most preferred outcomes chosen no matter what types the others reveal.

Voters


• Circumventing the GSIT:– Restricted preferences.– Groves-Clarke Tax Mechanism.

• Groves-Clarke Tax Mechanism:– o = (g ,1,… |A|). i is the amount agent i receives.– g encodes the other features of the outcome.

Strategic (Insincere) Voters (4) Groves-Clarke Tax Mechanism(1)

• Quasilinear preferences: ui(o) = vi(g) + i.

• Net benefit: vi(g) = vi gross(g) – P / |A|.

• Every agent iA reveals his valuation vi(g) for every possible g.

• The social choice is g* =arg maxg i vi(g).

• Every agent is levied a tax: tax i = ji vj (g*) - ji vj (arg maxg ki vk (g)).

Strategic (Insincere) Voters (5) Groves-Clarke Tax Mechanism(2)

• Size of an agent’s tax is exactly how much his vote lowers the other’s utility.

• Quasilinearity:– No agent should care how others divide payoffs

among themselves.– An agent’s valuation vi gross(g) should not

depend on the amount of money that the agent will have.

Voters


• If each agent has quasilinear preferences, then each agent’s dominant strategy is to reveal his true preferences.

• Agents need not waste effort in counter speculating each others’ preference declarations.

• Participation is individually rational.

Voters

Strategic (Insincere) Voters (7)• Problems of Groves-Clarke Tax Mechanism:

– Does not maintain budget balance.– Not coalition proof.– Intractable.

• Other ways to circumvent the GSIT:– Choosing a dictator randomly.– Make the computation of an untruthful revelation

prohibitively costly.


2. Auctions• Unlike voting where the outcome binds all agents,

in auctions the outcome is usually a deal between two agents.

• In voting the protocol designer wants to enhance the social good, while in auctions, the auctioneer wants to maximize his own profit.

• Classical setting.• Contracting setting.

Auctions (2)1. Auction settings.2. Auctions protocols.3. Efficiency of the resulting allocation.4. Revenue equivalence and non-equivalence.5. Bidder collusion.6. Lying auctioneer.7. Bidders lying in non-private-value auctions.8. Undesirable private information revelation.9. Roles of computation in auctions.

1. Auction settings

• Three qualitatively different auctions depending on how an agent’s value of the item is formed:

1. Private value.2. Common value.3. Correlated value.

2. Auctions protocols

1. English (first-price open-cry) auction.2. First-price sealed-bid auction.3. Dutch (descending) auction.4. Vickrey (second-price sealed-bid) auction.

• Allocation of computation resources in OS, allocation of bandwidth in computer networks, computationally control building heating.

• Has not been widely adopted in auctions among humans.

3. Efficiency of the resulting allocation.

• In isolated private value or common value auctions, each one of the four auction protocols allocates the auctioned item Pareto efficiently to the bidder who values it the most.

• All four protocols are Pareto efficient in the allocation.

• The dominant strategies (Vickrey and English) are more efficient.

4. Revenue equivalence and non-equivalence.

• Revenue equivalence: All of the four auction protocols produce the same expected revenue to the auctioneer in private value auctions where the values are independently distributed and bidders are risk-neutral.

• Among risk averse bidders, the Dutch and the first-price sealed-bid protocols give higher expected revenue to the auctioneer.

• A risk averse auctioneer achieves higher expected utility via the Vickrey or English protocols.

Revenue equivalence and non-equivalence. (2)

• In non-private value auctions, both the English and Vickrey protocols produce greater expected revenue to the auctioneer than the first-price sealed-bid auction or Dutch auction.

• In non-private value auctions with at least three bidders, the English auction leads to higher revenue than the Vickrey auction.

5. Bidder collusion.• The English auction and the Vickrey auction

actually self-enforce some of the most likely collusion agreements.

• First-price sealed-bid and the Dutch auctions are preferred for deterring collusion.

• For collusion to take place in Vickrey, first-price sealed-bid or Dutch auctions the bidders have to identify each other before placing the bids.

6. Lying Auctioneer.• In Vickrey auction the auctioneer may lie

about the value of the second highest bidder.• In the English auction the auctioneer can use

shills that bid in the auction in order to make the real bidders increase their valuations of the item.

• The auctioneer may bid himself to guarantee that the item will not be sold below a certain price.

7. Bidders lying in non-private-value auctions.

• Winner’s curse: If an agent bids its valuations and wins the auction, it will know that its valuation was too high because the other agents bid less.

• Agents should bid less than their valuations.• This is the best strategy in Vickrey auctions.• Vickrey fails to induce truthful bidding in

most auction settings.

8. Undesirable private information revelation.

• In Vickrey auctions the agents often bid truthfully. This leads to the bidders revealing their true valuations.

• This information is sensitive and the bidders would prefer not to reveal it.

• Another reason why the Vickrey auction protocol is not widely used among humans.

9. Roles of computation in auctions.

• Two issues arise from computation in auctions:

1. Computationally complex look ahead that arises when auctioning interrelated items one at a time.

2. Implications of costly local marginal cost (valuation) computation or information gathering in a single-shot auction.

Interrelated auctions.

• Look ahead:– Without look ahead the allocation may be

inefficient.– With look ahead the agents will not bid their

true per-item cost.– Computation cost may be prohibitively great.

• Allow agents to backtrack from commitments by paying penalties.

Single-shot auctions

• Incentive to counter speculate: In a single-shot private value Vickrey auction with uncertainty about an agent’s own valuations, a risk neutral agent’s best action can depend on the other agents. It follows that is is worth counter speculating.


3. Bargaining• Real world settings usually consist of a finite

number of competing agents, so neither monopoly,nor monopsony nor perfect competition assumptions strictly apply.

• Bargaining theory fits in this gap.• Bargaining theory:

1. Axiomatic 2. Strategic

Bargaining Theory

Axiomatic• Does not use the idea of equilibrium.• Desirable properties for a solution, called

axioms of the bargaining solution, are postulated.

• Then the solution that satisfies these axioms are sought.

• Nash bargaining solution.

Axiomatic Bargaining Theory

Nash Bargaining Solution (1)• Nash analyzed a 2-agent setting where the

agents have to decide on an outcome o O, and the fallback outcome ofallback occurs if no agreement is reached.

• There is a utility function ui: O R for each agent i [1,2].

• It is assumed that the set of feasible utility vectors { (u1 (o), u2 (o)) | o O} is convex.


Nash Bargaining Solution (2)

• Axioms for the Nash bargaining solution u* = (u1(o*), u2(o*)) are:

1. Invariance. 2. Anonymity (symmetry). 3. Independence of irrelevant alternatives. 4. Pareto efficiency.


Nash Bargaining Solution (3)

• The unique solution that satisfies these four axioms is:

o* = arg maxo[u1(o)–u1(ofallback)][u2(o)–u2 (ofallback)]• Other bargaining solutions also exist.

Bargaining Theory

Strategic (1)• Bargaining situation is modeled as a game.• Solution is based on an analysis of which of

the players’ strategies are in equilibrium.• Solution is not unique.• Explains the behavior of rational utility

maximizing agents better than axiomatic approaches.

• Usually analyses sequential bargaining.

Bargaining Theory

Strategic (2)

1. Finite number of offers with no time discount.2. Finite number of offers with time discount.3. Infinite number of offers with no time discount.4. Infinite number of offers with time discount.

Strategic Bargaining Theory

Rubinstein Bargaining Solution• In a discounted infinite round setting, the

subgame perfect Nash equilibrium outcome is unique. Agent 1 gets (1- 2) / (1- 12), where 1 is 1’s discount factor, and 2 is 2’s. Agent 2 gets one minus this. Agreement is reached in the first round.

• The proof gives a way to solve for subgame perfect Nash equilibrium payoffs.


Fixed Bargaining Cost per negotiation round

• If the agents have symmetric bargaining costs, the solution concept is powerless.

• If 1’s bargaining cost c1 is even slightly smaller than 2’s cost c2, then 1 gets the entire dollar.

• If 1’s bargaining cost is greater than 2’s, then 1 receives a payoff that equals the second agent’s bargaining cost, and agent 2 receives the rest.


Recent extensions [Kraus et al.]

• Sequential bargaining with outside options. • Sequential bargaining where one agent

gains and one loses over time.• Negotiation over time when agents do not

know each other’ types.

Bargaining

Computation (1)

• Assume perfect rationality.• The space of deals is assumed to be fully

comprehended by the agents.• The value of each potential contract known.• Focus of future work:

– make the cost of search explicit and.– consider its trade-off with bargaining gains.

Bargaining

Computation (2)• There are two searches occurring in bargaining:

1. Intra-agent deliberative search: an agent locally generates alternatives, evaluates them, counter speculates, does look ahead etc.

2. Inter-agent committal search: the agents make agreements with each other regarding the solution.

1. Introduction2. Evaluation criteria3. Non-cooperative interaction protocols


4. Conclusions

distributed rational decision making

Documents

nash equilibrium guarantees

multiple nash equilibrium

agents strategies

agent i

agents best strategy

selfinterested agents

agents payoffs

subgroups of agents