an chapter introduction 11: mul tia gent · introduction to multiagent systems 2e 2 utilities and...

Post on 14-Aug-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CH

AP

TE

R11:

MU

LTIA

GE

NT

INT

ER

AC

TIO

NS

An

Introductionto

MultiagentS

ystems

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

1W

hatareM

ultiagentSystem

s?

!"

#$%&

"'

!"

(

)*

!"

($"

(!%)

+($&

"&

%*)

"$,

)($&

",-

.!

%!/&

0$"

12

!"

+!

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

1

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Thus

am

ultiagentsystemcontains

anum

berofagents

...•...

which

interactthroughcom

munication

...

•...

areable

toactin

anenvironm

ent...

•...

havedifferent“spheres

ofinfluence”(w

hichm

aycoincide)...

•...

willbe

linkedby

other(organisational)

relationships.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

2

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

2U

tilitiesand

Preferences

•A

ssume

we

havejusttw

oagents:

Ag

=i,j.

•A

gentsare

assumed

tobe

self-interested:they

havepreferences

overhow

theenvironm

entis .

•A

ssume

Ω=ω

1,ω

2,...

isthe

setof“outcomes”

thatagents

havepreferences

over.

•W

ecapture

preferencesby

utilityfunctions:

ui:Ω→

R

uj:Ω→

R

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

3

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•U

tilityfunctions

leadto

preferenceorderings

overoutcom

es:

ω

iω′

means

ui (ω

)≥

ui (ω

′)

ω

iω′

means

ui (ω

)>

ui (ω

′)

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

4

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Whatis

Utility?

•U

tilityis

notmoney

(butitisa

usefulanalogy).

•Typicalrelationship

between

utility&

money:

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

5

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

utility

mo

ney

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

6

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

3M

ultiagentEncounters

•W

eneed

am

odeloftheenvironm

entinw

hichthese

agentsw

illact...

–agents

simultaneously

choosean

actionto

perform,

andas

aresultofthe

actionsthey

select,anoutcom

ein

Ωw

illresult;–

theactualoutcom

edepends

onthe

combination

ofactions;

–assum

eeach

agenthasjusttw

opossible

actionsthatitcan

performC

(“cooperate”)and

“D”

(“defect”).http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

7

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•E

nvironmentbehaviour

givenby

statetransform

erfunction :τ

:A

c︸︷︷︸

agenti’s

action×

Ac

︸︷︷︸

agentj’s

action→

Ω

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

8

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•H

ereis

astate

transformer

function:

τ(D

,D)

1τ(D

,C)

2τ(C

,D)

3τ(C

,C)

4

(This

environmentis

sensitiveto

actionsofboth

agents.)

•H

ereis

another:

τ(D

,D)

1τ(D

,C)

1τ(C

,D)

1τ(C

,C)

1

(Neither

agenthasany

influencein

thisenvironm

ent.)

•A

ndhere

isanother:

τ(D

,D)

1τ(D

,C)

2τ(C

,D)

1τ(C

,C)

2

(This

environmentis

controlledby

j.)

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

9

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

RationalA

ction

•S

upposew

ehave

thecase

where

bothagents

caninfluence

theoutcom

e,andthey

haveutility

functionsas

follows:

ui (ω

1 )=

1u

i (ω2 )

=1

ui (ω

3 )=

4u

i (ω4 )

=4

uj (ω

1 )=

1u

j (ω2 )

=4

uj (ω

3 )=

1u

j (ω4 )

=4

•W

itha

bitofabuseofnotation:

ui (D

,D)

=1

ui (D

,C)

=1

ui (C

,D)

=4

ui (C

,C)

=4

uj (D

,D)

=1

uj (D

,C)

=4

uj (C

,D)

=1

uj (C

,C)

=4

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

10

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•T

henagent

i’spreferences

are:

C,C

i C

,D

iD

,C

i D,D

•“C

”is

therationalchoice

fori.

(Because

iprefersalloutcom

esthatarise

throughC

overalloutcom

esthatarise

throughD

.)

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

11

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

PayoffM

atrices

•W

ecan

characterisethe

previousscenario

ina

payoffm

atrix

i

j

defectcoop

defect1

41

1coop

14

44

•A

gentiis

thecolum

nplayer.

•A

gentjis

therow

player.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

12

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Solution

Concepts

•H

oww

illarationalagentw

illbehavein

anygiven

scenario?

•A

nswered

insolution

concepts:

–dom

inantstrategy;–

Nash

equilibriumstrategy;

–P

aretooptim

alstrategies;–

strategiesthatm

aximise

socialwelfare.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

13

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Dom

inantStrategies

•W

ew

illsaythata

strategysi is

dominantfor

playeriif

nom

atterw

hatstrategysj agent

jchooses,iw

illdoat

leastasw

ellplayingsi as

itwould

doinganything

else.

•U

nfortunately,thereisn’talw

aysa

dominantstrategy.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

14

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

(Pure

Strategy)

Nash

Equilibrium

•In

general,we

willsay

thattwo

strategiess1

ands2

arein

Nash

equilibriumif:

1.under

theassum

ptionthatagentiplays

s1 ,agentjcan

dono

betterthan

plays2 ;and

2.under

theassum

ptionthatagentjplays

s2 ,agentican

dono

betterthan

plays1 .

•N

eitheragenthas

anyincentive

todeviate

froma

Nash

equilibrium.

•U

nfortunately:http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

15

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

1.N

oteveryinteraction

scenariohas

aN

ashequilibrium

.2.

Som

einteraction

scenarioshave

more

thanone

Nash

equilibrium.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

16

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Matching

Pennies

Players

iandjsim

ultaneouslychoose

theface

ofacoin,either

“heads”or

“tails”.Ifthey

showthe

same

face,theniw

ins,while

iftheyshow

differentfaces,thenjw

ins.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

17

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Matching

Pennies:

The

PayoffM

atrix

iheadsitails

jheads1

−1

−1

1

jtails−

11

1−

1

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

18

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Mixed

Strategies

forM

atchingP

ennies

•N

Opair

ofstrategiesform

sa

purestrategy

NE

:w

hateverpair

ofstrategiesis

chosen,somebody

will

wish

theyhad

donesom

ethingelse.

•T

hesolution

isto

allowm

ixedstrategies:

–play

“heads”w

ithprobability

0.5–

play“tails”

with

probability0.5.

•T

hisis

aN

Estrategy.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

19

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Mixed

Strategies

•A

mixed

strategyhas

theform

–play

α1

with

probabilityp

1

–play

α2

with

probabilityp

2

–...

–play

αk

with

probabilityp

k .

suchthat

p1

+p2

+···

+p

k=

1.

•N

ashproved

thateveryfinite

game

hasa

Nash

equilibriumin

mixed

strategies .

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

20

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Nash’s

Theorem

•N

ashproved

thateveryfinite

game

hasa

Nash

equilibriumin

mixed

strategies .(U

nlikethe

casefor

purestrategies.)

•S

othis

resultovercomes

thelack

ofsolutions;butthere

stillmay

bem

orethan

oneN

ashequilibrium

...

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

21

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Pareto

Optim

ality

•A

noutcom

eis

saidto

beP

aretooptim

al(orP

aretoefficient )

ifthereis

noother

outcome

thatmakes

oneagent better

offwithoutm

akinganother

agentworse

off .

•Ifan

outcome

isP

aretooptim

al,thenatleastone

agentwillbe

reluctanttom

oveaw

ayfrom

it(becausethis

agentwillbe

worse

off).http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

22

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•Ifan

outcome

ωis

notPareto

optimal,then

thereis

anotheroutcom

eω′thatm

akeseveryone

ashappy,if

nothappier,thanω

.“R

easonable”agents

would

agreeto

move

toω′in

thiscase.

(Even

ifIdon’tdirectlybenefitfrom

ω′,you

canbenefitw

ithoutme

suffering.)

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

23

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

SocialW

elfare

•T

hesocialw

elfareofan

outcome

ωis

thesum

oftheutilities

thateachagentgets

fromω

:∑i∈

Ag

ui (ω

)

•T

hinkofitas

the“totalam

ountofmoney

inthe

system”.

•A

sa

solutionconcept,m

aybe

appropriatew

henthe

whole

system(allagents)

hasa

singleow

ner(then

overallbenefitofthesystem

isim

portant,notindividuals).

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

24

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Com

petitiveand

Zero-S

umInteractions

•W

herepreferences

ofagentsare

diametrically

opposedw

ehave

strictlycom

petitivescenarios.

•Z

ero-sumencounters

arethose

where

utilitiessum

tozero:

ui (ω

)+

uj (ω

)=

0for

allω∈

Ω.

•Z

erosum

encountersare

badnew

s:for

me

toget+

veutility

youhave

togetnegative

utility!T

hebest

outcome

form

eis

thew

orstforyou!

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

25

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•Z

erosum

encountersin

reallifeare

veryrare

...but

peoplefrequently

actasifthey

were

ina

zerosum

game.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

26

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

4T

heP

risoner’sD

ilemm

a

Two

men

arecollectively

chargedw

itha

crime

andheld

inseparate

cells,with

now

ayofm

eetingor

comm

unicating.T

heyare

toldthat:

•ifone

confessesand

theother

doesnot,the

confessorw

illbefreed,and

theother

willbe

jailedfor

threeyears;

•ifboth

confess,theneach

willbe

jailedfor

two

years.

Both

prisonersknow

thatifneitherconfesses,

thenthey

willeach

bejailed

forone

year.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

27

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•P

ayoffmatrix

forprisoner’s

dilemm

a:

i

j

defectcoop

defect2

12

4coop

43

13

•Top

left:Ifboth

defect,thenboth

getpunishmentfor

mutualdefection.

•Top

right:If

icooperatesand

jdefects,igets

sucker’spayoffof1,w

hilejgets

4.http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

28

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•B

ottomleft:

Ifjcooperates

andidefects,

jgetssucker’s

payoffof1,while

igets4.

•B

ottomright:

Rew

ardfor

mutualcooperation.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

29

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

WhatS

houldYou

Do?

•T

heindividualrationalaction

isdefect.

This

guaranteesa

payoffofnow

orsethan

2,whereas

cooperatingguarantees

apayoffofatm

ost1.

•S

odefection

isthe

bestresponseto

allpossiblestrategies:

bothagents

defect,andgetpayoff=

2.

•B

utintuitionsays

thisis

notthebestoutcom

e:S

urelythey

shouldboth

cooperateand

eachget

payoffof3!

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

30

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Solution

Concepts

•D

isa

dominantstrategy.

•(D

,D)

isthe

onlyN

ashequilibrium

.

•A

lloutcomes

except(D

,D)

areP

aretooptim

al.

•(C

,C)

maxim

isessocialw

elfare.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

31

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•T

hisapparentparadox

isthe

fundamentalproblem

ofm

ulti-agentinteractions .Itappears

toim

plythat cooperation

willnotoccur

insocieties

ofself-interestedagents .

•R

ealworld

examples:

–nuclear

arms

reduction(“w

hydon’tIkeep

mine...”)

–free

ridersystem

s—

publictransport;

–in

theU

K—

televisionlicenses.

•T

heprisoner’s

dilemm

ais

ubiquitous.

•C

anw

erecover

cooperation?

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

32

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Argum

entsfor

Recovering

Cooperation

•C

onclusionsthatsom

ehave

drawn

fromthis

analysis:

–the

game

theorynotion

ofrationalactionis

wrong!

–som

ehowthe

dilemm

ais

beingform

ulatedw

rongly

•A

rguments

torecover

cooperation:

–W

eare

notallmachiavelli!

–T

heother

prisoneris

my

twin!

–P

rogramequilibria

andm

ediators–

The

shadowofthe

future...

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

33

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

4.1P

rogramE

quilibria

•T

hestrategy

youreally

wantto

playin

theprisoner’s

dilemm

ais:

I’llcooperateifhe

will

.

•P

rogramequilibria

provideone

way

ofenablingthis.

•E

achagentsubm

itsa

programstrategy

toa

mediator

which

jointlyexecutes

thestrategies.

Crucially,strategies

canbe

conditionedon

thestrategies

oftheothers .

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

34

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

4.2P

rogramE

quilibria

•C

onsiderthe

following

program:

IF

HisProgram

==ThisProgram

THEN

DO(C);

ELSEDO(D);

END-IF.

Here

==

istextualcom

parison.

•T

hebestresponse

tothis

programis

tosubm

itthesam

eprogram

,givingan

outcome

of(C

,C)!

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

35

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•You

can’tgetthesucker’s

payoffbysubm

ittingthis

program.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

36

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

4.3T

heIterated

Prisoner’s

Dilem

ma

•O

neansw

er:play

thegam

em

orethan

once.Ifyou

knowyou

willbe

meeting

youropponentagain,

thenthe

incentiveto

defectappearsto

evaporate.

•C

ooperationis

therationalchoice

inthe

infinititelyrepeated

prisoner’sdilem

ma.

(Hurrah!)

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

37

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

4.4B

ackwards

Induction

•B

ut...suppose

youboth

knowthatyou

willplay

thegam

eexactly

ntim

es.O

nround

n−

1,youhave

anincentive

todefect,to

gainthatextra

bitofpayoff...B

utthism

akesround

n−

2the

last“real”,andso

youhave

anincentive

todefectthere,too.

This

isthe

backwards

inductionproblem

.

•P

layingthe

prisoner’sdilem

ma

with

afixed,finite,

pre-determined,com

monly

known

number

ofrounds,defection

isthe

beststrategy.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

38

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

4.5A

xelrod’sTournam

ent

•S

upposeyou

playiterated

prisoner’sdilem

ma

againsta

rangeofopponents

...W

hatstrategyshould

youchoose,so

asto

maxim

iseyour

overallpayoff?

•A

xelrod(1984)

investigatedthis

problem,w

itha

computer

tournamentfor

programs

playingthe

prisoner’sdilem

ma.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

39

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Strategies

inA

xelrod’sTournam

ent

•A

LLD:

“Alw

aysdefect”

—the

hawk

strategy;

•T

IT-FO

R-TAT

:

1.O

nround

u=

0,cooperate.2.

On

roundu

>0,do

whatyour

opponentdidon

roundu−

1.

•T

ES

TE

R:

On

1stround,defect.Ifthe

opponentretaliated,thenplay

TIT-F

OR

-TAT.Otherw

iseintersperse

cooperation&

defection.http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

40

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•JO

SS

:A

sT

IT-FO

R-TAT,exceptperiodically

defect.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

41

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Recipes

forS

uccessin

Axelrod’s

Tournament

Axelrod

suggeststhe

following

rulesfor

succeedingin

histournam

ent:

•D

on’tbeenvious:

Don’tplay

asifitw

erezero

sum!

•B

enice:

Startby

cooperating,andreciprocate

cooperation.

•R

etaliateappropriately:

Alw

ayspunish

defectionim

mediately,butuse

“measured”

force—

don’toverdoit.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

42

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

•D

on’tholdgrudges:

Alw

aysreciprocate

cooperationim

mediately.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

43

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

5G

ame

ofChicken

•C

onsideranother

typeofencounter

—the

game

ofchicken :

i

j

defectcoop

defect1

21

4coop

43

23

(Think

ofJames

Dean

inR

ebelwithouta

Cause:

swerving

=coop,driving

straight=defect.)

•D

ifferenceto

prisoner’sdilem

ma:

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

44

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Mutualdefection

ism

ostfearedoutcom

e.

(Whereas

sucker’spayoffis

mostfeared

inprisoner’s

dilemm

a.)

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

45

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

Solution

Concepts

•T

hereis

nodom

inantstrategy(in

oursense).

•S

trategypairs

(C,D

))and

(D,C

))are

Nash

equilibriums.

•A

lloutcomes

except(D

,D)

areP

aretooptim

al.

•A

lloutcomes

except(D

,D)

maxim

isesocialw

elfare.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

46

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

6O

therS

ymm

etric2

x2

Gam

es

•G

iventhe

4possible

outcomes

of(symm

etric)cooperate/defectgam

es,thereare

24possible

orderingson

outcomes.

–C

C

i CD

i DC

i DD

Cooperation

dominates.

–D

C

i DD

i CC

i CD

Deadlock.

Youw

illalways

dobestby

defecting.–

DC

i CC

i DD

i CD

Prisoner’s

dilemm

a.–

DC

i CC

i CD

i DD

Chicken.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

47

Chapter

11A

nIntroduction

toM

ultiagentS

ystems

2e

–C

C

i DC

i DD

i CD

Stag

hunt.

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

48

top related