an chapter introduction 11: mul tia gent · introduction to multiagent systems 2e 2 utilities and...
Post on 14-Aug-2020
2 Views
Preview:
TRANSCRIPT
CH
AP
TE
R11:
MU
LTIA
GE
NT
INT
ER
AC
TIO
NS
An
Introductionto
MultiagentS
ystems
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
1W
hatareM
ultiagentSystem
s?
!"
#$%&
"'
!"
(
)*
!"
($"
(!%)
+($&
"&
%*)
"$,
)($&
",-
.!
%!/&
0$"
12
!"
+!
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
1
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Thus
am
ultiagentsystemcontains
anum
berofagents
...•...
which
interactthroughcom
munication
...
•...
areable
toactin
anenvironm
ent...
•...
havedifferent“spheres
ofinfluence”(w
hichm
aycoincide)...
•...
willbe
linkedby
other(organisational)
relationships.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
2
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
2U
tilitiesand
Preferences
•A
ssume
we
havejusttw
oagents:
Ag
=i,j.
•A
gentsare
assumed
tobe
self-interested:they
havepreferences
overhow
theenvironm
entis .
•A
ssume
Ω=ω
1,ω
2,...
isthe
setof“outcomes”
thatagents
havepreferences
over.
•W
ecapture
preferencesby
utilityfunctions:
ui:Ω→
R
uj:Ω→
R
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
3
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•U
tilityfunctions
leadto
preferenceorderings
overoutcom
es:
ω
iω′
means
ui (ω
)≥
ui (ω
′)
ω
iω′
means
ui (ω
)>
ui (ω
′)
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
4
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Whatis
Utility?
•U
tilityis
notmoney
(butitisa
usefulanalogy).
•Typicalrelationship
between
utility&
money:
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
5
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
utility
mo
ney
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
6
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
3M
ultiagentEncounters
•W
eneed
am
odeloftheenvironm
entinw
hichthese
agentsw
illact...
–agents
simultaneously
choosean
actionto
perform,
andas
aresultofthe
actionsthey
select,anoutcom
ein
Ωw
illresult;–
theactualoutcom
edepends
onthe
combination
ofactions;
–assum
eeach
agenthasjusttw
opossible
actionsthatitcan
performC
(“cooperate”)and
“D”
(“defect”).http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
7
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•E
nvironmentbehaviour
givenby
statetransform
erfunction :τ
:A
c︸︷︷︸
agenti’s
action×
Ac
︸︷︷︸
agentj’s
action→
Ω
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
8
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•H
ereis
astate
transformer
function:
τ(D
,D)
=ω
1τ(D
,C)
=ω
2τ(C
,D)
=ω
3τ(C
,C)
=ω
4
(This
environmentis
sensitiveto
actionsofboth
agents.)
•H
ereis
another:
τ(D
,D)
=ω
1τ(D
,C)
=ω
1τ(C
,D)
=ω
1τ(C
,C)
=ω
1
(Neither
agenthasany
influencein
thisenvironm
ent.)
•A
ndhere
isanother:
τ(D
,D)
=ω
1τ(D
,C)
=ω
2τ(C
,D)
=ω
1τ(C
,C)
=ω
2
(This
environmentis
controlledby
j.)
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
9
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
RationalA
ction
•S
upposew
ehave
thecase
where
bothagents
caninfluence
theoutcom
e,andthey
haveutility
functionsas
follows:
ui (ω
1 )=
1u
i (ω2 )
=1
ui (ω
3 )=
4u
i (ω4 )
=4
uj (ω
1 )=
1u
j (ω2 )
=4
uj (ω
3 )=
1u
j (ω4 )
=4
•W
itha
bitofabuseofnotation:
ui (D
,D)
=1
ui (D
,C)
=1
ui (C
,D)
=4
ui (C
,C)
=4
uj (D
,D)
=1
uj (D
,C)
=4
uj (C
,D)
=1
uj (C
,C)
=4
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
10
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•T
henagent
i’spreferences
are:
C,C
i C
,D
iD
,C
i D,D
•“C
”is
therationalchoice
fori.
(Because
iprefersalloutcom
esthatarise
throughC
overalloutcom
esthatarise
throughD
.)
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
11
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
PayoffM
atrices
•W
ecan
characterisethe
previousscenario
ina
payoffm
atrix
i
j
defectcoop
defect1
41
1coop
14
44
•A
gentiis
thecolum
nplayer.
•A
gentjis
therow
player.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
12
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Solution
Concepts
•H
oww
illarationalagentw
illbehavein
anygiven
scenario?
•A
nswered
insolution
concepts:
–dom
inantstrategy;–
Nash
equilibriumstrategy;
–P
aretooptim
alstrategies;–
strategiesthatm
aximise
socialwelfare.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
13
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Dom
inantStrategies
•W
ew
illsaythata
strategysi is
dominantfor
playeriif
nom
atterw
hatstrategysj agent
jchooses,iw
illdoat
leastasw
ellplayingsi as
itwould
doinganything
else.
•U
nfortunately,thereisn’talw
aysa
dominantstrategy.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
14
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
(Pure
Strategy)
Nash
Equilibrium
•In
general,we
willsay
thattwo
strategiess1
ands2
arein
Nash
equilibriumif:
1.under
theassum
ptionthatagentiplays
s1 ,agentjcan
dono
betterthan
plays2 ;and
2.under
theassum
ptionthatagentjplays
s2 ,agentican
dono
betterthan
plays1 .
•N
eitheragenthas
anyincentive
todeviate
froma
Nash
equilibrium.
•U
nfortunately:http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
15
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
1.N
oteveryinteraction
scenariohas
aN
ashequilibrium
.2.
Som
einteraction
scenarioshave
more
thanone
Nash
equilibrium.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
16
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Matching
Pennies
Players
iandjsim
ultaneouslychoose
theface
ofacoin,either
“heads”or
“tails”.Ifthey
showthe
same
face,theniw
ins,while
iftheyshow
differentfaces,thenjw
ins.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
17
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Matching
Pennies:
The
PayoffM
atrix
iheadsitails
jheads1
−1
−1
1
jtails−
11
1−
1
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
18
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Mixed
Strategies
forM
atchingP
ennies
•N
Opair
ofstrategiesform
sa
purestrategy
NE
:w
hateverpair
ofstrategiesis
chosen,somebody
will
wish
theyhad
donesom
ethingelse.
•T
hesolution
isto
allowm
ixedstrategies:
–play
“heads”w
ithprobability
0.5–
play“tails”
with
probability0.5.
•T
hisis
aN
Estrategy.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
19
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Mixed
Strategies
•A
mixed
strategyhas
theform
–play
α1
with
probabilityp
1
–play
α2
with
probabilityp
2
–...
–play
αk
with
probabilityp
k .
suchthat
p1
+p2
+···
+p
k=
1.
•N
ashproved
thateveryfinite
game
hasa
Nash
equilibriumin
mixed
strategies .
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
20
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Nash’s
Theorem
•N
ashproved
thateveryfinite
game
hasa
Nash
equilibriumin
mixed
strategies .(U
nlikethe
casefor
purestrategies.)
•S
othis
resultovercomes
thelack
ofsolutions;butthere
stillmay
bem
orethan
oneN
ashequilibrium
...
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
21
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Pareto
Optim
ality
•A
noutcom
eis
saidto
beP
aretooptim
al(orP
aretoefficient )
ifthereis
noother
outcome
thatmakes
oneagent better
offwithoutm
akinganother
agentworse
off .
•Ifan
outcome
isP
aretooptim
al,thenatleastone
agentwillbe
reluctanttom
oveaw
ayfrom
it(becausethis
agentwillbe
worse
off).http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
22
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•Ifan
outcome
ωis
notPareto
optimal,then
thereis
anotheroutcom
eω′thatm
akeseveryone
ashappy,if
nothappier,thanω
.“R
easonable”agents
would
agreeto
move
toω′in
thiscase.
(Even
ifIdon’tdirectlybenefitfrom
ω′,you
canbenefitw
ithoutme
suffering.)
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
23
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
SocialW
elfare
•T
hesocialw
elfareofan
outcome
ωis
thesum
oftheutilities
thateachagentgets
fromω
:∑i∈
Ag
ui (ω
)
•T
hinkofitas
the“totalam
ountofmoney
inthe
system”.
•A
sa
solutionconcept,m
aybe
appropriatew
henthe
whole
system(allagents)
hasa
singleow
ner(then
overallbenefitofthesystem
isim
portant,notindividuals).
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
24
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Com
petitiveand
Zero-S
umInteractions
•W
herepreferences
ofagentsare
diametrically
opposedw
ehave
strictlycom
petitivescenarios.
•Z
ero-sumencounters
arethose
where
utilitiessum
tozero:
ui (ω
)+
uj (ω
)=
0for
allω∈
Ω.
•Z
erosum
encountersare
badnew
s:for
me
toget+
veutility
youhave
togetnegative
utility!T
hebest
outcome
form
eis
thew
orstforyou!
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
25
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•Z
erosum
encountersin
reallifeare
veryrare
...but
peoplefrequently
actasifthey
were
ina
zerosum
game.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
26
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
4T
heP
risoner’sD
ilemm
a
Two
men
arecollectively
chargedw
itha
crime
andheld
inseparate
cells,with
now
ayofm
eetingor
comm
unicating.T
heyare
toldthat:
•ifone
confessesand
theother
doesnot,the
confessorw
illbefreed,and
theother
willbe
jailedfor
threeyears;
•ifboth
confess,theneach
willbe
jailedfor
two
years.
Both
prisonersknow
thatifneitherconfesses,
thenthey
willeach
bejailed
forone
year.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
27
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•P
ayoffmatrix
forprisoner’s
dilemm
a:
i
j
defectcoop
defect2
12
4coop
43
13
•Top
left:Ifboth
defect,thenboth
getpunishmentfor
mutualdefection.
•Top
right:If
icooperatesand
jdefects,igets
sucker’spayoffof1,w
hilejgets
4.http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
28
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•B
ottomleft:
Ifjcooperates
andidefects,
jgetssucker’s
payoffof1,while
igets4.
•B
ottomright:
Rew
ardfor
mutualcooperation.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
29
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
WhatS
houldYou
Do?
•T
heindividualrationalaction
isdefect.
This
guaranteesa
payoffofnow
orsethan
2,whereas
cooperatingguarantees
apayoffofatm
ost1.
•S
odefection
isthe
bestresponseto
allpossiblestrategies:
bothagents
defect,andgetpayoff=
2.
•B
utintuitionsays
thisis
notthebestoutcom
e:S
urelythey
shouldboth
cooperateand
eachget
payoffof3!
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
30
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Solution
Concepts
•D
isa
dominantstrategy.
•(D
,D)
isthe
onlyN
ashequilibrium
.
•A
lloutcomes
except(D
,D)
areP
aretooptim
al.
•(C
,C)
maxim
isessocialw
elfare.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
31
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•T
hisapparentparadox
isthe
fundamentalproblem
ofm
ulti-agentinteractions .Itappears
toim
plythat cooperation
willnotoccur
insocieties
ofself-interestedagents .
•R
ealworld
examples:
–nuclear
arms
reduction(“w
hydon’tIkeep
mine...”)
–free
ridersystem
s—
publictransport;
–in
theU
K—
televisionlicenses.
•T
heprisoner’s
dilemm
ais
ubiquitous.
•C
anw
erecover
cooperation?
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
32
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Argum
entsfor
Recovering
Cooperation
•C
onclusionsthatsom
ehave
drawn
fromthis
analysis:
–the
game
theorynotion
ofrationalactionis
wrong!
–som
ehowthe
dilemm
ais
beingform
ulatedw
rongly
•A
rguments
torecover
cooperation:
–W
eare
notallmachiavelli!
–T
heother
prisoneris
my
twin!
–P
rogramequilibria
andm
ediators–
The
shadowofthe
future...
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
33
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
4.1P
rogramE
quilibria
•T
hestrategy
youreally
wantto
playin
theprisoner’s
dilemm
ais:
I’llcooperateifhe
will
.
•P
rogramequilibria
provideone
way
ofenablingthis.
•E
achagentsubm
itsa
programstrategy
toa
mediator
which
jointlyexecutes
thestrategies.
Crucially,strategies
canbe
conditionedon
thestrategies
oftheothers .
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
34
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
4.2P
rogramE
quilibria
•C
onsiderthe
following
program:
IF
HisProgram
==ThisProgram
THEN
DO(C);
ELSEDO(D);
END-IF.
Here
==
istextualcom
parison.
•T
hebestresponse
tothis
programis
tosubm
itthesam
eprogram
,givingan
outcome
of(C
,C)!
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
35
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•You
can’tgetthesucker’s
payoffbysubm
ittingthis
program.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
36
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
4.3T
heIterated
Prisoner’s
Dilem
ma
•O
neansw
er:play
thegam
em
orethan
once.Ifyou
knowyou
willbe
meeting
youropponentagain,
thenthe
incentiveto
defectappearsto
evaporate.
•C
ooperationis
therationalchoice
inthe
infinititelyrepeated
prisoner’sdilem
ma.
(Hurrah!)
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
37
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
4.4B
ackwards
Induction
•B
ut...suppose
youboth
knowthatyou
willplay
thegam
eexactly
ntim
es.O
nround
n−
1,youhave
anincentive
todefect,to
gainthatextra
bitofpayoff...B
utthism
akesround
n−
2the
last“real”,andso
youhave
anincentive
todefectthere,too.
This
isthe
backwards
inductionproblem
.
•P
layingthe
prisoner’sdilem
ma
with
afixed,finite,
pre-determined,com
monly
known
number
ofrounds,defection
isthe
beststrategy.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
38
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
4.5A
xelrod’sTournam
ent
•S
upposeyou
playiterated
prisoner’sdilem
ma
againsta
rangeofopponents
...W
hatstrategyshould
youchoose,so
asto
maxim
iseyour
overallpayoff?
•A
xelrod(1984)
investigatedthis
problem,w
itha
computer
tournamentfor
programs
playingthe
prisoner’sdilem
ma.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
39
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Strategies
inA
xelrod’sTournam
ent
•A
LLD:
“Alw
aysdefect”
—the
hawk
strategy;
•T
IT-FO
R-TAT
:
1.O
nround
u=
0,cooperate.2.
On
roundu
>0,do
whatyour
opponentdidon
roundu−
1.
•T
ES
TE
R:
On
1stround,defect.Ifthe
opponentretaliated,thenplay
TIT-F
OR
-TAT.Otherw
iseintersperse
cooperation&
defection.http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
40
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•JO
SS
:A
sT
IT-FO
R-TAT,exceptperiodically
defect.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
41
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Recipes
forS
uccessin
Axelrod’s
Tournament
Axelrod
suggeststhe
following
rulesfor
succeedingin
histournam
ent:
•D
on’tbeenvious:
Don’tplay
asifitw
erezero
sum!
•B
enice:
Startby
cooperating,andreciprocate
cooperation.
•R
etaliateappropriately:
Alw
ayspunish
defectionim
mediately,butuse
“measured”
force—
don’toverdoit.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
42
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
•D
on’tholdgrudges:
Alw
aysreciprocate
cooperationim
mediately.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
43
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
5G
ame
ofChicken
•C
onsideranother
typeofencounter
—the
game
ofchicken :
i
j
defectcoop
defect1
21
4coop
43
23
(Think
ofJames
Dean
inR
ebelwithouta
Cause:
swerving
=coop,driving
straight=defect.)
•D
ifferenceto
prisoner’sdilem
ma:
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
44
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Mutualdefection
ism
ostfearedoutcom
e.
(Whereas
sucker’spayoffis
mostfeared
inprisoner’s
dilemm
a.)
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
45
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
Solution
Concepts
•T
hereis
nodom
inantstrategy(in
oursense).
•S
trategypairs
(C,D
))and
(D,C
))are
Nash
equilibriums.
•A
lloutcomes
except(D
,D)
areP
aretooptim
al.
•A
lloutcomes
except(D
,D)
maxim
isesocialw
elfare.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
46
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
6O
therS
ymm
etric2
x2
Gam
es
•G
iventhe
4possible
outcomes
of(symm
etric)cooperate/defectgam
es,thereare
24possible
orderingson
outcomes.
–C
C
i CD
i DC
i DD
Cooperation
dominates.
–D
C
i DD
i CC
i CD
Deadlock.
Youw
illalways
dobestby
defecting.–
DC
i CC
i DD
i CD
Prisoner’s
dilemm
a.–
DC
i CC
i CD
i DD
Chicken.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
47
Chapter
11A
nIntroduction
toM
ultiagentS
ystems
2e
–C
C
i DC
i DD
i CD
Stag
hunt.
http://www.csc.liv.ac.uk/˜mjw/pubs/imas/
48
top related