an chapter introduction 11: mul tia gent · introduction to multiagent systems 2e 2 utilities and...

Introductionto

MultiagentS

ystems

http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

Chapter

nIntroduction

ultiagentS

ystems

hatareM

ultiagentSystem

Chapter

nIntroduction

ultiagentS

ystems

ultiagentsystemcontains

berofagents

...•...

interactthroughcom

munication

•...

areable

toactin

anenvironm

ent...

•...

havedifferent“spheres

ofinfluence”(w

aycoincide)...

•...

willbe

linkedby

other(organisational)

relationships.

Chapter

nIntroduction

ultiagentS

ystems

tilitiesand

Preferences

havejusttw

oagents:

gentsare

assumed

self-interested:they

havepreferences

overhow

theenvironm

entis .

setof“outcomes”

thatagents

havepreferences

ecapture

preferencesby

utilityfunctions:

ui:Ω→

uj:Ω→

Chapter

nIntroduction

ultiagentS

ystems

tilityfunctions

leadto

preferenceorderings

overoutcom

iω′

ui (ω

iω′

ui (ω

Chapter

nIntroduction

ultiagentS

ystems

Whatis

Utility?

tilityis

notmoney

(butitisa

usefulanalogy).

•Typicalrelationship

between

utility&

money:

Chapter

nIntroduction

ultiagentS

ystems

utility

Chapter

nIntroduction

ultiagentS

ystems

ultiagentEncounters

odeloftheenvironm

entinw

hichthese

agentsw

illact...

–agents

simultaneously

choosean

actionto

perform,

aresultofthe

actionsthey

select,anoutcom

illresult;–

theactualoutcom

edepends

combination

ofactions;

–assum

agenthasjusttw

opossible

actionsthatitcan

performC

(“cooperate”)and

“D”

(“defect”).http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

Chapter

nIntroduction

ultiagentS

ystems

nvironmentbehaviour

givenby

statetransform

erfunction :τ

c︸︷︷︸

agenti’s

action×

︸︷︷︸

agentj’s

action→

Chapter

nIntroduction

ultiagentS

ystems

astate

transformer

function:

environmentis

sensitiveto

actionsofboth

agents.)

another:

(Neither

agenthasany

influencein

thisenvironm

ndhere

isanother:

environmentis

controlledby

Chapter

nIntroduction

ultiagentS

ystems

RationalA

upposew

thecase

bothagents

caninfluence

theoutcom

e,andthey

haveutility

functionsas

follows:

ui (ω

i (ω2 )

ui (ω

i (ω4 )

uj (ω

j (ω2 )

uj (ω

j (ω4 )

bitofabuseofnotation:

Chapter

nIntroduction

ultiagentS

ystems

henagent

i’spreferences

•“C

therationalchoice

(Because

iprefersalloutcom

esthatarise

throughC

overalloutcom

esthatarise

throughD

Chapter

nIntroduction

ultiagentS

ystems

PayoffM

atrices

characterisethe

previousscenario

payoffm

defectcoop

defect1

gentiis

thecolum

nplayer.

gentjis

therow

player.

Chapter

nIntroduction

ultiagentS

ystems

Solution

Concepts

illarationalagentw

illbehavein

anygiven

scenario?

nswered

insolution

concepts:

–dom

inantstrategy;–

equilibriumstrategy;

aretooptim

alstrategies;–

strategiesthatm

aximise

socialwelfare.

Chapter

nIntroduction

ultiagentS

ystems

inantStrategies

illsaythata

strategysi is

dominantfor

playeriif

atterw

hatstrategysj agent

jchooses,iw

illdoat

leastasw

ellplayingsi as

itwould

doinganything

nfortunately,thereisn’talw

dominantstrategy.

Chapter

nIntroduction

ultiagentS

ystems

Strategy)

Equilibrium

general,we

willsay

thattwo

strategiess1

equilibriumif:

1.under

theassum

ptionthatagentiplays

s1 ,agentjcan

betterthan

plays2 ;and

2.under

theassum

ptionthatagentjplays

s2 ,agentican

betterthan

plays1 .

eitheragenthas

anyincentive

todeviate

equilibrium.

nfortunately:http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

Chapter

nIntroduction

ultiagentS

ystems

oteveryinteraction

scenariohas

ashequilibrium

einteraction

scenarioshave

thanone

equilibrium.

Chapter

nIntroduction

ultiagentS

ystems

Matching

Pennies

Players

iandjsim

ultaneouslychoose

theface

ofacoin,either

“heads”or

“tails”.Ifthey

showthe

face,theniw

ins,while

iftheyshow

differentfaces,thenjw

Chapter

nIntroduction

ultiagentS

ystems

Matching

Pennies:

PayoffM

iheadsitails

jheads1

jtails−

Chapter

nIntroduction

ultiagentS

ystems

Strategies

atchingP

ennies

ofstrategiesform

purestrategy

hateverpair

ofstrategiesis

chosen,somebody

theyhad

donesom

ethingelse.

hesolution

allowm

ixedstrategies:

–play

“heads”w

ithprobability

0.5–

play“tails”

probability0.5.

Estrategy.

Chapter

nIntroduction

ultiagentS

ystems

Strategies

strategyhas

theform

–play

probabilityp

–play

probabilityp

–...

–play

probabilityp

suchthat

+···

ashproved

thateveryfinite

equilibriumin

strategies .

Chapter

nIntroduction

ultiagentS

ystems

Nash’s

Theorem

ashproved

thateveryfinite

equilibriumin

strategies .(U

nlikethe

casefor

purestrategies.)

resultovercomes

thelack

ofsolutions;butthere

stillmay

orethan

ashequilibrium

Chapter

nIntroduction

ultiagentS

ystems

Pareto

noutcom

saidto

aretooptim

al(orP

aretoefficient )

ifthereis

noother

outcome

thatmakes

oneagent better

offwithoutm

akinganother

agentworse

•Ifan

outcome

aretooptim

al,thenatleastone

agentwillbe

reluctanttom

ayfrom

it(becausethis

agentwillbe

off).http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

Chapter

nIntroduction

ultiagentS

ystems

•Ifan

outcome

notPareto

optimal,then

thereis

anotheroutcom

eω′thatm

akeseveryone

ashappy,if

nothappier,thanω

easonable”agents

agreeto

toω′in

thiscase.

ifIdon’tdirectlybenefitfrom

ω′,you

canbenefitw

ithoutme

suffering.)

Chapter

nIntroduction

ultiagentS

ystems

SocialW

elfare

hesocialw

elfareofan

outcome

thesum

oftheutilities

thateachagentgets

fromω

:∑i∈

ui (ω

hinkofitas

the“totalam

ountofmoney

system”.

solutionconcept,m

appropriatew

henthe

system(allagents)

singleow

ner(then

overallbenefitofthesystem

portant,notindividuals).

Chapter

nIntroduction

ultiagentS

ystems

petitiveand

Zero-S

umInteractions

herepreferences

ofagentsare

diametrically

opposedw

strictlycom

petitivescenarios.

ero-sumencounters

arethose

utilitiessum

tozero:

ui (ω

uj (ω

allω∈

erosum

encountersare

badnew

toget+

veutility

youhave

togetnegative

utility!T

hebest

outcome

orstforyou!

Chapter

nIntroduction

ultiagentS

ystems

erosum

encountersin

reallifeare

veryrare

...but

peoplefrequently

actasifthey

zerosum

Chapter

nIntroduction

ultiagentS

ystems

risoner’sD

arecollectively

chargedw

andheld

inseparate

cells,with

eetingor

unicating.T

heyare

toldthat:

•ifone

confessesand

theother

doesnot,the

confessorw

illbefreed,and

theother

willbe

jailedfor

threeyears;

•ifboth

confess,theneach

willbe

jailedfor

years.

prisonersknow

thatifneitherconfesses,

thenthey

willeach

bejailed

forone

Chapter

nIntroduction

ultiagentS

ystems

ayoffmatrix

forprisoner’s

dilemm

defectcoop

defect2

•Top

left:Ifboth

defect,thenboth

getpunishmentfor

mutualdefection.

•Top

right:If

icooperatesand

jdefects,igets

sucker’spayoffof1,w

hilejgets

4.http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

Chapter

nIntroduction

ultiagentS

ystems

ottomleft:

Ifjcooperates

andidefects,

jgetssucker’s

payoffof1,while

igets4.

ottomright:

ardfor

mutualcooperation.

Chapter

nIntroduction

ultiagentS

ystems

houldYou

heindividualrationalaction

isdefect.

guaranteesa

payoffofnow

orsethan

2,whereas

cooperatingguarantees

apayoffofatm

odefection

bestresponseto

allpossiblestrategies:

bothagents

defect,andgetpayoff=

utintuitionsays

thisis

notthebestoutcom

urelythey

shouldboth

cooperateand

eachget

payoffof3!

Chapter

nIntroduction

ultiagentS

ystems

Solution

Concepts

dominantstrategy.

ashequilibrium

lloutcomes

except(D

aretooptim

isessocialw

elfare.

Chapter

nIntroduction

ultiagentS

ystems

hisapparentparadox

fundamentalproblem

ulti-agentinteractions .Itappears

plythat cooperation

willnotoccur

insocieties

ofself-interestedagents .

ealworld

examples:

–nuclear

reduction(“w

hydon’tIkeep

mine...”)

–free

ridersystem

publictransport;

televisionlicenses.

heprisoner’s

dilemm

ubiquitous.

erecover

cooperation?

Chapter

nIntroduction

ultiagentS

ystems

entsfor

Recovering

Cooperation

onclusionsthatsom

fromthis

analysis:

–the

theorynotion

ofrationalactionis

wrong!

–som

ehowthe

dilemm

beingform

ulatedw

rongly

rguments

torecover

cooperation:

notallmachiavelli!

heother

prisoneris

rogramequilibria

ediators–

shadowofthe

future...

Chapter

nIntroduction

ultiagentS

ystems

rogramE

quilibria

hestrategy

youreally

wantto

playin

theprisoner’s

dilemm

I’llcooperateifhe

rogramequilibria

provideone

ofenablingthis.

achagentsubm

programstrategy

mediator

jointlyexecutes

thestrategies.

Crucially,strategies

conditionedon

thestrategies

oftheothers .

Chapter

nIntroduction

ultiagentS

ystems

rogramE

quilibria

onsiderthe

following

program:

HisProgram

==ThisProgram

DO(C);

ELSEDO(D);

END-IF.

istextualcom

parison.

hebestresponse

tothis

programis

tosubm

itthesam

eprogram

,givingan

outcome

Chapter

nIntroduction

ultiagentS

ystems

•You

can’tgetthesucker’s

payoffbysubm

ittingthis

program.

Chapter

nIntroduction

ultiagentS

ystems

heIterated

Prisoner’s

neansw

er:play

thegam

orethan

once.Ifyou

knowyou

willbe

meeting

youropponentagain,

thenthe

incentiveto

defectappearsto

evaporate.

ooperationis

therationalchoice

infinititelyrepeated

prisoner’sdilem

(Hurrah!)

Chapter

nIntroduction

ultiagentS

ystems

ackwards

Induction

ut...suppose

youboth

knowthatyou

willplay

thegam

eexactly

nround

1,youhave

anincentive

todefect,to

gainthatextra

bitofpayoff...B

utthism

akesround

last“real”,andso

youhave

anincentive

todefectthere,too.

backwards

inductionproblem

layingthe

prisoner’sdilem

afixed,finite,

pre-determined,com

number

ofrounds,defection

beststrategy.

Chapter

nIntroduction

ultiagentS

ystems

xelrod’sTournam

upposeyou

playiterated

prisoner’sdilem

againsta

rangeofopponents

hatstrategyshould

youchoose,so

iseyour

overallpayoff?

xelrod(1984)

investigatedthis

problem,w

computer

tournamentfor

programs

playingthe

prisoner’sdilem

Chapter

nIntroduction

ultiagentS

ystems

Strategies

xelrod’sTournam

“Alw

aysdefect”

—the

strategy;

nround

0,cooperate.2.

roundu

whatyour

opponentdidon

roundu−

1stround,defect.Ifthe

opponentretaliated,thenplay

-TAT.Otherw

iseintersperse

cooperation&

defection.http://www.csc.liv.ac.uk/˜mjw/pubs/imas/

Chapter

nIntroduction

ultiagentS

ystems

R-TAT,exceptperiodically

defect.

Chapter

nIntroduction

ultiagentS

ystems

Recipes

uccessin

Axelrod’s

Tournament

Axelrod

suggeststhe

following

rulesfor

succeedingin

histournam

on’tbeenvious:

Don’tplay

asifitw

erezero

enice:

Startby

cooperating,andreciprocate

cooperation.

etaliateappropriately:

ayspunish

defectionim

mediately,butuse

“measured”

force—

don’toverdoit.

Chapter

nIntroduction

ultiagentS

ystems

on’tholdgrudges:

aysreciprocate

cooperationim

mediately.

Chapter

nIntroduction

ultiagentS

ystems

ofChicken

onsideranother

typeofencounter

—the

ofchicken :

defectcoop

defect1

(Think

ofJames

ebelwithouta

Cause:

swerving

=coop,driving

straight=defect.)

ifferenceto

prisoner’sdilem

Chapter

nIntroduction

ultiagentS

ystems

Mutualdefection

ostfearedoutcom

(Whereas

sucker’spayoffis

mostfeared

inprisoner’s

dilemm

Chapter

nIntroduction

ultiagentS

ystems

Solution

Concepts

hereis

inantstrategy(in

oursense).

trategypairs

equilibriums.

lloutcomes

except(D

aretooptim

lloutcomes

except(D

isesocialw

elfare.

Chapter

nIntroduction

ultiagentS

ystems

etric2

iventhe

4possible

outcomes

of(symm

etric)cooperate/defectgam

es,thereare

24possible

orderingson

outcomes.

Cooperation

dominates.

Deadlock.

illalways

dobestby

defecting.–

Prisoner’s

dilemm

Chicken.

Chapter

nIntroduction

ultiagentS

ystems

an chapter introduction 11: mul tia gent · introduction to multiagent systems 2e 2 utilities and...

Documents

agents. intelligent agents. multiagent systems. delegation...

multiagent planning, control, and...

transitioning multiagent technology to uav...

1 lecture 2: intelligent agents an introduction to...

portfolio pref 102

evolving policy geometry for scalable multiagent …evolving...

pref allot (updated)

decentralized decision making for multiagent systemspled...

comparing multiagent systems research in...

infrastructures for the environment of multiagent...

learning multiagent communication with...

multiagent systems...

multiagent systems & societies of agents (ii)

chapter 14: multiagent alloca systems scarce …...chapter...

software agents and multiagent systems agent oriented...

trends in practical applications of agents and multiagent...

mallesh pref 1

multiagent systems

learning to improve the quality of plans produced by...

pref - astro.cas.cz