evolutionary game algorithm for continuous parameter optimization alireza mirian

Evolutionary Game Algorithm for continuous parameter optimization

Alireza Mirian

Alireza MirianEvolutionary Computation presentation, 2012

A system in which a number of rational players make decision in a way that maximize their utility.

2

What is Game Theory?

Non-cooperative and cooperative games

Equilibrium point Evolutionary Game

Algorithm Mapping between

strategy profile and xi

Procedure of EGA Results and

comparison with other algorithms

What is a Game?


Each player (agents) has a set of possible actions (strategies) to choose from

Each player have their Utility Function that determines the profit/outcome of any decision

Agents are rational self-interested decision makers, i.e. they make decision upon their view of utility.

Players doesn’t have full control over outcome. That is, a person’s success is based upon the choices of others

3








What is a Game?


Games have wide range, from parlor games (chess, poker, bridge) to various economic, political, military or biological situations.

4








What is a Game?


Game theory: the study of mathematical models of games

John von Neumann & John Nash Has lots of applications in

economics, political science, and psychology, and other, more prescribed sciences, like logic or biology.

tries to find a “solution” for game

5










Decision Theory: A special case of Game with one player

6










In non-cooperative games the goal of each player is to achieve the largest possible individual gain (profit or payoff)

In cooperative games the action of players are directed to maximize the gain of “collectives” (coalitions) without subsequent subdivision of the gain among the players within the coalition

7










Non-cooperative: Two player Hokm

Cooperative: Four player Hokm

8










Let I denote the set of players Let Si denote the set of all possible

actions for player i (strategies of player i)

|Si| > 1 (why?) At each “round” of the game, each

player chooses a certain strategy si ϵ Si

So, after each round: (s1,s2,…,sn) = s is put together.

This system is called a situation In each situation, each player gets a

profit S = S1×…×Sn = ∏iϵI Si (strategy profile).

9








Non-cooperative game

10


Definition of Non-cooperative

Game:

G=[ I , {Si}iϵI , {Ui}iϵI ]

I = {1,2, …, n} : set of players

Si : strategy set for player i (set of

possible actions)

Ui : Utility function defined on set

S=∏iϵI Si









11


Example: 4-barg!

I = {1,2}

S1 = { , , , }

S2 = { , , , }

U1( s ) = U1({ , }) = 2

U2( s ) = U2({ , }) = 0









2 1

s ={ , }

12


s = {s1, …,si-1, si, si+1, …, sn}

s || s΄i = {s1, …,si-1, s΄i , si+1, …, sn}

That is, s || s΄i is a situation that differs from s, only in si

Admissible situation: a situation s is called admissible for player i if any other strategy s΄i for this player we have: Ui(s || s΄i ) ≤ Ui(s)








Admissible situation

13


A situation s, which is admissible for all the players is called an equilibrium situation

That is, in a equilibrium situation, no player is interested to change their strategy. (why?)

Solution of a non-cooperative game: determination of an equilibrium situation








Equilibrium point

14


An optimization problem: arg max f(x)

x ∈ D

where x = (x1,x2,...,xn) ∈ Rn, xi ∈ [xi

l, xiu] ,

i = 1,2,...,n, is n-dimensional real vector, f(x) is the objective function, D = [xi

l, xiu] ⊆ R n defines

the search space, and x∗ that satisfies f(x∗)= max { f (x) | x ∈ D } is the optimal solution of problem








Optimization problem

15


In EGA the optimization problem maps into a non-cooperative

Optimum will find by exploring the equilibrium situations in corresponding game

Global convergence property of the algorithm is proofed








Optimization problem and game

16


x = (x1,x2,...,xn) G = ( I, {Si}iϵI , {Ui}iϵI ) Variable x is mapped to strategy

profile of game agents Objective function f is mapped to

game agents΄ utility function Nx :the number of agents that their strategy

profile will represent a variable xi

|I| = n * nx |Si| = m

Size of strategy profile of nx agent: mnx -1

Precision of this mapping: (xiu – xi

l

)/(mnx -1)








Mapping between strategy profile and xi

17


Decoding function φ: xi = φ(si) = xi

l + decimal(si) × (xiu – xi

l )/(mnx -1) Example: f(x) = x1 + x2

where xi ϵ [-2.048, 2/048], i = 1,2

xn = 10, m = 2 overall strategy profile of nI = n × nx =20

agent is a binary string with length of 20: S: 0000110111 1101110001

x1=-2.048+decimal(0000110111)2 ×4.096/(210 -1)

x2=-2.048+decimal(1101110001)2 ×4.096/(210 -1)

x1 = 1.8277785, x2 = 1.479444








Mapping between strategy profile and xi

x1 x2

18


All the agents have the same utility function which is just objective function

u = { ui(s) ≡ f(φ(s)), i є I}where I = {1, 2, 3, …, nI}

s is the strategy profile of nI = n × nx

In the previous example: s = (00001101111101110001) u(i) = f(φ(s)) = f(x1, x2) = x1 + x2 = -

0.348341i = 1, 2, 3, …, 20








Utility function

19


At the start of EGA each agent randomly selects a strategy from its strategy set {0, 1, . . ., m − 1} with a probability 1/m

After that, In each loop: Random perturb: current strategy of

each agent is replaced with a random strategy with a probability 1/m for each strategy

agents will do a deterministic process to reach an equilibrium point se

(t)








Procedure of EGA

20


Procedure EGA

t = 0;

randomly initialize s(0) and set it as current solution;

while termination condition is not satisfied doperform a random perturb on current solution s(t);

do a deterministic process to reach an equilibrium point se

(t) ;

if utility of se (t) ≥ utility of current solution

current solution = se (t)

end

t = t + 1;

end

end








Procedure of EGA

21


How to reach the equilibrium point? Coalition: nx agents that represent the same

component xi of variable x are defined as one coalition In out example: agent 1, 2, . . ., 10 that represent x1 is a

coalition, and agent 11, 12, . . ., 20 that represent x2 is another coalition.

BRC: the strategy profile of a coalition that maximizes its utility while strategy profile of other coalitions are fixed is called the Best-Response Correspondence (BRC) of that coalition.

Process of reaching equilibrium: While equilibrium point is not reached, all

coalitions replace their strategy profile with their BRC in sequence








Reaching equilibrium point

22


Pseudo code of reaching equilibrium point:

while equilibrium state is not achievedfor agent coalition i = 1, 2, . . . ,n

agent coalition i replaces its strategy profile with its BRC;

end

end

Now two other thing: How to decide whether an equilibrium point is

achieved? How does an agent coalition find out its BRC








Reaching equilibrium point

23


How to decide whether an equilibrium point is achieved? when r (the number index of BRC rounds)

reaches a predefined number R the utility has not improved in dr consecutive

rounds

How does an agent coalition find out its BRC? Exact BRC ~> have to compute the utilities of all

possible strategy profiles within its strategy profile space

Cardinality of the strategy profile set of a coalition ( = mnx ) usually is a very large number

inner level optimization is used to find an approximate BRC.








Two remaining problem

24


Inner level optimization for approximating BRC has two phases: first phase: with a perturb probability pd , the

current strategy of each agent in a coalition is replaced with a new strategy with a probability 1/m for each strategy.

Second phase: each agent in the coalition replaces its current strategy with an optimal strategy selected from its strategy set { 0,1,...,m − 1 } which maximizes its utility in sequence.

inner level optimization process has the same structure as the main loop of EGA itself if we regard one agent as a coalition (except that the inner process only has one loop i.e. one BRC round)








inner level optimization

25










26










27


Y. Jun a, L. Xiande, H. Lu, “Evolutionary game algorithm for continuous parameter optimization”, Information Processing Letters, 2004

N. N. Vorob’ev, “Game Theory Lectures for Economists and Systems Scientists”, Springer-verlag,1977

R. D. Luce, H. Raiffa, “Games and Decision”, J. Wiley & sons, 1957

R. Cressman, “The Stability Concept of Evolutionary Game Theory”, Springer-verlag, 1992

E. V. Damme, “non-cooperative Games” TILEC and CentER, Tilburg University, 2004

Y. Jun, L. Xiande, H. Lu, “Evolutionary game algorithm for multiple knapsack problem”, Proc. of 2003 IEEE/WIC International Conference on Intelligent Agent Technology, 2003.

Ross, Don, "Game Theory", The Stanford Encyclopedia of Philosophy (Fall 2011 Edition), Edward N. Zalta (ed.), 2011

D. K. Levine, “What is Game Theory?”, Department of Economics, UCLA

References

28


Thanks for your attention

:D

evolutionary game algorithm for continuous parameter optimization alireza mirian

Documents

game theory

noncooperative games

cooperative games equilibrium

algorithms noncooperative

player hokm cooperative

special case of game

decision theory

strategy profile