lecture 8, 11/08/07information, control & games, fall 07, copyright p. b. luh, s.-c. chang 1...

Lecture 8, 11/08/07 Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

Information, Control & Games: Lecture #8Hierarchical Games

Last Two Times: • Finite Nash Games

– Feedback Games and Behavior Strategies• Infinite Nash Games

– Open-Loop, Feedback, Closed-Loop Nash Equilibria• Cooperative games

– Coalitional games– Redistribution of payoffs– Unanimity game and the core– Majority, vote trading, Landowner&workers game– Shapley Value under Differential Marginal Contributions– Cooperative Game and Risk

Next Time• 11/15 No Class • 11/22 Midterm exam

Today • Finite Hierarchical Games

– Motivating Examples

– Solution Concept

– Examples and Results on Finite Games

– An Example of Single-Act Infinite Games

• The Inducible Region Approach – Approach for Single-Stage Problems

– Principle of Optimality

– Multi-Stage Games

• Team Decision Theory– A Motivating Example

– A Formal Model and Solution Methodology

– A Canonical Example

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

• Reading Assignments for Today

1. Text 6.2

2. T. S. Chang, P. B. Luh, “Derivation of Necessary and Sufficient Conditions for Single-Stage Stackelberg Games vis the Inducible Region Concept,” IEEE Transactions on Automatic Control, Vol. AC-29, No.1, Jan. 1984, pp. 63-66

3. P.B. Luh, S.C. Chang, T.S. Chang, ”Solutions and Properties of Multi-Stage Stackelberg Games,” Automatica, Vol.20,No.2, March 1984, pp.251-256.

4. Relevant sections of T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory

5. Y. C. Ho, “Team Decision Theory and Information Structures,” Proceedings of IEEE, Vol. 68, No. 6, June 1980, pp. 644-654.

Information, Control and GamesLecture 8

Lecture 8, 11/08/07

Hierarchical Games: Motivating Examples

• Consider the Dating Game discussed previously

Gentleman\Lady Opera Football

Opera (-1, -2) (0, 0)

Football (0, 0) (-2, -1)

– There are two Nash solutions which are not equivalent (Do

not have the same pair of costs) and not interchangeable (Mixing

various Nash choices may not end up with a Nash solution)

Q. If the lady is in a dominating position and can announce and then impose her strategy, knowing that the gentleman will react “rationally.” What should she announce? Why?

The lady would announce and impose Opera

Lecture 8, 11/08/07

– The DM holding the powerful position to announce and then impose his/her strategies is the Leader

– Followers then reacting rationally to Leader’s strategies

– Hierarchical, Leader-Follower, or Stackelberg Game

Q. Who is the leader? For example, after marriage, would the lady remain to be the leader?– Earn to be a leader or by the authority of position (still earned)

Q. Other examples of hierarchical games? – Grading for this course

• Allocation of 100 points: What are important

• Homework: Grading 2 randomly selected problems

• Term Project: Extra credit for cross-discipline teaming

– What would happen if a student said “I am not going to hand in any homework assignment for this course”?

Other examples:

Seat belt law

Speed limits

Lecture 8, 11/08/07

Solution Concept • Consider a two-person problem with 1 leader (DM1) and 1

follower (DM2) – DM1: Strategy 1 1, cost function J1(1, 2)

– DM2: Strategy 2 2, cost function J2(1, 2)

SGD. How to describe the hierarchical game concept?

• For a given 1, DM2 reacts rationally by minimizing his/her cost, i.e., min22

J2(1, 2)

– DM2’s rational reaction set:

R2(1) { 2, | J2(1, ) J2(1, 2) 2 2}

• The leader needs to find the best strategy 1S to minimize

J1(1, 2), taking into account F’s reactions

min11 J1(1, 2), s.t. 2 R2(1)

Lecture 8, 11/08/07

Q. If you were DM1, what would you do? – Need to have some behavioral assumptions

– The leader is assumed to be conservative to safeguard the worst case

Q. What if there are multiple elements in R2(1)?

Example: DM1\DM2 L M R

L (0, 0) (1, 0) (3, 1)

R (2, -1) (2, 0) (-1, -1)

Selecting L for the costs (1, 0). Mathematically?

min11 max2R2(1) J1(1, 2)

1S ~ The Stackelberg strategy for the leader

S, 2S) ~ The Stackelberg cost

Q. Is the problem easier or more difficult as compared to Nash? Lecture 8, 11/08/07

• The problem is difficult even if R2(1) is singleton for all 1

– Difficult to characterize the reaction set R2(1)

– Difficult to optimize J1(1, 2) s.t. 2 R2(1)

Graphical Interpretation

Q. What is R2(u1)? What is the solution with DM1 as the leader (S1)? What is R1(u2)? Solution with DM2 as the leader (S2)? Nash equilibrium (N)?

Level curves for DM2

R2Level curvesfor DM1

Q. How do we compare S1 with N for DM1? Why?

S1 is better than N for DM1 if R2(1) is a singleton for each 1

N is the intersection of R1 and R2, and S1 has the best J1 on R2

Same is true for DM 2 Lecture 8, 11/08/07

Examples

Example 1. A Matrix Game

Q. What is the solution when DM1 is the leader? With DM2 as the leader? The Nash solutions?

DM1\DM2 21 22 23 24

L (-4, -1) (2, 0) (0, 1) (2, -1)

M (-3, -2) (0, 3) (0, -3) (-3, -2)

R (4, -1) (1, 0) (1, 0) (-2, -1)

1S = M with costs (0, -3)

2S = 24 with costs (-3, -2)

– The Nash equilibrium points are: (L, 21) with costs (-4, -1) and (M, 23) with costs (0, -3) (Intersection of R1 and R2)

Lecture 8, 11/08/07

Example 2. A Game in Extensive Form

6,-6 3,-7

4,-10 10,-5

Q. Who is the leader?

What is the solution when DM1 is the leader? When DM2 is the leader? Nash?

– DM1 as the leader 1S = L with costs (3, -7) ~ Easy

– DM2 can be the leader ~ The leader doesn’t have to move first. He has to announce strategy first and then impose it

– DM2: 22 = 4 strategies 21 22 23 24

DM1 L L L R R

DM1 R L R L R

Outcome

Nash: 2

N: DM1 = L ~ L DM1 = R ~ L

1N: L with costs (3, -7)

Backward induction(3, -7) (3, -7) (4, -10) (6, -6)

Lecture 8, 11/08/07

Example 2. (Continued)

6,-6 3,-7

4,-10 10,-5

SGD. The problem with DM2 as the leader was solved by converting it to normal form. Can it be directly solved in extensive form? How?

• Will come back to this at the next hour

Lecture 8, 11/08/07

Relevant Results on Finite Games

Theorem 3.3. Every two-person finite game admits a pure Stackelberg strategy for the leader

Proof. Intuitively clear from the finiteness of 1 and 2

Proposition 3.16. For a given two-person finite game, if– a pure Nash solution (1, 2) exists, and

– R2(1) is a singleton for every 1 1

then J1S J1

Proof. By contradiction

Lecture 8, 11/08/07

An Example of Single-Act Infinite Games

Problem Formulation

• Two manufacturers, M1 and M2, produce a single product type with the same manufacturing technology – Mi produces quantity ui at the cost Ci = cui + d

– d > 0 is the setup cost, and c > 0 is the unit production cost

• Price is determined by the demand-supply relationship: p = a – b(u1 + u2), with a > 0 and b > 0

• Profit for M1: pu1 – C1 = -J(u1, u2), or

J1(u1, u2) = [b(u1 + u2) – a]u1 + cu1 + d

– Similarly, J2(u1, u2) = [b(u1 + u2) – a]u2 + cu2 + d Lecture 8, 11/08/07

• Several cases will be examined:– Single manufacturer (monopoly case with u2 0)

– Nash equilibrium

– Stackelberg solution with M1 as the leader

Case 1. Single Manufacturer

Q. How to solve the problem?

• Necessary condition: dJ1/du1 = a - 2bu1 - c = 0 u1 = (a – c)/2b

• Suppose a = 50, b = 1, c = 2, and d = 10, thenu1

M = (a – c)/2b = 24, pM = a – bu1M = 50 – 24 = 26

C1M = cu1

M + d = 224 + 10 = 58

J1M = – pu1

M + C1M = –2624 + 58 = –566

Lecture 8, 11/08/07

Case 2. Nash equilibrium

Q. How to solve the problem? • Necessary condition:

(J1(u1, u2) = [b(u1 + u2) – a]u1 + cu1 + d)

J1/u1 = a - 2bu1 - bu2 - c = 0

J2/u2 = a - bu1 - 2bu2 - c = 0

u1N = u2

N = (a – c)/3b ~ Symmetric

• With the same a = 50, b = 1, c = 2, and d = 10, then what?u1

N = u2N = (a – c)/3b = 16, u1

N + u2N = 32 > u1

M = 24

pN = a – b(u1N + u2

N) = 50 – 32 = 18 < pM = 26

C1N = C2

N = cu1N + d = 216 + 10 = 42, C1

N + C2N = 84 > 58

J1N = J2

N = -1816 + 42 = –246,

-(J1N + J2

N) = 492 < -J1M = 566

Lecture 8, 11/08/07

Case 3. Stackelberg solution with M1 as the leader

SGD. How to solve the problem? • Reaction of M2:

J2/u2 = a - bu1 - 2bu2 - c = 0 u2 = (a – c – bu1)/2b

• M1’s problem: Min J1(u1, u2), s.t. u2 = (a – c – bu1)/2b. How to solve it?

J1(u1, u2) = [b(u1 + u2) – a]u1 + cu1 + d

= (bu1 – a – c)u1/2 + cu1 + d

dJ1/du1 = bu1 - 0.5a + 0.5c = 0 u1S = (a – c)/2b

• From here, one can get u2

S = (a – c – bu1S)/2b = (a – c)/4b = u1

pS = a – b(u1S + u2

S) = (a +3c)/4

Lecture 8, 11/08/07

• With the same a = 50, b = 1, c = 2, and d = 10, then what?u1

S = (a – c)/2b = 24 = u1M > u1

N = 16

u2S = 0.5u1

S = 12 < u2N

u1S + u2

S = 36 > u1N + u2

N = 32 > u1M = 24

pS = (a +3c)/4 = 56/4 = 14 < pN = 18 < pM = 26

C1S = cu1

S + d = 224 + 10 = 58

C2S = cu2

S + d = 212 + 10 = 34

J1S = -pu1

S + C1S = -1424 + 58 = –278

-J1S = 278 > -J1

N = 246; -J1S = 278 < -JM = 566

J2S = – pu2

S + C2S = –1412 + 34 = –134

-J2S = 134 < -J2

N = 246

-(J1S + J2

S) = 412 < -(J1N + J2

N) = 492 < -J1M = 566

Lecture 8, 11/08/07

The Inducible Region Approach

• Change of DM2 to DM0 with cost stated as (J0, J1)

• If DM1 is the leader, the problem is quite simple

• The best DM0 can do depends on his/her ability to influence DM1’s cost

Q. What is the worst that DM0 can penalize DM1?

3,8 -3,7

-9,10 10,3

SGD. Now with DM0 as the leader, we want to solve the problem directly in extensive form. How?

Lecture 8, 11/08/07

minu1 maxu0

J1(u0, u1) M, maximum penalizing strategy, (u0M, u1

– M is the worst value that DM1 could ever get. Implications?

– Any outcome with J1 > M cannot be achieved

– Conversely, any outcome with J1(u0, u1) < M is “inducible,”

i.e., exists a strategy 0 so that (u0, u1) is the resulting outcome

3,8 -3,7

-9,10 10,3

– On the boundary J1(u0, u1) = M:

• (u0M, u1

M) is inducible

3,8 -3,7

-9,10 -9,8

• J0(u0, u1) < J0(u0M, u1

M) is not inducible (behavior assumption)

• J0(u0, u1) > J0(u0M, u1

M) is okay but not worthwhile to consider

Lecture 8, 11/08/07

DM0’s Optimization Problem

• Select the minimal cost within the “inducible region”: IR = {(u0, u1) | J1(u0, u1) () M}

3,8 -3,7

-9,10 10,3

Min(u0, u1)IR J0(u0, u1)

– If u1 = u1S, then u0 = u0

– If u1 u1S, the resulting J1 > 7 to induce DM1 to select u1

(u0S, u1

S) = (L, L), with (J0S, J1

S) = (-3, 7)

• To construct DM0’s strategy 0S:

Lecture 8, 11/08/07

Q. How to interpret the approach graphically?

J1(u0,u1) = M

maxu0 J1(u0,u1)

0M(u1)

J1(u0,u1) = J1S

Q. How to construct 0S?

– As long as the curve is outside the level curve of J(u0, u1) =

J1S but tangent to it at (u0

S, u1S)

– Could be linear, nonlinear, or even discontinuous

minIR J0(u0,u1) (u0

S,u1S)

Example

Problem Formulation J0 = ½ u0

2 – ½ u0u1 + u12 + a u1

J1 = (u0 – ½)2 + 4(u1 – ½)2

– with u0, u1 [0, 1], and “a” is a parameter to be varied

Delineate IR, find (u0S, u1

S), and construct a 0S

Delineation of IR M minu1

maxu0 J1(u0, u1)

As an intermediate step: maxu0 J1(u0, u1) 0

M(u1). What is it?

– It is easy to see that 0M(u1) = 0 or 1, and u1 = ½

– Consequently, M (u0 – ½)2 + 4(u1 – ½)2, = (½)2 = ¼, and

IR = {(u0, u1) | (u0 – ½)2 + 4(u1 – ½)2 () ¼}

Finding (u0S, u1

– Min(u0, u1)IR J0(u0, u1) ~ May not be an easy task

Q. Any short cut?

– Can find the “team solution,” i.e., selecting both u0 and u1 to

minimize J0 (DM0 and DM1 act as a team, the best possible)

– If the team solution is in IR, then we are almost done

– Otherwise, have to perform the “hard” optimization

• First, find the “team solution” J0 = ½ u0

2 – ½ u0u1 + u12 + a u1

J0/u0 = u0 – ½ u1 = 0 u0 = ½ u1

J0/u1 = – ½ u0 + 2u1 + a = 0 7/4 u1 + a = 0

– Consequently, u0 = -2/7 a, and u1 = -4/7 a

– If (-2/7 a, -4/7 a) IR, then it is (u0S, u1

Q. Is (-2/7 a, -4/7 a) in IR? J1 = (u0 – ½)2 + 4(u1 – ½)2

= (-2/7 a – ½)2 + 4(-4/7 a – ½)2

= (4/49 a2 + 2/7 a + ¼) + 4(16/49 a2 + 4/7 a + ¼)

= 68/49 a2 + 18/7 a + 5/4 () ¼

– Therefore, (-2/7 a, -4/7 a) IR if

68/49 a2 + 18/7 a + 1 () 0, or

a2 + 63/34 a + 49/68 () 0, or

63763,

6863763

a [-1.298. -0.555] A

• If a A, then u0S = -2/7 a, and u1

S = -4/7 a

• A possible 0S:

– If u1 = -4/7 a, then u0 = -2/7 a

– Otherwise, u0 = 0

• Suppose a = -1, and we want to find a linear 0S

u0S = -2/7 a = 2/7; u1

S = -4/7 a = 4/7

Q. How to find a linear 0S?

– Find the level curve of J1(u0S, u1

– Find the tangent line of the level curve at (u0

S, u1S)J1 = (u0 – ½)2 + 4(u1 – ½)2 = 13/196

J1/u1 = 2(u0 – ½) (du0/du1) + 8(u1 – ½) = 0

du0/du1|(u0S, u1

S) = -8(u1 – ½)/2(u0 – ½) = 4/3

– Then the tangent line at (u0S, u1

S) is described by

(u0 – 2/7) = 4/3 (u1 – 4/7), or

u0 = 4/3 u1 – 10/21 for u1 [5/14, 1], and u0 = 0 otherwise

– Or, u0 = 2/7 for u1 = 4/7, and u0 = 0 otherwise. Credible?

Q. What to do if a A ?

• Have to solve the problem:Min J0 = ½ u0

2 – ½ u0u1 + u12 + a u1

s.t. (u0 – ½)2 + 4(u1 – ½)2 () ¼

– With nonlinear constraint, not an easy problem

– Left as an exercise

Principle of Optimality

The Issue

• Nash games in extensive form are solved by backward induction. This is based on the Principle of Optimality– An optimal strategy has the property that whatever the initial

state and time are, all remaining decisions must also constitute an optimal strategy

• For hierarchical games, we have not been able to use backward induction

Q. Why? Does Principle of Optimality still hold here?

01 = u1

u0 = 0(u1)

– Once announced 0S and observed u1, DM0 should set u0 as u0 =

0S(u1). Is there incentive for DM0 to deviate from this?

Example (Slightly modified)

-10,9 -3,5

-2,-2 4,7

Q. Now suppose we are the leader DM0. We announced the strategy, and DM1 selected L. What should we do? – There is strong incentive for DM0 to select R instead of

carrying out what was announced

– Along the optimal path, there is incentive for DM0 to deviate

Q. Suppose for some unknown reasons DM1 select R. What should we do? – There is also strong incentive for DM0 to select L instead of

carrying out what was announced

– Off the optimal path, there is incentive for the DM0 to deviate

• Principle of optimality does not hold for hierarchical games

• Multi-stage games cannot be solved one stage at a time back from the terminal stage as in backward induction

• There exist inherent incentive for the leader to deviate from what was announced, even in the absence of uncertainties or any unforeseeable interruptions

• Government/CEO credibility is at risk, a very familiar phenomena

Multi-Stage Hierarchical Games

• Since the Principle of Optimality does not hold, a multi-stage problem cannot be solved by backward induction

Q. What to do? – The inducible region approach can be extended. How?

– The basic ideas of IR presented earlier still hold here

The General Approach – Delineate IR

• Worst case analysis for the follower

– Min(u0, u1)IR J0(u0, u1)

• A parameter optimization problem may not easy to solve

– Construct 0S.

• Theoretically not hard, but may have practical difficulties

(2,4.5)

(4,10)

(-100,7)

(-2,9) (0,5)

(-100,10)

(-100,4)

(J0,J1)

SGD. What is the worst that DM0 can penalize DM1?

minu11 maxu01

[minu12 maxu02

J1(u0, u1)]

= minu11 maxu01

0tM(u1t) ~ “The maximum penalizing

strategy”

• Obtained by backward induction

Q. Which outcomes are inducible, which are not? – For any (u0, u1), if J1(u0, u1) > Mt

for any t (any stage) along the path, u1 will never be selected

– Any (u0, u1) such that J1(u0, u1) < Mt for all t (for all the stages)

along the path, (u0, u1) is “inducible”

– The boundary J1(u0, u1) = Mt: Analyzed carefully as before

IR = {(u0, u1) | J1(u0, u1) () Mt for all t along the path}

Finding (u0S, u1

– Min(u0, u1)IR J0(u0, u1)

• A parameter optimization problem with outcomes (0, 5)

Constructing 0S

– If u1t = u1tS, then u0t = u0t

– Otherwise, try to make the resulting J1 greater than J1S

• One way is to use the maximum penalizing strategy 0tM(u1t)

• Others might be better in case of deviation by DM1

• E.g., (U, D, U, U) would result in (J0, J1) = (-100, 7)!

Team Decision Theory

– Decentralized decision-making where DMs have access to different information and are responsible for different decisions, but share the same objective

A Motivating Example

• Problem Context – Back to the old days with no radio or telephone

– Mr. B of Boston and Mr. N of NYC, knowing their local weather, have to decide whether to go to Hartford today

Boston Hartford

New York

– The meeting requires a good weather at Hartford. If it rains, then they waste the trip

Q. Should they go or not?

• Mr. B and Mr. N share the same objective function – Shine at Hartford: B \ N Go Don’t

Go -10 3

Don’t 3 0B \ N Go Don’t

Go 4 2

Don’t 2 -5

– Rain at Hartford:

• The only information available is the local weather: B at

Boston and N at NYC. B, N, and H are correlated:

B R R R R S S S SN R R S S R R S SH R S R S R S R SPr. 0.25 0.05 0.1 0.1 0.1 0.1 0.05 0.25

Q. What are possible strategies for Mr. B? Mr. N? B: 4 strategies B1

B2 B3 B4

S in B G G D DR in B G D G D

N1 N2 N3 N4

S in N G G D D

R in N G D G D

N: 4 strategies

• Off-line coordination ~ Want to find the best pair of strategies. How?

• Game in normal form: Compute the expected payoff for each pair of strategies (16 of them), and select the best pair

J(B, N) = E[J(B(B), N(N), H)]

= B, N, H J(B(B), N(N), H)Pr(B, N, H)

– For example,

J(B1, N1) = 0.254 + 0.05(-10) + 0.14 + 0.1(-10) + 0.14

+ 0.1(-10) + 0.054 + 0.25(-10) = -3

J(B1, N2) = 0.252 + 0.053 + 0.14 + 0.1(-10) + 0.12

+ 0.13 + 0.054 + 0.25(-10) = -1.75

• The one with the minimum cost is the solution (B

*, N*) = (B1, N1), with J* = -3

– With pre-game coordination, there is no difficulty associated with the non-uniqueness of solution

Q. Is there a systematic way to find the optimal solution? – Not an easy task. Shall present a formal model and then go

over several examples

A Formal Model and Solution Methodology

A Formal Model

• A set of DMs: {1, 2, .., N} – Without loss of generality, assume that each DM observes one

measurement, makes one decision, and then leaves

• A set of uncertainties = {1, 2, .., m}, with j j

– Nature’s decisions

• A set of observations: z = {z1, z2, .., zN}

– zn = n(, u) Zn ~ DM n’s observation, subject to causality

– If zn = n() for all n, then static information

• A set of decision variables: u = {u1, u2, .., uN}

– un Un ~ DM n’s decision

n: Zn Un ~ DM n’s strategy

n n ~ The set of admissible strategies, e.g., linear

strategies

– un = n(zn)

• The cost function J: U R – For a given realization of :

J(u, ) = J(u1, u2, .., uN, ) ~ Extensive form

= J(1(z1), 2(z2), .., N(zN), )

= J(, ) ~ Normal form

J(1, 2, .., N) = E[J(, )] ~ Expected cost function

• The problem: min1, 2, .., NJ(1, 2, .., N)

Q. How to solve the problem?

Solution Methodology • Functional (as opposed to parameter) optimization

• No systematic way to solve it, except under special conditions

• Possible methods: – Brute force exhaustive search

– Impose more structures, e.g., best linear/quadratic strategies

– Relax the conditions, e.g., person-by-person optimal

21 2223

11 0 0 -1

12 0 -3 0

13 -1 0 -2

J(12, 22) = -3 ~ Optimal team solution

J(13, 23) = -2 ~ Person-by-person optimal ~ Nash solution

• Team optimality Person-by-person optimality, but not vice versa

• (1*, 2

*, .., N*) is a person-by-person optimal solution iff

J(1*, 2

*, .., N*) J(1

*, 2*, .., n, n+1

*,., N*) n and n

– Equivalently, minnJ(1

*, 2*, .., n, n+1

*,., N*) n

– An optimal control problem, and may be solvable

– DMs could coordinate off-line, and no difficulty on the non-uniqueness of solutions Start

Guess (1g, .., N

MinnJ(1

g, .., n, .., Ng)

* = ng n?

Yes, PBPOS?

No, revise the guess

• Consequently, one way to find (1

*, 2*, .., N

*) by:

– There is no systematic way to revise the guess

– To prove the convergence is quite difficult

Today – Approach for Single-Stage Problems

• Paper Review by Priscillia Hunt, Selini Katsaiti, Dinesh Padmanabhan, Ivailo Kotzev

A Canonical Example

Problem Formulation

• Two DMs. DM0 with control u0, and DM1 with u1

y0 = x0 + bv0, b > 0

y1 = x0 + cu0 + dv1, c 0, d > 0

– x0, v1, v2 are independent random variables with x0 ~ N(0, 2),

v1 ~ N(0, 1), and v2 ~ N(0, 1)

– Information structure yet to be specified

– The cost function:

J = E{½ (x0 + au0 + u1)2 + hu0

2 + gu12}, with a, h, g 0

• By appropriately assigning values to parameters, the above can represent different problems. We shall consider a few

The Static Case – Let a = b = d = 1, c = 0, and h = g = ½

J = E{½ (x0 + u0 + u1)2 + ½u0

2 + ½u12}

0 = {y0}, with y0 = x0 + v0

1 = {y1}, with y1 = x0 + v1

– With a random initial state x0, two noisy measurements are

made: y0 by DM0, and y1 by DM1

u0(y0)

u1(y1)

– Static information structure since both y0 and y1 are

independent of decisions

Q. How to solve it?

• Shall find linear person-by-person optimal solution first, and then show that it is also team optimal

Linear Person-by-Person Optimal Solution

– Assume u0* = k0y0 and u1

* = k1y1, with k0 and k1 yet to be

determined

DM0’s perspective

• If u1* = k1y1 = k1(x0 + v1), then min0

J(0, 1*) 0

J = E{½ (x0 + u0 + u1)2 + ½u0

2 + ½u12}

= E{E{½[x0 + u0 + k1(x0 + v1)]2 + ½u0

2 + ½[k1(x0 + v1)]2|y0}}

J/u0 = 0 = 2u0 + E[(k1+ 1)x0 + k1v1)| y0]

= 2u0 + (k1+ 1)E[x0| y0]

u0* = - ½ (k1+ 1)E[x0| y0]

~ What is this?

E[x0|y0] =x0 + x0y0 y0y0

-1 (y0 -y0)

= 0 + 2(2 + 1)-1y0 = 2y0/(2 + 1)

– Consequently,

u0* = - ½ (k1+ 1)E[x0|y0] = -(k1+ 1)2y0/2(2 + 1) = k0y0

2(2 + 1)k0 + 2k1 = -2

DM1’s perspective

– Similarly, if u0* = k0y0, then min1

J(0*, 1) 1

u1* = - ½ (k0+ 1)E[x0|y1] = -(k0+ 1)2y1/2(2 + 1) = k1y1

2k0 + 2(2 + 1)k1 = -2

Determining k0 and k1

– Two unknowns k0 and k1, and two linear conditions:

• Determinant is not zero Always exists a unique solution

• Once k0 and k1 are solved, the solution is obtained

Q. Is the solution team optimal? Why or why not?

Team Optimal Solution

• The above solution is also team optimal in view of the strict convexity of J

– For a given realization of (including x0, v0 and v1), y0 and y1

are determined

u0* = k0 y0 and u1

* = k1 y1

J(u0, u1, ) = ½ (x0 + u0 + u1)2 + ½u0

2 + ½u12

> J(u0*, u1

*, ) + J/u0(u0 - u0*) + J/u1(u1 - u1

for all (0, 1) (0*, 1

E[J(u0=0(y0), u1=1(y1), )] > E[J(u0=0*(y0), u1=1

*(y1), )]

(0*, 1

*) is team optimal

u0* = -2/7 y0, and u1

* = -2/7 y1

JStatic = E{½ (x0 + u0 + u1)2 + ½u0

2 + ½u12}

= ½ E{[x0 – 2/7 (x0 + v0) – 2/7 (x0 + v1)]2

+ [2/7 (x0 + v0)]2 + [2/7 (x0 + v1)]

= ½ E{[3/7 x0 – 2/7 v0 – 2/7 v1]2 + 4/49 (x0 + v0)

+ 4/49 (x0 + v1)2}

= ½ {[36/49 + 4/49 + 4/49] + 104/49}

= 42/49 = 6/7 = 0.857

Special Case When = 2

• In this case, the following equations for k0 and k1 become

General Results

Proposition 1

• For an LQG static team decision problem with J = ½ uTQu + uTS, Q > 0, ~ N(0, ), and

y = H,

The team optimal solution exists, is unique, is linear in measurements, and can be obtained by solving the person-by-person optimal problem

– Motivating Examples– Solution Concept– Examples and Results on Finite Games – An Example of Single-Act Infinite Games

• The Inducible Region Approach – Approach for Single-Stage Problems– Principle of Optimality – Multi-Stage Games

• Team Decision Theory– A Motivating Example– A Formal Model and Solution Methodology – A Canonical Example

Next Time• 11/14 No Class • 11/21 Midterm exam

lecture 8, 11/08/07information, control & games, fall 07, copyright p. b. luh, s.-c. chang 1...

control games

automatic control

example of single

information structures

nash solutions

canonical example

motivating examples

hierarchical gameslast

Documents

page 14 luh feb 2011 color

luhayumanikmas,love theforest€¦ · luh ayu manik – an...

a109 luh - south african air force · a109 luh 07 - 2006...

senior design project megan luh hao luo march 23 2010

page 5 luh feb 2011

electronic luh ppc (merged printable document)

prof. luh-maan chang - semi.org symposium bref dec... ·...

g a luh. do c x

running title: luh role in mucilage release corresponding...

6. feeding, dr. luh

page 14 enero 11 luh color

goals of corporate governance - a singapore perspective...

senior design project megan luh hao luo febrary 17 2010

1)tu aimes le sport??? (sp ɔ rrr ) 2) quel sport est-ce que...

senior design project megan luh hao luo january 21 2010

6rnfhw 7fhkqrrhjvol - bce.it · zzz lurqzrrgohfwuhrqlfv rfp...

flareon5 challenge10 solution - fireeye...)luh(\h ,qf...

study of the light utility helicopter (luh)

man d 2066 luh - elring aftermarket · 2018. 6. 29. · d...

luh-521.pdf · created date: 6/29/2006 12:37:08 pm