game theoretic learning and social in...
TRANSCRIPT
![Page 1: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/1.jpg)
Game Theoretic Learningand Social Influence
Jeff S. Shamma
&
IPAM Workshop on Decision Support for TrafficNovember 16–20, 2015
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 1/42
![Page 2: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/2.jpg)
Interconnected decisions
Traffic decision makers:
UsersFleet managersInfrastructure planners
cf., Ritter: “Traffic decision support”, Thursday AM
Satisfaction with decision depends on decisions of others.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 2/42
![Page 3: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/3.jpg)
Interconnected decisions
Traffic decision makers:
UsersFleet managersInfrastructure planners
cf., Ritter: “Traffic decision support”, Thursday AM
Satisfaction with decision depends on decisions of others.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 2/42
![Page 4: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/4.jpg)
Illustrations
Commuting:
Decision = PathSatisfaction depends on paths of others
Planning:
Decision = Infrastructure allocationSatisfaction depends on utilization
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 3/42
![Page 5: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/5.jpg)
Viewpoint: Game theory
“... the study of mathematical models of conflict and cooperationbetween intelligent rational decision-makers”
Myerson (1991), Game Theory.
Extensive literature...
Game elements:
Players/Agents/ActorsActions/Strategies/ChoicesPreferences over joint choices
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 4/42
![Page 6: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/6.jpg)
Viewpoint: Game theory
“... the study of mathematical models of conflict and cooperationbetween intelligent rational decision-makers”
Myerson (1991), Game Theory.
Extensive literature...
Game elements:
Players/Agents/ActorsActions/Strategies/ChoicesPreferences over joint choices
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 4/42
![Page 7: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/7.jpg)
Outline
Game models
Influence models
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 5/42
![Page 8: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/8.jpg)
Viewpoint: Game theory
Game elements:
Players/Agents/ActorsActions/Strategies/ChoicesPreferences over joint choices
Solution concept: What to expect?
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 6/42
![Page 9: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/9.jpg)
Viewpoint: Game theory
Game elements:
Players/Agents/ActorsActions/Strategies/ChoicesPreferences over joint choices
Solution concept: What to expect?
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 6/42
![Page 10: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/10.jpg)
Making decisions vs modeling decisions
Single decision maker:
Choice set: x ∈ XUtility function: U(x)
x = arg maxx′∈X
U(x ′)
Linear/Convex/Semidefinite/Integer/Dynamic...Programming
Good model?
Issues:
Complexity, randomness, incompleteness, framing...Furthermore...are preferences even consistent with utility function?
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 7/42
![Page 11: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/11.jpg)
Making decisions vs modeling decisions
Single decision maker:
Choice set: x ∈ XUtility function: U(x)
x = arg maxx′∈X
U(x ′)
Linear/Convex/Semidefinite/Integer/Dynamic...Programming
Good model?
Issues:
Complexity, randomness, incompleteness, framing...Furthermore...are preferences even consistent with utility function?
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 7/42
![Page 12: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/12.jpg)
Making decisions vs modeling decisions
Single decision maker:
Choice set: x ∈ XUtility function: U(x)
x = arg maxx′∈X
U(x ′)
Linear/Convex/Semidefinite/Integer/Dynamic...Programming
Good model?
Issues:
Complexity, randomness, incompleteness, framing...Furthermore...are preferences even consistent with utility function?
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 7/42
![Page 13: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/13.jpg)
Modeling decisions in games
Elements: Players, choices, and preferences over joint choices:
Ui (x) = Ui (xi , x−i )
Solution concept: What to expect?
Nash Equilibrium
Everyone’s choice is optimal given the choices of others.
Alternatives:
Bounded rationality modelsHannan consistency, correlated equilibrium, ...
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 8/42
![Page 14: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/14.jpg)
Modeling decisions in games
Elements: Players, choices, and preferences over joint choices:
Ui (x) = Ui (xi , x−i )
Solution concept: What to expect?
Nash Equilibrium
Everyone’s choice is optimal given the choices of others.
Alternatives:
Bounded rationality modelsHannan consistency, correlated equilibrium, ...
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 8/42
![Page 15: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/15.jpg)
Modeling decisions in games
Elements: Players, choices, and preferences over joint choices:
Ui (x) = Ui (xi , x−i )
Solution concept: What to expect?
Nash Equilibrium
Everyone’s choice is optimal given the choices of others.
Alternatives:
Bounded rationality modelsHannan consistency, correlated equilibrium, ...
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 8/42
![Page 16: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/16.jpg)
Illustration: Congestion games (discrete)
Setup:
Players: 1, 2, ..., nSet of resources: R = {r1, ..., rm} (roads)Action sets: Ai ⊂ 2R (paths)Joint action: (a1, a2, ..., an)
Cost: (vs Utility)
Resource level:cr (a) = φr (σr (a)︸ ︷︷ ︸
#users
)
User level:Ci (ai , a−i ) =
∑r∈ai
cr (a)
Nash equilibrium: a∗ = (a∗1 , ..., a∗n)
Ci (a∗i , a∗−i ) ≤ Ci (a
′i , a∗−i )
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 9/42
![Page 17: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/17.jpg)
Illustration: Congestion games (discrete)
Setup:
Players: 1, 2, ..., nSet of resources: R = {r1, ..., rm} (roads)Action sets: Ai ⊂ 2R (paths)Joint action: (a1, a2, ..., an)
Cost: (vs Utility)
Resource level:cr (a) = φr (σr (a)︸ ︷︷ ︸
#users
)
User level:Ci (ai , a−i ) =
∑r∈ai
cr (a)
Nash equilibrium: a∗ = (a∗1 , ..., a∗n)
Ci (a∗i , a∗−i ) ≤ Ci (a
′i , a∗−i )
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 9/42
![Page 18: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/18.jpg)
Illustration: Congestion games (discrete)
Setup:
Players: 1, 2, ..., nSet of resources: R = {r1, ..., rm} (roads)Action sets: Ai ⊂ 2R (paths)Joint action: (a1, a2, ..., an)
Cost: (vs Utility)
Resource level:cr (a) = φr (σr (a)︸ ︷︷ ︸
#users
)
User level:Ci (ai , a−i ) =
∑r∈ai
cr (a)
Nash equilibrium: a∗ = (a∗1 , ..., a∗n)
Ci (a∗i , a∗−i ) ≤ Ci (a
′i , a∗−i )
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 9/42
![Page 19: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/19.jpg)
Social influence: Equilibrium shaping
Claim: For such congestion games, NE minimizes
P(a) =∑r
σr (a)∑k=0
φr (k)
Overall congestion:
G (a) =∑r
φr (σr (a))︸ ︷︷ ︸cost
· σr (a)︸ ︷︷ ︸# users
Claim: Modified resource cost
φr (σr (a)) + (σr (a)− 1) ·(φr (σr (a))− φr (σr (a)− 1)
)︸ ︷︷ ︸
imposition toll
results in NE that minimizes overall congestion.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 10/42
![Page 20: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/20.jpg)
Social influence: Equilibrium shaping
Claim: For such congestion games, NE minimizes
P(a) =∑r
σr (a)∑k=0
φr (k)
Overall congestion:
G (a) =∑r
φr (σr (a))︸ ︷︷ ︸cost
· σr (a)︸ ︷︷ ︸# users
Claim: Modified resource cost
φr (σr (a)) + (σr (a)− 1) ·(φr (σr (a))− φr (σr (a)− 1)
)︸ ︷︷ ︸
imposition toll
results in NE that minimizes overall congestion.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 10/42
![Page 21: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/21.jpg)
Social influence: Equilibrium shaping
Claim: For such congestion games, NE minimizes
P(a) =∑r
σr (a)∑k=0
φr (k)
Overall congestion:
G (a) =∑r
φr (σr (a))︸ ︷︷ ︸cost
· σr (a)︸ ︷︷ ︸# users
Claim: Modified resource cost
φr (σr (a)) + (σr (a)− 1) ·(φr (σr (a))− φr (σr (a)− 1)
)︸ ︷︷ ︸
imposition toll
results in NE that minimizes overall congestion.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 10/42
![Page 22: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/22.jpg)
NE & Price of X
Discussion:
Model presumes NE as outcomeHow does NE compare to social planner optimal measured by G(·)?
Price-of-Anarchy (PoA):
PoA =maxa∈NE G (a)
mina G (a)
pessimistic ratio of performance at NE vs optimal performance
Price-of-Stability (PoS):
PoS =mina∈NE G (a)
mina G (a)
optimistic ratio of performacne at NE vs optimal performance
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 11/42
![Page 23: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/23.jpg)
NE & Price of X
Discussion:
Model presumes NE as outcomeHow does NE compare to social planner optimal measured by G(·)?
Price-of-Anarchy (PoA):
PoA =maxa∈NE G (a)
mina G (a)
pessimistic ratio of performance at NE vs optimal performance
Price-of-Stability (PoS):
PoS =mina∈NE G (a)
mina G (a)
optimistic ratio of performacne at NE vs optimal performance
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 11/42
![Page 24: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/24.jpg)
NE & Price of X
Discussion:
Model presumes NE as outcomeHow does NE compare to social planner optimal measured by G(·)?
Price-of-Anarchy (PoA):
PoA =maxa∈NE G (a)
mina G (a)
pessimistic ratio of performance at NE vs optimal performance
Price-of-Stability (PoS):
PoS =mina∈NE G (a)
mina G (a)
optimistic ratio of performacne at NE vs optimal performance
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 11/42
![Page 25: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/25.jpg)
Illustration: Equal cost sharing
Setup:
Equally shared cost of resource:
α
# users
(vs increasing congestion)High road: n − εLow road: 1
S D
High road
Low road
NE:
All use High road at individual cost n−εn< 1
All use Low road at individual cost 1n
PoX: G (a) is sum of individual costs
PoA ≈ n & PoS = 1
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 12/42
![Page 26: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/26.jpg)
Illustration: Equal cost sharing
Setup:
Equally shared cost of resource:
α
# users
(vs increasing congestion)High road: n − εLow road: 1
S D
High road
Low road
NE:
All use High road at individual cost n−εn< 1
All use Low road at individual cost 1n
PoX: G (a) is sum of individual costs
PoA ≈ n & PoS = 1
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 12/42
![Page 27: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/27.jpg)
Illustration: Equal cost sharing
Setup:
Equally shared cost of resource:
α
# users
(vs increasing congestion)High road: n − εLow road: 1
S D
High road
Low road
NE:
All use High road at individual cost n−εn< 1
All use Low road at individual cost 1n
PoX: G (a) is sum of individual costs
PoA ≈ n & PoS = 1
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 12/42
![Page 28: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/28.jpg)
Illustration: Equal cost sharing, cont.
Setup:
Equally shared cost as beforeUser specific starting points
NE:
All use private resource at individual cost 1All use shared resource at individual cost k
n
PoX: G (a) is sum of individual costs
PoA = n/k & PoS = 1
Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 13/42
![Page 29: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/29.jpg)
Illustration: Equal cost sharing, cont.
Setup:
Equally shared cost as beforeUser specific starting points
NE:
All use private resource at individual cost 1All use shared resource at individual cost k
n
PoX: G (a) is sum of individual costs
PoA = n/k & PoS = 1
Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 13/42
![Page 30: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/30.jpg)
Illustration: Equal cost sharing, cont.
Setup:
Equally shared cost as beforeUser specific starting points
NE:
All use private resource at individual cost 1All use shared resource at individual cost k
n
PoX: G (a) is sum of individual costs
PoA = n/k & PoS = 1
Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 13/42
![Page 31: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/31.jpg)
PoX analysis
Extensions: Broader solution concepts, various families of games,price-of-uncertainty, price-of-byzantine, price-of- ...
(λ, µ)-smoothness: For any two action profiles, a∗ & a′∑i
Ci (a∗i , a′−i ) ≤ λ
∑i
Ci (a∗) + µ
∑i
C (a′)
Theorem: Under (λ, µ)-smoothness,
PoA ≤ λ
1− µ
Think of a′ as NE and a∗ as central optimum
Roughgarden (2009), “Intrinsic robustness of the price of anarchy”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 14/42
![Page 32: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/32.jpg)
PoX analysis
Extensions: Broader solution concepts, various families of games,price-of-uncertainty, price-of-byzantine, price-of- ...
(λ, µ)-smoothness: For any two action profiles, a∗ & a′∑i
Ci (a∗i , a′−i ) ≤ λ
∑i
Ci (a∗) + µ
∑i
C (a′)
Theorem: Under (λ, µ)-smoothness,
PoA ≤ λ
1− µ
Think of a′ as NE and a∗ as central optimum
Roughgarden (2009), “Intrinsic robustness of the price of anarchy”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 14/42
![Page 33: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/33.jpg)
PoX analysis
Extensions: Broader solution concepts, various families of games,price-of-uncertainty, price-of-byzantine, price-of- ...
(λ, µ)-smoothness: For any two action profiles, a∗ & a′∑i
Ci (a∗i , a′−i ) ≤ λ
∑i
Ci (a∗) + µ
∑i
C (a′)
Theorem: Under (λ, µ)-smoothness,
PoA ≤ λ
1− µ
Think of a′ as NE and a∗ as central optimum
Roughgarden (2009), “Intrinsic robustness of the price of anarchy”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 14/42
![Page 34: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/34.jpg)
Outline/Recap
Game models
Setup & equilibriumPrice-of-X
Influence models
Equilibrium shaping
Lingering issues:Uncertain landscapesEquilibrium analysis
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 15/42
![Page 35: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/35.jpg)
Equilibrium shaping & uncertain landscapes
Marginal contribution utility:
Assume global objetive, G(·)Define “null” action ∅Set
U(ai , a−i ) = G(ai , a−i )− G(∅, a−i )
Claim: PoS = 1
Recall:
φr (σr (a)) +
new term︷︸︸︷β (σr (a)− 1) ·
(φr (σr (a))− φr (σr (a)− 1)
)︸ ︷︷ ︸
imposition toll:τ(a)
Uncertain landscape: What if users have different β’s?many more sources of uncertainty...
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 16/42
![Page 36: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/36.jpg)
Equilibrium shaping & uncertain landscapes
Marginal contribution utility:
Assume global objetive, G(·)Define “null” action ∅Set
U(ai , a−i ) = G(ai , a−i )− G(∅, a−i )
Claim: PoS = 1
Recall:
φr (σr (a)) +
new term︷︸︸︷β (σr (a)− 1) ·
(φr (σr (a))− φr (σr (a)− 1)
)︸ ︷︷ ︸
imposition toll:τ(a)
Uncertain landscape: What if users have different β’s?many more sources of uncertainty...
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 16/42
![Page 37: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/37.jpg)
Equilibrium shaping & uncertain landscapes
Marginal contribution utility:
Assume global objetive, G(·)Define “null” action ∅Set
U(ai , a−i ) = G(ai , a−i )− G(∅, a−i )
Claim: PoS = 1
Recall:
φr (σr (a)) +
new term︷︸︸︷β (σr (a)− 1) ·
(φr (σr (a))− φr (σr (a)− 1)
)︸ ︷︷ ︸
imposition toll:τ(a)
Uncertain landscape: What if users have different β’s?many more sources of uncertainty...
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 16/42
![Page 38: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/38.jpg)
Equilibrium shaping under uncertainty
φr (σr (a)) +
new term︷︸︸︷β (σr (a)− 1) ·
(φr (σr (a))− φr (σr (a)− 1)
)︸ ︷︷ ︸
imposition toll:τ(a)
Theorem: As κ→∞
κ · (φr (σr (a)) + τ(a))
leads to PoA→ 1
Extensions: Optimal bounded tolls in special case of parallel links &affine costs.
Brown & Marden (2015), “Optimal mechanisms for robust coordination in congestion games”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 17/42
![Page 39: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/39.jpg)
Equilibrium shaping under uncertainty
φr (σr (a)) +
new term︷︸︸︷β (σr (a)− 1) ·
(φr (σr (a))− φr (σr (a)− 1)
)︸ ︷︷ ︸
imposition toll:τ(a)
Theorem: As κ→∞
κ · (φr (σr (a)) + τ(a))
leads to PoA→ 1
Extensions: Optimal bounded tolls in special case of parallel links &affine costs.
Brown & Marden (2015), “Optimal mechanisms for robust coordination in congestion games”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 17/42
![Page 40: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/40.jpg)
Social influence: Mechanism design
Private infoD
=⇒ Social decision
vs
Private infoS
=⇒ MessagesM
=⇒ Social decision
A “mechanism” M is a rule from reports to decisions.
Mechanism M induces a game in reporting strategies.
Seek to implement D as solution of game, i.e.,
D =M◦ S?
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 18/42
![Page 41: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/41.jpg)
Standard illustration: Sealed bid/second price auctions
Private infoD
=⇒ Social decision
vs
Private infoS
=⇒ MessagesM
=⇒ Social decision
Planner objective: Assign item to highest private valuation
Agent objective: Item value minus payment
Messages: Bids
Social decision:
Item to high bidderPayment from high bidder = Second highest bid
Claim: Truthful bidding is a NE.
Special case of broad discussion...
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 19/42
![Page 42: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/42.jpg)
Illustration: Sequential resource allocation
Private infoD
=⇒ Social decision
vs
Private infoS
=⇒ MessagesM
=⇒ Social decision
Private info is revealed sequentially
Agents do not know own valuations in advance
Decisions based on sequential messages
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 20/42
![Page 43: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/43.jpg)
Example
Two users: {1, 2}User’s need for resource is low or high: θi ∈ {L,H}
Planner allocates resource a ∈ {1, 2}Planner objective: Fair allocation with User 2 priority
Dynamics: Coupled state transitions according to 4× 4 matrices over set
{(L, L), (H, L), (L,H), (H,H)}
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 21/42
![Page 44: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/44.jpg)
Example
Two users: {1, 2}User’s need for resource is low or high: θi ∈ {L,H}Planner allocates resource a ∈ {1, 2}
Planner objective: Fair allocation with User 2 priority
Dynamics: Coupled state transitions according to 4× 4 matrices over set
{(L, L), (H, L), (L,H), (H,H)}
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 21/42
![Page 45: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/45.jpg)
Example
Two users: {1, 2}User’s need for resource is low or high: θi ∈ {L,H}Planner allocates resource a ∈ {1, 2}Planner objective: Fair allocation with User 2 priority
Dynamics: Coupled state transitions according to 4× 4 matrices over set
{(L, L), (H, L), (L,H), (H,H)}
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 21/42
![Page 46: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/46.jpg)
Example
Two users: {1, 2}User’s need for resource is low or high: θi ∈ {L,H}Planner allocates resource a ∈ {1, 2}Planner objective: Fair allocation with User 2 priority
Dynamics: Coupled state transitions according to 4× 4 matrices over set
{(L, L), (H, L), (L,H), (H,H)}
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 21/42
![Page 47: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/47.jpg)
Sequential resource allocation, cont
Private infoD
=⇒ Social decision
vs
Private infoS
=⇒ MessagesM
=⇒ Social decision
Planner objective: Induce truthful reporting
Agent objective: Future discounted resource access minus payments
∞∑t=0
δt(vi (π(r t1 , ..., r
tn), θti )− qti (r t1 , ..., r
tn))
Messages: High/Low resource need
Theorem: LP computations so that truthful reporting is a NE.Kotsalis & Shamma (2013): “Dynamic mechanism design in correlated environments”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 22/42
![Page 48: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/48.jpg)
Example, cont
Optimal efficient policy: Favor Agent 2
π(L, L) = 1/2, π(H, L) = 1, π(L,H) = 2, π(H,H) = 2
Agent 2 can monopolize by misreporting
Payment rule:q1(·, ·) = 0
q2(L, L) < 0, q2(L,H) < 0, q2(H, L) < 0
q2(H,H) > 0
Ex ante payment from agent 2 = 0
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 23/42
![Page 49: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/49.jpg)
Critique
Lingering issues:Uncertain landscapesEquilibrium analysis
Tuned for behavior only at specific Nash equilibrium
Presumes agents solve coordinated (dynamic) optimization
Presumes knowledge of full system dynamics available to all
Neglects model mismatch
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 24/42
![Page 50: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/50.jpg)
Outline/Recap
Game models
Setup & equilibriumPrice-of-X
Influence models
Equilibrium shapingMechanism design
Lingering issues:Uncertain landscapesEquilibrium analysis
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 25/42
![Page 51: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/51.jpg)
Learning/evolutionary games
Shift of focus:
Away from equilibrium—Nash equilibrium
Towards how players might arrive to solution—i.e., dynamics
“The attainment of equilibrium requires a disequilibrium process.”
Arrow, 1987.
“The explanatory significance of the equilibrium concept dependson the underlying dynamics.”
Skyrms, 1992.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 26/42
![Page 52: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/52.jpg)
Learning/evolutionary games
Shift of focus:
Away from equilibrium—Nash equilibrium
Towards how players might arrive to solution—i.e., dynamics
“The attainment of equilibrium requires a disequilibrium process.”
Arrow, 1987.
“The explanatory significance of the equilibrium concept dependson the underlying dynamics.”
Skyrms, 1992.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 26/42
![Page 53: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/53.jpg)
Literature
Monographs:
Weibull, Evolutionary Game Theory, 1997.
Young, Individual Strategy and Social Structure, 1998.
Fudenberg & Levine, The Theory of Learning in Games, 1998.
Samuelson, Evolutionary Games and Equilibrium Selection, 1998.
Young, Strategic Learning and Its Limits, 2004.
Sandholm, Population Dynamics and Evolutionary Games, 2010.
Surveys:
Hart, “Adaptive heuristics”, Econometrica, 2005.
Fudenberg & Levine, “Learning and equilibrium”, Annual Review ofEconomics, 2009.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 27/42
![Page 54: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/54.jpg)
Stability & multi-agent learning
Caution! Single agent learning 6= Multiagent learning
Sato, Akiyama, & Farmer, “Chaos in a simple two-person game”, PNAS, 2002.Piliouras & JSS, “Optimization despite chaos: Convex relaxations to complete limit sets via Poincare recurrence”, SODA, 2014.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 28/42
![Page 55: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/55.jpg)
Best reply dynamics (with inertia)
ai (t) ∈
{Bi (a−i (t − 1)) w.p. 0 < ρ < 1
ai (t − 1) w.p. 1− ρ
Features:
Pure NE is a stationary point
Based on greedy response to myopic forecast:
aguess−i (t) = a−i (t − 1)?
Need not converge to NE
Theorem
For finite-improvement-property games under best reply with inertia,player strategies converge to NE .
(Includes anonymous congestion games...)
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 29/42
![Page 56: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/56.jpg)
Best reply dynamics (with inertia)
ai (t) ∈
{Bi (a−i (t − 1)) w.p. 0 < ρ < 1
ai (t − 1) w.p. 1− ρ
Features:
Pure NE is a stationary point
Based on greedy response to myopic forecast:
aguess−i (t) = a−i (t − 1)?
Need not converge to NE
Theorem
For finite-improvement-property games under best reply with inertia,player strategies converge to NE .
(Includes anonymous congestion games...)
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 29/42
![Page 57: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/57.jpg)
Best reply dynamics (with inertia)
ai (t) ∈
{Bi (a−i (t − 1)) w.p. 0 < ρ < 1
ai (t − 1) w.p. 1− ρ
Features:
Pure NE is a stationary point
Based on greedy response to myopic forecast:
aguess−i (t) = a−i (t − 1)?
Need not converge to NE
Theorem
For finite-improvement-property games under best reply with inertia,player strategies converge to NE .
(Includes anonymous congestion games...)
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 29/42
![Page 58: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/58.jpg)
Fictitious play
Each player:
Maintain empirical frequencies (histograms) of opposing actionsForecasts (incorrectly) that others play independently according toobserved empirical frequenciesSelects an action that maximizes expected payoff
Compare to best reply:
aguess−i (t) = a−i (t − 1)?
vs
aguess−i (t) ∼ q−i (t) ∈ ∆(A−i )
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 30/42
![Page 59: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/59.jpg)
Fictitious play
Each player:
Maintain empirical frequencies (histograms) of opposing actionsForecasts (incorrectly) that others play independently according toobserved empirical frequenciesSelects an action that maximizes expected payoff
Compare to best reply:
aguess−i (t) = a−i (t − 1)?
vs
aguess−i (t) ∼ q−i (t) ∈ ∆(A−i )
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 30/42
![Page 60: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/60.jpg)
Fictitious play: Convergence results
Theorem
For zero-sum games (1951), 2× 2 games (1961), potential games(1996), and 2× N games (2003) under fictitious play , player empiricalfrequencies converge to NE .
Theorem
For Shapley “fashion game” (1964), Jordan anti-coordination game(1993), Foster & Young merry-go-round game (1998) under fictitiousplay , player empirical frequencies DO NOT converge to NE .
Detail: Discussion extended to mixed/randomized NE
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 31/42
![Page 61: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/61.jpg)
Fictitious play: Convergence results
Theorem
For zero-sum games (1951), 2× 2 games (1961), potential games(1996), and 2× N games (2003) under fictitious play , player empiricalfrequencies converge to NE .
Theorem
For Shapley “fashion game” (1964), Jordan anti-coordination game(1993), Foster & Young merry-go-round game (1998) under fictitiousplay , player empirical frequencies DO NOT converge to NE .
Detail: Discussion extended to mixed/randomized NE
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 31/42
![Page 62: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/62.jpg)
FP simulations
0 50 100 150 200 2500
0.2
0.4
0.6
0.8
1
0 50 100 150 200 2500
0.2
0.4
0.6
0.8
1
Rock-Paper-Scissors
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 32/42
![Page 63: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/63.jpg)
FP simulations
Shapley “Fashion Game”R G B
R 0, 1 0, 0 1, 0G 1, 0 0, 1 0, 0B 0, 0 1, 0 0, 1
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 32/42
![Page 64: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/64.jpg)
FP & large games
FP bookkeeping
Observe actions of all players
Construct probability distribution of all possible opponentconfigurations
Prohibitive for large games
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 33/42
![Page 65: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/65.jpg)
FP & large games
FP bookkeeping
Observe actions of all players
Construct probability distribution of all possible opponentconfigurations
Prohibitive for large games
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 33/42
![Page 66: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/66.jpg)
Joint strategy FP
Modification:
Maintain empirical frequencies (histograms) of opposing actions
Forecasts (incorrectly) that others play /////////////////independently according toobserved empirical frequencies
Selects an action that maximizes expected payoff
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 34/42
![Page 67: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/67.jpg)
JSFP bookkeeping
Virtual payoff vector: The payoffs that could have been obtained
Ui (t) =
ui (1, a−i (t))ui (2, a−i (t))
...ui (m, a−i (t))
Time averaged virtual payoff :
Vi (t + 1) = (1− ρ)Vi (t) + ρUi (t)
Stepsize ρ is either constant (fading) or diminishing (averaging)
Equivalent JSFP: At each stage, select best virtual payoff action
Viewpoint: Bookkeeping is oracle based (cf., traffic reports)
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 35/42
![Page 68: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/68.jpg)
JSFP bookkeeping
Virtual payoff vector: The payoffs that could have been obtained
Ui (t) =
ui (1, a−i (t))ui (2, a−i (t))
...ui (m, a−i (t))
Time averaged virtual payoff :
Vi (t + 1) = (1− ρ)Vi (t) + ρUi (t)
Stepsize ρ is either constant (fading) or diminishing (averaging)
Equivalent JSFP: At each stage, select best virtual payoff action
Viewpoint: Bookkeeping is oracle based (cf., traffic reports)
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 35/42
![Page 69: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/69.jpg)
JSFP simulation
Anonymous congestion
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 36/42
![Page 70: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/70.jpg)
Outline/Recap
Game models
Setup & equilibriumPrice-of-XLearning/evolutionary games
Influence models
Equilibrium shapingMechanism design
Lingering issues:Uncertain landscapesEquilibrium analysis
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 37/42
![Page 71: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/71.jpg)
Social influence: Sparse seeding
Public service advertisingPhase I:
Receptive agents: Follow planner’s advice (e.g., with probability α)Non-receptive agents: Unilateral best-response dynamics
Phase II: Receptive agents may revert to best-response dynamics
Main results: Desirable bounds on resulting PoA for various settings(anonymous congestion, shared cost, set coverage...)
Compare: Nash equilibrium vs learning agents
Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”Balcan, Krehbiel, Piliouras, and Shin (2012), “Minimally invasive mechanism design: Distributed covering with carefully chosenadvice”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 38/42
![Page 72: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/72.jpg)
Social influence: Sparse seeding
Public service advertisingPhase I:
Receptive agents: Follow planner’s advice (e.g., with probability α)Non-receptive agents: Unilateral best-response dynamics
Phase II: Receptive agents may revert to best-response dynamics
Main results: Desirable bounds on resulting PoA for various settings(anonymous congestion, shared cost, set coverage...)
Compare: Nash equilibrium vs learning agents
Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”Balcan, Krehbiel, Piliouras, and Shin (2012), “Minimally invasive mechanism design: Distributed covering with carefully chosenadvice”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 38/42
![Page 73: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/73.jpg)
Social influence: Sparse seeding
Public service advertisingPhase I:
Receptive agents: Follow planner’s advice (e.g., with probability α)Non-receptive agents: Unilateral best-response dynamics
Phase II: Receptive agents may revert to best-response dynamics
Main results: Desirable bounds on resulting PoA for various settings(anonymous congestion, shared cost, set coverage...)
Compare: Nash equilibrium vs learning agents
Balcan, Blum, & Mansour (2013), “Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior”Balcan, Krehbiel, Piliouras, and Shin (2012), “Minimally invasive mechanism design: Distributed covering with carefully chosenadvice”
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 38/42
![Page 74: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/74.jpg)
Outline/Recap
Game models
Setup & equilibriumPrice-of-XLearning/evolutionary games
Influence models
Equilibrium shapingMechanism designDynamic incentives
Lingering issues:Uncertain landscapesEquilibrium analysis
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 39/42
![Page 75: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/75.jpg)
Social influence & feedback control
vs
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 40/42
![Page 76: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/76.jpg)
Social influence & feedback control
vs
Benefits of feedback (Astrom)
Reliable behavior from unreliable components humans
Mitigate disturbances
Shape dynamic behavior
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 40/42
![Page 77: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/77.jpg)
Dynamic incentive challenges
Modeling:
Order?Time-scale?Heterogeneity?Non-stationarity?Resolution?Social network effects?
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 41/42
![Page 78: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/78.jpg)
Dynamic incentive challenges
Modeling:
Order?Time-scale?Heterogeneity?Non-stationarity?Resolution?Social network effects?
“Thus unless we know quite a lot about the topology of interac-tion and the agents’ decision-making processes, estimates of thespeed of adjustment could be off by many orders of magnitude.”
Young, “Social dynamics: Theory and applications”, 2001.
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 41/42
![Page 79: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/79.jpg)
Dynamic incentive challenges
Modeling:
Order?Time-scale?Heterogeneity?Non-stationarity?Resolution?Social network effects?
Reinforcement learning vs Trend-based reinforcement learning
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 41/42
![Page 80: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/80.jpg)
Dynamic incentive challenges
Modeling:
Order?Time-scale?Heterogeneity?Non-stationarity?Resolution?Social network effects?
Measurement & actuation
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 41/42
![Page 81: Game Theoretic Learning and Social In uencehelper.ipam.ucla.edu/publications/traws4/traws4_12668.pdfGame elements: Players/Agents/Actors Actions/Strategies/Choices Preferences over](https://reader034.vdocument.in/reader034/viewer/2022050503/5f953c1c339f3961ed7c808a/html5/thumbnails/81.jpg)
Recap/Outline/Conclusions
Game models
Setup & equilibriumPrice-of-XLearning/evolutionary games
Influence models
Equilibrium shapingMechanism designDynamic incentives
Challenges
ModelingMeasurement & actuation June 2015
Jeff S. Shamma Game Theoretic Learning and Social Influence ∼ 42/42