download the slides: tutorialcognitiveradiotechnologies.com/files/wsu_handouts_may_2010.pdf · 1...

1

Game Theory in the Analysis and Design of Cognitive Radio

Download the slides:http://www.crtwireless.com/WSU_Tutorial.html

James [email protected](540) 230-6012

and Design of Cognitive Radio Networks

© Cognitive Radio Technologies, 2010

1

(540) 230 6012www.crtwireless.com

Wright State UniversityMay 12, 2010

Cognitive Radio TechnologiesFounded in 2007 by Dr. James Neel and Professor Jeff Reed to commercialize

Business Details

Professor Jeff Reed to commercialize cognitive radio research out of Virginia Tech• 6 employees / contractors• 07 Sales = 64k, 08 Sales = 127k• 09 Sales = 394k, 10 (contracts) = 960k

• Partner with established companies to spin in cognitive radio research

Business Model

cognitive radio research• Navy SBIR 08-099 => L3-Nova • Air Force SBIR 083-160 => GDC4S

• Contract research and consulting related to cognitive radio and software radio

• DARPA, DTI, CERDEC, Global Electronics• Position for entry in emerging wireless markets

• Cognitive Zigbee

2

Selected Projects

• Prototype SDR• Distributed

CR Projects SDR Projects• Prototype SDR

for software controllable antenna

• Fundamental limits to SDR performance

• Distributed spectrum management for WNW

• White Space Networking

Incumbent or other CR user( i h )

TV incumbent user Microphone userFractional usef TV h l

Other CR user or non-microphone incumbent ( l ti itti )

Incumbent or other CR user( i h )

TV incumbent user Microphone userFractional usef TV h l

Other CR user or non-microphone incumbent ( l ti itti ) performance

• Rapid estimation of SDR resources

• Cognitive gateway with ad-hoc extensions

6 MHz Unused(6 MHz)6 MHz

f

(except microphone user)TV incumbent user Microphone user

of TV channel

GuardBand

(regulations permitting)

6 MHz Unused(6 MHz)6 MHz

f

(except microphone user)TV incumbent user Microphone user

of TV channel

GuardBand

(regulations permitting)

0 50 100 150 200 250 300

40

60

80

100

120

140

Cha

nnel

0 50 100 150 200 250 300-90

-80

-70

-60

I i(f) (d

Bm

)

CRT’s Strengths• Analysis of networked

cognitive radio algorithms Average LinkInterference

FrequencyAdjustments

0 50 100 150 200 250 300

0 50 100 150 200 250 300-80

-75

-70

-65

-60

-55

iteration

Φ(f)

(dB

m)

(game theory)• Design of low complexity,

low overhead (scalable), convergent and stable cognitive radio algorithms– Infrastructure, mesh, and ad-

Net Interference

-60

-55

-50

-45

leve

ls (d

Bm

)

4

hoc networks– DFS, TPC, AIA, beamforming,

routing, topology formation

0 10 20 30 40 50 60 70 80 90 100-90

-85

-80

-75

-70

-65

Ste

ady-

stat

e In

terfe

renc

e

Number Links

Typical Worst Case Without AlgorithmAverage Without AlgorithmTypical Worst Case With AlgorithmAverage With AlgorithmColission Threshold

3

Tutorial Background• Minor modifications to tutorial given at DySPAN in 2007• Most material from my three week defenseMost material from my three week defense

– Very understanding committee– Dissertation online @

http://scholar.lib.vt.edu/theses/available/etd-12082006-141855/– Original defense slides @

http://www.mprg.org/people/gametheory/Meetings.shtml• Other material from training short course I gave in

summer 2003

5

summer 2003– http://www.mprg.org/people/gametheory/Class.shtml

• Eventually will be formalized into a book– Been saying that for a while…

• Soft copy of tutorial at– http://www.crtwireless.com/WSU_Tutorial.html

Approximate Tutorial ScheduleTime Material08:00-09:00 Cognitive Radio and Game Theory (51)

Break~20min1000-1020

Break

g y ( )09:00-09:45 Steady-state Solution Concepts (38)09:45-10:00 Performance Metrics (11)10:00-10:15 Break10:15-11:00 Notion of Time and Imperfections in Games (34)11:00-11:45 Using Game Theory to Design Cognitive Radio Networks (28)11:45-12:00 Summary (14)

6

4

General Comments on Tutorial• “This talk is intended to provide attendees with knowledge of the

most important game theoretic concepts employed in state-of-the-art dynamic spectrum access networks ”

• More leisurely sources of information:– D. Fudenberg, J. Tirole, Game Theory,

MIT Press 1991.– R. Myerson, Game Theory: Analysis of

dynamic spectrum access networks. • Lots of concepts, no proofs – cramming 2-3 semesters of game

theory into 3.5 hours• Tutorial can provide quick reference for concepts discussed at

conference

7

y , y yConflict, Harvard University Press, 1991.

– M. Osborne, A. Rubinstein, A Course in Game Theory, MIT Press, 1994.

– J. Neel. J. Reed, A. MacKenzie, Cognitive Radio Network Performance Analysis in Cognitive Radio Technology, B. Fette, ed., Elsevier August 2006.

Image modified from http://hacks.mit.edu/Hacks/by_year/1991/fire_hydrant/

Cognitive Radio and Game TheoryTheoryCognitive Radio,Game Theory,Relationship Between the

8

Between the Two

5

Basic Game Concepts and Cognitive Radio Networks• Assumptions about Cognitive Radios and Cognitive

Radio Networks– Definition and concept of cognitive radio as used in this

presentation– Design Challenges Posed by Cognitive Radio Networks– A Model of a Cognitive Radio Network

• High Level View of Game Theory– Common Components– Common Models

• Relationship between Game Theory and Cognitive Radio

9

• Relationship between Game Theory and Cognitive Radio Networks– Modeling a Generic Cognitive Radio Network as a Game– Differences in Typical Assumptions– Limitations of Application

Cognitive Radio: Basic IdeaSoftware radios permit network or user to control the operation of a

ft di• Cognitive radios enhance the control

process by adding– Intelligent, autonomous control of the

radio– An ability to sense the environment– Goal driven operation

Processes for learning about

OS

Software ArchServices

Waveform Software

Con

trol

Pla

ne

software radio

10

– Processes for learning about environmental parameters

– Awareness of its environment• Signals• Channels

– Awareness of capabilities of the radio– An ability to negotiate waveforms with

other radios

Board package (RF, processors)

Board APIs

6

OODA Loop: (continuously)• Observe outside world

O i tInfer from Context

I f f R di M d l

Cognition cycle

Conceptual Operation

• Orient to infer meaning of observations

• Adjust waveform as needed to achieve goal

• Implement processes needed to change waveform

Urgent

Orient

Select AlternateGoals

Plan

Normal

Immediate

LearnNewObserve

D id

Infer from Radio ModelEstablish Priority

Parse Stimuli

Pre-process


11

Other processes: (as needed)

• Adjust goals (Plan)• Learn about the outside

world, needs of user,…Allocate ResourcesInitiate Processes

Negotiate Protocols

States

OutsideWorld

Decide

Act

User Driven(Buttons)Autonomous

StatesGenerate “Best” Waveform

Figure adapted From Mitola, “Cognitive Radio for Flexible Mobile Multimedia Communications ”, IEEE Mobile Multimedia Conference, 1999, pp 3-10.

Implementation Classes

• Weak cognitive radio– Radio’s adaptations

determined by hard coded algorithms and informed by observations

• Strong cognitive radio– Radio’s adaptations

determined by conscious reasoning Closest approximation is

12

– Many may not consider this to be cognitive (see discussion related to Fig 6 in 1900.1 draft)

– Closest approximation is the ontology reasoning cognitive radios

In general, strong cognitive radios have potential to achieve both much better and much worse behavior in a network, but may not be realizable.

7

Brilliant Algorithms and Cognitive Engines• Most research focuses on

development of l ith f

• Cognitive engine can be viewed as a software

hit talgorithms for:– Observation– Decision processes– Learning– Policy– Context Awareness

• Some complete OODA loop algorithms

architecture• Provides structure for

incorporating and interfacing different algorithms

• Mechanism for sharing information across algorithms

13

loop algorithms • In general different

algorithms will perform better in different situations

algorithms• No current

implementation standard

Performance API Hardware/platform API

Radio

CE-Radio Interface

Observation Action

Example Architecture from CWT

User Model

Evolver

Cognitive System ControllerChob

Uob

User DomainUser preference

Local service facility

Security

Radio-domain cognitionRadio

Resource Monitor

Performance API Hardware/platform API

Radio Performance

Monitor

WMS

Search SpaceConfig

ChannelIdentifier

WaveformRecognizer

ObservationOrientation

Action

Decision


14

Security

Policy Model

Evolver

|(Simulated Meters) – (Actual Meters)| Simulated Meters

Actual Meters

Cognitive System Module

Reg

Knowledge BaseShort Term MemoryLong Term Memory

WSGA Parameter SetRegulatory Information

Initial ChromosomesWSGA Parameters

Objectives and weights

System Chromosome

}max{}max{

UUU

CHCHCH

USDUSD

•=•=

Decision Maker

Policy DomainUser preference

Local service facility

User data securitySystem/Network security

X86/UnixTerminal

Learning

Models

8

DFS in 802.16h• Drafts of 802.16h

defined a generic

Channel AvailabilityCheck on next channel

Available?

Choose Different Channel

Service in function

No

Decision, Action

Observation

gDFS algorithm which implements observation, decision, action, and learning Stop Transmission

Detection?

Select and change to new available channel in a defined time with a max. transmission time

In service monitoring of operating channel

No

Yes

Start Channel Exclusion timer

Yes

Learning

Observation

Decision, Action


15

processes• Very simple

implementation

Modified from Figure h1 IEEE 802.16h-06/010 Draft IEEE Standard for Local and metropolitan area networks Part 16: Air Interface for Fixed Broadband Wireless Access Systems Amendment for Improved Coexistence Mechanisms for License-Exempt Operation, 2006-03-29

Log of Channel Availability

Channel unavailable for Channel Exclusion time

Available?

Background In service monitoring (on non-

operational channels)

No

Yes

Start Channel Exclusion timerg

Other Cognitive Radio Efforts• TVWS PHY/MAC

– 802.22 TVWS• 802.22.1 beacons• SCC41

– 802.11af WhiteFi– CogNeA

• 802.19.1 TVWS Coexistence

• WhiteSpace Database GroupS lf O i i N t k

– 1900.4 Architectural building blocks

– 1900.5 Policy Languages– 1900.6 Sensing interfaces

• WinnForum (SDRF)– MLM – metalanguages– CRWG – database, IPA

G t• Self-Organizing Networks (3GPP / NGMN)

• 802.21 Media Independent Handoffs

• Government– NTIA testbed– DARPA: xG, WNAN– Various service efforts– NIJ Interoperability

16

9

Used cognitive radio definition

• A cognitive radio is a radio whose control processes permit the radio to leverage situational knowledge p g gand intelligent processing to autonomously adapt towards some goal.

• Intelligence as defined by [American Heritage_00] as “The capacity to acquire and apply knowledge, especially toward a purposeful goal.”– To eliminate some of the mess, I would love to just call

17

cognitive radio, “intelligent” radio, i.e., – a radio with the capacity to acquire and apply knowledge

especially toward a purposeful goal

Cognitive Networks• Rather than having

intelligence reside in a i l d i i t llisingle device, intelligence

can reside in the network• Effectively the same as a

centralized approach• Gives greater scope to the

available adaptations– Topology, routing

Conceptually permits

18

– Conceptually permits adaptation of core and edge devices

• Can be combined with cognitive radio for mix of capabilities

• Focus of E2R program

R. Thomas et al., “Cognitive networks: adaptation and learning to achieve end-to-end performance objectives,” IEEE Communications Magazine, Dec. 2006

10

The Interaction Problem

OutsideWorld

19

• Outside world is determined by the interaction of numerous cognitive radios

• Adaptations spawn adaptations

Issues Can Occur When Multiple Intelligences Interact

• Crash of May 6, 2010– Not just a fat finger

Combination of bad economic– Combination of bad economic news, big bet by Universa, and interactions of traders and computers

• Housing BubbleBounce up instead of

http://www.legitreviews.com/images/reviews/news/dow_drop.jpg

– Bounce up instead of down

– Slower interactions lead to slower changes

– Also indicative of the role beliefs play in instability

20

http://www.nytimes.com/imagepages/2006/08/26/weekinreview/27leon_graph2.html

11

In heavily loaded networks, a single vacation can spawn an infinite adaptation process

• Suppose 2– g31>g21; g12>g32 ; g23>g13

• Without loss of generality– g31, g12, g23 = 1– g21, g32, g13 = 0.5

• Infinite Loop! 13

– 4,5,1,3,2,6,4,…

Chan. (0,0,0) (0,0,1) (0,1,0) (0,1,1) (1,0,0) (1,0,1) (1,1,0) (1,1,1)Interf. (1.5,1.5,1.5) (0.5,1,0) (1,0,0.5) (0,0.5,1) (0,0.5,1) (1,0,0.5) (0.5,1,0) (1.5,1.5,1.5)

Interference Characterization

0 1 2 3 4 5 6 7

Phone Image: http://www1.istockphoto.com/file_thumbview_approve/2820949/2/istockphoto_2820949_dect_phone.jpgCradle Image:http://www.skypejournal.com/blog/archives/images/AVM_7170_D.jpg

Generalized Insights from the DECT Example• If # links / clusters > # channels, decentralized channel choices will

have a non-zero looping probabilityhave a non zero looping probability• As # links / clusters →∞, looping probability goes to 1

– 2 channels– k channels

• Can be mitigated by increasing # of channels (DECT has 120) or reducing frequency of adaptations (DECT is every 30 minutes)– Both waste spectrum

( ) ( ) 31 3 / 4 n Cp loop ≥ −

( ) ( ) 111 1 2 n kCkp loop +− +≥ − −

– And we’re talking 100’s of ms for vacation times• “Centralized” solutions become distributed as networks scale

– “Rippling” in Cisco WiFi Enterprise Networks• www.hubbert.org/labels/Ripple.html

• Also shows up in more recent proposals– Recent White Spaces paper from Microsoft

12

Locally optimal decisions that lead to globally undesirable networks

• Scenario: Distributed SINR maximizing gpower control in a single cluster

• For each link, it is desirable to increase transmit power in response to increased

Power

SINR

23

increased interference

• Steady state of network is all nodes transmitting at maximum power

Insufficient to consider only a single link, must consider interaction

Potential Problems with Networked Cognitive Radios

DistributedInfinite recursions

CentralizedSignaling Overhead• Infinite recursions

• Instability (chaos)• Vicious cycles• Adaptation collisions• Equitable distribution of

resources

• Signaling Overhead• Complexity• Responsiveness• Single point of failure

24

• Byzantine failure• Information distribution

13

1. Steady state h t i ti

NE3NE3NE3NE3

Network Analysis Objectives

e ac

tions

)

characterization2. Steady state optimality3. Convergence4. Stability/Noise5. Scalability

a1

a2

NE1

NE2

a1

a2

NE1

NE2

a1

a2

NE1

NE2

a1

a2

NE1

NE2

(R di 1’ il bl ti )

(Rad

io 2

’s a

vaila

ble

focu

s

25

a3

Steady State CharacterizationIs it possible to predict behavior in the system?How many different outcomes are possible?

OptimalityAre these outcomes desirable?Do these outcomes maximize the system target parameters?

ConvergenceHow do initial conditions impact the system steady state?What processes will lead to steady state conditions?How long does it take to reach the steady state?

Stability/NoiseHow do system variations/noise impact the system?Do the steady states change with small variations/noise?Is convergence affected by system variations/noise?

ScalabilityAs the number of devices increases,

How is the system impacted?Do previously optimal steady states remain optimal?

(Radio 1’s available actions)

General Model (Focus on OODA Loop Interactions)• Cognitive Radios • Set N

P ti l di i j• Particular radios, i, j

Outside

26

OutsideWorld

14

General Model (Focus on OODA Loop Interactions)

Actions• Different radios mayDifferent radios may

have different capabilities

• May be constrained by policy

• Should specify each radio’s available

ti t t

27

actions to account for variations

• Actions for radio i– Ai Act

General Model (Focus on OODA Loop Interactions)

Decision Rules• Maps observations

Implies very simple, deterministic function,p

to actions– di:O→Ai

• Intelligence implies that these actions further the radio’s goal

O R

e.g., standard interference function

28

– ui:O→R• Interesting problem:

simultaneously modeling behavior of ontological and procedural radios

Decide

15

Comments on Timing• When decisions are

made also matters and diff t di ill

Decision timing classes• Synchronous

different radios will likely make decisions at different time

• Tj – when radio j makes its adaptations– Generally assumed to be

an infinite set

– All at once• Round-robin

– One at a time in order– Used in a lot of analysis

• Random– One at a time in no order

A h

29

– Assumed to occur at discrete time

• Consistent with DSP implementation

• T=T1∪T2∪⋅⋅⋅∪Tn• t ∈ T

• Asynchronous– Random subset at a time– Least overhead for a

network

Cognitive Radio Network Modeling Summary• Decision making radios• Actions for each radio

Obser ed O tcome

• i,j ∈N, |N| = n• A=A1×A2×⋅⋅⋅×An

O• Observed Outcome Space

• Goals• Decision Rules• Timing• Network

• O

• uj:O→R (uj:A→R) • dj:O→Ai (dj:A→ Ai) • T=T1∪T2∪⋅⋅⋅∪Tn• ⟨N, A, {uj}, {dj},T ⟩


30

16

Basic Game Components1. A (well-defined) set of 2 or more players2 A set of actions for each player2. A set of actions for each player.3. A set of preference relationships for each

player for each possible action tuple.

• More elaborate games exist with more components but these three must always be there.

• Some also introduce an outcome function which maps action

31

• Some also introduce an outcome function which maps action tuples to outcomes which are then valued by the preference relations.

• Games with just these three components (or a variation on the preference relationships) are said to be in Normal form or Strategic Form

Set of Players (decision makers)

• N – set of n players consisting of players “named” {1 2 3 i j n}named {1, 2, 3,…,i, j,…,n}

• Note the n does not mean that there are 14 players in every game.

• Other components of the game that “belong” to a particular player are normally indicated by a subscript.

• Generic players are most commonly written

32

• Generic players are most commonly written as i or j.

• Usage: N is the SET of players, n is the number of players.

• N \ i = {1,2,…,i-1, i+1 ,…, n} All players in Nexcept for i

17

ActionsAi – Set of available actions for player i Example Two Player ai – A particular action chosen by i, ai ∈ Ai

A – Action Space, Cartesian product of all Ai

A=A1× A2×· · · × An

a – Action tuple – a point in the Action Space

A-i – Another action space A formed from

A2 = A-1

a

Action SpaceA1 = A2 = [0 ∞)A=A1× A2

33

A-i =A1× A2×· · · ×Ai-1 × Ai+1 × · · · × An

a-i – A point from the space A-i

A = Ai × A-i

A1= A-2

a

a1 = a-2

a2 = a-1 b

b1 = b-2

b2 = b-1

Preference Relation expresses an individual player’s desirability ofone outcome over another (A binary relationship)

Preference Relations (1/2)

*io o o is preferred at least as much as o* by player i

i Preference Relationship (prefers at least as much as)

i Strict Preference Relationship (prefers strictly more than)

~i “Indifference” Relationship (prefers equally)

*io o *

io oiff *io obut not

34

i p (p q y)*~io o *

io oiff *io oand

18

Preference Relations (2/2)• Games generally assume the relationship

b t ti d t ibetween actions and outcomes is invertible so preferences can be expressed over action vectors.

• Preferences are really an ordinal relationship

35

relationship– Know that player prefers one outcome to

another, but quantifying by how much introduces difficulties

A mathematical description of preference relationships.

Utility Functions (1/2)(Objective Fcns, Payoff Fcns)

Preference Relation then defined as*

ia a

Maps action space to set of real numbers.

iff ( ) ( )*i iu a u a≥

:iu A→R

36

i ( ) ( )i i

*ia a iff ( ) ( )*

i iu a u a>

*~ia a iff ( ) ( )*i iu a u a=

19

Utility Functions (2/2)By quantifying preference relationships all sorts of valuable

th ti l ti b i t d dmathematical operations can be introduced.

Also note that the quantification operation is not unique as long as relationships are preserved. Many map preference relationships to [0,1].

Example

37

Jack prefers Apples to Oranges

JackApples Oranges ( ) ( )Jack Jacku Apples u Oranges>

a) uJack(Apples) = 1, uJack(Oranges) = 0

b) uJack(Apples) = -1, uJack(Oranges) = -7.5

Normal Form Games(Strategic Form Games)

In normal form, a game consists of three primary

{ }, , iG N A u=

components

N – Set of PlayersAi – Set of Actions Available to Player i

38

A – Action Space {ui} – Set of Individual Objective Functions

:iu A→R

1 2 nA A A A= × × ×

20

Normal Formal Games in Matrix Representation

Useful for representing 2 player games with finite action sets.Player 1’s actions are indexed by rows.

a2 b2

A2 = {a2,b2}A1 = {a1,b1}N = {1,2}

Player 2’s actions are indexed by columns.Each entry is the payoff vector, (u1, u2), corresponding to the action tuple

39

a1

b1

u1(a1, a2), u2(a1, a2) u1(a1, b2), u2(a1, b2)

u1(b1, b2), u2(b1, b2)u1(b1, a2), u2(b1, a2)

OrientInfer from Context

Establish Priority

Infer from Radio Model

Utility function Utility Function

Cognitive radios are naturally modeled as players in a game

NormalUrgent

Establish Priority

PlanNormal

Immediate

LearnNewStates

Goal

ObserveDecide

Arguments


40

Allocate ResourcesInitiate Processes

NegotiateAdapted From Mitola, “Cognitive Radio for Flexible Mobile Multimedia Communications ”, IEEE Mobile Multimedia Conference, 1999, pp 3-10.

OutsideWorld

Act

Autonomous States

\

Outcome Space

Action Sets

DecisionRules

21

Radio 2

Actions

Radio 1

Actions

Interaction is naturally modeled as a game

ActionsActionsAction Space Decision

RulesDecision Rules

:f A O→Informed by Communications


41

u2u1 Outcome Space

:f A O→Communications Theory

( )1 2ˆ ˆ,γ γ( )1 1̂u γ ( )2 2ˆu γ

OrientInfer from Context Infer from Radio Model

Level0 SDR1 Goal Driven

If distributed adaptation doesn’t occur, it’s not a game

When Game Theory can be Applied

NormalUrgent

OSelect Alternate

GoalsEstablish Priority

PlanNormal

Immediate

LearnNewStates

Generate AlternateGoals

ObserveDecide

1 Goal Driven2 Context Aware3 Radio Aware4 Planning5 Negotiating6 Learns Environment7 Adapts Plans8 Adapts Protocols

g

Parse StimuliPre-process

Suitable for game theory analysis

Unconstrained action sets (radios can make up new adaptations) or

42

Allocate ResourcesInitiate Processes

NegotiateNegotiate Protocols

OutsideWorld

Act

User Driven(Buttons) Autonomous Determine “Best”

PlanStates

Determine “Best” Known WaveformGenerate “Best” Waveform

p p )undefined goals (utility functions) make analysis impractical

Game Theory applies to: 1. Adaptive aware radios2. Cognitive radios that learn about

their environment

22

Conditions for Applying Game Theory to CRNs• Conditions for rationality

– Well defined decision making processes– Expectation of how changes impacts

performance• Conditions for a nontrivial game

– Multiple interactive decision makers

43

Multiple interactive decision makers– Nonsingleton action sets

• Conditions generally satisfied by distributed dynamic CRN schemes

• Inappropriate applications– Cellular Downlink power control (single cell)

Example Application Appropriateness

– Site Planning– A single cognitive network

• Appropriate applications– Multiple interactive cognitive networks– Distributed power control on non-orthogonal

waveformsAd h t l

44

• Ad-hoc power control• Cell breathing

– Adaptive MAC– Distributed Dynamic Frequency Selection– Network formation (localized objectives)

23

Some differences between game models and cognitive radio network model• Assuming numerous iterations, normal form game only

has a single stage.

Player Cognitive Radio

g g– Useful for compactly capturing modeling components at a single

stage– Normal form game properties will be exploited in the analysis of

other games– Other game models discussed throughout this presentation

45

y gKnowledge Knows A Can learn O (may know or learn A)

f : A →O

InvertibleConstantKnown

Not invertible (noise)May change over time (though relatively fixed for short periods)Has to learn

Preferences Ordinal Cardinal (goals)

Summary• Adaptations of cognitive radios interact

– Adaptations can have unexpected negative results• Infinite recursions, vicious cycles

– Insufficient to consider behavior of only a single link in the design• Behavior of collection of radios can be modeled as a game• Some differences in models and assumptions but high level

mapping is fairly close

46

• As we look at convergence, performance, collaboration, and stability, we’ll extend the model

1

Equilibrium Conceptsq p

Nash Equilibria, Mixed Strategy Equilbria, Coalitional Games, the Core,


1

Games, the Core, Shapley Value, Nash Bargaining,

WSU May 10, 2010

Steady-states• Recall model of <N,A,{di},T> which we characterize with

the evolution function d• Steady-state is a point where a*= d(a*) for all t ≥t *

• Obvious solution: solve for fixed points of d.• For non-cooperative radios, if a* is a fixed point under

synchronous timing, then it is under the other three timings (round-robin, random, asynchronous)

• Works well for convex action spaces

2

– Not always guaranteed to exist– Value of fixed point theorems

• Not so well for finite spaces– Generally requires exhaustive search

2

“A steady-state where each player holds a correct expectation of the other players’ behavior and acts rationally.” - Osborne

Nash Equilibrium

An action vector from which no player can profitably unilaterally deviate.

( ) ( ), ,i i i i i iu a a u b a− −≥An action tuple a is a NE if for every i ∈ Nfor all bi ∈Ai.

Definition

3

Note showing that a point is a NE says nothing about the process by which the steady state is reached. Nor anything about its uniqueness.Also note that we are implicitly assuming that only pure strategies are possible in this case.

Examples• Cognitive Radios’

DilemmaT di h t i l– Two radios have two signals to choose between {n,w} and {N,W}

– n and N do not overlap– Higher throughput from

operating as a high power wideband signal when other is narrowband

4

is narrowband

• Jamming Avoidance– Two channels– No NE

0 10 (-1,1) (1,-1)1 (1,-1) (-1,1)

Jammer

Transmitter

3

How do the players find the Nash Equilibrium?• Preplay Communication

– Before the game, discuss their options. Note only NE are it bl did t f di ti l ldsuitable candidates for coordination as one player could

profitably violate any agreement.• Rational Introspection

– Based on what each player knows about the other players, reason what the other players would do in its own best interest. (Best Response - tomorrow) Points where everyone would be playing “correctly” are the NE.

• Focal PointS di ti i hi h t i ti f th t l it t t d

5

– Some distinguishing characteristic of the tuple causes it to stand out. The NE stands out because it’s every player’s best response.

• Trial and Error– Starting on some tuple which is not a NE a player “discovers”

that deviating improves its payoff. This continues until no player can improve by deviating. Only guaranteed to work for Potential Games (couple weeks)

Nash Equilibrium as a Fixed Point

• Individual Best Response( ) ( ) ( ){ }ˆ :B a b A u b a u a a a A= ∈ ≥ ∀ ∈

• Synchronous Best Response

• Nash Equilibrium as a fixed point

( ) ( ) ( ){ }: , ,i i i i i i i i i i iB a b A u b a u a a a A− −= ∈ ≥ ∀ ∈

( ) ( )ˆ ˆii N

B a B a∈

= ×

( )* *ˆa B a=

6

• Fixed point theorems can be used to establish existence of NE (see dissertation)

• NE can be solved by implied system of equations

( )

4

Example solution for Fixed Point by Solving for Best Response Fixed Point

• Bandwidth Allocation GameFive cognitive radios with each radio i free to– Five cognitive radios with each radio, i, free to determine the number of simultaneous frequency hopping channels the radio implements, ci ∈[0,∞).

– Goal– P(c) fraction of symbols that are not interfered with

(making P(c)ci the goodput for radio i)

( ) ( ) ( )i i i iu c P c c C c= −

7

– Ci(ci) is radio i’s cost for supporting ci simultaneous channels.

( )i k i ik N

u c B c c Kc∈

⎛ ⎞= − −⎜ ⎟⎝ ⎠

∑

Best Response Analysis( )i k i i

k Nu c B c c Kc⎛ ⎞

= − −⎜ ⎟⎝ ⎠

∑Goalk N∈⎝ ⎠

( )\

ˆ / 2i i kk N i

c B c B K c∈

⎛ ⎞= = − −⎜ ⎟

⎝ ⎠∑Best Response

Simultaneous System of

8( ) ( )ˆ / 1ic B K N i N= − + ∀ ∈

yEquations

( )ˆ / 6ic B K i N= − ∀ ∈Solution

Generalization

5

Significance of NE for CRNs

Autonomously Rational Decision Rule


9

• Why not “if and only if”?– Consider a self-motivated game with a local maximum and a hill-climbing

algorithm.– For many decision rules, NE do capture all fixed points (see dissertation)

• Identifies steady-states for all “intelligent” decision rules with the same goal.

• Implies a mechanism for policy design while accommodating differing implementations

– Verify goals result in desired performance– Verify radios act intelligently

Nash Equilibrium Existence

Visualizable Definition of Quasi Concavity

10

( ) ( ){ }* *:U a a A f a a= ∈ ≥

a2a1

U(a1)

a

f (a)

a0 a2a1

U(a1)

a

f (a)

a0

Visualizable Definition of Quasi-ConcavityAll upper-level sets are convex

Not all games have an NE, But games with mixed strategies do

6

My Favorite Mixed Strategy StoryPure Strategies in an Extended GameConsider an extensive form game where each stage is a strategicf d h i i h hform game and the action space remains the same at each stage.Before play begins, each player chooses a probabilistic strategythat assigns a probability to each action in his action set. At eachstage, the player chooses an action from his action set according to the probabilities he assigned before play began.

Example

11

a p eConsider a video football game which will be simulated. Before thegame begins two players assign probabilities of calling running plays or passing plays for both offense or defense. In the simulation,for each down the kind of play chosen by each team is based on theinitial probabilities assigned to kinds of plays. (Play NCAA2003)

Example Mixed Strategy Game

Jamming gameq (1 q) Action Tuples Probabilities

a1

b

a2 b2

1,-1 -1, 1

1 1

p

(1 )

q (1-q) (a1,a2)(a1,b2)(b1,a2)(b1,b2)

pqp(1-q)

(1-p)(1-q)(1-p)q

Expected Utilities

12

b1 1, -1-1, 1(1-p) ( ) ( ) ( )( )( ) ( ) ( )( )( )

1 , 1 1 1

1 1 1 1 1

U p q pq p q

p q p q

= + − − +

+ − − + − −

( ) ( ) ( )( )( ) ( ) ( )( )( )

2 , 1 1 1

1 1 1 1 1

U p q pq p q

p q p q

= − + − +

+ − + − − −

Δ(A1)={p,(1-p): ∀p∈[0,1]}

Δ(A2)={q,(1-q): ∀q∈[0,1]}

Sets of probability distributions

7

Nash Equilibrium in a Mixed Strategy GameDefinition Mixed Strategy Nash EquilibriumA mixed strategy profile α* is a NE iff ∀i∈NA mixed strategy profile α is a NE iff ∀i∈N

( ) ( ) ( )* * *, ,i i i i i i i iU U Aα α β α β− −≥ ∀ ∈Δ

Best Response Correspondence( )

( )( )arg max ,

i ii i i i iA

BR Uα

α α α− −∈Δ=

13

Alternate NE DefinitionConsider ( ) ( )i N iB BRα α∈= ×

A mixed strategy profile α* is a NE iff( )* *Bα α∈

1( ) ( )4 2 2 1U p q pq p q= +( )1 , 4 2 2 1U p q pq p q= − − + Best Response Correspondences

Nash Equilibrium

1

Best Response

( )1 4 2u q qp

∂= −

∂

( )2 4 2u p pq

∂= − +

∂0 1/ 2q <⎧

( ) ( )2 , 4 2 2 1U p q pq p q= − − − +

0.5BR1(q)

BR2(p)p(a1)

p

q

1-p

1-q

14

0 1( )1

0 1/ 2[0,1] 1/ 21 1/ 2

qBR q q

q

<⎧⎪= =⎨⎪ >⎩

( )2

1 1/ 2[0,1] 1/ 20 1/ 2

pBR p p

p

>⎧⎪= =⎨⎪ <⎩

0.5

Note: NE in mixed extension which did not exist in original

p(a2)

8

Interesting Properties of Mixed Strategy Games1. Every Mixed Extension of a Strategic

Game has an NEGame has an NE.2. A mixed strategy αi is a best response to

α-i iff every action in the support of αi is itself a best response to α-i.

3. Every action in the support of any l ’ ilib i i d t t i ld

15

player’s equilibrium mixed strategy yields the same payoff to that player.

Coalitional Game (with transferable payoff or utilities)• Concept: groups of players (called coalitions) conspire together to

implement actions which yields a result for the coalition. The value received by the coalition is then distributed among the coalitionreceived by the coalition is then distributed among the coalition members.

• Where do radios collaborate and distribute value?– 802.16h interference groups – allocation of bandwidth– Distribution of frequencies/spreading codes among cells– File sharing in P2P network

• Transferable utility refers to existence of some commodity for which a player’s utility increases by one unit for every unit of the commodity it receives

16

• Game Components, ⟨N,v⟩– N set of players– Characteristic function– Coalition, S⊆N

• How is this value distributed?– Payoff vector, (xi)i∈S

• Payoff vector is said to be S-feasible if x(S) ≤ v(S)

: 2 \Nv ∅→

( ) ii S

x S x∈

=∑

9

The Core (Transferrable)• The Core

For ⟨N ⟩ the set of feasible pa off profiles– For ⟨N,v⟩, the set of feasible payoff profiles, (xi)i∈S for which there is no coalition S and S-feasible payoff vector (yi)i∈S for which yi > xifor all i∈S.

• General principles of the NE also apply to the Core:

17

the Core:– Number of solutions for a game may be

anywhere from 0 to ∞– May be stable or unstable.

Example• Suppose three radios, N = {1,2,3}, can choose to

participate in a peer-to-peer networkparticipate in a peer to peer network. • Characteristic Function

– v(N) = 1– v({1,2})= v({1,3})= v({2,3})=α∈[0,1]– v(1)= v(2)= v(3) = 0

• Loosely, α indicates # of duplicated files

18

• If α>2/3, Core is empty

x = (2/5, 2/5,0)Example adaptations forα=4/5

x = (0, 3/5, 1/5) x = (2/5, 0, 2/5)

x = (1/3, 1/3, 1/3) x = (2/5, 2/5,0)

10

Comments on the Core• Possibility of empty core implies that even when

radios can freely negotiate and form arbitrary y g ycoalitions, no steady-state may exist

• Frequently very large (infinite) number of steady-states, e.g., α<2/3 – Makes it impossible to predict exact behavior

• Existence conditions for the Core, but would need to cover some linear programming

19

concepts• Related (but not addressed today) concepts:

– Bargaining Sets, Kernel, Nucleolus

Strong NE• Concept: Assume radios are able to collaborate,

but utilities aren’t necessarily transferrablebut utilities aren t necessarily transferrable• An action tuple a* such that

( ) ( )* *, ,i i S S S ii Su a u a a S N a A− ∈

≥ ∀ ⊆ ∈ ×

No Strong NE

N WUnique Strong NE

20

N Wn (9.6,9.6) (9.6, 21)w (21, 9.6) (22, 22)

11

Motivation for Shapley value• Core was generally either empty or very large.

Want a “good” single solution– Want a “good” single solution.

• Kinda defining formal distribution function• Terminology

– Marginal Contribution of i( ) ( ) ( )i S v S i v SΔ = ∪ −

21

– Interchangeability of i, j

– Dummy player (no synergy)

( ) { }( ) \i S v i S N iΔ = ∀ ⊆

( ) ( ) \{ , }i jS S S N i jΔ = Δ ∀ ⊆

Axioms for Shapley Value• Let ψ be some distribution of value for a TU

coalition gamecoalition game• Symmetry:

– If i and j are interchangeable, then ψi(v)=ψj(v)• Dummy:

– If i is a dummy, then ψi(v) = v({i})• Additivity:

22

y– Given ⟨N,v⟩ and ⟨N,w⟩, ψi(v + w)= ψi(v)+ ψi(w) for all

i∈N, where v+w = v(S) + w(S)• Balanced Contributions

– Given ⟨N,v⟩, ( ) ( ) ( ) ( )\ \, \ . , \ ,N j N ii i j jN v N j v N v N i vψ ψ ψ ψ− = −

12

Shapley Value

( )! 1 !S N S− −( ) ( ) ( ) ( )( )

\

! 1 !!i

S N i

S N SS v S i v S

Nψ

⊆

− −= −∑ ∪

Marginal Value Contributed by i

Probability that i will be next one invited to the grand coalition ( ) ( ) ( )i S v S i v SΔ = ∪ −

23

Only assignment (value) that satisfies balanced contributions; only assignment that simultaneously satisfies symmetry, dummy, and additivity axioms

given that coalition S is already part of the coalition assuming random ordering.

Implications of Shapley Value• One form of a fair allocation

– What you receive is based on the value you add– What you receive is based on the value you add– Independent of order of arrival– I liken it to setting salaries according to the Value

Over Replacement Player concept• “Better” solution concept than the core as it’s a

single payoff as opposed to a potentially infinite number

24

number • Allows for analysis of relative “power” of different

players in the system

13

Steady-State Summary• Not every game has a steady-state• NE are analogous to fixed points of self-interested g p

decision processes• NE can be applied to procedural and ontological radios

– Don’t need to know decision rule, only goals, actions, and assumption that radios act in their own interest

• A game (network) may have 0, 1, or many steady-states• All finite normal form games have an NE in its mixed

extension

25

– Over multiple iterations, implies constant adaptation• More complex game models yield more complex steady-

state concepts• Can define steady-states concepts for coalitional games

– Frequently so broad that specific solutions are used

5/11/2010

1

Evaluating Equilibriag q

Objective Function Maximization, Pareto Efficiency, Notions

f F i


1

of Fairness

WSU May 12, 2010

Optimality• In general we assume

the existence of somethe existence of some design objective function J:A→R

• The desirableness of a network state, a, is the value of J(a).I l i i

2

• In general maximizers of J are unrelated to fixed points of d.

Figure from Fig 2.6 in I. Akbar, “Statistical Analysis of Wireless Systems Using Markov Models,” PhD Dissertation, Virginia Tech, January 2007

5/11/2010

2

Example Functions• Utilitarian

– Sum of all players’ utilitiesUtilitarian Maximizers

p y– Product of all players’

utilities• Practical

– Total system throughput– Average SINR– Maximum End-to-End

Latency

System Throughput Maximizers

3

y– Minimal sum system

interference• Objective can be

unrelated to utilitiesInterference Minimization

Price of Anarchy (Factor)Performance of Centralized Algorithm Solution

Performance of Distributed Algorithm Solution

• Centralized solution always at least as good as distributed solution– Like ASIC is always at least as good as

DSPI t f i l ti

g

≥ 1

4

• Ignores costs of implementing algorithms– Sometimes centralized is infeasible (e.g.,

routing the Internet)– Distributed can sometimes (but not

generally) be more costly than centralized

9.6

7

5/11/2010

3

Price of Anarchy Discussion• Best of All Possible Worlds

– Low complexity distributed algorithms with low anarchy factors• Reality implies mix of methods

– Hodgepodge of mixed solutions• Policy – bounds the price of anarchy• Utility adjustments – align distributed solution with centralized

solution• Market methods – sometimes distributed, sometimes centralized• Punishment – sometimes centralized, sometimes distributed,

sometimes both• Radio environment maps ”centralized” information for distributed

5

• Radio environment maps – centralized information for distributed decision processes

– Fully distributed• Potential game design – really, the Panglossian solution, but only

applies to particular problems

Pareto efficiency (optimality)• Formal definition: An action vector a* is

Pareto efficient if there exists no other action vector a, such that every radio’s valuation of the network is at least as good and at least one radio assigns a higher valuation

• Informal definition: An action tuple is Pareto efficient if some radios must be hurt in order to improve the payoff of other radios

6

to improve the payoff of other radios.• Important note

– Like design objective function, unrelated to fixed points (NE)

– Generally less specific than evaluating design objective function

5/11/2010

4

Example Games

Legend Pareto Efficient

a

a2 b2 a2 b2

Legend Pareto Efficient

NE NE + PE

7

a1

b1

1,1 -5,5

-1,-15,-5

a1

b1

1,1 -5,5

3, 35,-5

Notions of Fairness• What is “Fair”?

Abstractly “fair” means different things to different– Abstractly “fair” means different things to different analysts

– In every day life, “unfair” is short hand for “I deserve more than I got”

• Nonetheless is used to evaluate how equitably radio resources are distributed

8

5/11/2010

5

Gini Coefficient• Basic concept:

– Order players by utility. – Form CDF for sorted utility U

tility

Form CDF for sorted utility distribution (Lorenz curve)

– Integrate (sum) the difference between perfect equality (of outcome) and CDF

– Divide result by sum of all players’ utilities

• Formula Player #

Aggr

egat

e

Lorenz curve

( )( ) ( )

( )

11 1 2

ii N

n i u aG a n

n u a∈

⎛ ⎞+ −⎜ ⎟= + −⎜ ⎟⎜ ⎟

∑∑

9

• Used in a lot of macro-economic comparisons of income distributions

• Relatively simple, independent of scale, independent of size of N, anonymity

• Radically different outcomes can give the same result

G N Wn 0 0.37w 0.37 0

( )ii N

n u a∈

⎜ ⎟⎜ ⎟⎝ ⎠

∑

Other Metrics of Fairness• Theill Index

( ) ( )1 u a u a⎛ ⎞ ( ) ( )1∑• Atkinson Index, ε is income inequality aversion

( ) ( )( )

[ )1 1

11 11 , 0,1ii N

T a u au n

εε ε

−−⎛ ⎞

= − ∈⎜ ⎟⎝ ⎠∑

( ) ( ) ( )1 lni i

i N

u a u aT a

n u u∈

⎛ ⎞= ⎜ ⎟

⎝ ⎠∑ ( ) ( )1

ii N

u a u an ∈

= ∑

10

i Nu n ∈⎝ ⎠

( ) ( )1/

1 11 , 1n

ii N

T a u au n

ε∈

⎛ ⎞= − =⎜ ⎟

⎝ ⎠∑

5/11/2010

6

Bargaining Problem• Components: ⟨F, v⟩

– Feasible payoffs F, closed convex subset of Rn

– Disagreement Point v = (v1, v2)• What 1 or 2 could achieve without bargaining

• Example:– Even if system is jammed, still gets some throughput– Member of 802.16h interference group and try its luck

• F is said to be essential if there is some y∈F such that y1>v1 and y2>v2If t t “bi di ” th F ld b th ff

11

• If contracts are “binding” then F could be the payoffs corresponding to entire original action space

• Otherwise, F may need to be drawn from the set of NE or from enforceable set (see punishment in repeated games)

• A particular solution is referred to by φ(F, v) ∈Rn

Desirable Bargaining Axioms about a Solution• Strong Efficiency

φ(F v) is Pareto• Independence of

Irrelevant Alternatives– φ(F, v) is Pareto Efficient

• Individually Rational– φ(F, v) ≥v

• Scale CovarianceFor any λ λ γ

Irrelevant Alternatives– If G ⊆F and G is

closed and convex and φ(F, v)∈G, then φ(G, v)=φ(F, v)

• Symmetry

12

– For any λ1, λ2, γ1, γ2∈R, λ1, λ2 >0, if

then

– If v1=v2 and {(x1,x2)|(x2,x1)∈F}=F, then φ1(F, v)= φ2(F, v) ( ) ( ){ }1 1 1 2 2 2 1 2, | ,G x x x x Fλ γ λ γ= + + ∈

( ) ( ) ( )( )1 1 1 2 2 2, , , ,G w F v F vφ λφ γ λ φ γ= + +

( )1 1 1 2 2 2,w v vλ γ λ γ= + +

5/11/2010

7

Nash Bargaining Solution• NBS

( ) ( )

• Interestingly, this is the only bargaining solution which simultaneously satisfies the preceding 5 axioms

( ) ( ),

, arg max i ii Nx F x vF v x vφ

∈∈ ≥∈ Π −

13

preceding 5 axioms

GT framework for BW allocation [Yaiche]: System Model• N users• L links• L links• Users compete for the total link capacity• Each user has a minimum rate MRi and peak

rate PRi• Admissible rate vector is given by,

{ }N


14

C : vector of link capacitiesA L*N: alp = 1 if link belongs to path p, else 0.

{ }0 | , , andNX x x MR x PR Ax C= ∈ ≥ ≤ ≤

Scenario given in H. Yaiche, R. Mazumdar, C. Rosenberg, “A game theoretic framework for bandwidth allocation and pricing in broadband networks”, IEEE/ACM Transactions on Networking, Volume: 8 , Issue: 5 , Oct. 2000, pp. 667-678.

5/11/2010

8

Centralized Optimization Problem

( )N

∏•{ } ( )

{ }{ }

1

: 1

1

i ixi

i i

i i

Max x MR

st x MR i N

x PR i N

=

−

≥ ∈

≤ ∈

∏…

…

•v

15

( ) ( ) { }1l l

Ax C l L≥ ∈ …

• Unique NBS existsF

Summary of Equilibria Evaluation• Lots of different ways which a point can be

evaluatedevaluated• Many are contradictory• Loosely, any point could be said to be

optimal given the right objective function• Insufficient to say that a point is optimal

16

– Must describe the metric in use• Suggestion: use whatever metric makes

sense to you as a network designer

1

The Notion of Time and Imperfections in Games andImperfections in Games and NetworksExtensive Form Games, Repeated Games, Convergence Concepts in Normal Form Games,


1

o a o Ga es,Trembling Hand Games, Noisy Observations

WSU May 12, 2010

Model Timing Review• When decisions are

made also matters and diff t di ill

Decision timing classes• Synchronous

different radios will likely make decisions at different time

• Tj – when radio j makes its adaptations– Generally assumed to be

an infinite set

– All at once• Round-robin

– One at a time in order– Used in a lot of analysis

• Random– One at a time in no order

A h

2

– Assumed to occur at discrete time

• Consistent with DSP implementation

• T=T1∪T2∪⋅⋅⋅∪Tn• t ∈ T

• Asynchronous– Random subset at a time– Least overhead for a

network

2

Extensive Form Game Components1. A set of players.2 Th ti il bl t h l t h d i i t ( t t )

Components

2. The actions available to each player at each decision moment (state).3. A way of deciding who is the current decision maker.4. Outcomes on the sequence of actions.5. Preferences over all outcomes.

1B 1,-1Strategic Form Equivalence

Strategies for AGame Tree Representation

A Silly Jammer Avoidance Game

3

A1

2

2

1’

2’

B

B

,

-1,1

-1,1

1,-1

1

2

1,1’ 1,2’ 2,1’ 2,2’

1,-1

1,-11,-1

1,-1 -1,1 -1,1

-1,1 -1,1

Strategies for A{1,2}Strategies for B{(1,1’),(1,2’),(2,1’),(2,2’)}

Backwards Induction• Concept

– Reason backwards based on what each player would rationally lplay

– Predicated on Sequential Rationality– Sequential Rationality – if starting at any decision point for a

player in the game, his strategy from that point on represents a best response to the strategies of the other players

– Subgame Perfect Nash Equilibrium is a key concept (not formally discussed today).

Alternating Packet Forwarding Game


4

C

S S

1 1

1,0

C2

0,2

C

S

3,1

S

C2

2,4

S

1

5,3

C C

S

2

4,6

7,54,65,32,43,10,2

Alternating Packet Forwarding Game

3

Comments on Extensive Form Games• Actions will generally not be directly observable• However likely that cognitive radios will build up• However, likely that cognitive radios will build up

histories• Ability to apply backwards induction is

predicated on knowing other radio’s objectives, actions, observations and what they know they know…

5

– Likely not practical• Really the best choice for modeling notion of

time when actions available to radios change with history

Repeated GamesStage 1Stage 1• Same game is repeated

Indefinitely

Stage 2Stage 2

– Indefinitely– Finitely

• Players consider discounted payoffs across multiple stages

Stage k

6

Stage kStage k

– Stage k

– Expected value over all future stages

( ) ( )k k ki iu a u aδ=

( )( ) ( )0

k k ki i

k

u a u aδ∞

=

=∑

4

Lesser Rationality: Myopic Processes• Players have no knowledge about utility

functions or expectations about future playfunctions, or expectations about future play, typically can observe or infer current actions

• Best response dynamic – maximize individual performance presuming other players’ actions are fixed

• Better response dynamic improve individual

7

• Better response dynamic – improve individual performance presuming other players’ actions are fixed

• Interesting convergence results can be established

Paths and Convergence• Path [Monderer_96]

– A path in Γ is a sequence γ = (a0, a1,…) such that for every k 1 th i t i l h th t th t tk ≥ 1 there exists a unique player such that the strategy combinations (ak-1, ak) differs in exactly one coordinate.

– Equivalently, a path is a sequence of unilateral deviations. When discussing paths, we make use of the following conventions.

– Each element of γ is called a step.– a0 is referred to as the initial or starting point of γ.– Assuming γ is finite with m steps am is called the terminal

8

Assuming γ is finite with m steps, a is called the terminal point or ending point of γ and say that γ has length m.

• Cycle [Voorneveld_96]– A finite path γ = (a0, a1,…,ak) where ak = a0

5

Improvement Paths• Improvement Path

– A path γ = (a0 a1 ) where for all k≥1– A path γ = (a , a ,…) where for all k≥1, ui(ak)>ui(ak-1) where i is the unique deviator at k

• Improvement Cycle– An improvement path that is also a cycle– See the DFS example

γ1γ1

9

γ2

γ1

γ3

γ4γ5γ6γ2

γ1

γ3

γ4γ5γ6

Convergence Properties• Finite Improvement Property (FIP)

– All improvement paths in a game are finite• Weak Finite Improvement Property (weak

FIP)– From every action tuple, there exists an

improvement path that terminates in an NE.

10

improvement path that terminates in an NE.• FIP implies weak FIP• FIP implies lack of improvement cycles• Weak FIP implies existence of an NE

6

Examples

A BGame with FIP

a

b

A B

1,-1

-1,1

0,2

2,2

A B CWeak FIP but not FIP


11

a

b

A B

1,-1 -1,1

1,-1-1,1

C

0,2

1,2c 2,12,0 2,2

Implications of FIP and weak FIP• Assumes radios are incapable of reasoning ahead and

must react to internal states and current observations• Unless the game model of a CRN has weak FIP, then no

autonomously rational decision rule can be guaranteed to converge from all initial states under random and round-robin timing (Theorem 4.10 in dissertation).

• If the game model of a CRN has FIP, then ALL autonomously rational decision rules are guaranteed to converge from all initial states under random and round


12

converge from all initial states under random and round-robin timing.– And asynchronous timings, but not immediate from definition

• More insights possible by considering more refined classes of decision rules and timings

7

Decision Rules


13

Markov Chains• Describes adaptations

as probabilistic ptransitions between network states.– d is nondeterministic

• Sources of randomness:– Nondeterministic timing– Noise

14

– Noise• Frequently depicted as

a weighted digraph or as a transition matrix

8

General Insights ([Stewart_94])• Probability of occupying a state

after two iterations.– Form PP.– Now entry pmn in the mth row and nth

column of PP represents the probability that system is in state an

two iterations after being in state am. • Consider Pk.

– Then entry pmn in the mth row and nth

l f h b bili

15

column of represents the probability that system is in state an two iterations after being in state am.

Steady-states of Markov chains• May be inaccurate to consider a Markov

chain to have a fixed pointchain to have a fixed point– Actually ok for absorbing Markov chains

• Stationary Distribution– A probability distribution such that π* such that π*T P =π*T is said to be a stationary distribution for the Markov chain defined by P

16

distribution for the Markov chain defined by P. • Limiting distribution

– Given initial distribution π0 and transition matrix P, the limiting distribution is the distribution that results from evaluating

0lim T k

kπ

→∞P

9

Ergodic Markov Chain • [Stewart_94] states that a Markov chain is

ergodic if it is a Markov chain if it is a)ergodic if it is a Markov chain if it is a) irreducible, b) positive recurrent, and c) aperiodic.

• Easier to identify rule:– For some k Pk has only nonzero entries

• (Convergence steady-state) If ergodic then

17

• (Convergence, steady-state) If ergodic, then chain has a unique limiting stationary distribution.

Absorbing Markov Chains• Absorbing state

Given a Markov chain with transition matrix P a state– Given a Markov chain with transition matrix P, a state am is said to be an absorbing state if pmm=1.

• Absorbing Markov Chain– A Markov chain is said to be an absorbing Markov

chain if • it has at least one absorbing state and


18

• from every state in the Markov chain there exists a sequence of state transitions with nonzero probability that leads to an absorbing state. These nonabsorbing states are called transient states.

a0 a1 a2 a3 a4 a5

10

Absorbing Markov Chain Insights• Canonical Form ' ab

⎡ ⎤= ⎢ ⎥⎣ ⎦

Q RP 0 I

• Fundamental Matrix

• Expected number of times that the system will pass through state am given that the system starts in state ak.– nkm

• (Convergence Rate) Expected number of iterations

⎣ ⎦

( ) 1−= −N I Q

19

• (Convergence Rate) Expected number of iterations before the system ends in an absorbing state starting in state am is given by tm where 1 is a ones vector– t=N1

• (Final distribution) Probability of ending up in absorbing state am given that the system started in ak is bkm where

=B NR

Absorbing Markov Chains and Improvement Paths

• Sources of randomness– Timing (Random, Asynchronous)

Decision rule (random decision rule)– Decision rule (random decision rule)– Corrupted observations

• An NE is an absorbing state for autonomously rational decision rules.

• Weak FIP implies that the game is an absorbing Markov chain as long as the NE terminating improvement path always has a nonzero probability of being implemented

20

of being implemented.• This then allows us to characterize

– convergence rate, – probability of ending up in a particular NE, – expected number of times a particular transient state will be

visited

11

Connecting Markov models, improvement paths, and decision rules

• Suppose we need the path γ = (a0, a1,…am) for convergence by weak FIP.

• Must get right sequence of players and right sequence ofMust get right sequence of players and right sequence of adaptations.

• Friedman Random Better Response– Random or Asynchronous

• Every sequence of players have a chance to occur• Random decision rule means that all improvements have a chance to

be chosen– Synchronous not guaranteed

21

• Alternate random better response (chance of choosing same action)– Because of chance to choose same action, every sequence of

players can result from every decision timing.– Because of random choice, every improvement path has a chance

of occurring

Convergence Results (Finite Games)

• If a decision rule converges under round-robin, random, or synchronous timing, then it also converges under

22

or synchronous timing, then it also converges under asynchronous timing.

• Random better responses converge for the most decision timings and the most surveyed game conditions.– Implies that non-deterministic procedural cognitive radio

implementations are a good approach if you don’t know much about the network.

12

Impact of Noise• Noise impacts the mapping from actions to outcomes,

f :A→Of :A→O• Same action tuple can lead to different outcomes• Most noise encountered in wireless systems is

theoretically unbounded.• Implies that every outcome has a nonzero chance of

being observed for a particular action tuple.

23

g p p• Some outcomes are more likely to be observed than

others (and some outcomes may have a very small chance of occurring)

Another DFS Example• Consider a radio observing the spectral

energy across the bands defined by the set gy yC where each radio k is choosing its band of operation fk.

• Noiseless observation of channel ck

• Noisy observation

( ) ( ),i k ki k k kk N

o c g p c fθ∈

= ∑( ) ( ) ( ), ,i k ki k k k i ko c g p c f n c tθ= +∑

24

Noisy observation• If radio is attempting to minimize inband

interference, then noise can lead a radio to believe that a band has lower or higher interference than it does

( ) ( ) ( ), ,i k ki k k k i kk N

g p f∈∑

13

Trembling Hand (“Noise” in Games)• Assumes players have a nonzero chance of

making an error implementing their actionmaking an error implementing their action.– Who has not accidentally handed over the wrong

amount of cash at a restaurant? – Who has not accidentally written a “tpyo”?

• Related to errors in observation as erroneous observations cause errors in implementation

25

observations cause errors in implementation (from an outside observer’s perspective).

Noisy decision rules• Noisy utility ( ) ( ) ( ), ,i i iu a t u a n a t= +

Trembling Hand


26

ObservationErrors

14

Implications of noise• For random timing, [Friedman] shows game with noisy

random better response is an ergodic Markov chain.p g• Likewise other observation based noisy decision rules

are ergodic Markov chains– Unbounded noise implies chance of adapting (or not adapting) to

any action– If coupled with random, synchronous, or asynchronous timings,

then CRNs with corrupted observation can be modeled as ergodic Makov chains.

27

– Not so for round-robin (violates aperiodicity)• Somewhat disappointing

– No real steady-state (though unique limiting stationary distribution)

DFS Example with three access points

• 3 access nodes, 3 channels, attempting to operate in band with least spectral energy.

• Constant power• Link gain matrix

• Noiseless observations

12

3

28

• Noiseless observations

• Random timing

15

Trembling Hand• Transition Matrix, p=0.1

29

• Limiting distribution

Noisy Best Response• Transition Matrix, N(0,1) Gaussian Noise


30

• Limiting stationary distributions

16

Comment on Noise and Observations

• Cardinality of goals makes a difference for cognitive radios

Probability of making an error is a function of the difference– Probability of making an error is a function of the difference in utilities

– With ordinal preferences, utility functions are just useful fictions

• Might as well assume a trembling hand• Unboundedness of noise implies that no state can

be absorbing for most decision rules• NE retains significant predictive power

31

– While CRN is an ergodic Markov chain, NE (and the adjacent states) remain most likely states to visit

– Stronger prediction with less noise– Also stronger when network has a Lyapunov function– Exception - elusive equilibria ([Hicks_04])

Stability

yy y

32

x

y

x

y

Stable, but not attractive

x

y

Attractive, but not stable

17

Lyapunov’s Direct Method


33Left unanswered: where does L come from? Can it be inferred from radio goals?

Summary• Given a set of goals, an NE is a fixed point for all radios with those

goals for all autonomously rational decision processes• Traditional engineering analysis techniques can be applied in a• Traditional engineering analysis techniques can be applied in a

game theoretic setting– Markov chains to improvement paths

• Network must have weak FIP for autonomously rational radios to converge– Weak FIP implies existence of absorbing Markov chain for many

decision rules/timings• In practical system, network has a theoretically nonzero chance of

visiting every possible state (ergodicity), but does have unique li iti t ti di t ib ti


34

limiting stationary distribution– Specific distribution function of decision rules, goals– Will be important to show Lyapunov stability

• Shortly, we’ll cover potential games and supermodular games which can be shown to have FIP or weak FIP. Further potential games have a Lyapunov function!

1

Designing Cognitive Radio Networks to Yield DesiredNetworks to Yield Desired BehaviorPolicy, Cost Functions, Repeated


1

pGames, SupermodularGames, Potential Games

WSU May10, 2010

Policy• Concept: Constrain the

available actions so the worst cases of distributed decision making can be avoided

• Not a new concept –– Policy has been used since

there’s been an FCC• What’s new is assuming

2

What s new is assuming decision makers are the radios instead of the people controlling the radios

2

Policy applied to radios instead of humans

• Need a language to convey policy

Policiesfrequency

mask

– Learn what it is– Expand upon policy later

• How do radios interpret policy– Policy engine?

• Need an enforcement mechanism– Might need to tie in to humans

N d f li

Policies

3

• Need a source for policy– Who sets it?– Who resolves disputes?

• Logical extreme can be quite complex, but logical extreme may not be necessary.

• Detection– Digital TV: -116 dBm over a 6 MHz channel

802.22 Example Policies

– Digital TV: -116 dBm over a 6 MHz channel– Analog TV: -94 dBm at the peak of the NTSC

(National Television System Committee) picture carrier

– Wireless microphone: -107 dBm in a 200 kHz bandwidth.


4

• Transmitted Signal– 4 W Effective Isotropic Radiated Power (EIRP)– Specific spectral masks – Channel vacation times

C. Cordeiro, L. Challapali, D. Birru, S. Shankar, “IEEE 802.22: The First Worldwide Wireless Standard based on Cognitive Radios,” IEEE DySPAN2005, Nov 8-11, 2005 Baltimore, MD.

3

Cost Adjustments• Concept: Centralized unit dynamically adjusts

costs in radios’ objective functions to ensurecosts in radios objective functions to ensure radios operate on desired point

• Example: Add -12 to the use of wideband f

( ) ( ) ( )i i iu a u a c a= +

5

waveform

Repeated GamesStage 1Stage 1• Same game is

repeated

Stage 2Stage 2

p– Indefinitely– Finitely

• Players consider discounted payoffs across multiple stages– Stage k

6

Stage kStage k

Stage k

– Expected value over all future stages

( ) ( )k k ki iu a u aδ=

( )( ) ( )0

k k ki i

k

u a u aδ∞

=

=∑

4

Impact of Strategies• Rather than merely reacting to the state of the network, radios

can choose their actions to influence the actions of other radios• Threaten to act in a way that minimizes another radio’s• Threaten to act in a way that minimizes another radio s

performance unless it implements the desired actions• Common strategies

– Tit-for-tat– Grim trigger– Generous tit-for-tat

• Play can be forced to any “feasible” payoff vector with proper selection of punishment strategy.

7

gy

Impact of Communication on Strategies• Players agree to play in a certain manner• Threats can force play to almost any state

Nada C N

• Threats can force play to almost any state– Breaks down for finite number of stages

8

nada

c

0,0 -5,5

-1,15,-5

-100,0

-100,-1n -1,-1000,-100 -100,-100

5

Improvement from Punishment

• Throughput/unit power gains be enforcing a common received power level at a base station

• Punishment by jamming• Without benefit to

deviating, players can operate at lower power level and achieve same

9 A. MacKenzie and S. Wicker, “Game Theory in Communications: Motivation, Explanation, and Application to Power Control,” Globecom2001, pp. 821-825.

throughput

Instability in Punishment• Issues arise when

radios aren’t directly observing actionsobserving actions and are punishing with their actions without announcing punishment

• Eventually, a deviation will be

10

falsely detected, punished and without signaling, this leads to a cascade of problems

V. Srivastava, L. DaSilva, “Equilibria for Node Participation in Ad Hoc Networks –An Imperfect Monitoring Approach,” ICC 06, June 2006, vol 8, pp. 3850-3855

6

Comments on Punishment• Works best with a common controller to announce• Problems in fully distributed system

– Need to elect a controller– Otherwise competing punishments, without knowing other players’

utilities can spiral out of control• Problems when actions cannot be directly observed

– Leads to Byzantine problem• No single best strategy exists

– Strategy flexibility is important – Significant problems with jammers (they nominally receive higher

11

– Significant problems with jammers (they nominally receive higher utility when “punished”

• Generally better to implement centralized controller– Operating point has to be announced anyways

Supermodular Games• A game such that

– Action space is a lattice

2

0, , ,i

i j

u i j N a Aa a∂

≥ ∀ ∈ ∈∂ ∂

– Utility functions are supermodular• Identification• NE Properties

– NE Existence: All supermodular games have a NE– NE Identification: NE form a lattice

• Convergence

12

g– Has weak FIP– Best response algorithms converge

• Stability– Unique NE is an attractive fixed point for best

response

7

Ad-hoc power control• Network description

E h di tt t• Each radio attempts to achieve a target SINR at the receiving end of its link.

• System objective is ensuring every radio

Gateway

ClusterHead

Gateway

ClusterHead

13

ensuring every radio achieves its target SINR

ClusterHeadClusterHead

( ) ( )2ˆk kk N

J γ γ∈

= − −∑p

Generalized repeated gamestage game

• Players – N⎡ ⎤• Actions –

• Utility function

• Action space formulation

( ) ( )2ˆj j ju o γ γ= − −

max0,j jP p⎡ ⎤= ⎣ ⎦

( )2

⎛ ⎞⎛ ⎞

14

( ) ( )10 10\

ˆ 10 log 10logj j jj j kj k jk N j

u p g p g p Nγ∈

⎛ ⎞⎛ ⎞= − − + +⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠⎝ ⎠

∑

gjk fraction of power transmitted by j that can’t be removed by receiving end of radio j’s linkNj noise power at receiving end of radio j’s link

8

Model identification & analysis

• Supermodular gameAction space is a lattice– Action space is a lattice

– Implications• NE exists• Best response converges• Stable if discrete action space

• Best response is also standard

( )

( )

2

\

2000

ln 20

j kj

j kj kj k j

k N j

u p gp p

p g p N∈

∂= >

∂ ∂ ⎛ ⎞+⎜ ⎟

⎝ ⎠∑

15

Best response is also standard– Unique NE– Solvable (see prelim report)– Stable (pseudo-contraction) for infinite

action spaces

( )ˆˆ jk

j jj

B pγγ

=p

ValidationImplies all radios achieved target SINR

16

Noiseless Best Response Noisy Best Response

9

Comments on Designing Networks with Supermodular Games

• Scales wellSum of supermodular functions is a supermodular– Sum of supermodular functions is a supermodular function

– Add additional action types, e.g., power, frequency, routing,..., as long as action space remains a lattice and utilities are supermodular

• Says nothing about desirability or stability of

17

equilibria• Convergence is sensitive to the specific decision

rule and the ability of the radios to implement it

Potential Games• Existence of a function (called

the potential function, V), that fl t th h i tilitreflects the change in utility seen

by a unilaterally deviating player.

• Cognitive radio interpretation:– Every time a cognitive radio

unilaterally adapts in a way that furthers its own goal, some real-valued function increases.

18time

Φ(ω

)

10

Exact Potential Game Forms• Many exact potential games can be recognized

by the form of the utility functionby the form of the utility function


19

Implications of Monotonicity• Monotonicity implies

– Existence of steady-states (maximizers of V)Convergence to maximizers of V for numerous combinations– Convergence to maximizers of V for numerous combinations of decision timings decision rules – all self-interested adaptations

• Does not mean that that we get good performance– Only if V is a function we want to maximize


20

11

Other Potential Game Properties• All finite potential games have FIP• All finite games with FIP are potential gamesAll finite games with FIP are potential games

– Very important for ensuring convergence of distributed cognitive radio networks

• -V is a is a Lyapunov function for isolated maximizers

• Stable NE solvable by maximizers of V• Linear combination of exact potential games is

21

Linear combination of exact potential games is an exact potential game

• Maximizer of potential game need not maximize your objective function– Cognitive Radios’ Dilemma is a potential game

Interference Reducing Networks

• Concept– Cognitive radio network is a potential game with a potential

function that is negation of observed network interferencefunction that is negation of observed network interference• Definition

– A network of cognitive radios where each adaptation decreases the sum of each radio’s observed interference is an IRN

( ) ( )ii N

Iω ω∈

Φ =∑

(ω)

22

• Implementation:– Design DFS algorithms such that network is a potential game

with Φ ∝ -V

time

Φ

12

Bilateral Symmetric Interference

• Two cognitive radios, j,k∈N, exhibit bilateral symmetric interference if( ) ( ) ∀ Ω ∀ Ω

What’s good for the goose, isgood for the gander…

( ) ( ), ,jk j j k kj k k jg p g pρ ω ω ρ ω ω= ,j j k kω ω∀ ∈Ω ∀ ∈Ω• ωk – waveform of radio k• pk - the transmission power of

radio k’s waveform• gkj - link gain from the

transmission source of radio k’s signal to the point where radio jmeasures its interference,

23Source: http://radio.weblogs.com/0120124/Graphics/geese2.jpg

measures its interference, • - the fraction of radio

k’s signal that radio j cannot exclude via processing (perhaps via filtering, despreading, or MUD techniques).

( ),k jρ ω ω

Bilateral Symmetric Interference Implies an Interference Reducing Network

• Cognitive Radio Goal:• By bilateral symmetric interference

( ) ( ) ( )∑∈

−=−=iNj

jijjiii pgIu\

,ωωρωω

y y

• Rewrite goal

• Therefore a BSI game (Si =0)

( ) ( )\

,i ik i kk N i

u bω ω ω∈

= − ∑

( ) ( )1

,i

ki k k iV g pω ρ ω ω−

= −∑∑

( ) ( ) ( ) ( )kiikikkikiiikikkki bbpgpg ωωωωωωρωωρ ,,,, ===

24

• Interference Function

• Therefore profitable unilateral deviations increase V and decrease Φ(ω) – an IRN

1i N k∈ =

( ) ( )2Vω ωΦ = −

13

An IRN 802.11 DFS Algorithm• Suppose each access node

measures the received signal power and frequency of the

Listen onChannel LCpower and frequency of the

RTS/CTS (or BSSID) messages sent by observable access nodes in the network.

• Assumed out-of-channel interference is negligible and RTS/CTS transmitted at same power

C

RTS/CTSenergy detected? Measure power

of access node in message, p

Note address of access node, a

U d t

Pick channel tolisten on, LC

yn

Start

25( ) ( )jkkkjkjjjk ffpgffpg ,, σσ =

( ) ( ) ( )\

,i i ki k i kk N i

u f I f g p f fσ∈

= − = − ∑

( )1

,0

i ki k

i k

f ff f

f fσ

=⎧= ⎨ ≠⎩

Update interference

tableTime for decision?

Apply decision criteria for new

operating channel, OCUse 802.11h

to signal change in OC to clients

yn

Statistics• 30 cognitive access nodes in European UNII

bands• Choose channel with lowest interference

40

50

60

70

rfere

nce

(dB

)

Reduction in Net Interference

Choose channel with lowest interference• Random timing• n=3• Random initial channels• Randomly distributed positions over 1 km2

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

Number of Access Nodes

Red

uctio

n in

Net

Inte

r

Round-robin Asynchronous Legacy Devices

Reduction in Net Interference


26

14

Ad-hoc Network• Possible to adjust previous

algorithm to not favor access d li tnodes over clients

• Suitable for ad-hoc networks• CRT has IRN based distributed

zero-overhead low-complexity algorithms for – Spreading codes– Power variations

Subcarrier allocation

27

– Subcarrier allocation– Bandwidth variations– Activity levels weighted by

interference– Noninteractive terms – modulation,

FEC, interleaving– Beamforming– And combinations of the above

Comments on Potential Games• All networks for which there is not a better response interaction loop

is a potential game• Before implementing fully distributed GA SA or most CBR decision• Before implementing fully distributed GA, SA, or most CBR decision

rules, important to show that goals and action satisfy potential game model

• Sum of exact potential games is itself an exact potential game– Permits (with a little work) scaling up of algorithms that adjust single

parameters to multiple parameters • Possible to combine with other techniques

– Policy restricts action space, but subset of action space remains a potential game (see J. Neel, J. Reed, “Performance of Distributed D namic Freq enc Selection Schemes for Interference Red cing


28

Dynamic Frequency Selection Schemes for Interference Reducing Networks,” Milcom 2006)

– As a self-interested additive cost function is also a potential game, easy to combine with additive cost approaches (see J. Neel, J. Reed, R. Gilles, “The Role of Game Theory in the Analysis of Software Radio Networks,” SDR Forum02)

• Read more on potential games:– Chapter 5 in Dissertation of J. Neel, Available at

http://scholar.lib.vt.edu/theses/available/etd-12082006-141855/

15

Token Economies• Pairs of cognitive radios exchange tokens for

services rendered or bandwidth rented• Example:

– Primary users leasing spectrum to secondary users • D. Grandblaise, K. Moessner, G. Vivier and R. Tafazolli,

“Credit Token based Rental Protocol for Dynamic Channel Allocation,” CrownCom06.

– Node participation in peer-to-peer networks• T. Moreton, “Trading in Trust, Tokens, and Stamps,”


29

g pWorkshop on the Economics of Peer-to-Peer Systems, Berkeley, CA June 2003.

• Why it works – it’s a potential game when there’s no externality to the trade– Ordinal potential function given by sum of utilities

Comments on Network Options• Approaches can be combined

– Policy + potentialy p– Punishment + cost adjustment– Cost adjustment + token economies

• Mix of centralized and distributed is likely best approach• Potential game approach has lowest complexity, but

cannot be extended to every problem• Token economies requires strong property rights to

b h i


30

ensure proper behavior• Punishment can also be implemented at a choke point in

the network

16

Global Altruism: distributed, but more costly• Concept: All radios distributed all relevant information to all other

radios and then each independently computes jointly optimal solutionsolution– Proposed for spreading code allocation in Popescu04, Sung03– Used in xG Program (Comments of G. Denker, SDR Forum Panel

Session on “A Policy Engine Framework”) Overhead ranges from 5%-27%

• C = cost of computation• I = cost of information transfer from node to node• n = number of nodes• Distributed

31

– nC + n(n-1)I/2• Centralized (election)

– C + 2(n-1)I• Price of anarchy = 1• May differ if I is asymmetric

Improving Global Altruism• Global altruism is clearly inferior to a centralized solution

for a single problem. • However, suppose radios reported information to, and

used information from, a common database– n(n-1)I/2 => 2nI

• And suppose different radios are concerned with different problems with costs C1,…,Cn

• Centralized– Resources = 2(n-1)I + sum(C1,…,Cn)

32

– Time = 2(n-1)I + sum(C1,…,Cn)• Distributed

– Resources = 2nI + sum(C1,…,Cn)– Time = 2I + max (C1,…,Cn)

17

Comments on Cost Adjustments• Permits more flexibility than policy

– If a radio really needs to deviate, then it can• Easy to turn off and on as a policy tool

– Example: protected user shows up in a channel, cost to use that channel goes up

– Example: prioritized user requests channel,

33

Example: prioritized user requests channel, other users’ cost to use prioritized user’s channel goes up (down if when done)

Example Application: • Overlay network of secondary

users (SU) free to adapt power transmit time and

• Without REM:– Decisions solely based on link

SINRpower, transmit time, and channel

SINR• With REM

– Radios effectively know everythingUpshot: A little gain for the secondary users; big gain for primary users

34From: Y. Zhao, J. Gaeddert, K. Bae, J. Reed, “Radio Environment Map Enabled Situation-Aware Cognitive Radio Learning Algorithms,” SDR Forum Technical Conference 2006.

18

Comments on Radio Environment Map• Local altruism also possible

Less information transfer– Less information transfer• Like policy, effectively needs a common

language• Nominally could be centralized or distributed

databaseR d

35

• Read more: – Y. Zhao, B. Le, J. Reed, “Infrastructure Support – The

Radio Environment MAP,” in Cognitive Radio Technology, B. Fette, ed., Elsevier 2006.

5/11/2010

1

Summary and ConclusionsSummary and Conclusions • Summary of Critical

Concepts• The Future Role of Game

Theory in the Design and Regulation of Dynamic


1

g ySpectrum Access Networks

• Topics for Further Study and Research

WSU May 10, 2010

What does game theory bring to the design of cognitive radio networks? (1/2)• A natural “language” for modeling cognitive radio

networks• Permits analysis of ontological radios

– Only know goals and that radios will adapt towards its goal

• Simplifies analysis of random procedural radios• Permits simultaneous analysis of multiple

decision rules – only need goal

2

y g• Provides condition to be assured of possibility of

convergence for all autonomously myopic cognitive radios (weak FIP)

5/11/2010

2

What does game theory bring to the design of cognitive radio networks? (2/2)• Provides condition to be assured of

convergence for all autonomously myopic g y y pcognitive radios (FIP, not synchronous timing)

• Rapid analysis– Verify goals and actions satisfy a single model, and

steady-states, convergence, and stability• An intuition as to what conditions will be needed

to field successful cognitive radio decision rules.

3

• A natural understanding of distributed interactive behavior which simplifies the design of low complexity distributed algorithms

Game Models of Cognitive Radio Networks• Almost as many models

as there are algorithms• Normal Form Game

– ⟨N, A, {ui}⟩• Supermodular Game ( )2

0iu a∂≥• Normal form game

excellent for capturing single iteration of a complex system

• Most other models add features to this model – Time, decision rules, noisy

observations Natural states

• Supermodular Game–

• Potential Game–

• Repeated Game– ⟨N, A, {ui}, {di}⟩

• Asynchronous Game– ⟨N, A, {ui}, {di}, T⟩

Extensive Form Game

( )0

i ja a≥

∂ ∂

( ) ( )22ji

i j j i

u au aa a a a

∂∂=

∂ ∂ ∂ ∂

4

observations, Natural states• Some can be recast as a

normal form game– Extensive form game

• Extensive Form Game– ⟨N, A, {ui}, {di}, T⟩

• TU Game– ⟨N, v⟩

• Bargaining Game– ⟨F, v⟩

5/11/2010

3

Steady-states• Different game models have

different steady-state concepts

• Nash Equilibrium• Strong Nashconcepts

• Games can have many, one, or no steady-states

• Nash equilibrium (and its variants) is most commonly applied concept– Excellent for distributed

noncollaborative algorithms• Games with punishment and

Coalitional games tend to

• Strong Nash Equilibrium

• Core• Shapley Value• Nash Bargaining

Solution

5

Coalitional games tend to have a very large number of equilibria

• Game theory permits analysis of steady-states without knowing specific decision rules

Optimality• Numerous different

notions of optimality• Pareto Efficiency

Obj tinotions of optimality• Many are

contradictory• Use whatever metric

makes sense

• Objective Maximization

• Gini Index• Shapley Value• Nash Bargaining

6

Solution

5/11/2010

4

Convergence• Showing existence

of steady-states is yinsufficient; need to know if radios can reach those states

• FIP (potential games) gives the broadest convergence

7

convergence conditions

• Random timing actually helps convergence

Noise• Unbounded noise causes

all networks toall networks to theoretically behave as ergodic Markov chains

• Important to show Lyapunov stabiltiy

• Noisy observations cause noisy implementation to

8

noisy implementation to an outside observer– Trembling hand

5/11/2010

5

Game Theory and Design• Numerous techniques for

improving the behavior of cognitive radio networks

• Supermodular games– Steady state exists– Best response convergencecognitive radio networks

• Techniques can be combined• Potential games yield lowest

complexity implementations– Judicious design of goals,

actions, • Practical limitations limit

effectiveness of punishment– Observing actions

• Potential games– Identifiable steady-states– All self-interested decision rules

converge– Lyapunov function exists for isolated

equilibria• Punishment

– Can enforce any action tuple– Can be brittle when distributed

• Policy– Limits worst case performance

9

– Likely best when a referee exists

• Policy can limit the worst effects, doesn’t really address optimality or convergence issues

Limits worst case performance• Cost function

– Reshapes preferences– Could damage underlying structure if

not a self-interested cost• Centralized

– Can theoretically realize any result– Consumes overhead– Slower reactions

Future Directions in Game Theory and Design• Integrate policy and potential games• Integration of coalitional and distributed

forms• Increasing dimensionality of action sets

– Cross-layerI t ti f d i d hi hi l

10

• Integration of dynamic and hierarchical policies and games

5/11/2010

6

Future Direction in Regulation• Can incorporate optimization into policy by

specifying goalsspecifying goals• In theory, correctly implementing goals,

correctly implementing actions, and exhaustive self-interested adaptation is enough to predict behavior (at least for potential games)

11

potential games)– Simpler policy certification

• Provable network behavior

Avenues for Future Research on Game Theory and CRNs• Integration of bargaining,

centralized, and distributed algorithms into a common

• Imperfection in observations in general

algorithms into a common framework

• Cross-layer algorithms• Better incorporating

performance of classification techniques into behavior

• Asymmetric potential games• Bargaining algorithms for

• Time varying game models while inferring convergence, stability…

• Combination of policy, potential games, coalition formation, and token economies

• Can be modeled as a game

12

Bargaining algorithms for cognitive radio

• Improving the brittleness of punishment in distributed implementations with imperfect observations

with to types of players– Distributed cognitive radios– Dynamic policy provider

5/11/2010

7

Questions?

13

www.crtwireless.com

download the slides: tutorialcognitiveradiotechnologies.com/files/wsu_handouts_may_2010.pdf · 1...

Documents