download the slides: tutorialcognitiveradiotechnologies.com/files/wsu_handouts_may_2010.pdf · 1...
TRANSCRIPT
1
Game Theory in the Analysis and Design of Cognitive Radio
Download the slides:http://www.crtwireless.com/WSU_Tutorial.html
James [email protected](540) 230-6012
and Design of Cognitive Radio Networks
© Cognitive Radio Technologies, 2010
1
(540) 230 6012www.crtwireless.com
Wright State UniversityMay 12, 2010
Cognitive Radio TechnologiesFounded in 2007 by Dr. James Neel and Professor Jeff Reed to commercialize
Business Details
Professor Jeff Reed to commercialize cognitive radio research out of Virginia Tech• 6 employees / contractors• 07 Sales = 64k, 08 Sales = 127k• 09 Sales = 394k, 10 (contracts) = 960k
• Partner with established companies to spin in cognitive radio research
Business Model
cognitive radio research• Navy SBIR 08-099 => L3-Nova • Air Force SBIR 083-160 => GDC4S
• Contract research and consulting related to cognitive radio and software radio
• DARPA, DTI, CERDEC, Global Electronics• Position for entry in emerging wireless markets
• Cognitive Zigbee
2
Selected Projects
• Prototype SDR• Distributed
CR Projects SDR Projects• Prototype SDR
for software controllable antenna
• Fundamental limits to SDR performance
• Distributed spectrum management for WNW
• White Space Networking
Incumbent or other CR user( i h )
TV incumbent user Microphone userFractional usef TV h l
Other CR user or non-microphone incumbent ( l ti itti )
Incumbent or other CR user( i h )
TV incumbent user Microphone userFractional usef TV h l
Other CR user or non-microphone incumbent ( l ti itti ) performance
• Rapid estimation of SDR resources
• Cognitive gateway with ad-hoc extensions
6 MHz Unused(6 MHz)6 MHz
f
(except microphone user)TV incumbent user Microphone user
of TV channel
GuardBand
(regulations permitting)
6 MHz Unused(6 MHz)6 MHz
f
(except microphone user)TV incumbent user Microphone user
of TV channel
GuardBand
(regulations permitting)
0 50 100 150 200 250 300
40
60
80
100
120
140
Cha
nnel
0 50 100 150 200 250 300-90
-80
-70
-60
I i(f) (d
Bm
)
CRT’s Strengths• Analysis of networked
cognitive radio algorithms Average LinkInterference
FrequencyAdjustments
0 50 100 150 200 250 300
0 50 100 150 200 250 300-80
-75
-70
-65
-60
-55
iteration
Φ(f)
(dB
m)
(game theory)• Design of low complexity,
low overhead (scalable), convergent and stable cognitive radio algorithms– Infrastructure, mesh, and ad-
Net Interference
-60
-55
-50
-45
leve
ls (d
Bm
)
4
hoc networks– DFS, TPC, AIA, beamforming,
routing, topology formation
0 10 20 30 40 50 60 70 80 90 100-90
-85
-80
-75
-70
-65
Ste
ady-
stat
e In
terfe
renc
e
Number Links
Typical Worst Case Without AlgorithmAverage Without AlgorithmTypical Worst Case With AlgorithmAverage With AlgorithmColission Threshold
3
Tutorial Background• Minor modifications to tutorial given at DySPAN in 2007• Most material from my three week defenseMost material from my three week defense
– Very understanding committee– Dissertation online @
http://scholar.lib.vt.edu/theses/available/etd-12082006-141855/– Original defense slides @
http://www.mprg.org/people/gametheory/Meetings.shtml• Other material from training short course I gave in
summer 2003
5
summer 2003– http://www.mprg.org/people/gametheory/Class.shtml
• Eventually will be formalized into a book– Been saying that for a while…
• Soft copy of tutorial at– http://www.crtwireless.com/WSU_Tutorial.html
Approximate Tutorial ScheduleTime Material08:00-09:00 Cognitive Radio and Game Theory (51)
Break~20min1000-1020
Break
g y ( )09:00-09:45 Steady-state Solution Concepts (38)09:45-10:00 Performance Metrics (11)10:00-10:15 Break10:15-11:00 Notion of Time and Imperfections in Games (34)11:00-11:45 Using Game Theory to Design Cognitive Radio Networks (28)11:45-12:00 Summary (14)
6
4
General Comments on Tutorial• “This talk is intended to provide attendees with knowledge of the
most important game theoretic concepts employed in state-of-the-art dynamic spectrum access networks ”
• More leisurely sources of information:– D. Fudenberg, J. Tirole, Game Theory,
MIT Press 1991.– R. Myerson, Game Theory: Analysis of
dynamic spectrum access networks. • Lots of concepts, no proofs – cramming 2-3 semesters of game
theory into 3.5 hours• Tutorial can provide quick reference for concepts discussed at
conference
7
y , y yConflict, Harvard University Press, 1991.
– M. Osborne, A. Rubinstein, A Course in Game Theory, MIT Press, 1994.
– J. Neel. J. Reed, A. MacKenzie, Cognitive Radio Network Performance Analysis in Cognitive Radio Technology, B. Fette, ed., Elsevier August 2006.
Image modified from http://hacks.mit.edu/Hacks/by_year/1991/fire_hydrant/
Cognitive Radio and Game TheoryTheoryCognitive Radio,Game Theory,Relationship Between the
8
Between the Two
5
Basic Game Concepts and Cognitive Radio Networks• Assumptions about Cognitive Radios and Cognitive
Radio Networks– Definition and concept of cognitive radio as used in this
presentation– Design Challenges Posed by Cognitive Radio Networks– A Model of a Cognitive Radio Network
• High Level View of Game Theory– Common Components– Common Models
• Relationship between Game Theory and Cognitive Radio
9
• Relationship between Game Theory and Cognitive Radio Networks– Modeling a Generic Cognitive Radio Network as a Game– Differences in Typical Assumptions– Limitations of Application
Cognitive Radio: Basic IdeaSoftware radios permit network or user to control the operation of a
ft di• Cognitive radios enhance the control
process by adding– Intelligent, autonomous control of the
radio– An ability to sense the environment– Goal driven operation
Processes for learning about
OS
Software ArchServices
Waveform Software
Con
trol
Pla
ne
software radio
10
– Processes for learning about environmental parameters
– Awareness of its environment• Signals• Channels
– Awareness of capabilities of the radio– An ability to negotiate waveforms with
other radios
Board package (RF, processors)
Board APIs
6
OODA Loop: (continuously)• Observe outside world
O i tInfer from Context
I f f R di M d l
Cognition cycle
Conceptual Operation
• Orient to infer meaning of observations
• Adjust waveform as needed to achieve goal
• Implement processes needed to change waveform
Urgent
Orient
Select AlternateGoals
Plan
Normal
Immediate
LearnNewObserve
D id
Infer from Radio ModelEstablish Priority
Parse Stimuli
Pre-process
© Cognitive Radio Technologies, 2007
11
Other processes: (as needed)
• Adjust goals (Plan)• Learn about the outside
world, needs of user,…Allocate ResourcesInitiate Processes
Negotiate Protocols
States
OutsideWorld
Decide
Act
User Driven(Buttons)Autonomous
StatesGenerate “Best” Waveform
Figure adapted From Mitola, “Cognitive Radio for Flexible Mobile Multimedia Communications ”, IEEE Mobile Multimedia Conference, 1999, pp 3-10.
Implementation Classes
• Weak cognitive radio– Radio’s adaptations
determined by hard coded algorithms and informed by observations
• Strong cognitive radio– Radio’s adaptations
determined by conscious reasoning Closest approximation is
12
– Many may not consider this to be cognitive (see discussion related to Fig 6 in 1900.1 draft)
– Closest approximation is the ontology reasoning cognitive radios
In general, strong cognitive radios have potential to achieve both much better and much worse behavior in a network, but may not be realizable.
7
Brilliant Algorithms and Cognitive Engines• Most research focuses on
development of l ith f
• Cognitive engine can be viewed as a software
hit talgorithms for:– Observation– Decision processes– Learning– Policy– Context Awareness
• Some complete OODA loop algorithms
architecture• Provides structure for
incorporating and interfacing different algorithms
• Mechanism for sharing information across algorithms
13
loop algorithms • In general different
algorithms will perform better in different situations
algorithms• No current
implementation standard
Performance API Hardware/platform API
Radio
CE-Radio Interface
Observation Action
Example Architecture from CWT
User Model
Evolver
Cognitive System ControllerChob
Uob
User DomainUser preference
Local service facility
Security
Radio-domain cognitionRadio
Resource Monitor
Performance API Hardware/platform API
Radio Performance
Monitor
WMS
Search SpaceConfig
ChannelIdentifier
WaveformRecognizer
ObservationOrientation
Action
Decision
© Cognitive Radio Technologies, 2007
14
Security
Policy Model
Evolver
|(Simulated Meters) – (Actual Meters)| Simulated Meters
Actual Meters
Cognitive System Module
Reg
Knowledge BaseShort Term MemoryLong Term Memory
WSGA Parameter SetRegulatory Information
Initial ChromosomesWSGA Parameters
Objectives and weights
System Chromosome
}max{}max{
UUU
CHCHCH
USDUSD
•=•=
Decision Maker
Policy DomainUser preference
Local service facility
User data securitySystem/Network security
X86/UnixTerminal
Learning
Models
8
DFS in 802.16h• Drafts of 802.16h
defined a generic
Channel AvailabilityCheck on next channel
Available?
Choose Different Channel
Service in function
No
Decision, Action
Observation
gDFS algorithm which implements observation, decision, action, and learning Stop Transmission
Detection?
Select and change to new available channel in a defined time with a max. transmission time
In service monitoring of operating channel
No
Yes
Start Channel Exclusion timer
Yes
Learning
Observation
Decision, Action
© Cognitive Radio Technologies, 2007
15
processes• Very simple
implementation
Modified from Figure h1 IEEE 802.16h-06/010 Draft IEEE Standard for Local and metropolitan area networks Part 16: Air Interface for Fixed Broadband Wireless Access Systems Amendment for Improved Coexistence Mechanisms for License-Exempt Operation, 2006-03-29
Log of Channel Availability
Channel unavailable for Channel Exclusion time
Available?
Background In service monitoring (on non-
operational channels)
No
Yes
Start Channel Exclusion timerg
Other Cognitive Radio Efforts• TVWS PHY/MAC
– 802.22 TVWS• 802.22.1 beacons• SCC41
– 802.11af WhiteFi– CogNeA
• 802.19.1 TVWS Coexistence
• WhiteSpace Database GroupS lf O i i N t k
– 1900.4 Architectural building blocks
– 1900.5 Policy Languages– 1900.6 Sensing interfaces
• WinnForum (SDRF)– MLM – metalanguages– CRWG – database, IPA
G t• Self-Organizing Networks (3GPP / NGMN)
• 802.21 Media Independent Handoffs
• Government– NTIA testbed– DARPA: xG, WNAN– Various service efforts– NIJ Interoperability
16
9
Used cognitive radio definition
• A cognitive radio is a radio whose control processes permit the radio to leverage situational knowledge p g gand intelligent processing to autonomously adapt towards some goal.
• Intelligence as defined by [American Heritage_00] as “The capacity to acquire and apply knowledge, especially toward a purposeful goal.”– To eliminate some of the mess, I would love to just call
17
cognitive radio, “intelligent” radio, i.e., – a radio with the capacity to acquire and apply knowledge
especially toward a purposeful goal
Cognitive Networks• Rather than having
intelligence reside in a i l d i i t llisingle device, intelligence
can reside in the network• Effectively the same as a
centralized approach• Gives greater scope to the
available adaptations– Topology, routing
Conceptually permits
18
– Conceptually permits adaptation of core and edge devices
• Can be combined with cognitive radio for mix of capabilities
• Focus of E2R program
R. Thomas et al., “Cognitive networks: adaptation and learning to achieve end-to-end performance objectives,” IEEE Communications Magazine, Dec. 2006
10
The Interaction Problem
OutsideWorld
19
• Outside world is determined by the interaction of numerous cognitive radios
• Adaptations spawn adaptations
Issues Can Occur When Multiple Intelligences Interact
• Crash of May 6, 2010– Not just a fat finger
Combination of bad economic– Combination of bad economic news, big bet by Universa, and interactions of traders and computers
• Housing BubbleBounce up instead of
http://www.legitreviews.com/images/reviews/news/dow_drop.jpg
– Bounce up instead of down
– Slower interactions lead to slower changes
– Also indicative of the role beliefs play in instability
20
http://www.nytimes.com/imagepages/2006/08/26/weekinreview/27leon_graph2.html
11
In heavily loaded networks, a single vacation can spawn an infinite adaptation process
• Suppose 2– g31>g21; g12>g32 ; g23>g13
• Without loss of generality– g31, g12, g23 = 1– g21, g32, g13 = 0.5
• Infinite Loop! 13
– 4,5,1,3,2,6,4,…
Chan. (0,0,0) (0,0,1) (0,1,0) (0,1,1) (1,0,0) (1,0,1) (1,1,0) (1,1,1)Interf. (1.5,1.5,1.5) (0.5,1,0) (1,0,0.5) (0,0.5,1) (0,0.5,1) (1,0,0.5) (0.5,1,0) (1.5,1.5,1.5)
Interference Characterization
0 1 2 3 4 5 6 7
Phone Image: http://www1.istockphoto.com/file_thumbview_approve/2820949/2/istockphoto_2820949_dect_phone.jpgCradle Image:http://www.skypejournal.com/blog/archives/images/AVM_7170_D.jpg
Generalized Insights from the DECT Example• If # links / clusters > # channels, decentralized channel choices will
have a non-zero looping probabilityhave a non zero looping probability• As # links / clusters →∞, looping probability goes to 1
– 2 channels– k channels
• Can be mitigated by increasing # of channels (DECT has 120) or reducing frequency of adaptations (DECT is every 30 minutes)– Both waste spectrum
( ) ( ) 31 3 / 4 n Cp loop ≥ −
( ) ( ) 111 1 2 n kCkp loop +− +≥ − −
– And we’re talking 100’s of ms for vacation times• “Centralized” solutions become distributed as networks scale
– “Rippling” in Cisco WiFi Enterprise Networks• www.hubbert.org/labels/Ripple.html
• Also shows up in more recent proposals– Recent White Spaces paper from Microsoft
12
Locally optimal decisions that lead to globally undesirable networks
• Scenario: Distributed SINR maximizing gpower control in a single cluster
• For each link, it is desirable to increase transmit power in response to increased
Power
SINR
23
increased interference
• Steady state of network is all nodes transmitting at maximum power
Insufficient to consider only a single link, must consider interaction
Potential Problems with Networked Cognitive Radios
DistributedInfinite recursions
CentralizedSignaling Overhead• Infinite recursions
• Instability (chaos)• Vicious cycles• Adaptation collisions• Equitable distribution of
resources
• Signaling Overhead• Complexity• Responsiveness• Single point of failure
24
• Byzantine failure• Information distribution
13
1. Steady state h t i ti
NE3NE3NE3NE3
Network Analysis Objectives
e ac
tions
)
characterization2. Steady state optimality3. Convergence4. Stability/Noise5. Scalability
a1
a2
NE1
NE2
a1
a2
NE1
NE2
a1
a2
NE1
NE2
a1
a2
NE1
NE2
(R di 1’ il bl ti )
(Rad
io 2
’s a
vaila
ble
focu
s
25
a3
Steady State CharacterizationIs it possible to predict behavior in the system?How many different outcomes are possible?
OptimalityAre these outcomes desirable?Do these outcomes maximize the system target parameters?
ConvergenceHow do initial conditions impact the system steady state?What processes will lead to steady state conditions?How long does it take to reach the steady state?
Stability/NoiseHow do system variations/noise impact the system?Do the steady states change with small variations/noise?Is convergence affected by system variations/noise?
ScalabilityAs the number of devices increases,
How is the system impacted?Do previously optimal steady states remain optimal?
(Radio 1’s available actions)
General Model (Focus on OODA Loop Interactions)• Cognitive Radios • Set N
P ti l di i j• Particular radios, i, j
Outside
26
OutsideWorld
14
General Model (Focus on OODA Loop Interactions)
Actions• Different radios mayDifferent radios may
have different capabilities
• May be constrained by policy
• Should specify each radio’s available
ti t t
27
actions to account for variations
• Actions for radio i– Ai Act
General Model (Focus on OODA Loop Interactions)
Decision Rules• Maps observations
Implies very simple, deterministic function,p
to actions– di:O→Ai
• Intelligence implies that these actions further the radio’s goal
O R
e.g., standard interference function
28
– ui:O→R• Interesting problem:
simultaneously modeling behavior of ontological and procedural radios
Decide
15
Comments on Timing• When decisions are
made also matters and diff t di ill
Decision timing classes• Synchronous
different radios will likely make decisions at different time
• Tj – when radio j makes its adaptations– Generally assumed to be
an infinite set
– All at once• Round-robin
– One at a time in order– Used in a lot of analysis
• Random– One at a time in no order
A h
29
– Assumed to occur at discrete time
• Consistent with DSP implementation
• T=T1∪T2∪⋅⋅⋅∪Tn• t ∈ T
• Asynchronous– Random subset at a time– Least overhead for a
network
Cognitive Radio Network Modeling Summary• Decision making radios• Actions for each radio
Obser ed O tcome
• i,j ∈N, |N| = n• A=A1×A2×⋅⋅⋅×An
O• Observed Outcome Space
• Goals• Decision Rules• Timing• Network
• O
• uj:O→R (uj:A→R) • dj:O→Ai (dj:A→ Ai) • T=T1∪T2∪⋅⋅⋅∪Tn• ⟨N, A, {uj}, {dj},T ⟩
© Cognitive Radio Technologies, 2007
30
16
Basic Game Components1. A (well-defined) set of 2 or more players2 A set of actions for each player2. A set of actions for each player.3. A set of preference relationships for each
player for each possible action tuple.
• More elaborate games exist with more components but these three must always be there.
• Some also introduce an outcome function which maps action
31
• Some also introduce an outcome function which maps action tuples to outcomes which are then valued by the preference relations.
• Games with just these three components (or a variation on the preference relationships) are said to be in Normal form or Strategic Form
Set of Players (decision makers)
• N – set of n players consisting of players “named” {1 2 3 i j n}named {1, 2, 3,…,i, j,…,n}
• Note the n does not mean that there are 14 players in every game.
• Other components of the game that “belong” to a particular player are normally indicated by a subscript.
• Generic players are most commonly written
32
• Generic players are most commonly written as i or j.
• Usage: N is the SET of players, n is the number of players.
• N \ i = {1,2,…,i-1, i+1 ,…, n} All players in Nexcept for i
17
ActionsAi – Set of available actions for player i Example Two Player ai – A particular action chosen by i, ai ∈ Ai
A – Action Space, Cartesian product of all Ai
A=A1× A2×· · · × An
a – Action tuple – a point in the Action Space
A-i – Another action space A formed from
A2 = A-1
a
Action SpaceA1 = A2 = [0 ∞)A=A1× A2
33
A-i =A1× A2×· · · ×Ai-1 × Ai+1 × · · · × An
a-i – A point from the space A-i
A = Ai × A-i
A1= A-2
a
a1 = a-2
a2 = a-1 b
b1 = b-2
b2 = b-1
Preference Relation expresses an individual player’s desirability ofone outcome over another (A binary relationship)
Preference Relations (1/2)
*io o o is preferred at least as much as o* by player i
i Preference Relationship (prefers at least as much as)
i Strict Preference Relationship (prefers strictly more than)
~i “Indifference” Relationship (prefers equally)
*io o *
io oiff *io obut not
34
i p (p q y)*~io o *
io oiff *io oand
18
Preference Relations (2/2)• Games generally assume the relationship
b t ti d t ibetween actions and outcomes is invertible so preferences can be expressed over action vectors.
• Preferences are really an ordinal relationship
35
relationship– Know that player prefers one outcome to
another, but quantifying by how much introduces difficulties
A mathematical description of preference relationships.
Utility Functions (1/2)(Objective Fcns, Payoff Fcns)
Preference Relation then defined as*
ia a
Maps action space to set of real numbers.
iff ( ) ( )*i iu a u a≥
:iu A→R
36
i ( ) ( )i i
*ia a iff ( ) ( )*
i iu a u a>
*~ia a iff ( ) ( )*i iu a u a=
19
Utility Functions (2/2)By quantifying preference relationships all sorts of valuable
th ti l ti b i t d dmathematical operations can be introduced.
Also note that the quantification operation is not unique as long as relationships are preserved. Many map preference relationships to [0,1].
Example
37
Jack prefers Apples to Oranges
JackApples Oranges ( ) ( )Jack Jacku Apples u Oranges>
a) uJack(Apples) = 1, uJack(Oranges) = 0
b) uJack(Apples) = -1, uJack(Oranges) = -7.5
Normal Form Games(Strategic Form Games)
In normal form, a game consists of three primary
{ }, , iG N A u=
components
N – Set of PlayersAi – Set of Actions Available to Player i
38
A – Action Space {ui} – Set of Individual Objective Functions
:iu A→R
1 2 nA A A A= × × ×
20
Normal Formal Games in Matrix Representation
Useful for representing 2 player games with finite action sets.Player 1’s actions are indexed by rows.
a2 b2
A2 = {a2,b2}A1 = {a1,b1}N = {1,2}
Player 2’s actions are indexed by columns.Each entry is the payoff vector, (u1, u2), corresponding to the action tuple
39
a1
b1
u1(a1, a2), u2(a1, a2) u1(a1, b2), u2(a1, b2)
u1(b1, b2), u2(b1, b2)u1(b1, a2), u2(b1, a2)
OrientInfer from Context
Establish Priority
Infer from Radio Model
Utility function Utility Function
Cognitive radios are naturally modeled as players in a game
NormalUrgent
Establish Priority
PlanNormal
Immediate
LearnNewStates
Goal
ObserveDecide
Arguments
© Cognitive Radio Technologies, 2007
40
Allocate ResourcesInitiate Processes
NegotiateAdapted From Mitola, “Cognitive Radio for Flexible Mobile Multimedia Communications ”, IEEE Mobile Multimedia Conference, 1999, pp 3-10.
OutsideWorld
Act
Autonomous States
\
Outcome Space
Action Sets
DecisionRules
21
Radio 2
Actions
Radio 1
Actions
Interaction is naturally modeled as a game
ActionsActionsAction Space Decision
RulesDecision Rules
:f A O→Informed by Communications
© Cognitive Radio Technologies, 2007
41
u2u1 Outcome Space
:f A O→Communications Theory
( )1 2ˆ ˆ,γ γ( )1 1̂u γ ( )2 2ˆu γ
OrientInfer from Context Infer from Radio Model
Level0 SDR1 Goal Driven
If distributed adaptation doesn’t occur, it’s not a game
When Game Theory can be Applied
NormalUrgent
OSelect Alternate
GoalsEstablish Priority
PlanNormal
Immediate
LearnNewStates
Generate AlternateGoals
ObserveDecide
1 Goal Driven2 Context Aware3 Radio Aware4 Planning5 Negotiating6 Learns Environment7 Adapts Plans8 Adapts Protocols
g
Parse StimuliPre-process
Suitable for game theory analysis
Unconstrained action sets (radios can make up new adaptations) or
42
Allocate ResourcesInitiate Processes
NegotiateNegotiate Protocols
OutsideWorld
Act
User Driven(Buttons) Autonomous Determine “Best”
PlanStates
Determine “Best” Known WaveformGenerate “Best” Waveform
p p )undefined goals (utility functions) make analysis impractical
Game Theory applies to: 1. Adaptive aware radios2. Cognitive radios that learn about
their environment
22
Conditions for Applying Game Theory to CRNs• Conditions for rationality
– Well defined decision making processes– Expectation of how changes impacts
performance• Conditions for a nontrivial game
– Multiple interactive decision makers
43
Multiple interactive decision makers– Nonsingleton action sets
• Conditions generally satisfied by distributed dynamic CRN schemes
• Inappropriate applications– Cellular Downlink power control (single cell)
Example Application Appropriateness
– Site Planning– A single cognitive network
• Appropriate applications– Multiple interactive cognitive networks– Distributed power control on non-orthogonal
waveformsAd h t l
44
• Ad-hoc power control• Cell breathing
– Adaptive MAC– Distributed Dynamic Frequency Selection– Network formation (localized objectives)
23
Some differences between game models and cognitive radio network model• Assuming numerous iterations, normal form game only
has a single stage.
Player Cognitive Radio
g g– Useful for compactly capturing modeling components at a single
stage– Normal form game properties will be exploited in the analysis of
other games– Other game models discussed throughout this presentation
45
y gKnowledge Knows A Can learn O (may know or learn A)
f : A →O
InvertibleConstantKnown
Not invertible (noise)May change over time (though relatively fixed for short periods)Has to learn
Preferences Ordinal Cardinal (goals)
Summary• Adaptations of cognitive radios interact
– Adaptations can have unexpected negative results• Infinite recursions, vicious cycles
– Insufficient to consider behavior of only a single link in the design• Behavior of collection of radios can be modeled as a game• Some differences in models and assumptions but high level
mapping is fairly close
46
• As we look at convergence, performance, collaboration, and stability, we’ll extend the model
1
Equilibrium Conceptsq p
Nash Equilibria, Mixed Strategy Equilbria, Coalitional Games, the Core,
© Cognitive Radio Technologies, 2010
1
Games, the Core, Shapley Value, Nash Bargaining,
WSU May 10, 2010
Steady-states• Recall model of <N,A,{di},T> which we characterize with
the evolution function d• Steady-state is a point where a*= d(a*) for all t ≥t *
• Obvious solution: solve for fixed points of d.• For non-cooperative radios, if a* is a fixed point under
synchronous timing, then it is under the other three timings (round-robin, random, asynchronous)
• Works well for convex action spaces
2
– Not always guaranteed to exist– Value of fixed point theorems
• Not so well for finite spaces– Generally requires exhaustive search
2
“A steady-state where each player holds a correct expectation of the other players’ behavior and acts rationally.” - Osborne
Nash Equilibrium
An action vector from which no player can profitably unilaterally deviate.
( ) ( ), ,i i i i i iu a a u b a− −≥An action tuple a is a NE if for every i ∈ Nfor all bi ∈Ai.
Definition
3
Note showing that a point is a NE says nothing about the process by which the steady state is reached. Nor anything about its uniqueness.Also note that we are implicitly assuming that only pure strategies are possible in this case.
Examples• Cognitive Radios’
DilemmaT di h t i l– Two radios have two signals to choose between {n,w} and {N,W}
– n and N do not overlap– Higher throughput from
operating as a high power wideband signal when other is narrowband
4
is narrowband
• Jamming Avoidance– Two channels– No NE
0 10 (-1,1) (1,-1)1 (1,-1) (-1,1)
Jammer
Transmitter
3
How do the players find the Nash Equilibrium?• Preplay Communication
– Before the game, discuss their options. Note only NE are it bl did t f di ti l ldsuitable candidates for coordination as one player could
profitably violate any agreement.• Rational Introspection
– Based on what each player knows about the other players, reason what the other players would do in its own best interest. (Best Response - tomorrow) Points where everyone would be playing “correctly” are the NE.
• Focal PointS di ti i hi h t i ti f th t l it t t d
5
– Some distinguishing characteristic of the tuple causes it to stand out. The NE stands out because it’s every player’s best response.
• Trial and Error– Starting on some tuple which is not a NE a player “discovers”
that deviating improves its payoff. This continues until no player can improve by deviating. Only guaranteed to work for Potential Games (couple weeks)
Nash Equilibrium as a Fixed Point
• Individual Best Response( ) ( ) ( ){ }ˆ :B a b A u b a u a a a A= ∈ ≥ ∀ ∈
• Synchronous Best Response
• Nash Equilibrium as a fixed point
( ) ( ) ( ){ }: , ,i i i i i i i i i i iB a b A u b a u a a a A− −= ∈ ≥ ∀ ∈
( ) ( )ˆ ˆii N
B a B a∈
= ×
( )* *ˆa B a=
6
• Fixed point theorems can be used to establish existence of NE (see dissertation)
• NE can be solved by implied system of equations
( )
4
Example solution for Fixed Point by Solving for Best Response Fixed Point
• Bandwidth Allocation GameFive cognitive radios with each radio i free to– Five cognitive radios with each radio, i, free to determine the number of simultaneous frequency hopping channels the radio implements, ci ∈[0,∞).
– Goal– P(c) fraction of symbols that are not interfered with
(making P(c)ci the goodput for radio i)
( ) ( ) ( )i i i iu c P c c C c= −
7
– Ci(ci) is radio i’s cost for supporting ci simultaneous channels.
( )i k i ik N
u c B c c Kc∈
⎛ ⎞= − −⎜ ⎟⎝ ⎠
∑
Best Response Analysis( )i k i i
k Nu c B c c Kc⎛ ⎞
= − −⎜ ⎟⎝ ⎠
∑Goalk N∈⎝ ⎠
( )\
ˆ / 2i i kk N i
c B c B K c∈
⎛ ⎞= = − −⎜ ⎟
⎝ ⎠∑Best Response
Simultaneous System of
8( ) ( )ˆ / 1ic B K N i N= − + ∀ ∈
yEquations
( )ˆ / 6ic B K i N= − ∀ ∈Solution
Generalization
5
Significance of NE for CRNs
Autonomously Rational Decision Rule
© Cognitive Radio Technologies, 2007
9
• Why not “if and only if”?– Consider a self-motivated game with a local maximum and a hill-climbing
algorithm.– For many decision rules, NE do capture all fixed points (see dissertation)
• Identifies steady-states for all “intelligent” decision rules with the same goal.
• Implies a mechanism for policy design while accommodating differing implementations
– Verify goals result in desired performance– Verify radios act intelligently
Nash Equilibrium Existence
Visualizable Definition of Quasi Concavity
10
( ) ( ){ }* *:U a a A f a a= ∈ ≥
a2a1
U(a1)
a
f (a)
a0 a2a1
U(a1)
a
f (a)
a0
Visualizable Definition of Quasi-ConcavityAll upper-level sets are convex
Not all games have an NE, But games with mixed strategies do
6
My Favorite Mixed Strategy StoryPure Strategies in an Extended GameConsider an extensive form game where each stage is a strategicf d h i i h hform game and the action space remains the same at each stage.Before play begins, each player chooses a probabilistic strategythat assigns a probability to each action in his action set. At eachstage, the player chooses an action from his action set according to the probabilities he assigned before play began.
Example
11
a p eConsider a video football game which will be simulated. Before thegame begins two players assign probabilities of calling running plays or passing plays for both offense or defense. In the simulation,for each down the kind of play chosen by each team is based on theinitial probabilities assigned to kinds of plays. (Play NCAA2003)
Example Mixed Strategy Game
Jamming gameq (1 q) Action Tuples Probabilities
a1
b
a2 b2
1,-1 -1, 1
1 1
p
(1 )
q (1-q) (a1,a2)(a1,b2)(b1,a2)(b1,b2)
pqp(1-q)
(1-p)(1-q)(1-p)q
Expected Utilities
12
b1 1, -1-1, 1(1-p) ( ) ( ) ( )( )( ) ( ) ( )( )( )
1 , 1 1 1
1 1 1 1 1
U p q pq p q
p q p q
= + − − +
+ − − + − −
( ) ( ) ( )( )( ) ( ) ( )( )( )
2 , 1 1 1
1 1 1 1 1
U p q pq p q
p q p q
= − + − +
+ − + − − −
Δ(A1)={p,(1-p): ∀p∈[0,1]}
Δ(A2)={q,(1-q): ∀q∈[0,1]}
Sets of probability distributions
7
Nash Equilibrium in a Mixed Strategy GameDefinition Mixed Strategy Nash EquilibriumA mixed strategy profile α* is a NE iff ∀i∈NA mixed strategy profile α is a NE iff ∀i∈N
( ) ( ) ( )* * *, ,i i i i i i i iU U Aα α β α β− −≥ ∀ ∈Δ
Best Response Correspondence( )
( )( )arg max ,
i ii i i i iA
BR Uα
α α α− −∈Δ=
13
Alternate NE DefinitionConsider ( ) ( )i N iB BRα α∈= ×
A mixed strategy profile α* is a NE iff( )* *Bα α∈
1( ) ( )4 2 2 1U p q pq p q= +( )1 , 4 2 2 1U p q pq p q= − − + Best Response Correspondences
Nash Equilibrium
1
Best Response
( )1 4 2u q qp
∂= −
∂
( )2 4 2u p pq
∂= − +
∂0 1/ 2q <⎧
( ) ( )2 , 4 2 2 1U p q pq p q= − − − +
0.5BR1(q)
BR2(p)p(a1)
p
q
1-p
1-q
14
0 1( )1
0 1/ 2[0,1] 1/ 21 1/ 2
qBR q q
q
<⎧⎪= =⎨⎪ >⎩
( )2
1 1/ 2[0,1] 1/ 20 1/ 2
pBR p p
p
>⎧⎪= =⎨⎪ <⎩
0.5
Note: NE in mixed extension which did not exist in original
p(a2)
8
Interesting Properties of Mixed Strategy Games1. Every Mixed Extension of a Strategic
Game has an NEGame has an NE.2. A mixed strategy αi is a best response to
α-i iff every action in the support of αi is itself a best response to α-i.
3. Every action in the support of any l ’ ilib i i d t t i ld
15
player’s equilibrium mixed strategy yields the same payoff to that player.
Coalitional Game (with transferable payoff or utilities)• Concept: groups of players (called coalitions) conspire together to
implement actions which yields a result for the coalition. The value received by the coalition is then distributed among the coalitionreceived by the coalition is then distributed among the coalition members.
• Where do radios collaborate and distribute value?– 802.16h interference groups – allocation of bandwidth– Distribution of frequencies/spreading codes among cells– File sharing in P2P network
• Transferable utility refers to existence of some commodity for which a player’s utility increases by one unit for every unit of the commodity it receives
16
• Game Components, ⟨N,v⟩– N set of players– Characteristic function– Coalition, S⊆N
• How is this value distributed?– Payoff vector, (xi)i∈S
• Payoff vector is said to be S-feasible if x(S) ≤ v(S)
: 2 \Nv ∅→
( ) ii S
x S x∈
=∑
9
The Core (Transferrable)• The Core
For ⟨N ⟩ the set of feasible pa off profiles– For ⟨N,v⟩, the set of feasible payoff profiles, (xi)i∈S for which there is no coalition S and S-feasible payoff vector (yi)i∈S for which yi > xifor all i∈S.
• General principles of the NE also apply to the Core:
17
the Core:– Number of solutions for a game may be
anywhere from 0 to ∞– May be stable or unstable.
Example• Suppose three radios, N = {1,2,3}, can choose to
participate in a peer-to-peer networkparticipate in a peer to peer network. • Characteristic Function
– v(N) = 1– v({1,2})= v({1,3})= v({2,3})=α∈[0,1]– v(1)= v(2)= v(3) = 0
• Loosely, α indicates # of duplicated files
18
• If α>2/3, Core is empty
x = (2/5, 2/5,0)Example adaptations forα=4/5
x = (0, 3/5, 1/5) x = (2/5, 0, 2/5)
x = (1/3, 1/3, 1/3) x = (2/5, 2/5,0)
10
Comments on the Core• Possibility of empty core implies that even when
radios can freely negotiate and form arbitrary y g ycoalitions, no steady-state may exist
• Frequently very large (infinite) number of steady-states, e.g., α<2/3 – Makes it impossible to predict exact behavior
• Existence conditions for the Core, but would need to cover some linear programming
19
concepts• Related (but not addressed today) concepts:
– Bargaining Sets, Kernel, Nucleolus
Strong NE• Concept: Assume radios are able to collaborate,
but utilities aren’t necessarily transferrablebut utilities aren t necessarily transferrable• An action tuple a* such that
( ) ( )* *, ,i i S S S ii Su a u a a S N a A− ∈
≥ ∀ ⊆ ∈ ×
No Strong NE
N WUnique Strong NE
20
N Wn (9.6,9.6) (9.6, 21)w (21, 9.6) (22, 22)
11
Motivation for Shapley value• Core was generally either empty or very large.
Want a “good” single solution– Want a “good” single solution.
• Kinda defining formal distribution function• Terminology
– Marginal Contribution of i( ) ( ) ( )i S v S i v SΔ = ∪ −
21
– Interchangeability of i, j
– Dummy player (no synergy)
( ) { }( ) \i S v i S N iΔ = ∀ ⊆
( ) ( ) \{ , }i jS S S N i jΔ = Δ ∀ ⊆
Axioms for Shapley Value• Let ψ be some distribution of value for a TU
coalition gamecoalition game• Symmetry:
– If i and j are interchangeable, then ψi(v)=ψj(v)• Dummy:
– If i is a dummy, then ψi(v) = v({i})• Additivity:
22
y– Given ⟨N,v⟩ and ⟨N,w⟩, ψi(v + w)= ψi(v)+ ψi(w) for all
i∈N, where v+w = v(S) + w(S)• Balanced Contributions
– Given ⟨N,v⟩, ( ) ( ) ( ) ( )\ \, \ . , \ ,N j N ii i j jN v N j v N v N i vψ ψ ψ ψ− = −
12
Shapley Value
( )! 1 !S N S− −( ) ( ) ( ) ( )( )
\
! 1 !!i
S N i
S N SS v S i v S
Nψ
⊆
− −= −∑ ∪
Marginal Value Contributed by i
Probability that i will be next one invited to the grand coalition ( ) ( ) ( )i S v S i v SΔ = ∪ −
23
Only assignment (value) that satisfies balanced contributions; only assignment that simultaneously satisfies symmetry, dummy, and additivity axioms
given that coalition S is already part of the coalition assuming random ordering.
Implications of Shapley Value• One form of a fair allocation
– What you receive is based on the value you add– What you receive is based on the value you add– Independent of order of arrival– I liken it to setting salaries according to the Value
Over Replacement Player concept• “Better” solution concept than the core as it’s a
single payoff as opposed to a potentially infinite number
24
number • Allows for analysis of relative “power” of different
players in the system
13
Steady-State Summary• Not every game has a steady-state• NE are analogous to fixed points of self-interested g p
decision processes• NE can be applied to procedural and ontological radios
– Don’t need to know decision rule, only goals, actions, and assumption that radios act in their own interest
• A game (network) may have 0, 1, or many steady-states• All finite normal form games have an NE in its mixed
extension
25
– Over multiple iterations, implies constant adaptation• More complex game models yield more complex steady-
state concepts• Can define steady-states concepts for coalitional games
– Frequently so broad that specific solutions are used
5/11/2010
1
Evaluating Equilibriag q
Objective Function Maximization, Pareto Efficiency, Notions
f F i
© Cognitive Radio Technologies, 2010
1
of Fairness
WSU May 12, 2010
Optimality• In general we assume
the existence of somethe existence of some design objective function J:A→R
• The desirableness of a network state, a, is the value of J(a).I l i i
2
• In general maximizers of J are unrelated to fixed points of d.
Figure from Fig 2.6 in I. Akbar, “Statistical Analysis of Wireless Systems Using Markov Models,” PhD Dissertation, Virginia Tech, January 2007
5/11/2010
2
Example Functions• Utilitarian
– Sum of all players’ utilitiesUtilitarian Maximizers
p y– Product of all players’
utilities• Practical
– Total system throughput– Average SINR– Maximum End-to-End
Latency
System Throughput Maximizers
3
y– Minimal sum system
interference• Objective can be
unrelated to utilitiesInterference Minimization
Price of Anarchy (Factor)Performance of Centralized Algorithm Solution
Performance of Distributed Algorithm Solution
• Centralized solution always at least as good as distributed solution– Like ASIC is always at least as good as
DSPI t f i l ti
g
≥ 1
4
• Ignores costs of implementing algorithms– Sometimes centralized is infeasible (e.g.,
routing the Internet)– Distributed can sometimes (but not
generally) be more costly than centralized
9.6
7
5/11/2010
3
Price of Anarchy Discussion• Best of All Possible Worlds
– Low complexity distributed algorithms with low anarchy factors• Reality implies mix of methods
– Hodgepodge of mixed solutions• Policy – bounds the price of anarchy• Utility adjustments – align distributed solution with centralized
solution• Market methods – sometimes distributed, sometimes centralized• Punishment – sometimes centralized, sometimes distributed,
sometimes both• Radio environment maps ”centralized” information for distributed
5
• Radio environment maps – centralized information for distributed decision processes
– Fully distributed• Potential game design – really, the Panglossian solution, but only
applies to particular problems
Pareto efficiency (optimality)• Formal definition: An action vector a* is
Pareto efficient if there exists no other action vector a, such that every radio’s valuation of the network is at least as good and at least one radio assigns a higher valuation
• Informal definition: An action tuple is Pareto efficient if some radios must be hurt in order to improve the payoff of other radios
6
to improve the payoff of other radios.• Important note
– Like design objective function, unrelated to fixed points (NE)
– Generally less specific than evaluating design objective function
5/11/2010
4
Example Games
Legend Pareto Efficient
a
a2 b2 a2 b2
Legend Pareto Efficient
NE NE + PE
7
a1
b1
1,1 -5,5
-1,-15,-5
a1
b1
1,1 -5,5
3, 35,-5
Notions of Fairness• What is “Fair”?
Abstractly “fair” means different things to different– Abstractly “fair” means different things to different analysts
– In every day life, “unfair” is short hand for “I deserve more than I got”
• Nonetheless is used to evaluate how equitably radio resources are distributed
8
5/11/2010
5
Gini Coefficient• Basic concept:
– Order players by utility. – Form CDF for sorted utility U
tility
Form CDF for sorted utility distribution (Lorenz curve)
– Integrate (sum) the difference between perfect equality (of outcome) and CDF
– Divide result by sum of all players’ utilities
• Formula Player #
Aggr
egat
e
Lorenz curve
( )( ) ( )
( )
11 1 2
ii N
n i u aG a n
n u a∈
⎛ ⎞+ −⎜ ⎟= + −⎜ ⎟⎜ ⎟
∑∑
9
• Used in a lot of macro-economic comparisons of income distributions
• Relatively simple, independent of scale, independent of size of N, anonymity
• Radically different outcomes can give the same result
G N Wn 0 0.37w 0.37 0
( )ii N
n u a∈
⎜ ⎟⎜ ⎟⎝ ⎠
∑
Other Metrics of Fairness• Theill Index
( ) ( )1 u a u a⎛ ⎞ ( ) ( )1∑• Atkinson Index, ε is income inequality aversion
( ) ( )( )
[ )1 1
11 11 , 0,1ii N
T a u au n
εε ε
−−⎛ ⎞
= − ∈⎜ ⎟⎝ ⎠∑
( ) ( ) ( )1 lni i
i N
u a u aT a
n u u∈
⎛ ⎞= ⎜ ⎟
⎝ ⎠∑ ( ) ( )1
ii N
u a u an ∈
= ∑
10
i Nu n ∈⎝ ⎠
( ) ( )1/
1 11 , 1n
ii N
T a u au n
ε∈
⎛ ⎞= − =⎜ ⎟
⎝ ⎠∑
5/11/2010
6
Bargaining Problem• Components: ⟨F, v⟩
– Feasible payoffs F, closed convex subset of Rn
– Disagreement Point v = (v1, v2)• What 1 or 2 could achieve without bargaining
• Example:– Even if system is jammed, still gets some throughput– Member of 802.16h interference group and try its luck
• F is said to be essential if there is some y∈F such that y1>v1 and y2>v2If t t “bi di ” th F ld b th ff
11
• If contracts are “binding” then F could be the payoffs corresponding to entire original action space
• Otherwise, F may need to be drawn from the set of NE or from enforceable set (see punishment in repeated games)
• A particular solution is referred to by φ(F, v) ∈Rn
Desirable Bargaining Axioms about a Solution• Strong Efficiency
φ(F v) is Pareto• Independence of
Irrelevant Alternatives– φ(F, v) is Pareto Efficient
• Individually Rational– φ(F, v) ≥v
• Scale CovarianceFor any λ λ γ
Irrelevant Alternatives– If G ⊆F and G is
closed and convex and φ(F, v)∈G, then φ(G, v)=φ(F, v)
• Symmetry
12
– For any λ1, λ2, γ1, γ2∈R, λ1, λ2 >0, if
then
– If v1=v2 and {(x1,x2)|(x2,x1)∈F}=F, then φ1(F, v)= φ2(F, v) ( ) ( ){ }1 1 1 2 2 2 1 2, | ,G x x x x Fλ γ λ γ= + + ∈
( ) ( ) ( )( )1 1 1 2 2 2, , , ,G w F v F vφ λφ γ λ φ γ= + +
( )1 1 1 2 2 2,w v vλ γ λ γ= + +
5/11/2010
7
Nash Bargaining Solution• NBS
( ) ( )
• Interestingly, this is the only bargaining solution which simultaneously satisfies the preceding 5 axioms
( ) ( ),
, arg max i ii Nx F x vF v x vφ
∈∈ ≥∈ Π −
13
preceding 5 axioms
GT framework for BW allocation [Yaiche]: System Model• N users• L links• L links• Users compete for the total link capacity• Each user has a minimum rate MRi and peak
rate PRi• Admissible rate vector is given by,
{ }N
© Cognitive Radio Technologies, 2007
14
C : vector of link capacitiesA L*N: alp = 1 if link belongs to path p, else 0.
{ }0 | , , andNX x x MR x PR Ax C= ∈ ≥ ≤ ≤
Scenario given in H. Yaiche, R. Mazumdar, C. Rosenberg, “A game theoretic framework for bandwidth allocation and pricing in broadband networks”, IEEE/ACM Transactions on Networking, Volume: 8 , Issue: 5 , Oct. 2000, pp. 667-678.
5/11/2010
8
Centralized Optimization Problem
( )N
∏•{ } ( )
{ }{ }
1
: 1
1
i ixi
i i
i i
Max x MR
st x MR i N
x PR i N
=
−
≥ ∈
≤ ∈
∏…
…
•v
15
( ) ( ) { }1l l
Ax C l L≥ ∈ …
• Unique NBS existsF
Summary of Equilibria Evaluation• Lots of different ways which a point can be
evaluatedevaluated• Many are contradictory• Loosely, any point could be said to be
optimal given the right objective function• Insufficient to say that a point is optimal
16
– Must describe the metric in use• Suggestion: use whatever metric makes
sense to you as a network designer
1
The Notion of Time and Imperfections in Games andImperfections in Games and NetworksExtensive Form Games, Repeated Games, Convergence Concepts in Normal Form Games,
© Cognitive Radio Technologies, 2010
1
o a o Ga es,Trembling Hand Games, Noisy Observations
WSU May 12, 2010
Model Timing Review• When decisions are
made also matters and diff t di ill
Decision timing classes• Synchronous
different radios will likely make decisions at different time
• Tj – when radio j makes its adaptations– Generally assumed to be
an infinite set
– All at once• Round-robin
– One at a time in order– Used in a lot of analysis
• Random– One at a time in no order
A h
2
– Assumed to occur at discrete time
• Consistent with DSP implementation
• T=T1∪T2∪⋅⋅⋅∪Tn• t ∈ T
• Asynchronous– Random subset at a time– Least overhead for a
network
2
Extensive Form Game Components1. A set of players.2 Th ti il bl t h l t h d i i t ( t t )
Components
2. The actions available to each player at each decision moment (state).3. A way of deciding who is the current decision maker.4. Outcomes on the sequence of actions.5. Preferences over all outcomes.
1B 1,-1Strategic Form Equivalence
Strategies for AGame Tree Representation
A Silly Jammer Avoidance Game
3
A1
2
2
1’
2’
B
B
,
-1,1
-1,1
1,-1
1
2
1,1’ 1,2’ 2,1’ 2,2’
1,-1
1,-11,-1
1,-1 -1,1 -1,1
-1,1 -1,1
Strategies for A{1,2}Strategies for B{(1,1’),(1,2’),(2,1’),(2,2’)}
Backwards Induction• Concept
– Reason backwards based on what each player would rationally lplay
– Predicated on Sequential Rationality– Sequential Rationality – if starting at any decision point for a
player in the game, his strategy from that point on represents a best response to the strategies of the other players
– Subgame Perfect Nash Equilibrium is a key concept (not formally discussed today).
Alternating Packet Forwarding Game
© Cognitive Radio Technologies, 2007
4
C
S S
1 1
1,0
C2
0,2
C
S
3,1
S
C2
2,4
S
1
5,3
C C
S
2
4,6
7,54,65,32,43,10,2
Alternating Packet Forwarding Game
3
Comments on Extensive Form Games• Actions will generally not be directly observable• However likely that cognitive radios will build up• However, likely that cognitive radios will build up
histories• Ability to apply backwards induction is
predicated on knowing other radio’s objectives, actions, observations and what they know they know…
5
– Likely not practical• Really the best choice for modeling notion of
time when actions available to radios change with history
Repeated GamesStage 1Stage 1• Same game is repeated
Indefinitely
Stage 2Stage 2
– Indefinitely– Finitely
• Players consider discounted payoffs across multiple stages
Stage k
6
Stage kStage k
– Stage k
– Expected value over all future stages
( ) ( )k k ki iu a u aδ=
( )( ) ( )0
k k ki i
k
u a u aδ∞
=
=∑
4
Lesser Rationality: Myopic Processes• Players have no knowledge about utility
functions or expectations about future playfunctions, or expectations about future play, typically can observe or infer current actions
• Best response dynamic – maximize individual performance presuming other players’ actions are fixed
• Better response dynamic improve individual
7
• Better response dynamic – improve individual performance presuming other players’ actions are fixed
• Interesting convergence results can be established
Paths and Convergence• Path [Monderer_96]
– A path in Γ is a sequence γ = (a0, a1,…) such that for every k 1 th i t i l h th t th t tk ≥ 1 there exists a unique player such that the strategy combinations (ak-1, ak) differs in exactly one coordinate.
– Equivalently, a path is a sequence of unilateral deviations. When discussing paths, we make use of the following conventions.
– Each element of γ is called a step.– a0 is referred to as the initial or starting point of γ.– Assuming γ is finite with m steps am is called the terminal
8
Assuming γ is finite with m steps, a is called the terminal point or ending point of γ and say that γ has length m.
• Cycle [Voorneveld_96]– A finite path γ = (a0, a1,…,ak) where ak = a0
5
Improvement Paths• Improvement Path
– A path γ = (a0 a1 ) where for all k≥1– A path γ = (a , a ,…) where for all k≥1, ui(ak)>ui(ak-1) where i is the unique deviator at k
• Improvement Cycle– An improvement path that is also a cycle– See the DFS example
γ1γ1
9
γ2
γ1
γ3
γ4γ5γ6γ2
γ1
γ3
γ4γ5γ6
Convergence Properties• Finite Improvement Property (FIP)
– All improvement paths in a game are finite• Weak Finite Improvement Property (weak
FIP)– From every action tuple, there exists an
improvement path that terminates in an NE.
10
improvement path that terminates in an NE.• FIP implies weak FIP• FIP implies lack of improvement cycles• Weak FIP implies existence of an NE
6
Examples
A BGame with FIP
a
b
A B
1,-1
-1,1
0,2
2,2
A B CWeak FIP but not FIP
© Cognitive Radio Technologies, 2007
11
a
b
A B
1,-1 -1,1
1,-1-1,1
C
0,2
1,2c 2,12,0 2,2
Implications of FIP and weak FIP• Assumes radios are incapable of reasoning ahead and
must react to internal states and current observations• Unless the game model of a CRN has weak FIP, then no
autonomously rational decision rule can be guaranteed to converge from all initial states under random and round-robin timing (Theorem 4.10 in dissertation).
• If the game model of a CRN has FIP, then ALL autonomously rational decision rules are guaranteed to converge from all initial states under random and round
© Cognitive Radio Technologies, 2007
12
converge from all initial states under random and round-robin timing.– And asynchronous timings, but not immediate from definition
• More insights possible by considering more refined classes of decision rules and timings
7
Decision Rules
© Cognitive Radio Technologies, 2007
13
Markov Chains• Describes adaptations
as probabilistic ptransitions between network states.– d is nondeterministic
• Sources of randomness:– Nondeterministic timing– Noise
14
– Noise• Frequently depicted as
a weighted digraph or as a transition matrix
8
General Insights ([Stewart_94])• Probability of occupying a state
after two iterations.– Form PP.– Now entry pmn in the mth row and nth
column of PP represents the probability that system is in state an
two iterations after being in state am. • Consider Pk.
– Then entry pmn in the mth row and nth
l f h b bili
15
column of represents the probability that system is in state an two iterations after being in state am.
Steady-states of Markov chains• May be inaccurate to consider a Markov
chain to have a fixed pointchain to have a fixed point– Actually ok for absorbing Markov chains
• Stationary Distribution– A probability distribution such that π* such that π*T P =π*T is said to be a stationary distribution for the Markov chain defined by P
16
distribution for the Markov chain defined by P. • Limiting distribution
– Given initial distribution π0 and transition matrix P, the limiting distribution is the distribution that results from evaluating
0lim T k
kπ
→∞P
9
Ergodic Markov Chain • [Stewart_94] states that a Markov chain is
ergodic if it is a Markov chain if it is a)ergodic if it is a Markov chain if it is a) irreducible, b) positive recurrent, and c) aperiodic.
• Easier to identify rule:– For some k Pk has only nonzero entries
• (Convergence steady-state) If ergodic then
17
• (Convergence, steady-state) If ergodic, then chain has a unique limiting stationary distribution.
Absorbing Markov Chains• Absorbing state
Given a Markov chain with transition matrix P a state– Given a Markov chain with transition matrix P, a state am is said to be an absorbing state if pmm=1.
• Absorbing Markov Chain– A Markov chain is said to be an absorbing Markov
chain if • it has at least one absorbing state and
© Cognitive Radio Technologies, 2007
18
• from every state in the Markov chain there exists a sequence of state transitions with nonzero probability that leads to an absorbing state. These nonabsorbing states are called transient states.
a0 a1 a2 a3 a4 a5
10
Absorbing Markov Chain Insights• Canonical Form ' ab
⎡ ⎤= ⎢ ⎥⎣ ⎦
Q RP 0 I
• Fundamental Matrix
• Expected number of times that the system will pass through state am given that the system starts in state ak.– nkm
• (Convergence Rate) Expected number of iterations
⎣ ⎦
( ) 1−= −N I Q
19
• (Convergence Rate) Expected number of iterations before the system ends in an absorbing state starting in state am is given by tm where 1 is a ones vector– t=N1
• (Final distribution) Probability of ending up in absorbing state am given that the system started in ak is bkm where
=B NR
Absorbing Markov Chains and Improvement Paths
• Sources of randomness– Timing (Random, Asynchronous)
Decision rule (random decision rule)– Decision rule (random decision rule)– Corrupted observations
• An NE is an absorbing state for autonomously rational decision rules.
• Weak FIP implies that the game is an absorbing Markov chain as long as the NE terminating improvement path always has a nonzero probability of being implemented
20
of being implemented.• This then allows us to characterize
– convergence rate, – probability of ending up in a particular NE, – expected number of times a particular transient state will be
visited
11
Connecting Markov models, improvement paths, and decision rules
• Suppose we need the path γ = (a0, a1,…am) for convergence by weak FIP.
• Must get right sequence of players and right sequence ofMust get right sequence of players and right sequence of adaptations.
• Friedman Random Better Response– Random or Asynchronous
• Every sequence of players have a chance to occur• Random decision rule means that all improvements have a chance to
be chosen– Synchronous not guaranteed
21
• Alternate random better response (chance of choosing same action)– Because of chance to choose same action, every sequence of
players can result from every decision timing.– Because of random choice, every improvement path has a chance
of occurring
Convergence Results (Finite Games)
• If a decision rule converges under round-robin, random, or synchronous timing, then it also converges under
22
or synchronous timing, then it also converges under asynchronous timing.
• Random better responses converge for the most decision timings and the most surveyed game conditions.– Implies that non-deterministic procedural cognitive radio
implementations are a good approach if you don’t know much about the network.
12
Impact of Noise• Noise impacts the mapping from actions to outcomes,
f :A→Of :A→O• Same action tuple can lead to different outcomes• Most noise encountered in wireless systems is
theoretically unbounded.• Implies that every outcome has a nonzero chance of
being observed for a particular action tuple.
23
g p p• Some outcomes are more likely to be observed than
others (and some outcomes may have a very small chance of occurring)
Another DFS Example• Consider a radio observing the spectral
energy across the bands defined by the set gy yC where each radio k is choosing its band of operation fk.
• Noiseless observation of channel ck
• Noisy observation
( ) ( ),i k ki k k kk N
o c g p c fθ∈
= ∑( ) ( ) ( ), ,i k ki k k k i ko c g p c f n c tθ= +∑
24
Noisy observation• If radio is attempting to minimize inband
interference, then noise can lead a radio to believe that a band has lower or higher interference than it does
( ) ( ) ( ), ,i k ki k k k i kk N
g p f∈∑
13
Trembling Hand (“Noise” in Games)• Assumes players have a nonzero chance of
making an error implementing their actionmaking an error implementing their action.– Who has not accidentally handed over the wrong
amount of cash at a restaurant? – Who has not accidentally written a “tpyo”?
• Related to errors in observation as erroneous observations cause errors in implementation
25
observations cause errors in implementation (from an outside observer’s perspective).
Noisy decision rules• Noisy utility ( ) ( ) ( ), ,i i iu a t u a n a t= +
Trembling Hand
© Cognitive Radio Technologies, 2007
26
ObservationErrors
14
Implications of noise• For random timing, [Friedman] shows game with noisy
random better response is an ergodic Markov chain.p g• Likewise other observation based noisy decision rules
are ergodic Markov chains– Unbounded noise implies chance of adapting (or not adapting) to
any action– If coupled with random, synchronous, or asynchronous timings,
then CRNs with corrupted observation can be modeled as ergodic Makov chains.
27
– Not so for round-robin (violates aperiodicity)• Somewhat disappointing
– No real steady-state (though unique limiting stationary distribution)
DFS Example with three access points
• 3 access nodes, 3 channels, attempting to operate in band with least spectral energy.
• Constant power• Link gain matrix
• Noiseless observations
12
3
28
• Noiseless observations
• Random timing
15
Trembling Hand• Transition Matrix, p=0.1
29
• Limiting distribution
Noisy Best Response• Transition Matrix, N(0,1) Gaussian Noise
© Cognitive Radio Technologies, 2007
30
• Limiting stationary distributions
16
Comment on Noise and Observations
• Cardinality of goals makes a difference for cognitive radios
Probability of making an error is a function of the difference– Probability of making an error is a function of the difference in utilities
– With ordinal preferences, utility functions are just useful fictions
• Might as well assume a trembling hand• Unboundedness of noise implies that no state can
be absorbing for most decision rules• NE retains significant predictive power
31
– While CRN is an ergodic Markov chain, NE (and the adjacent states) remain most likely states to visit
– Stronger prediction with less noise– Also stronger when network has a Lyapunov function– Exception - elusive equilibria ([Hicks_04])
Stability
yy y
32
x
y
x
y
Stable, but not attractive
x
y
Attractive, but not stable
17
Lyapunov’s Direct Method
© Cognitive Radio Technologies, 2007
33Left unanswered: where does L come from? Can it be inferred from radio goals?
Summary• Given a set of goals, an NE is a fixed point for all radios with those
goals for all autonomously rational decision processes• Traditional engineering analysis techniques can be applied in a• Traditional engineering analysis techniques can be applied in a
game theoretic setting– Markov chains to improvement paths
• Network must have weak FIP for autonomously rational radios to converge– Weak FIP implies existence of absorbing Markov chain for many
decision rules/timings• In practical system, network has a theoretically nonzero chance of
visiting every possible state (ergodicity), but does have unique li iti t ti di t ib ti
© Cognitive Radio Technologies, 2007
34
limiting stationary distribution– Specific distribution function of decision rules, goals– Will be important to show Lyapunov stability
• Shortly, we’ll cover potential games and supermodular games which can be shown to have FIP or weak FIP. Further potential games have a Lyapunov function!
1
Designing Cognitive Radio Networks to Yield DesiredNetworks to Yield Desired BehaviorPolicy, Cost Functions, Repeated
© Cognitive Radio Technologies, 2010
1
pGames, SupermodularGames, Potential Games
WSU May10, 2010
Policy• Concept: Constrain the
available actions so the worst cases of distributed decision making can be avoided
• Not a new concept –– Policy has been used since
there’s been an FCC• What’s new is assuming
2
What s new is assuming decision makers are the radios instead of the people controlling the radios
2
Policy applied to radios instead of humans
• Need a language to convey policy
Policiesfrequency
mask
– Learn what it is– Expand upon policy later
• How do radios interpret policy– Policy engine?
• Need an enforcement mechanism– Might need to tie in to humans
N d f li
Policies
3
• Need a source for policy– Who sets it?– Who resolves disputes?
• Logical extreme can be quite complex, but logical extreme may not be necessary.
• Detection– Digital TV: -116 dBm over a 6 MHz channel
802.22 Example Policies
– Digital TV: -116 dBm over a 6 MHz channel– Analog TV: -94 dBm at the peak of the NTSC
(National Television System Committee) picture carrier
– Wireless microphone: -107 dBm in a 200 kHz bandwidth.
© Cognitive Radio Technologies, 2007
4
• Transmitted Signal– 4 W Effective Isotropic Radiated Power (EIRP)– Specific spectral masks – Channel vacation times
C. Cordeiro, L. Challapali, D. Birru, S. Shankar, “IEEE 802.22: The First Worldwide Wireless Standard based on Cognitive Radios,” IEEE DySPAN2005, Nov 8-11, 2005 Baltimore, MD.
3
Cost Adjustments• Concept: Centralized unit dynamically adjusts
costs in radios’ objective functions to ensurecosts in radios objective functions to ensure radios operate on desired point
• Example: Add -12 to the use of wideband f
( ) ( ) ( )i i iu a u a c a= +
5
waveform
Repeated GamesStage 1Stage 1• Same game is
repeated
Stage 2Stage 2
p– Indefinitely– Finitely
• Players consider discounted payoffs across multiple stages– Stage k
6
Stage kStage k
Stage k
– Expected value over all future stages
( ) ( )k k ki iu a u aδ=
( )( ) ( )0
k k ki i
k
u a u aδ∞
=
=∑
4
Impact of Strategies• Rather than merely reacting to the state of the network, radios
can choose their actions to influence the actions of other radios• Threaten to act in a way that minimizes another radio’s• Threaten to act in a way that minimizes another radio s
performance unless it implements the desired actions• Common strategies
– Tit-for-tat– Grim trigger– Generous tit-for-tat
• Play can be forced to any “feasible” payoff vector with proper selection of punishment strategy.
7
gy
Impact of Communication on Strategies• Players agree to play in a certain manner• Threats can force play to almost any state
Nada C N
• Threats can force play to almost any state– Breaks down for finite number of stages
8
nada
c
0,0 -5,5
-1,15,-5
-100,0
-100,-1n -1,-1000,-100 -100,-100
5
Improvement from Punishment
• Throughput/unit power gains be enforcing a common received power level at a base station
• Punishment by jamming• Without benefit to
deviating, players can operate at lower power level and achieve same
9 A. MacKenzie and S. Wicker, “Game Theory in Communications: Motivation, Explanation, and Application to Power Control,” Globecom2001, pp. 821-825.
throughput
Instability in Punishment• Issues arise when
radios aren’t directly observing actionsobserving actions and are punishing with their actions without announcing punishment
• Eventually, a deviation will be
10
falsely detected, punished and without signaling, this leads to a cascade of problems
V. Srivastava, L. DaSilva, “Equilibria for Node Participation in Ad Hoc Networks –An Imperfect Monitoring Approach,” ICC 06, June 2006, vol 8, pp. 3850-3855
6
Comments on Punishment• Works best with a common controller to announce• Problems in fully distributed system
– Need to elect a controller– Otherwise competing punishments, without knowing other players’
utilities can spiral out of control• Problems when actions cannot be directly observed
– Leads to Byzantine problem• No single best strategy exists
– Strategy flexibility is important – Significant problems with jammers (they nominally receive higher
11
– Significant problems with jammers (they nominally receive higher utility when “punished”
• Generally better to implement centralized controller– Operating point has to be announced anyways
Supermodular Games• A game such that
– Action space is a lattice
2
0, , ,i
i j
u i j N a Aa a∂
≥ ∀ ∈ ∈∂ ∂
– Utility functions are supermodular• Identification• NE Properties
– NE Existence: All supermodular games have a NE– NE Identification: NE form a lattice
• Convergence
12
g– Has weak FIP– Best response algorithms converge
• Stability– Unique NE is an attractive fixed point for best
response
7
Ad-hoc power control• Network description
E h di tt t• Each radio attempts to achieve a target SINR at the receiving end of its link.
• System objective is ensuring every radio
Gateway
ClusterHead
Gateway
ClusterHead
13
ensuring every radio achieves its target SINR
ClusterHeadClusterHead
( ) ( )2ˆk kk N
J γ γ∈
= − −∑p
Generalized repeated gamestage game
• Players – N⎡ ⎤• Actions –
• Utility function
• Action space formulation
( ) ( )2ˆj j ju o γ γ= − −
max0,j jP p⎡ ⎤= ⎣ ⎦
( )2
⎛ ⎞⎛ ⎞
14
( ) ( )10 10\
ˆ 10 log 10logj j jj j kj k jk N j
u p g p g p Nγ∈
⎛ ⎞⎛ ⎞= − − + +⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠⎝ ⎠
∑
gjk fraction of power transmitted by j that can’t be removed by receiving end of radio j’s linkNj noise power at receiving end of radio j’s link
8
Model identification & analysis
• Supermodular gameAction space is a lattice– Action space is a lattice
– Implications• NE exists• Best response converges• Stable if discrete action space
• Best response is also standard
( )
( )
2
\
2000
ln 20
j kj
j kj kj k j
k N j
u p gp p
p g p N∈
∂= >
∂ ∂ ⎛ ⎞+⎜ ⎟
⎝ ⎠∑
15
Best response is also standard– Unique NE– Solvable (see prelim report)– Stable (pseudo-contraction) for infinite
action spaces
( )ˆˆ jk
j jj
B pγγ
=p
ValidationImplies all radios achieved target SINR
16
Noiseless Best Response Noisy Best Response
9
Comments on Designing Networks with Supermodular Games
• Scales wellSum of supermodular functions is a supermodular– Sum of supermodular functions is a supermodular function
– Add additional action types, e.g., power, frequency, routing,..., as long as action space remains a lattice and utilities are supermodular
• Says nothing about desirability or stability of
17
equilibria• Convergence is sensitive to the specific decision
rule and the ability of the radios to implement it
Potential Games• Existence of a function (called
the potential function, V), that fl t th h i tilitreflects the change in utility seen
by a unilaterally deviating player.
• Cognitive radio interpretation:– Every time a cognitive radio
unilaterally adapts in a way that furthers its own goal, some real-valued function increases.
18time
Φ(ω
)
10
Exact Potential Game Forms• Many exact potential games can be recognized
by the form of the utility functionby the form of the utility function
© Cognitive Radio Technologies, 2007
19
Implications of Monotonicity• Monotonicity implies
– Existence of steady-states (maximizers of V)Convergence to maximizers of V for numerous combinations– Convergence to maximizers of V for numerous combinations of decision timings decision rules – all self-interested adaptations
• Does not mean that that we get good performance– Only if V is a function we want to maximize
© Cognitive Radio Technologies, 2007
20
11
Other Potential Game Properties• All finite potential games have FIP• All finite games with FIP are potential gamesAll finite games with FIP are potential games
– Very important for ensuring convergence of distributed cognitive radio networks
• -V is a is a Lyapunov function for isolated maximizers
• Stable NE solvable by maximizers of V• Linear combination of exact potential games is
21
Linear combination of exact potential games is an exact potential game
• Maximizer of potential game need not maximize your objective function– Cognitive Radios’ Dilemma is a potential game
Interference Reducing Networks
• Concept– Cognitive radio network is a potential game with a potential
function that is negation of observed network interferencefunction that is negation of observed network interference• Definition
– A network of cognitive radios where each adaptation decreases the sum of each radio’s observed interference is an IRN
( ) ( )ii N
Iω ω∈
Φ =∑
(ω)
22
• Implementation:– Design DFS algorithms such that network is a potential game
with Φ ∝ -V
time
Φ
12
Bilateral Symmetric Interference
• Two cognitive radios, j,k∈N, exhibit bilateral symmetric interference if( ) ( ) ∀ Ω ∀ Ω
What’s good for the goose, isgood for the gander…
( ) ( ), ,jk j j k kj k k jg p g pρ ω ω ρ ω ω= ,j j k kω ω∀ ∈Ω ∀ ∈Ω• ωk – waveform of radio k• pk - the transmission power of
radio k’s waveform• gkj - link gain from the
transmission source of radio k’s signal to the point where radio jmeasures its interference,
23Source: http://radio.weblogs.com/0120124/Graphics/geese2.jpg
measures its interference, • - the fraction of radio
k’s signal that radio j cannot exclude via processing (perhaps via filtering, despreading, or MUD techniques).
( ),k jρ ω ω
Bilateral Symmetric Interference Implies an Interference Reducing Network
• Cognitive Radio Goal:• By bilateral symmetric interference
( ) ( ) ( )∑∈
−=−=iNj
jijjiii pgIu\
,ωωρωω
y y
• Rewrite goal
• Therefore a BSI game (Si =0)
( ) ( )\
,i ik i kk N i
u bω ω ω∈
= − ∑
( ) ( )1
,i
ki k k iV g pω ρ ω ω−
= −∑∑
( ) ( ) ( ) ( )kiikikkikiiikikkki bbpgpg ωωωωωωρωωρ ,,,, ===
24
• Interference Function
• Therefore profitable unilateral deviations increase V and decrease Φ(ω) – an IRN
1i N k∈ =
( ) ( )2Vω ωΦ = −
13
An IRN 802.11 DFS Algorithm• Suppose each access node
measures the received signal power and frequency of the
Listen onChannel LCpower and frequency of the
RTS/CTS (or BSSID) messages sent by observable access nodes in the network.
• Assumed out-of-channel interference is negligible and RTS/CTS transmitted at same power
C
RTS/CTSenergy detected? Measure power
of access node in message, p
Note address of access node, a
U d t
Pick channel tolisten on, LC
yn
Start
25( ) ( )jkkkjkjjjk ffpgffpg ,, σσ =
( ) ( ) ( )\
,i i ki k i kk N i
u f I f g p f fσ∈
= − = − ∑
( )1
,0
i ki k
i k
f ff f
f fσ
=⎧= ⎨ ≠⎩
Update interference
tableTime for decision?
Apply decision criteria for new
operating channel, OCUse 802.11h
to signal change in OC to clients
yn
Statistics• 30 cognitive access nodes in European UNII
bands• Choose channel with lowest interference
40
50
60
70
rfere
nce
(dB
)
Reduction in Net Interference
Choose channel with lowest interference• Random timing• n=3• Random initial channels• Randomly distributed positions over 1 km2
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
Number of Access Nodes
Red
uctio
n in
Net
Inte
r
Round-robin Asynchronous Legacy Devices
Reduction in Net Interference
© Cognitive Radio Technologies, 2007
26
14
Ad-hoc Network• Possible to adjust previous
algorithm to not favor access d li tnodes over clients
• Suitable for ad-hoc networks• CRT has IRN based distributed
zero-overhead low-complexity algorithms for – Spreading codes– Power variations
Subcarrier allocation
27
– Subcarrier allocation– Bandwidth variations– Activity levels weighted by
interference– Noninteractive terms – modulation,
FEC, interleaving– Beamforming– And combinations of the above
Comments on Potential Games• All networks for which there is not a better response interaction loop
is a potential game• Before implementing fully distributed GA SA or most CBR decision• Before implementing fully distributed GA, SA, or most CBR decision
rules, important to show that goals and action satisfy potential game model
• Sum of exact potential games is itself an exact potential game– Permits (with a little work) scaling up of algorithms that adjust single
parameters to multiple parameters • Possible to combine with other techniques
– Policy restricts action space, but subset of action space remains a potential game (see J. Neel, J. Reed, “Performance of Distributed D namic Freq enc Selection Schemes for Interference Red cing
© Cognitive Radio Technologies, 2007
28
Dynamic Frequency Selection Schemes for Interference Reducing Networks,” Milcom 2006)
– As a self-interested additive cost function is also a potential game, easy to combine with additive cost approaches (see J. Neel, J. Reed, R. Gilles, “The Role of Game Theory in the Analysis of Software Radio Networks,” SDR Forum02)
• Read more on potential games:– Chapter 5 in Dissertation of J. Neel, Available at
http://scholar.lib.vt.edu/theses/available/etd-12082006-141855/
15
Token Economies• Pairs of cognitive radios exchange tokens for
services rendered or bandwidth rented• Example:
– Primary users leasing spectrum to secondary users • D. Grandblaise, K. Moessner, G. Vivier and R. Tafazolli,
“Credit Token based Rental Protocol for Dynamic Channel Allocation,” CrownCom06.
– Node participation in peer-to-peer networks• T. Moreton, “Trading in Trust, Tokens, and Stamps,”
© Cognitive Radio Technologies, 2007
29
g pWorkshop on the Economics of Peer-to-Peer Systems, Berkeley, CA June 2003.
• Why it works – it’s a potential game when there’s no externality to the trade– Ordinal potential function given by sum of utilities
Comments on Network Options• Approaches can be combined
– Policy + potentialy p– Punishment + cost adjustment– Cost adjustment + token economies
• Mix of centralized and distributed is likely best approach• Potential game approach has lowest complexity, but
cannot be extended to every problem• Token economies requires strong property rights to
b h i
© Cognitive Radio Technologies, 2007
30
ensure proper behavior• Punishment can also be implemented at a choke point in
the network
16
Global Altruism: distributed, but more costly• Concept: All radios distributed all relevant information to all other
radios and then each independently computes jointly optimal solutionsolution– Proposed for spreading code allocation in Popescu04, Sung03– Used in xG Program (Comments of G. Denker, SDR Forum Panel
Session on “A Policy Engine Framework”) Overhead ranges from 5%-27%
• C = cost of computation• I = cost of information transfer from node to node• n = number of nodes• Distributed
31
– nC + n(n-1)I/2• Centralized (election)
– C + 2(n-1)I• Price of anarchy = 1• May differ if I is asymmetric
Improving Global Altruism• Global altruism is clearly inferior to a centralized solution
for a single problem. • However, suppose radios reported information to, and
used information from, a common database– n(n-1)I/2 => 2nI
• And suppose different radios are concerned with different problems with costs C1,…,Cn
• Centralized– Resources = 2(n-1)I + sum(C1,…,Cn)
32
– Time = 2(n-1)I + sum(C1,…,Cn)• Distributed
– Resources = 2nI + sum(C1,…,Cn)– Time = 2I + max (C1,…,Cn)
17
Comments on Cost Adjustments• Permits more flexibility than policy
– If a radio really needs to deviate, then it can• Easy to turn off and on as a policy tool
– Example: protected user shows up in a channel, cost to use that channel goes up
– Example: prioritized user requests channel,
33
Example: prioritized user requests channel, other users’ cost to use prioritized user’s channel goes up (down if when done)
Example Application: • Overlay network of secondary
users (SU) free to adapt power transmit time and
• Without REM:– Decisions solely based on link
SINRpower, transmit time, and channel
SINR• With REM
– Radios effectively know everythingUpshot: A little gain for the secondary users; big gain for primary users
34From: Y. Zhao, J. Gaeddert, K. Bae, J. Reed, “Radio Environment Map Enabled Situation-Aware Cognitive Radio Learning Algorithms,” SDR Forum Technical Conference 2006.
18
Comments on Radio Environment Map• Local altruism also possible
Less information transfer– Less information transfer• Like policy, effectively needs a common
language• Nominally could be centralized or distributed
databaseR d
35
• Read more: – Y. Zhao, B. Le, J. Reed, “Infrastructure Support – The
Radio Environment MAP,” in Cognitive Radio Technology, B. Fette, ed., Elsevier 2006.
5/11/2010
1
Summary and ConclusionsSummary and Conclusions • Summary of Critical
Concepts• The Future Role of Game
Theory in the Design and Regulation of Dynamic
© Cognitive Radio Technologies, 2010
1
g ySpectrum Access Networks
• Topics for Further Study and Research
WSU May 10, 2010
What does game theory bring to the design of cognitive radio networks? (1/2)• A natural “language” for modeling cognitive radio
networks• Permits analysis of ontological radios
– Only know goals and that radios will adapt towards its goal
• Simplifies analysis of random procedural radios• Permits simultaneous analysis of multiple
decision rules – only need goal
2
y g• Provides condition to be assured of possibility of
convergence for all autonomously myopic cognitive radios (weak FIP)
5/11/2010
2
What does game theory bring to the design of cognitive radio networks? (2/2)• Provides condition to be assured of
convergence for all autonomously myopic g y y pcognitive radios (FIP, not synchronous timing)
• Rapid analysis– Verify goals and actions satisfy a single model, and
steady-states, convergence, and stability• An intuition as to what conditions will be needed
to field successful cognitive radio decision rules.
3
• A natural understanding of distributed interactive behavior which simplifies the design of low complexity distributed algorithms
Game Models of Cognitive Radio Networks• Almost as many models
as there are algorithms• Normal Form Game
– ⟨N, A, {ui}⟩• Supermodular Game ( )2
0iu a∂≥• Normal form game
excellent for capturing single iteration of a complex system
• Most other models add features to this model – Time, decision rules, noisy
observations Natural states
• Supermodular Game–
• Potential Game–
• Repeated Game– ⟨N, A, {ui}, {di}⟩
• Asynchronous Game– ⟨N, A, {ui}, {di}, T⟩
Extensive Form Game
( )0
i ja a≥
∂ ∂
( ) ( )22ji
i j j i
u au aa a a a
∂∂=
∂ ∂ ∂ ∂
4
observations, Natural states• Some can be recast as a
normal form game– Extensive form game
• Extensive Form Game– ⟨N, A, {ui}, {di}, T⟩
• TU Game– ⟨N, v⟩
• Bargaining Game– ⟨F, v⟩
5/11/2010
3
Steady-states• Different game models have
different steady-state concepts
• Nash Equilibrium• Strong Nashconcepts
• Games can have many, one, or no steady-states
• Nash equilibrium (and its variants) is most commonly applied concept– Excellent for distributed
noncollaborative algorithms• Games with punishment and
Coalitional games tend to
• Strong Nash Equilibrium
• Core• Shapley Value• Nash Bargaining
Solution
5
Coalitional games tend to have a very large number of equilibria
• Game theory permits analysis of steady-states without knowing specific decision rules
Optimality• Numerous different
notions of optimality• Pareto Efficiency
Obj tinotions of optimality• Many are
contradictory• Use whatever metric
makes sense
• Objective Maximization
• Gini Index• Shapley Value• Nash Bargaining
6
Solution
5/11/2010
4
Convergence• Showing existence
of steady-states is yinsufficient; need to know if radios can reach those states
• FIP (potential games) gives the broadest convergence
7
convergence conditions
• Random timing actually helps convergence
Noise• Unbounded noise causes
all networks toall networks to theoretically behave as ergodic Markov chains
• Important to show Lyapunov stabiltiy
• Noisy observations cause noisy implementation to
8
noisy implementation to an outside observer– Trembling hand
5/11/2010
5
Game Theory and Design• Numerous techniques for
improving the behavior of cognitive radio networks
• Supermodular games– Steady state exists– Best response convergencecognitive radio networks
• Techniques can be combined• Potential games yield lowest
complexity implementations– Judicious design of goals,
actions, • Practical limitations limit
effectiveness of punishment– Observing actions
• Potential games– Identifiable steady-states– All self-interested decision rules
converge– Lyapunov function exists for isolated
equilibria• Punishment
– Can enforce any action tuple– Can be brittle when distributed
• Policy– Limits worst case performance
9
– Likely best when a referee exists
• Policy can limit the worst effects, doesn’t really address optimality or convergence issues
Limits worst case performance• Cost function
– Reshapes preferences– Could damage underlying structure if
not a self-interested cost• Centralized
– Can theoretically realize any result– Consumes overhead– Slower reactions
Future Directions in Game Theory and Design• Integrate policy and potential games• Integration of coalitional and distributed
forms• Increasing dimensionality of action sets
– Cross-layerI t ti f d i d hi hi l
10
• Integration of dynamic and hierarchical policies and games
5/11/2010
6
Future Direction in Regulation• Can incorporate optimization into policy by
specifying goalsspecifying goals• In theory, correctly implementing goals,
correctly implementing actions, and exhaustive self-interested adaptation is enough to predict behavior (at least for potential games)
11
potential games)– Simpler policy certification
• Provable network behavior
Avenues for Future Research on Game Theory and CRNs• Integration of bargaining,
centralized, and distributed algorithms into a common
• Imperfection in observations in general
algorithms into a common framework
• Cross-layer algorithms• Better incorporating
performance of classification techniques into behavior
• Asymmetric potential games• Bargaining algorithms for
• Time varying game models while inferring convergence, stability…
• Combination of policy, potential games, coalition formation, and token economies
• Can be modeled as a game
12
Bargaining algorithms for cognitive radio
• Improving the brittleness of punishment in distributed implementations with imperfect observations
with to types of players– Distributed cognitive radios– Dynamic policy provider
5/11/2010
7
Questions?
13
www.crtwireless.com