stochastic coalitional games for cooperative random access ... · stochastic coalitional games for...

Stochastic Coalitional Games for CooperativeRandom Access in M2M Communications

Mehdi Naderi Soorki, Walid Saad, Mohammad Hossein Manshaei, and Hossein Saidi

Abstract—In this paper, the problem of random access con-tention between machine type devices (MTDs) in the uplink ofa wireless cellular network is studied. In particular, the possi-bility of forming cooperative groups to coordinate the MTDs’requests for the random access channel (RACH) is analyzed.The problem is formulated as a stochastic coalition formationgame in which the MTDs are the players that seek to formcooperative coalitions to optimize a utility function that captureseach MTD’s energy consumption and time-varying queue length.Within each coalition, an MTD acts as a coalition head thatsends the access requests of the coalition members over theRACH. One key feature of this game is its ability to copewith stochastic environments in which the arrival requests ofMTDs and the packet success rate over RACH are dynamicallytime-varying. The proposed stochastic coalitional is composedof multiple stages, each of which corresponds to a coalitionalgame in stochastic characteristic form that is played by theMTDs at each time step. To solve this game, a novel distributedcoalition formation algorithm is proposed and shown to convergeto a stable MTD partition. Simulation results show that, on theaverage, the proposed stochastic coalition formation algorithmcan reduce the average fail ratio and energy consumption ofup to 36% and 31% for a cluster-based distribution of MTDs,respectively, compared to a noncooperative case. Moreover, whenthe MTDs are more sensitive to the energy consumption (queuelength), the coalitions’ size will increase (decrease).

Index Terms— Game Theory; Machine-to-Machine Communica-tions; Internet of Things; Coalitional Games.

I. INTRODUCTION

Machine-to-machine (M2M) communication between ma-chine type devices (MTDs) such as sensors or wearableswill lie at the heart of tomorrow’s Internet of Things (IoT)system [1]. In order to support massive M2M communica-tions, there is a need for a reliable wireless infrastructure. Inthis respect, cellular networks provide an ideal platform forM2M communications, due to their proven effectiveness andreliability. However, deploying M2M over cellular networkssuch as LTE faces many challenges that range from networkdeployment to resource allocation and multiple access [1]–[6].

In particular, in cellular LTE systems, whenever a deviceintends to access the network, it begins by following a ran-dom access (RA) procedure that is done before the resource

M. Naderi Soorki, M. H. Manshaei, and H. Saidi are with the De-partment of Electrical and Computer Engineering, Isfahan University ofTechnology, Isfahan 84156-83111, Iran. e-mail: [email protected],{manshaei,hsaidi}@cc.iut.ac.ir.

W. Saad is with the Wireless@VT, Bradley Department of Electrical andComputer Engineering, Virginia Tech, Blacksburg, VA 24061, USA, e-mail:[email protected]

This research was supported by the Office of Naval Research (ONR) underGrant N00014-15-1-2709.

Manuscript received 13 April 2016; revised 25 Sep. 2016, second revised15 Jan. 2017, third revised 24 April 2017, accepted 15 June 2017.

allocation phase as explained in [4] and [7]. During the RAprocedure, each device needs to send a preamble signal over aspecial channel, known as the physical random access channel(RACH), which is used to transmit an initial preamble. Inthe RA procedure of a cellular network, the RACH is formedby a periodic sequence of allocated time-frequency resources,called random access (RA) slots. These slots are reserved inthe uplink channel of the network for the transmission ofaccess requests [7]. Thus, the transmission of preambles thatshows the requests made by MTDs for uplink resource, issynchronized during the RA slots. Each device selects thepreamble uniformly from the available preambles [4]. LTEtypically uses a contention-based random access procedurefor the initial association to the network, for the request ofresources for transmission, and for re-establishing a connectionupon failure [4] and [7]. The contention-based RA procedureconsists of a four-message handshake between any device suchas MTDs and the base station (BS) in order to successfullytransmit an initial preamble. The RA procedure in existingcellular systems has been mainly designed for human-to-human (H2H) communication scenarios in which the amountof uplink (UL) traffic is normally lower than the downlink(DL) traffic. In contrast, M2M applications will producesignificantly more UL traffic than in the downlink [1]. Inan M2M scenario, during the RA process of LTE, a largenumber of MTDs will simultaneously attempt to access ashared preamble. This can have several drawbacks such as to alow random access success rate, the waste of radio resources,packet loss, latency, and extra energy consumption as pointedout in [1] and [4].

Several recent works have proposed new techniques forreducing RA congestion in M2M scenarios such as in [4], [8]–[13]. A number of such works, such as [4], [9], and [10], focuson the RA process that involves preamble selection, designinga new preamble sequence, and efficient preamble allocation.In [11], the authors propose a new backoff algorithm whilethe work in [14] introduces a prioritized RA architecture. Theobjective in these works is to improve RA efficiency. On theother hand, there has been a number of recent works suchas [6], [12], [13], [15]–[19], and [20] that focus on how tocluster MTDs in an efficient way so as to decrease the loadover the RACH. An enhanced RA scheme based on spatialgrouping for reusable preamble allocation is proposed in [12]and [13]. This scheme reuses the preamble resources based onspatial grouping during the RA procedure. A group mobilitymanagement mechanism is studied in [15] using which MTDsare grouped based on the similarity of their mobility patternsat the location database, and only the leader machine performsmobility management. Clustering techniques based on quality-

arX

iv:1

707.

0775

9v1

[cs

.GT

] 2

4 Ju

l 201

7

2

of-service (QoS) requirements have been proposed [16], [18]and [19], for power allocation and energy-efficient M2Mcommunications. Optimal cluster formation and power controlfor maximizing throughput and minimizing transmit power arederived using mixed-integer non-linear programming in [6]and [20].

These existing works for M2M clustering such as in [6],[12], [13], [15]–[19], and [20] have mainly focused on cluster-ing using spatial metrics such as the distance between MTDsand some basic QoS metrics. Moreover, these works rely oncentralized algorithms that are normally used to find the opti-mal clustering of the network. However, a centralized approachrequires collecting a significant amount of information on therandom arrival of requests at the level of the MTDs as wellas on the random collisions that can occur over the RACH. Inan M2M network, this information collection must be updatedevery time slot due to the stochastic changes in the arrival ofrequests to the MTDs and the possible collisions. If done ina centralized manner, such a dynamic information update willsignificantly increase the signaling overhead over the uplink ofthe M2M network and, thus, will not be practical. In additionto signaling overhead, a centralized approach to coalitionformation is generally known to be NP-complete as shownin [21], especially for large number of MTDs. The complexityof such centralized approach grows exponentially with thenumber of MTDs because the number of all possible partitionsfor MTD set given by a value known as the Bell number [21].Thus, to decrease the complexity and signaling overheadit is highly desirable to equip the MTDs with distributedcooperative strategies that require little or no reliance oncentralized entities such as base stations. Moreover, clusteringMTDs in a practical cellular network must account not onlyfor spatial metrics and QoS such as in [6], [12], [13], [15]–[19] and [20], but also for the stochastic changes in the M2Mcommunication environment. Thus, the need for a stochasticcoalition formation approach results from the fact that, in prac-tice, an M2M communication network is highly dynamic andstochastic in nature. This stochastic nature stems from variousfeatures of the M2M environment such as random arrival ofaccess requests to the MTDs and random preamble collisionover the RACH. Note that modeling and capturing the variousdynamics of the M2M system, such as the random requeststhat arrive at the MTDs or random collision over RACH isvery challenging even for a single-cell scenario. In fact, thisdynamic clustering problem has not been considered in any ofthe existing literature on M2M such as the works in [6], [12],[13], [15]–[19] and [20] that also consider a single BS. Theadvantages of using a stochastic coalition formation approachare: 1) distributed solutions do not require any database thatrecords information such as the MTDs’ locations or stochasticenvironment changes such as the random arrival of requestsat the level of the MTDs or the collisions that can occur overthe RACH, 2) the signaling overhead for updating dynamicallyvarying information decreases in a coalitional game solutiondue to the fact that the MTDs will autonomously performcoalition formation to adapt to the new changes without anyneed to send any information to a centralized controller, 3)the complexity of clustering MTDs is more manageable in

a distributed coalitional game solution because the MTDswill individually perform distributed coalition formation and,unlike in the centralized approach, there is no need to searchover all the partitions of the MTDs’ set, and 4) a coalitionalgame formulation allows understanding how each MTD canmake its own decision on forming cooperative group in aself-configuring M2M network. Here, we note that the useof overlapping coalition formation approaches such as theones in [22] and [23] is not suitable for M2M commu-nication scenarios. Overlapping coalition game models areuseful in problems in which devices can further improve thesystem performance and efficiency by splitting their coalitionmembership between multiple, overlapping coalitions [22]and [23]. In the cooperative M2M random access problem,having an overlap between two coalitions of MTDs will notlead to sharing additional preambles between the overlappingcoalitions. Moreover, the incoming packet rate of the queuesof devices belonging to overlapping coalitions will increase.Consequently, as proposed in this work, one must adopt themore tractable non-overlapping coalition formation approachesfor M2M clustering.

The main contribution of this paper is to analyze theRA procedure for M2M communications and design a newcoalition formation protocol using which the MTDs can au-tonomously form clusters or coalitions in the presence ofstochastic arrival requests and a stochastic number of success-fully transmitted packets over the RACH. We formulate theproblem as a stochastic coalition formation game in whichthe MTDs are the players. In this game, the MTDs seek tocooperate with one another in order to coordinate their RAand use of the RACH. In particular, within each coalition,a coalition head sends the access requests of the coalition’smembers. The performance of each coalition is captured viaa utility function that reflects the number of requests that thecoalition members want to send over the RACH and the energyconsumption of its members during each time slot. To solvethis game, we propose an algorithm that enables the MTDs toform the optimal coalitions while optimizing a utility functionthat captures stochastic changes such as the arrival requests ofMTDs and the packet success rate of the RACH. Under thesestochastic changes, we show that the proposed algorithm canreach a stable partition, if the MTDs are sufficiently farsightedand they value future payoffs more than the present ones.For this game, we compute the required threshold of thefarsighted level of the MTDs that is needed to form stablecoalitions under proposed algorithm. Simulation results showthat the proposed approach can reduce the fail ratio and energyconsumption compared to a traditional noncooperative randomaccess model. The results show that, on the average, theproposed stochastic cooperative random access model providesa reduction of the fail ratio and energy power up to 36% and31% for a cluster-based distribution of MTDs, respectively,compared to a noncooperative case. We note that, althoughcoalitional game theory has been used in many works relatedto wireless communication such as [24], [25] and [26], tothe best of our knowledge, none of these existing works hasdeveloped a stochastic game model, in general, and for M2Mcommunication, in particular. In summary, the novelty of our

3

contribution, compared to existing work, lies in the followingkey points:

• We develop a new, M2M-specific model for the stochasticvalue function of an M2M coalition. This model showsthat, after forming coalitions, the value function of thecoalitions will change during the next time slots andthe players will become uncertain about their payoff infuture time slots. Then, in a given time slot, we use anew coalitional game class, known as games in stochasticcharacteristic function form.

• In addition to modeling the cooperation of MTDs duringone time slot as a coalitional game in stochastic char-acteristic function form, we modeled the cooperation ofMTDs during different time slots as a stochastic coalitiongame. In each stage of the stochastic coalition game,the cooperation of MTDs is modeled using the gamementioned in the previous bullet.

• For finding stable coalitions under unknown stochasticchanges in the value function, we have developed a novelcoalition formation algorithm that explicitly accounts forthe presence of a discount factor δ in the value function.

The rest of this paper is organized as follows. Section IIpresents the noncooperative random access model in a cellularLTE network. In Section III, we model the problem usingstochastic coalition formation in games. Simulation results arepresented and analyzed in Section IV. Finally, conclusions aredrawn in Section V.

II. SYSTEM MODEL

Consider a cellular network composed of one BS and aset M of M MTDs that seek to access the network’s uplinkresources to send their data. In this network, the RACHincludes µ preambles. In each time slot, each MTD willrandomly transmit one of these preambles to the BS. Weconsider an infinite number of discrete time slots {1, 2, ..., t, ...}with duration T for each RA slot. In each RA slot, every MTDm ∈ M transmits its access request with a fixed probabilityp. This is a practical assumption when considering the accessbarrier algorithm that is used to reshape and distribute thetraffic over the RACH, as discussed in [14] and [27]. Sinceour model focuses on the RA process, we do not considerhuman type devices as their impact will be equivalent to theMTDs.

When the MTDs are acting in a noncooperative manner,if an MTD m selects an RA preamble while other MTDsthat want to transmit RA requests do not select that specificpreamble, then MTD m can successfully transmit an RArequest since there will be no collisions. Hence, if onlyMTD m submits an RA request then the probability of itssuccessful transmission will be equal to the probability ofdata transmission p. Thus, the probability that an MTD msuccessfully transmits an RA request in a noncooperativemanner is given by:

PMs = p(1 − p)M−1 +

M∑j=2

1µ

(1 − 1

µ

) j−1PM ( j), (1)

where PM ( j) = M!j!(M−j)! pj(1 − p)M−j is the probability that j

MTDs out of a total of M MTDs transmit their access requests

during the current RA time slot. In (1), 1µ

(1 − 1

µ

) j−1is the

probability that MTD m selects a preamble which is differentfrom the preambles that were selected by the other j−1 MTDs.

Due to the random nature of the arrival requests at eachMTD and the departure requests over the RACH, we define adiscrete-time queueing system with Qm,t being the number ofrequests that MTD m has buffered at time slot t. We assumethat the maximum number of requests in this queue is K . WhenMTDs do not cooperate, the change in the queue length of eachMTD m at each time slot is given by:

Qm,t+1 = max{Qm,t − dm,t + am,t,K}, (2)

where am,t and dm,t represent, respectively, the number ofarrival and departure requests for MTD m conditioned on thepacket success rate over the RACH at time slot t. Qm,t canbe modeled as a G/G/1 queue which represents the queuelength in a system with a single server where inter-arrival timeshave a general (arbitrary) distribution and service times havea (different) general distribution [28]. In particular, we havePr(am,t = 1) = p and Pr(dm,t = 1) = PMs . (2) models the B-bitpreambles as a queue at the MAC layer. Some packets can bedropped due to the limited length of the buffer at the MAClayer or due to collision over the RACH. At the MAC layer,the loss is defined by the packet loss rate. For sending oneB-bit preamble from the queue of the MAC layer, each MTDmust transmit B bits over the RA slot.

For simplicity, we assume that each MTD transmits a single,fixed-sized packet request of size B bits to the BS with transmitpower Pz

LR. Let ZLR be the number of subcarriers in thenetwork, with each subcarrier having a bandwidth Bz . Lethzm = H0 |dm |−νξ be the channel gain for cellular link between

MTD m and BS, where H0 is the path loss constant, dm isthe distance between MTD m and the BS, ν is the path lossexponent, and ξ is the flat fading Rayleigh parameter withmean 1. The achievable rate of the cellular link between MTDm and BS for subcarrier z can be given by:

Rm = Bz log2(1 +PzLRhz

m

N0 +∑

n,m,z′=zhz′n Pz′

LR

), (3)

where∑

n,m,z′=zhz′n Pz′

LR is the interference received from other

MTDs over the cellular link. (3) shows the achievable rate overthe resource of RA slot in the physical layer of the cellularlink. The loss in the physical layer is due to channel gain,noise, and interference which are captured by (3). The timeneeded to send the B-bit packet will be B

Rm. We introduce

a power allocation mechanism that allocates power over thecellular links to guarantee B

Rm≤ T at each time slot. Thus,

EzLR = Pz

LRBRm

. The energy consumption per-packet over thecellular link ELR is defined as the total energy spent duringthe RA procedure until the successful transmission of the firstpacket over the RACH [4]. In such a noncooperative manner,the average per-request energy consumption of each MTD mis given by the series: EM

LR = PMs EzLR + PMs (1 − PMs )2Ez

LR +

PMs (1 − PMs )23EzLR + ... which can be written as follow:

4

EMLR =

∞∑t=1

PMs (1 − PMs )t−1tEzLR =

∞∑t=1

PMs (1 − PMs )t−1EzLR +

∞∑t=2

PMs (1 − PMs )t−1(t − 1)EzLR =

∞∑t=1

PMs (1 − PMs )t−1EzLR+

(1 − PMs )( ∞∑t=2

PMs (1 − PMs )t−2(t − 1)EzLR

)=

EzLR

PMs. (4)

Here, if the probability of successful transmission overRACH is equal to 1, PMs = 1, the average per-request energyconsumption of each MTD m will be Ez

LR because there is nocollision over the RACH.

The massive amount of incoming access requests stem-ming from MTDs can lead to a low packet success rateof the RACH, increased packet loss, intolerable latency, andincreased energy consumption [4], [29]. For M2M communi-cations, the number of access requests that the MTDs can sendis a more important metric than the bit rate or throughput. Thisis due to the fact that, the MTDs usually need to send data at avery low bit rate (M2M traffic payload size is small) becausethe size of the messages is generally very short in M2Mapplications (e.g. very few bits coming from a smart meteror sensor, or even just 1 bit used to inform of the existence orabsence of a given event) [4]. Thus, in the presence of a mas-sive number of MTDs, one must develop new approaches todecrease packet loss and energy consumption. Consequently,the goal for each MTD m is to minimize two objectives: queuelength Qm(t) and energy consumption ETL . For each MTD m,the objective can be viewed as a multi-objective optimizationproblem in which the MTD must balance the tradeoff betweenqueue length and energy consumption. According to linearscalarization technique [30], the single objective of a multi-objective optimization scalarized problem is the weightedsummation of multiple objectives, with the weights being theparameters of the scalarization. Thus, for each MTD m theobjective can be given by:

min(αmQm,t + βmEMTL), (5)

where αm and βm are the weights or preferences of MTDm with respect to the queue length and average energyconsumption per-packet, respectively. These parameters areused to adjust the scales and units of the queue length (numberof packets) and the energy consumption (joule).

Formally, a coalition S ⊆ M is defined as a subset of M,while a partition Πt = {S1,S2, ...,S |Πt |} is a set of mutuallydisjoint coalitions that span all ofM at time slot t. Whenever acoalition S of MTDs forms, its members exchange the accessrequests in their queues , Qm,t , over short-range (SR) M2Mchannels. Thus, the queue of requests in the coalition Si isequal to QSi,t =

∑m∈Si Qm,t . Let Q(t) = {Q1,Q2, ...,Q |Πt |} be

a set that represents the lengths of the queues of all coalitionsat time slot t. Once an MTD in S is selected as a coalitionhead, it will be responsible to send the access request of itscoalition’s members over the RACH. This coalition head willbe referred to as a machine type head (MTH). In our model,

the set of MTHs of all coalitions within a partition Πt is Ht =

{H1,H2, ...,H |Πt |} at time slot t.Let aSi,t and dSi,t be, respectively, the arrival rate of

requests to Si and the departure rate of requests from coalitionSi at time slot t. During each time slot, aSi,t can change from0, which means that none of the MTDs in the coalition Si issending an access request, to |Si | which implies that all of theMTDs in the coalition Si need to send an access request. Theprobability that aSi,t = n, where 0 ≤ n ≤ |Si |, is given by:

Pr(aSi,t = n) = |Si |!n!(|Si | − n)! pn(1 − p) |Si |−n. (6)

Here, dSi,t can be 0 or 1, because the head of coalition Sican successfully send data or a RACH collision may occur,during each time slot. The probability that dSi,t = 1 is:

Pr(dSi,t = 1) = PHts =

1µ

(1 − 1

µ

) |Ht |−1, (7)

where |Ht | −1 is the number of all coalition heads except thatof coalition Si . The queues of all of these |Ht | − 1 heads arenot empty and all of them want to access to RACH duringtime slot. Then, the evolution of queue length of coalition Sican be given by:

QSi,t+1 = max{QSi,t − dSi,t + aSi,t, k}. (8)

The evolution of the queue length at each MTD m in thecoalition Si , can be given by:

Qm,t+1 = max{Qm,t − wqmdSi,t + am,t, k}, (9)

where wqm is a coefficient that is related to fair scheduling.

In this regard, each coalition head applies a fair schedulingscheme to select the request of its coalition’s members and,subsequently, send it over RACH. For example, if we usea simple round-robin scheme in which the head will collectsequentially one request from each member in the coalitionSi from queue Qi , the coefficient in (9) will be w

qm =

1|Si | .

We define PzSR

as transmission power over a direct M2Mlink between a MTD and MTH. Let ZSR be the subcarriersfor M2M links, with each subcarrier having a bandwidth Bz .Further, we let hz

im = H0 |dim |−νξ as the channel gain for theM2M link between MTD m and head of the coalition i wheredim is the distance between MTDs i and m. The achievable rateof the M2M link between MTD m and head of the coalitioni for subcarrier z can be given by:

Rim = Bz log2©«1 +

PzSR

hzim

N0 +∑

n,m,z′=zhz′nmPz′

SR

ª®®¬ . (10)

Here, we assume that M2M communications occur oversubcarriers that are orthogonal to the uplink cellular commu-nication links. Thus, there is no interference between uplinkcellular links and M2M links. Since M2M links are sharedamong MTDs to transmit data to the MTHs, there is interfer-ence among M2M links. The decoding success probability ofMTHs depends on the interference received from other MTDs.Assuming an interference limited regime, each MTH listenssuccessfully receives the MTD packets during one time slot

5

MTD1

MTD2

MTD4

MTD3

All machine type devices want to access to the RACH.

Only the heads of coalitions want to access to the RACH. (CFR protocol)

Heavy load RACH

MTD5

MTD6

MTD7

MTD2

MTD4

MTD3

MTD6

MTD7

Low load RACH

MTD5

MTD1

H = fH1; H2; H3g M = fMTD1; MTD2; :::; MTD7g

S1

S2

S3

¼ = fS1;S2;S3g

Fig. 1. An illustration example of stochastic coalition formation and tradi-tional RACH for M = 7.

if the channel gains between members of each coalition andallocated power should be high enough to guarantee conditionB

Rim≤ T . The per-packet energy consumption over M2M link

ESR is defined as the total energy consumed by an MTD totransmit a single fixed-sized packet request to the MTH of itscoalition. It is given by ESR = Pz

SRB

Rim.

Fig. 1 shows the noncooperative and cooperative randomaccess model of 7 MTDs for M2M applications. Under anoncooperative random access model, all 7 MTDs individuallysend their access request on their cellular links to BS. Inthis case, the RACH model is overloaded by all 7 MTDs.However, under a cooperative random access model, MTDs,which are within the coverage of the short-range M2M links,form a cooperative coalition. Following Fig. 1, MTD 3, MTD6 and MTD 7 form coalition S1, MTD 2 forms coalition S2and MTD 1, MTD 4 and MTD 5 form coalition S3. Thus,during time slot t, the 7 MTDs form cooperative coalitionsΠ = {S1,S2,S3}. In this case, the RACH load is affected justby the three MTHs of the formed coalitions which is muchless than 7 MTDs in noncooperative model.

A. Optimal Solution for Cooperative Random Access in M2MCommunications

We present an optimization formulation for the cooperativerandom access in M2M communications. If global networkinformation is available, then the optimal coalitions can becomputed centrally at the base station. Given a networkpartition Π composed of N coalitions, then, let kmn be a binaryvariable such that kmn = 1 if MTDm belongs to the coalitionSn otherwise kmn = 0. The centralized optimization problemwill be:

min{Π,[kmn]M×N }

∑m

∞∑t=t0

δt−t0Qm,t, (11)

∞∑t=t0

δt−t0(wmELR(Π) + ESR(Si)

)≤ Emax, ∀m ∈ M, (12)

kmn × krnBT≤ Rrm, ∀Sn ∈ Π, (13)∑

n

kmn = 1 , ∀m ∈ M, (14)

∑m∈M

∑n∈N

kmn = M, (15)

kmn ∈ {0, 1}, ∀m ∈ M, ∀Si ∈ Π. (16)

The objective function in (11) minimizes the discounted sumof the queue lengths of the MTDs over time. (12) guaranteesthat the discounted sum of per-request energy consumption ofall MTDs is less than its predefined maximum value. (13)indicates that the bit rate of the M2M link between twoMTDs in each coalition should be high enough to allow thetransmission of one B-bit packet between them. (14) showsthat each MTD can only be on one coalition. (15) guaranteesthat all MTDs are considered in the coalition formation.(16) shows that kmn is a binary. (11) is a combinatorialoptimization problem with an exponential search space. Thereason is that the number of all partitions given by a valueknown as the Bell number which grows exponentially with thenumber of MTDs in the coalitions [21] and [31]. (11) can besolved by potentially deterministic exhaustive search algorithm[21]. However, due to size of our problem, we apply geneticalgorithms which are a class of stochastic search algorithmsthat have been used widely to solve large-scale NP-completecombinatorial optimization problems including searching foroptimal coalition structures [31].

III. STOCHASTIC GAME-THEORETIC MODEL FOR M2MCOOPERATION

In this section, we model the proposed cooperative randomaccess model for M2M communication over the RACH usingcooperative game theory [32]. First, we focus on the coop-eration of MTDs during each time slot given the stochasticqueue lengths of MTDs and we model it using a coalitiongame in stochastic characteristic function form (CGSC) [33].The goal of this CGSC is to find the formed coalitionsduring each time slot. Then, for modeling the cooperation ofMTDs during different time slots, we use the framework ofstochastic coalitional games (SCGs) [34]. An SCG captureshow the formation of cooperative coalitions can change due tostochastic factors, such as random arrivals. Thus, in our model,an SCG can be seen as a repeated game with one of its stagesbeing a CGSC that is played by MTDs. Finally, we propose analgorithm to solve the SCG in stochastic characteristic functionform and find a stable partition.

A. Coalitional game formulation during one time slot

In each time slot, a coalitional game in stochastic charac-teristic function form is uniquely defined by the pair (M, vt ),where the setM of players is the set of MTDs and vt : Πt →R |Si | is a stochastic value that reflects the utilities achievedby the coalitions formed at a given time slot t. This gameis different from a classical coalitional game such as in [35]in that the coalitional value is stochastic. First, from (7) and(8), we can see that queue length of each coalition, whichdepends on the packet success rate over RACH, is affected byother coalitions. Thus, the cooperation model in each time slotcan be mapped to a coalitional game in partition form which

6

can be significantly challenging to solve when the value isstochastic [36]. To over come this challenge, the cooperatingMTDs will assume that all of the other MTDs at each time slotwant to access the RACH and do not form any coalition. Suchan assumption maps to a conservative strategy in which theMTDs in each coalition assume the worst-case collision ratefrom other MTDs. This is in line with existing works suchas the jamming games in [37], in which it is assumed thatall other players jam a coalition in a cognitive radio network.Consequently, in this worst case scenario, the probability ofthe packet success rate over RACH, dSi,t = 1, is given by:

Pr(dSi,t = 1) = PHs =1µ(1 − 1

µ)M−|Si | . (17)

In this case, we can compute the value of each coalitionindependently of the coalition decisions of other MTDs. Now,the value of a coalition Si will depend only on its members,vt : Si → R |Si | , and, thus, we have a game in characteristicfunction form [36].

Given the dynamic changes in queues , Qt , and the probabil-ity of the packet success rate, PHt

s , over the RACH, the valueof the coalitions will randomly change over time. Thus, duringeach time slot t, the stochastic value of this game is a randomvariable which consists of two components (ud , ur ) [33]: adeterministic value (ud) gained in the current time slot and arandom value (ur ) that captures the prospective gains in futuretime slots.

1) Deterministic value for the current time slot: The de-terministic value for the current time slot of a coalition canbe directly computed by the members of a coalition sinceit is a certain outcome. This deterministic component is notrelated to the random value changes that will happen infuture time slots. We consider two cooperation schemes forthe proposed RACH coalitional game: altruistic cooperation,in which MTDs cooperate to maximize the value of theircoalition or selfish cooperation in which MTDs may cooperateto increase their individual payoffs.

Altruistic cooperation: altruistic MTDs seek to decreasethe overall group queue length and group energy consumptionof their formed coalition. In such a scenario, MTDs are mainlyconcerned with the overall gain of the coalition rather thantheir individual payoffs. In this case, the deterministic currenttime slot value of the coalition ud(Si,QSi,t ) ∈ R is a realvalue that is equal to the payoff of each machine MTD m inthe coalition Si . For each MTD m in Si , the payoff functionwill be equal to the value function::

ud(Si,QSi,t ) = −αm |QSi,t |−βm

(EHLR

|Si |+ ESR

)−γm |Si |, (18)

where γm is a unit cost parameter for MTD m. EHTL and QSi,t

are respectively given by (4) and (8), when we use (17) as theprobability of the packet success rate over the RACH whichis same as the probability of departure from the queue.

Selfish cooperation: each selfish MTD will seek to decreaseits individual queue length and energy consumption in theformed coalition, while disregarding the overall social welfareof the entire coalition. In this case, the deterministic currenttime slot value of the coalition is no longer a function over

3

Qi(t)Qi(t)¡ 1 Qi(t) + 1 Qi(t) + jSij

Pr(di;t = 1,ai;t = 0) Pr(di;t = 0,ai;t = 1)+Pr(di;t = 1,ai;t = 2)

Pr(di;t = 0,ai;t = jSij)

Pr(di;t = 0,ai;t = 2)+Pr(di;t = 1,ai;t = 3)

Qi(t) + 2

Pr(di;t = 0,ai;t = 0)+Pr(di;t = 1,ai;t = 1)

Fig. 2. The transition probability model for queue changes during one timeslot.

the real line. Instead, the deterministic current time slot valueof a coalition is a set of payoff vectors, Ud(Si,QSi,t ) ⊆ R |Si | ,where each element ud,m(Si,QSi,t ) of any vector ud ∈ Ud

represents the payoff of each MTD m in the coalition Si ,which is given by:

ud,m(Si,QSi,t ) = −αm |Qm,t | − βm(wemEH

LR + ESR) − γm |Si |,(19)

where EHTL and Qm,t are respectively given by (4) and (9) when

we use (17) as a probability of the RACH packet success rate(which is the same as the probability of departure from thequeue) and we

m is the fairness weight for time of being head.2) Random value for future time slots: To capture the

random component of the value of each coalition, we considerdiscounted future rewards for each MTD. For each MTD mat time slot t, a discounted future reward is the sum of futurepayoffs during the following time slots, t+1, t+2, ..., which arediscounted by a constant factor [38]. If we consider the samediscount factor δ, 0 ≤ δ ≤ 1, for all MTDs, then a higher δwill reflect MTDs that are farsighted and, thus, more interestedin the payoff of the next time slots rather than the currentone. Thus, the random values ur (Si,QSi,t ) for the altruisticscheme, and ur,m(Si,QSi,t ) for the selfish scheme that capturethe payoff during future time slots, will be given by:

ur (Si,QSi,t ) =nδ∑

n=t+1δn−t ud(Si,QSi,t→n)

ur,m(Si,QSi,t ) =nδ∑

n=t+1δn−t ud,m(Si,QSi,t→n) (20)

Here, nδ is a discrete value where δm ' 0 for nδ ≤ m.ud(Si,QSi,t→n) and ud,m(Si,QSi,t→n) are stochastic processeswhich stochastically change from current time slot t to thefuture time slot n. ud(Si,QSi,t→n) and ud,m(Si,QSi,t→n) arethe expectation of random changes of ud(Si,QSi,t→n) andud,m(Si,QSi,t→n) from time slot t to the time slot n. Thesevalues depend on the change in the queue during n − t timeslots, QSi,t→n. The stochastic change of the queue in one timeslot, QSi,t→t+1, is modeled by the probabilistic model shownin Fig. 2. Since the arrivals and departures of requests fromeach coalition are independent event, Pr(di,t = i, ai,t = j) =Pr(di,t = i) × Pr(ai,t = j) where Pr(di,t = i) and Pr(ai,t = j)are given by (6) and (7). In general, ud(Si,QSi,t→n) is:

ud(Si,QSi,t→n) = ∆nt ud(Si,QSi,n), for the altruistic scheme, and

ud,m(Si,QSi,t→n) = ∆nt ud,m(Si,QSi,n), for the selfish scheme.

(21)

7

Here, ud(Si,QSi,n) and ud,m(Si,QSi,n) are given by (18) and(19), respectively, and ∆n

t is the probability of transition fromQSi,t to QSi,n during n−t time slots. This probability dependson the (n− t)-hop path from QSi,t to QSi,n in Fig. 2. This pathreflects how the queue of coalition Si changes during n−t timeslots, QSi,t → QSi,n. The probability of change in the queueis equal to the sum of probabilities over all paths from QSi,tto QSi,n:

∆nt =

∑all paths: QSi , t→QSi ,n

Pr(QSi,t → QSi,n). (22)

From Fig. 2, we can see that there is a one-hop path betweenQSi,t and QSi,t+1. This path captures the stochastic change ofa queue during one time slot. This one-hop path reflect theevent during which queue QSi,t changes to QSi,t + j wherej = −1, 0, ..., |Si |. The probability that the queue changes dueto a one-hop path, p(QSi,t → QSi,t + j), can be found in Fig.2. For ∆t+2

t , the number of paths becomes greater than one.For example, if QSi,t+2 = QSi,t +3 then there are |Si |+2 two-hop paths during 2 time slots to change the queue length fromQSi,t to QSi,t + 3, which are QSi,t → QSi,t + i → QSi,t + 3for i = −1, 0, ..., |Si |. Then, the probability of each path iscalculated by multiplying the probability of two hops of Fig. 2.

After computing the deterministic value for current time slotand the random value for future time slots, we can explicitlydefine the value of each coalition or payoff function of eachMTD. For instance, when MTDs are altruistic, the value ofcoalition Si is equal to the payoff function of each MTD inthe coalition as follows:

va(Si,QSi,t ) = ud(Si,QSi,t0 ) + ur (Si,QSi,t0 ). (23)

When MTDs are altruistic, the value of each coalition Siwill be a real value that is equal to the benefit achievedby each individual MTD. If the MTDs are selfish, then thedeterministic current time slot value and random value forfuture time slots will be, respectively, a set of payoff vectorsUd(Si,QSi,t ) and Ur (Si,QSi,t ) ⊆ R |Si | . Thus, the value of acoalition, which is the sum of the deterministic current timeslot value and random value for future time slots, becomesa set of vectors, Vs(Si,QSi,t ) ⊆ R |Si | with each elementvsm(Si,QSi,t ) of any vector vs ∈ Vs is the payoff functionof each MTD in the coalition. The payoff function of a givenMTD m in coalition Si is given by:

vsm(Si,QSi,t ) = ud,m(Si,QSi,t0 ) + ur,m(Si,QSi,t0 ). (24)

Following (23) and (24), in the CGSC, the value of acoalition and the payoff of each MTD will depend on theformed coalition and its queue. Since the queue cannot bearbitrary shared among members of coalition, the CGSC is agame with non-transferable utility (NTU) [36].

B. Random access as an M-person stochastic coalition game

Next, we model the cooperation of MTDs during differenttime slots as an SCG [34]. In each stage of this SCG, thecooperation of MTDs is modeled using a CGSC [33]. Theplayers in the SGC are also the MTDs in the setM. The SCGis played in such a way that, at each time slot t, the MTDs

decide on their membership in a formed coalition Π(t). Theirdecisions depend on the random conditions of the game. Thestate of the SCG at a slot t is denoted by ht = (Π(t),Q(t)). Thisstate is a two-dimensional random variable with discrete finitestates which means MTDs form coalitions Π(t) while Q(t)is the set of queue of the coalitions. Q(t) is stochasticallychanging at each time slot t. Consequently, the RACH M-person stochastic coalition game in stochastic characteristicfunction (MSCF) is uniquely defined by the triplet (M, vt, ht ).Thus, the MTDs choose the actions that lead to formingdisjoint coalitions and each MTD receives a stochastic payoffvt at each time slot t which is given by (23) or (24).

The formed MTD coalitions may randomly change duringdifferent time slots, because the arrival and departure processesof the coalitions’ queues can stochastically change. Therefore,at each stage of an MSCF, the MTDs form the most suitablecoalitions depending on the queues. Let ht = (Π(t),Q(t)) bethe state of the coalition formation process at time slot t.As shown in [35], the process of coalition formation is astochastic process ζ which starts from an initial state h0 andmoves to another state ht following the stochastic changes inthe queues of the coalitions when going form the time slot 0 tothe time slot t. Our goal is to find a stable state of ζ which isessentially a stable partition, Π∗, ofM. Consequently, we canfind how coalitions are formed by MTDs given the stochasticchanges of their queues.

First, we define the moves or decisions that are going tobe used in our proposed algorithm. Consider a state ht =(Π(t),Q(t)) and a coalition Si , then, we make the followingdefinition:

Definition 1. Let FSi (ht ) be the set of states achievable by aone-step coalitional move (by Si) which changes the coalitionformation, Π(t), when an M-person stochastic coalition gamein stochastic characteristic function form is in state ht .

There are three types of moves for each MTD m at eachtime slot t. The first is Si , which means that the members ofSi do not change their coalition. The second type of movesfor each MTD m is Cm = {C1, C2, ..., C|Si |−1}, where Ck is ak-person coalition for MTD m that consists of MTD m andk − 1 members from Si . Cm is the set of all possible k-personcoalitions that MTD m ∈ Si can form with k < |Si | othermembers in the coalition Si . The total number of moves inCm is |Cm | =

∑ |Si |−1k=1 Ck where |Ck | = ( |Si |−1)!

(k−1)!( |Si |−k−2)! . Thelast type of move occurs when MTD m in the coalition Siasks other coalition Sj to form a new larger coalition whichis Si ∪ Sj . Consequently, the total of one-step move for eachMTD m in the coalition Si given by:

FmSi (ht ) = Si ∪ Cm ∪ {Si j |∀ j , i}. (25)

Therefore, a one-step coalitional move is equal to the unionof the one-step moves of all of the MTDs in Si denotedby FSi (ht ) = {∪m∈SiFm

Si (ht )}. After determining all possibleone-step coalitional moves, we can now define the concept ofa profitable move [39]:

Definition 2. Si has a (weakly) profitable move from Π1t

(under ζ) if there is Π2t ∈ FSi (ht ) (with Π2(t) , Π1(t)) such

8

TABLE ISTOCHASTIC COALITION FORMATION ALGORITHM FOR M2M

COMMUNICATION

Inputs: M, Π(t0), Q(t0)Initialize:

Set initial state and discount factor as ht0 = {Π(t0), Q(t0)} and δ,respectively.Find the discrete value nδ where δm ' 0 for nδ ≤ m.Stage 1:

(a) For each coalition Si , calculate stochastic change of thequeue, QSi , t0→t0+nδ , until time slot t0+nδ , using the transitionprobability model in Fig. 2.(b) Determine the value functions of MTDs in the partition Π(t0),using (20).(c) For each MTD m ∈ Si , find Cm including all possible k-person coalitions.(d) For each MTD m ∈ Si , find all possible coalitions such asSj that forming Si ∪ Sj is a profitable move.

Stage 2:(a) For each coalition Sk ∈ Fm

Si(ht ), calculate stochastic change

of the queue, QSk , t→t+nδ, until time slot t0 + nδ , using the

transition probability model in Fig. 2.(b) Do the most preferable one-step coalitional move in FSi (ht ).

Stage 3: while Π(t0) changes for two consecutive iterationsrepeat Stage 1 to Stage 2Output: Stably formed coalition: Π∗

that vm(Π2(t), ζ) ≥ vm(Π1(t), ζ) for all m ∈ Si . Si has a strictlyprofitable move from Π1(t) if there is Π2(t) ∈ FSi (ht ) suchthat vm(Π2(t), ζ) > vm(Π1(t), ζ) for all m ∈ Si .

By using the notion of a profitable move, we will proposean M-person stochastic coalition formation algorithm that canbe used to enable the MTDs to perform distributed coalitionformation under a stochastic characteristic function form. Thephases of algorithm are:

Initial Phase: At each time slot t0, the initial partition canbe a singleton network partition: Π(t0) = {{1}, {2}, ..., {M}}or a grand coalition:Π(t0) = {1, 2, ..., M}.

Sequential Phase: The stochastic coalition formation algo-rithm keeps on iterating over all the MTDs in the networkuntil all MTDs decide to stay in their current coalition, whichindicates that the algorithm has converged to a final stablenetwork partition Π∗(t0). In each iteration, each MTD mmember of a the coalition Si follows its most preferable one-step coalitional move which is in Fm

Si (ht ).A summary of the stochastic coalition formation algorithm

is presented in Table I. Next, we analyze the convergenceand stability of the proposed algorithm. First, we define adeterministic process of coalition formation as follow [39]:

Definition 3. For all formed coalitions Π1t and Π2

t , a process ofcoalition formation is said to be deterministic if the transitionprobability from coalition Π1

t to coalition Π2t in each time slot

t, p(Π1t ,Π

2t ) is in {0, 1}.

Lemma 1. The proposed M-person stochastic coalition forma-tion algorithm in Table I is a deterministic process of coalitionformation.

Proof. During the coalition formation process in Table I, whencoalition Π1

t is formed at a given step, if there is at least oneprofitable move for MTDs, the algorithm in Table I will goto coalition Π2

t in the next step with probability one. If there

is no profitable move for MTDs, the probability to changethe formed coalition from Π1

t to Π2t is zero. Therefore, the

algorithm in Table I deterministically changes the coalitionformation process. �

Definition 4. Under a stochastic change of the coalition value,a formed partition Π∗ is said to be stable, if no MTD m inthe coalition S∗i nor any group of MTDs can form anothercoalition. For a stable coalition, Π∗, the following conditionsare satisfied for all S∗i and S∗j ∈ Π∗:

vm(S∗i ,QSi,t ) ≥ vm(Ck,QCk,t ), k = 1, 2, ..., |S∗i | − 1,

vm(S∗i ,QSi,t ) ≥ vm(S∗i ∪ S∗j ,QSi,t +QSj,t ). (26)

Based on (26), in a stable partition, Π∗, no MTD m in S∗ican benefit by forming any k-person coalition that is differentfrom the current coalition S∗i nor by enabling its coalition toform a new, bigger coalition with other coalition S∗j . Thus, ina stable partition, each MTD m prefers to stay in its currentcoalition. As shown in [39, Theorem 4.1], for any deterministicprocess of coalition formation in characteristic function form,there exists a discount factor δ∗ ∈ (0, 1) such that for anycollection of discount factors in (δ∗, 1), that deterministicprocess converges to the unique limit which is the coalitionsthat will effectively form. According to [39, Theorem 4.1] andLemma 1, we can state the following result for our proposedalgorithm in Table I.

Theorem 1. For the proposed deterministic M-person stochas-tic coalition formation algorithm in Table I, there exists adiscount factor δ∗ such that for any collection of discountfactors in (δ∗, 1), the unique limit of algorithm in Table I willbe Π∗ with v(S∗i ,QSi,t ) . δ∗ can be found from the followingconditions:∞∑

n=t+1δn−t

(∆nt ud(S∗i ,QS∗i ,n) − ∆

nt ud(Ck,QCk,n)

)≥

ud,m(Ck,QCk,t ) − ud,m(S∗i ,QS∗i ,t ),∀S∗i ∈ Π∗, k = 1, 2, ..., |S∗i | − 1, and,∞∑

n=t+1δn−t


nt ud(S∗i ∪ S∗j ,QS∗i∪ S∗j,n)

)≥

ud,m(S∗i ∪ S∗j ,QS∗i∪ S∗j,t ) − ud,m(S∗i ,QS∗i ,t ),∀S∗j ∈ Π∗,S∗j , S∗i . (27)

Proof. See Appendix A. �

Following Lemma 1, Theorem 1, and Definition 4, we cansee that there exists δ∗ ∈ (0, 1) such that for all discount factorsin (δ∗, 1), the proposed algorithm in Table I will converge toa stable partition Π∗.

C. Complexity and Stability Analysis

The complexity of the algorithm in Table I lies in thecomplexity of the preferable one-step coalitional movement.Therefore, a good measure of complexity will be the number

9

of possible (per coalition) movements. For a given networkstructure, one possible move for a coalition Si is to form anew larger coalition Si ∪ Sj . This move implies that a givencoalition will try to merge with other coalitions while ensuringthat the bit rate between the members of the merged coalitionis equal to or larger than the desirable rate, i.e., B

T ≤ Rim. Inthe most complex case, the total number of merge attemptswill be M(M−1)

2 . The second type of coalition moves occurswhen the MTDs in a given coalition Si want to split and forma k-person coalition out of coalition Si where k < |Si |. Thetotal number of split moves for each MTD m is |Cm |. The mostcomplex case for splitting occurs when the network forms anM-person grand coalition. The total number of split attemptswill be

∑Mm=1 |Cm |. In terms of computations, for each move-

ment, the value function should be calculated under stochasticchanges. This calculation depends on the queue length in thecoalition and the value of δ. A larger value of δ leads to theneed for computing additional terms in (14) and (16). To showthe effect of δ on the number of terms of the series in (14)and (16), for each value of δ, we define a discrete value nδwhere δm ' 0 for nδ ≤ m. This means that value functionshould be calculated until next nδ time slots. Following (14)and (16), in the worst case, the complexity of calculating thevalue function is

(min{M,K}

)nδ where M is the maximumsize of grand coalition and K is maximum available buffersize. Consequently, the complexity of the proposed algorithmin Table I is O

( ( ∑Mm=1 |Cm | + M(M−1)

2)×

(min{M,K}

)nδ ) .In practice, the merge process requires a significantly lower

number of attempts because the MTDs can only attemptto merge with coalitions within their communication rangethat also satisfy the bit rate requirement. Since the size ofa coalition is typically small and limited due to the costfor cooperation, the split moves will be limited to findingall possible partitions for small sets. Consequently, we willseldom encounter an M−sized coalition. In fact, most formedcoalitions will be of size K that is much smaller than M .

IV. SIMULATION RESULTS AND ANALYSIS

For our simulations, we consider a single cell served by oneBS that is located at the center of a 400 m × 400 m square area.We consider two distribution models of the MTDs around theBS: (A) uniform and (B) cluster-based distributions. In theuniform distribution model, the MTDs randomly distributedacross square area. In a cluster-based distribution, we considercluster centers with density of 5 × 10−5 cluster/m2 whoselocations result from the realization of a Poisson point process.In each cluster, the locations of the MTDs result from therealization of a normal distribution around the cluster center.The number of MTDs per cluster is determined by a Poissondistribution whose mean value is equal to the ratio of thenumber of MTDs to the number of cluster centers in eachrealization. The time slot for complete RA request is set toT = 1 millisecond. The number of available preambles forRACH is µ = 256. The maximum power consumption on eachRB for sending data over the cellular link to the BS is set to25 dBm. The maximum size of each queue is K = 30. Weassume that the maximum power consumption used by MTDs

for sending requests to a coalition head is 5 dBm on eachRB [19] during each time slot. The bandwidth of each resourceblock is 15 kHz. We consider a 2 GHz carrier frequency. Thenoise power spectral density N0 is âLŠ170 dBm per Hz. Weconsider a path loss exponent of 2.5 and a Rayleigh fadingwith mean 1 for the channel model. Moreover, the length ofeach packet is set to 50 bits. We consider that the probabilityof sending a request for each MTD, p, is equal to 0.3. The unitcost parameter, γ is 0.2. For each number of MTDs, we applyproposed stochastic coalition formation algorithm in Table Ito find out the cooperative groups.

A. Effect of the number of MTDs

Figs. 3, 4, and 5 show the effect of the number of MTDson the stochastic cooperative random access model. In thesefigures, the number of MTDs is varied from 200 to 1000.

Fig. 3 shows the network’s fail ratio which is defined asthe ratio between the number of requests that collided andthe total number of requests that all MTDs sent over theRACH during each time slot. From Fig. 3, we can see that, byincreasing the number of MTDs, the fail ratio increases. Thisis due to the fact that, when the number of MTDs increases,the average arrival rate of the queue in each formed coalitionincreases. Consequently, sending the access requests over theRACH increases, and, thus, the fail ratio resulting from thestochastic cooperative random access model will increase tothe same value of traditional noncooperative random accessmodel. On the average, Fig. 3 shows that the stochasticcooperative random access model reduces the fail ratio ofaround 14% and 34%, for uniform distribution and cluster-based distribution, respectively, compared to the traditionalrandom access protocol, and, on the average, the stochasticcooperative random access model increases the fail ratio ofaround 17% and 32%, for uniform distribution and cluster-based distributions, respectively, compared to the optimalsolution. However, the complexity of the optimal solutionis significantly larger than the complexity of our proposeddistributed algorithm. In particular, the complexity of theoptimal solution grows exponentially with the number ofMTDs while the required iterations for convergence of theproposed algorithm is restricted to the number of merge-and-split moves per coalition. Thus, the proposed solution offersa better balance between complexity and performance.

In Fig. 4, we show the average per-MTD energy consump-tion for transmitting over the RACH. From Fig. 4, we can seethat the average per-MTD energy consumption resulting fromthe proposed stochastic cooperative random access model isless than the one resulting from the traditional noncooperativerandom access model. The reason is that, in each coalition, allof the MTDs share the energy consumed for acting as head.Fig. 4 also shows that selfish and altruistic coalition formationachieve an almost equal fail ratio and energy consumption.When the MTDs are distributed according to cluster-based dis-tribution, the performance of the proposed coalition formationalgorithm improves for both the fail ratio and energy con-sumption. Fig. 4 also shows that, on the average, the proposedapproach reduces the energy consumption of around 16% and

10

200 300 400 500 600 700 800 900 1000

Number of MTDs

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8F

ail a

cce

ss r

atio

pe

r M

TD

Noncooperative case

Coalition formation: self (uniform)

Coalition formation: altru (uniform)

Coalition formation: self (cluster)

Coalition formation: altru (cluster)

Optimal clustering: (uniform)

Optimal clustering: (cluster)

Fig. 3. Effect of number of MTDs on fail ratio.

200 300 400 500 600 700 800 900 1000

Number of MTDs

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Energ

y c

onsum

ption p

er

MT

D [m

Joule

]

Noncooperative case





Optimal clustering: (uniform)

Optimal clustering: (cluster)

Fig. 4. Effect of number of MTDs on the energy consumption per MTD.

31% for uniform distribution and cluster-based distributions,respectively, compared to a traditional random access protocol.Moreover, on the average, the stochastic cooperative randomaccess model consumes around 14% and 27% more energycompared to the optimal solution, for the uniform and cluster-based distribution, respectively.

Fig. 5 shows the price of anarchy which is defined asthe ratio between the utility achieved by the MTDs at theconvergence of the proposed stochastic cooperative randomaccess model and the utility achieved by the MTDs underthe optimal solution. Note that we are interested in price ofanarchy values that are close to 1 (or 100%) in which case theformed cooperative groups at the convergence of the proposedstochastic coalitional game provide a good approximation ofthe optimal solution. In Fig. 5, we can see that, by increasingthe number of MTDs, the price of anarchy decreases due tothe fact that a a network having more MTDs will providemore opportunities for cooperation. Thus, the number of stablecooperative groups resulting from the proposed stochasticcooperative random access model is greater than the number ofclusters under the optimal solution in (11). The highest price ofanarchy is observed when the MTDs are distributed accordingto a cluster-based distribution under altruistic cooperation. Onthe average, Fig. 5 shows that the price of anarchy is 80% and93% (76% and 83%) for uniform distribution and cluster-baseddistribution under altruistic (selfish) cooperation, respectively.

200 300 400 500 600 700 800 900 1000

Number of MTDs

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Price o

f anarc

hy





Fig. 5. Price of anarchy as function of the number of MTDs.

Thus, on the average, the performance gap between the pro-posed distributed algorithm and the optimal solution are 20.5%and 13.5% for the uniform and cluster-based distributions,respectively. However, the complexity and signaling overheadof our proposed distributed algorithm are much smaller thanthe signaling overhead and complexity of the optimal solution.

Fig. 6 shows how the number of iterations needed for con-vergence of the proposed algorithm in Table I, for two differentvalues of the discount factor δ when the MTDs are altruistic.From this figure, we can see that, when the number of MTDsis 600 for uniform (cluster-based) distribution, the number ofiterations will be 150 and 115 (195 and 165) for discount factorvalues of 0.5 and 0.8, respectively. By increasing the numberof MTDs, the number of iterations increases. When the numberof MTDs increases, the MTDs will form more coalitionswith smaller size. This means that MTDs in a large coalitionwould prefer to split into smaller ones. In the cluster-baseddistribution, the size of coalitions is larger than in the uniformdistribution. Thus, the number of iterations needed to formstable coalitions is smaller for the cluster-based distribution.When the discount factor increases, the MTDs become moremotivated to stay in their coalition for future payoff, thus thenumber of iterations decreases. Such a number of iterationsis quite reasonable for a dense network having thousandsof MTDs, as it implies roughly about 3 to 4 iterations perdevice. As explained next, such a convergence time, whichis needed only for the initial coalition formation process, ispractical. In essence, an initial delay is required for achievingthe stability of all coalitions. In addition, the convergenceresults shown correspond to a network which starts with aninitial state in which each MTD is a singleton, noncooperativeplayer. Such an initial state is, in fact, the worst-case, interms of convergence time. Once these singleton MTDs formtheir initial set of coalitions, if there is a need to re-runthe coalition formation process due to stochastic changes inenvironment, the algorithm will be performed starting fromthe last convergence state not the initial state. Naturally, sucha process will require fewer iterations than in the initial state,as the MTDs would have already self-organized into an initialset of coalitions. Moreover, in practice, the changes in the

11

200 300 400 500 600 700 800 900 1000

Number of MTDs

0

50

100

150

200

250

300

350

400N

um

be

r o

f ite

ratio

ns

Coalition formation: self (uniform):δ=0.8


Coalition formation: altru (cluster):δ=0.8

Coalition formation: altru (cluster):δ=0.5

Fig. 6. Effect of number of MTDs on the number of iterations.

200 300 400 500 600 700 800 900 1000

Number of MTDs

0

1

2

3

4

5

6

7

8

9

Num

ber

of m

oves p

er

coalition



Coalition formation: self (cluster):δ=0.8

Coalition formation: self (cluster):δ=0.5

Fig. 7. Effect of number of MTDs on the number of moves per coalition.

location or incoming traffic of MTDs happen only within alimited area of the network. Thus, there is no need to performthe proposed distributed algorithm for all of the MTDs.

Fig. 7 shows how the average number of moves per coalitionchanges as a function of the network size and the discountfactor, in the case of selfish cooperation. For example, whenthe number of MTs is 600 for the uniform (cluster-based)distribution, the average number of moves per coalition will bearound 1 and 2 (3 and 6) for δ = 0.5 and δ = 0.8, respectively.As the number of MTDs increases, the average number ofmoves per coalition decreases. Moreover, the average numberof moves per coalition becomes smaller for higher valuesof δ. This decrease in the average number of moves percoalition can be explained as follow. When the number ofMTDs increases, the fail ratio (energy consumption) of MTDsachieved by our proposed coalition formation under stochasticchanges decreases (increases) (See Figs. 3 and 4), and theMTDs prefer to form smaller-sized coalitions. This naturallyjustifies the smaller number of moves per coalition. Moreover,as δ increases, the MTDs become more farsighted and, as aresult, they will be more inclined stay in their coalitions insteadof moving to form other coalitions.

B. Effect of preferences

As shown in Figs. 8 and 9, we consider three scenarios:Scenario I in which MTDs are more sensitive to the queue

Scenario I Scenario II Scenario III0.1

0.2

0.3

0.4

0.5

0.6

0.7

Fa

il a

cce

ss r

atio

pe

r M

TD

Coalition formation: self (uniform):γ=0.2

Coalition formation: altru (uniform):γ=0.2

Coalition formation: self (cluster):γ=0.2

Coalition formation: altru (cluster):γ=0.2





Fig. 8. Effect of preference of MTDs on fail ratio when M = 500.

length than the energy consumption (α = 0.9, β = 0.1), andScenario II in which MTDs are equally sensitive to the queuelength and as energy consumption (α = 0.5, β = 0.5), andScenario III where MTDs favor the queue length less thanenergy consumption (α = 0.1, β = 0.9). To show the effectof the signaling overhead, we analyze the results of the threescenarios for two different values of γ, i.e., 0.05 and 0.2.The number of MTDs is 500 and the probability of sendingrequests to the BS is 0.3.

According to Fig. 8 and Fig. 9, in Scenario I (III) theaverage fail ratio and energy consumption increase (decrease)under the stochastic cooperative random access model whencompared with Scenario II. In addition, when the cost of thesignaling overhead decreases, the average fail ratio and energyconsumption decrease in all of the scenarios. For example,when γ = 0.2, under the selfish (altruistic) cooperation schemefor uniform (cluster-base) distribution, the average fail ratiosare 0.62, 0.44, and 0.38 (0.34, 0.28, and 0.21), and theenergy consumption per MTD will be 0.82, 0.53, and 5.1(0.47, 0.44, and 0.41) mJoules for Scenarios I, II, and III,respectively. Here, we observe that the coalition size decreases(increases) in Scenario I (III) and the number of formedcoalitions increases (decreases) in Scenario I (III). An increasein the number of formed coalitions leads to additional loadover the RACH and a larger coalition size leads to less energyconsumption for the members of the coalition. Consequently,the fail ratio and energy consumption increase for Scenario I,but the fail ratio and energy consumption decrease for ScenarioIII. When the unit cost parameter of the signaling overheadbecomes smaller for the MTDs, the coalition size increases.Thus, the fail ratio and energy consumption decreases whenγ = 0.05.

V. CONCLUSION

In this paper, we have proposed a novel queue lengthand energy-aware cooperation scheme for optimizing RACHaccess in M2M cellular networks. In the proposed model, theMTDs autonomously engage in a coalition formation processto coordinate their RACH access by forming cooperativegroups. We have modeled the cooperation of MTDs duringeach time slot as a coalitional game in stochastic characteristicfunction form. In this game, the MTDs seek to optimize apayoff function that includes two components: a deterministic

12

Scenario I Scenario II Scenario III0.3

0.4

0.5

0.6

0.7

0.8

0.9E

nerg

y c

onsum

ption p

er

MT

D [m

Joule

]Coalition formation: self (uniform):γ=0.2








Fig. 9. Effect of preference of MTDs on energy consumption when M = 500.

component related to the certain payoff achieved at the currenttime slot and a random term that corresponds to the correspondstochastic payoff achieved in future time slots. To model thecooperation of MTDs during different time slots, we have usedan M-person stochastic coalition game. To solve this game,we have introduced a novel coalition formation algorithm thatis shown to reach a stable partition. We have considered notonly the energy consumption but also queue length of accessrequests as the payoff of the MTDs. We have shown that,despite the stochastic changes in the payoff of the MTDs,the MTDs can form stable coalitions, if they are sufficientlyfarsighted. Simulation results have also shown that on theaverage, the proposed stochastic coalition formation algorithmcan significantly reduce the fail ratio and energy consumptionof the system. For future work, we can analyze the system inpresence of multiple base stations and inter-cell interference.

REFERENCES

[1] F. Ghavimi and H. H. Chen, “M2M communications in 3GPP LTE/LTE-A networks: architectures, service requirements, challenges, and appli-cations,” IEEE Communications Surveys & Tutorials, vol. 17, no. 2, pp.525–549, May 2015.

[2] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, andM. Ayyash, “Internet of things: a survey on enabling technologies,protocols, and applications,” IEEE Communications Surveys & Tutorials,vol. 17, no. 4, pp. 2347–2376, June 2015.

[3] R.-S. Cheng, C.-M. Huang, and G.-S. Cheng, “A D2D cooperative relayscheme for machine-to-machine communication in the LTE-A cellularnetwork,” Proc. of IEEE Int. Conf Information Networking (ICOIN), pp.153–158, Cambodia, Jan 2015.

[4] A. Laya, L. Alonso, and J. Alonso-Zarate, “Is the random access channelof LTE and LTE-A suitable for M2M communications? a survey ofalternatives,” IEEE Communications Surveys & Tutorials, vol. 16, no. 1,pp. 4–16, February 2014.

[5] M. Z. Shafiq, L. Ji, A. X. Liu, J. Pang, and J. Wang, “Large-scale mea-surement and characterization of cellular machine-to-machine traffic,”IEEE/ACM Trans. on Networking, vol. 21, no. 6, pp. 1960–1973, Dec2013.

[6] F. Hussain, A. Anpalagan, and M. Naeem, “Multi-objective MTCdevice controller resource optimization in M2M communication,” Proc.IEEE Biennial Symposium on Communications, pp. 184–188, Kingston,Canada, June 2014.

[7] G. T. . V10.4.0, “Evolved universal terrestrial radio access (e-utra);physical channels and modulation,” Dec. 2011.

[8] M. Hasan, E. Hossain, and D. Niyato, “Random access for machine-to-machine communication in LTE-advanced networks: issues and ap-proaches,” IEEE Communications Magazine, vol. 51, no. 6, pp. 86–93,June 2013.

[9] A. Ilori, Z. Tang, J. He, K. Blow, and H.-H. Chen, “A random channelaccess scheme for massive machine devices in LTE cellular networks,”Proc. of IEEE Int. Conf. on Communications (ICC), pp. 2985–2990,London, UK, June 2015.

[10] H. Thomsen, N. K. Pratas, C. Stefanovic, and P. Popovski, “Code-expanded radio access protocol for machine-to-machine communica-tions,” Trans. on Emerging Telecommunications Technologies, vol. 24,no. 4, pp. 355–365, 2013.

[11] X. Yang, A. Fapojuwo, and E. Egbogah, “Performance analysis andparameter optimization of random access backoff algorithm in lte,” Proc.of IEEE Vehicular Technology Conference (VTC), pp. 1–5, Quebec City,Canada, Sep 2012.

[12] T. Kim, H. S. Jang, and D. K. Sung, “An enhanced random accessscheme with spatial group based reusable preamble allocation in cellularM2M networks,” IEEE Communications Letters, vol. 19, no. 10, pp.1714–1717, Oct 2015.

[13] H. S. Jang, S. M. Kim, K. S. Ko, J. Cha, and D. K. Sung, “Spatial groupbased random access for M2M communications,” IEEE CommunicationsLetters, vol. 18, no. 6, pp. 961–964, June 2014.

[14] J. P. Cheng, C. h. Lee, and T. M. Lin, “Prioritized random access withdynamic access barring for RAN overload in 3GPP LTE-A networks,”Proc. of IEEE GLOBECOM Workshops, pp. 368–372, Houston, USA,Dec 2011.

[15] H.-L. Fu, P. Lin, H. Yue, G.-M. Huang, and C.-P. Lee, “Group mobilitymanagement for large-scale machine-to-machine mobile networking,”IEEE Trans. on Vehicular Technology, vol. 63, no. 3, pp. 1296–1305,March 2014.

[16] S. Y. Lien, K. C. Chen, and Y. Lin, “Toward ubiquitous massive accessesin 3GPP machine-to-machine communications,” IEEE CommunicationsMagazine, vol. 49, no. 4, pp. 66–74, April 2011.

[17] H. K. Lee, D. M. Kim, Y. Hwang, S. M. Yu, and S. L. Kim, “Feasibilityof cognitive machine-to-machine communication using cellular bands,”IEEE Wireless Communications Magazine, vol. 20, no. 2, pp. 97–103,April 2013.

[18] C. Y. Tu, C. Y. Ho, and C. Y. Huang, “Energy-efficient algorithms andevaluations for massive access management in cellular based machineto machine communications,” Proc. of IEEE Vehicular TechnologyConference (VTC), pp. 1–5, San Francisco, USA, Sep 2011.

[19] C. Y. Ho and C. Y. Huang, “Energy-saving massive access controland resource allocation schemes for M2M communications in OFDMAcellular networks,” IEEE Wireless Communications Letters, vol. 1, no. 3,pp. 209–212, June 2012.

[20] S.-E. Wei, H.-Y. Hsieh, and H.-J. Su, “Joint optimization of cluster for-mation and power control for interference-limited machine-to-machinecommunications,” Proc. of IEEE Global Communications Conference(GLOBECOM), pp. 5512–5518, Anaheim, CA, USA, Dec 2012.

[21] M. A. O. S. T. Sandholm, K. Larson and F. Tohme, “Coalition structuregeneration with worst case guarantees,” Artificial Intelligence, vol. 111,no. 1, pp. 209–238, July, 1999.

[22] X. Lu, P. Wang, and D. Niyato, “A layered coalitional game frameworkof wireless relay network,” IEEE Transactions on Vehicular Technology,vol. 63, no. 1, pp. 472–478, Jan, 2014.

[23] Y. Xiao, K. C. Chen, C. Yuen, Z. Han, and L. A. DaSilva, “Abayesian overlapping coalition formation game for device-to-devicespectrum sharing in cellular networks,” IEEE Transactions on WirelessCommunications, vol. 14, no. 7, pp. 4034–4051, July, 2015.

[24] Z. Han, W. Saad, T. Basar, and A. Hjørungnes, Game theory inwireless and communication networks: theory, models, and applications.Cambridge University Press, 2012.

[25] W. Saad, Z. Han, M. Debbah, A. Hjørungnes, and T. Basar, “Coalitionalgame theory for communication networks,” IEEE Signal ProcessingMagazine, vol. 26, no. 5, pp. 77–97, Sep 2009.

[26] Y. Zhang, F. Li, X. Ma, K. Wang, and X. Liu, “Cooperative energy-efficient content dissemination using coalition formation game overdevice-to-device communications,” Canadian Journal of Electrical andComputer Engineering, vol. 39, no. 1, pp. 2–10, January 2016.

[27] R. G. Cheng, J. Chen, D. W. Chen, and C. H. Wei, “Modeling andAnalysis of an Extended Access Barring Algorithm for Machine-TypeCommunications in LTE-A Networks,” IEEE Transactions on WirelessCommunications, vol. 14, no. 6, pp. 2956–2968, June 2015.

[28] R. B. Cooper, Introduction to queueing theory. North Holland, 1981.[29] A. Larmo and R. Susitaival, “RAN overload control for machine type

communications in LTE,” Proc. of IEEE GLOBECOM Workshops, pp.1626–1631, Anaheim, CA, USA, Dec 2012.

[30] C. Hwang and M. A. Syed, Multiple objective decision makingâATmeth-ods and applications: a state-of-the-art survey. Springer Science &Business Media, 2012.

13

[31] S. Sen and P. S. Dutta, “Searching for optimal coalition structures,” pp.287–292, 2000.

[32] Z. Dawy, W. Saad, A. Ghosh, J. G. Andrews, and E. Yaacoub, “To-ward massive machine type cellular communications,” IEEE WirelessCommunications, vol. 24, no. 1, pp. 120–128, February 2017.

[33] D. Granot, “Cooperative games in stochastic characteristic functionform,” Management Science, vol. 23, no. 6, pp. 621–630, 1977.

[34] J. Kaluski, “N-Person stochastic games with coalitions,” Cybernetics andSystems Analysis, vol. 38, no. 3, pp. 387–395, 2002.

[35] R. J. Aumann and S. Hart, Handbook of game theory with economicapplications. Elsevier, 1992, vol. 2.

[36] P. Young and S. Zamir, Handbook of game theory. Elsevier, 2014.[37] L. Xiao, J. Liu, Q. Li, N. B. Mandayam, and H. V. Poor, “User-centric

view of jamming games in cognitive radio networks,” IEEE Trans. onInformation Forensics and Security, vol. 10, no. 12, pp. 2578–2590, Dec2015.

[38] Y. Shoham and K. Leyton-Brown, Multiagent systems: algorithmic,game-theoretic, and logical foundations. Cambridge University Press,2008.

[39] H. Konishi and D. Ray, “Coalition formation as a dynamic process,”Journal of Economic Theory, vol. 110, no. 1, pp. 1–41, 2003.

APPENDIX

A. Proof of Theorem 1

In this appendix, we are going to prove the minimum of δ.Considering the stability conditions in (26) and Ck in Cm:

vm(S∗i ,QSi,t ) ≥ vm(Ck,QCk,t ),

following (24), we can write:

ud,m(S∗i ,QS∗i ,t ) + ur,m(S∗i ,QS∗i ,t ) ≥ ud,m(Ck,QCk,t )+ ur,m(Ck,QCk,t ),

according to (20), we can write:

ud,m(S∗i ,QS∗i ,t ) +∞∑

n=t+1δn−t ud(S∗i ,QSi∗,t→n) ≥

ud,m(Ck,QCk,t ) +∞∑

n=t+1δn−t ud(Ck,QCk,t→n),

considering (21), we can write:

ud,m(S∗i ,QS∗i ,t ) +∞∑

n=t+1δn−t∆n

t ud(S∗i ,QS∗i ,n) ≥

ud,m(Ck,QCk,t ) +∞∑

n=t+1δn−t∆n

t ud(Ck,QCk,n),

where ∆nt can be calculated by (22). By some simplification,

we can write:∞∑

n=t+1δn−t


nt ud(Ck,QCk,n)

)≥

ud,m(Ck,QCk,t ) − ud,m(S∗i ,QS∗i ,t ).

Consequently, we can find out the minimum of δ byconsidering all of these conditions simultaneously:∞∑

n=t+1δn−t


nt ud(Ck,QCk,n)

)≥

ud,m(Ck,QCk,t ) − ud,m(S∗i ,QS∗i ,t )∀S∗i ∈ Π∗, k = 1, 2, ..., |S∗i | − 1, and,∞∑

n=t+1δn−t


nt ud(S∗i ∪ S∗j ,QS∗i∪ S∗j,n)

)≥

ud,m(S∗i ∪ S∗j ,QS∗i∪ S∗j,t ) − ud,m(S∗i ,QS∗i ,t )∀S∗j ∈ Π∗,S∗j , S∗i ,

Due to considering all of these conditions together, theminimum of δ is given.

stochastic coalitional games for cooperative random access ... · stochastic coalitional games for...

Documents