convergence of a dynamic policy for buffer management in shared buffer atm switches

Performance Evaluation 36–37 (1999) 249–266www.elsevier.com/locate/peva

Convergence of a dynamic policy for buffer management in sharedbuffer ATM switches

Supriya Sharma a, Yannis Viniotis b,*

a Strategic Network Planning, Alcatel USA, Plano, TX 75075, USAb Department of ECE, North Carolina State University, Raleigh, NC 27695-7911, USA

Abstract

Lack of effective buffer management can lead to severe cell loss in shared buffer ATM switches. With the aim ofreducing cell loss, we study a class of non-anticipative buffer management policies that always admit cells to the bufferwhile there is space in it and may pushout a cell when the buffer becomes full. We propose a dynamic algorithm thatoperates without any knowledge of the arrival process; we consider a two-class system and show that the algorithm isoptimal when the arrivals to each class are a superposition of identically distributed and independent Bernoulli processes. 1999 Published by Elsevier Science B.V. All rights reserved.

Keywords: ATM shared buffer switches; Cell loss; Dynamic pushout algorithms; Convergence of Markov chains; Buffermanagement

1. Introduction

In this paper we study a 2ð 2 shared buffer ATM switch, shown in Fig. 1. Each arrival, called a cellin this paper, has a predetermined destination, either output line 1 or output line 2. If the output line isbusy at the time of cell arrival, the cell waits in a finite (size K ) common buffer that is shared betweencells of the two output lines. The transmission time of all cells is constant (and equal to 2.83 µs for ATM

Input line 2 Output line 2

Output line 1Input line 1

Fig. 1. The shared buffer switch.

Ł Corresponding author. E-mail: [email protected]

0166-5316/99/$ – see front matter 1999 Published by Elsevier Science B.V. All rights reserved.PII: S 0 1 6 6 - 5 3 1 6 ( 9 9 ) 0 0 0 2 1 - 8

250 S. Sharma, Y. Viniotis / Performance Evaluation 36–37 (1999) 249–266

lines at 155 Mbps). Even though the cells of the two classes share a common storage of size K cells,each output line has its own logical queue. There is a buffer management policy, f , which controls theentries to and departures from the buffer of cells. This policy is the focus of our study.

Previous work (e.g. [4,5]) shows that lack of effective buffer management can lead to severe cell lossin the buffer. For example, with complete sharing of the buffer space amongst the two output lines, theentire buffer can be dominated by cells of only one of the output lines, which causes the arriving cellsof the other output lines to be lost, resulting in unfairness and excessive loss, since one line may beforced to idle. Our goal, then, is to find policies that regulate the queue lengths of the output lines andthus reduce the cell loss probability of the buffer. The class of policies we consider has the followingstructure: policies always admit cells to the buffer while there is space in it and may pushout a cellwhen the buffer becomes full. In other words, a policy may remove a previously accepted cell fromthe buffer and insert the newly arrived one in its place. It is assumed that the policies do not have anyknowledge of the future (arrivals). In [7] it was determined that, under certain assumptions, the optimalpolicy (that results in the least probability of cell loss) is a static policy. Static policies are problematic toimplement in real systems, since they require knowledge of the probability mass function of the arrivalprocess.

In this paper, we propose a dynamic buffer management policy that operates without knowledge ofthe arrival process parameters; instead, it uses estimates of the optimal policy parameters. Our mainresult, Theorem 11, is a proof that these estimates (and consequently the performance of the dynamicpolicy) converge to those of the optimal one.

In Section 2 we present a discrete-time queueing model for a switch with two inputs and outputs. Thestatic optimal policy, f Ł, and the optimality criterion are stated in Section 3. In Section 4, we introducethe measurements collected by the dynamic policy and define this policy formally. Finally, in Section 5,we present the proof of convergence.

2. The stochastic model

A queueing model for the shared buffer ATM switch is depicted in Fig. 2. The output lines of thesystem are represented by two servers, called X and Y , serving cells from a common buffer of sizeK > 2 (that includes the space for cells in service). Time is discrete, denoted by n 2 N0 D f0; 1; 2; : : :gor n 2 N D f1; 2; : : :g. Arrivals take place at the end of a time slot, the length of which is equal to theservice time of a cell.

Fig. 2. The queueing model.

S. Sharma, Y. Viniotis / Performance Evaluation 36–37 (1999) 249–266 251

The class F of buffer management policies that we consider is the set of all policies, f , that arenon-anticipative and work conserving. As we shall see, since the policies are work conserving, Eqs. (3)–(5) and (10) hold true.

The arrival process on each input line is modeled as an iid Bernoulli process; the destination of thecell (server X or Y ) is also determined in an iid manner. Let the random processes fAX

n g; fAYn g, denote

the number of arrivals to servers X and Y (also called class X and Y ), respectively, in slot n 2 N0. Sincethe arrivals to a server are the superposition of two Bernoulli processes (the arrivals on the input lines),in any given time slot, there can be at most two arrivals in a slot, i.e.,

AXn ; AY

n ½ 0; and AXn C AY

n � 2: (1)

For a fixed n 2 N0, the vector fAXn ; AY

n g has the following known probability mass function:

Pf.AXn ; AY

n / D .i; j/g D ri j ; i; j ½ 0; i C j � 2: (2)

Consider a buffer management policy f 2 F . Let the sequence fX fn ;Y f

n g; n 2 N0, called the bufferprocess, denote the number of cells of each server in the buffer at the beginning of the nth time slot. Thestate space of the sequence is given by S D f.x; y/: x; y ½ 0; x C y � K g, where K is a fixed positiveconstant.

Define the sequence f OX fn ; OY f

n g; n 2 N0, as follows. Let .x/C D max.0; x/; then

. OX f0 ;OY f0 / D .X f

0 ;Y f0 /;

OX fn D .X f

n�1 � 1/C C AXn�1; n 2 N ; (3)

OY fn D .Y f

n�1 � 1/C C AYn�1; n 2 N :

This sequence is, therefore, the demand for storage in slot n, but is called the overflow process forreasons that will be apparent later. Together with the buffer process, the overflow process defines thequeueing model of the shared buffer ATM switch.

Define the sequence f¦ fk g; k 2 N0, as follows. Let ¦ f

0 D 0, and

¦f

k D minfn > ¦ fk�1: OX f

n C OY fn > K g; k 2 N : (4)

¦f

k is the kth return time to the state f OX fn C OY f

n > K g. At this time, the demand for buffer space exceedsthe capacity of the buffer and cell loss takes place. Since our buffer management policies accept all cellsinto the buffer while it is not full, the state of the buffer in the nth slot, .X f

n ;Y fn /, is the same as the

demand for space, f OX fn ; OY f

n g, except at cell loss times; i.e.,

.X fn ;Y f

n / D . OX fn ;OY fn /; n 6D ¦ f

k ; k 2 N : (5)

The decision of the buffer management policy, f (regarding the class of cell to be pushed out),determines the value of .X f

¦f

k;Y f

¦f

k/;8k 2 N , and is explained in the next section.

It can be seen that the event f¦ fk D mg depends entirely on the event f OX f

n ; OY fn gmnD0. Therefore,

f¦ fk g; k 2 N0 is a sequence of stopping times [1,6] on the process f OX f

n ; OY fn g.

Since the time slots cell loss takes place it can be determined by examining the f OX fn ; OY f

n g process;it is this process that is of particular interest to us, and shall be the focus of this paper. In the nextsubsection we examine how loss takes place in the shared buffer while in Section 2.2 we investigatesome properties of the overflow process.


2.1. Cell loss and policy actions

In order to find out exactly how cell loss takes place, we consider the total number of cells in thebuffer, given by X f

n C Y fn or OX f

n C OY fn . From Eq. (3), we can easily determine the conditions necessary

for cell loss to occur. These conditions are therefore stated without proof in the following lemma.

Lemma 1. There can be at most one cell loss in any time slot and this cell loss can take place only fromthe vertices of the state space, .0; K / and .K ; 0/, i.e., at all cell loss times, n D ¦ f

k ;8k 2 N , we havethat:

OX fn C OY f

n D K C 1;

.X fn�1;Y f

n�1/ D .K ; 0/ or .0; K /;

AXn�1 C AY

n�1 D 2:

Therefore, from Lemma 1, we can see that when .X fn�1;Y f

n�1/ D .0; K /, we get that:

. OX fn ;OY fn / D

8><>:.0; K C 1/; if AX

n�1 D 0; AYn�1 D 2;

.1; K /; if AXn�1 D 1; AY

n�1 D 1;

.2; K � 1/; if AXn�1 D 2; AY

n�1 D 0:

(6)

When .X fn�1;Y f

n�1/ D .K ; 0/, we get that:

. OX fn ;OY fn / D

8><>:.K C 1; 0/; if AX

n�1 D 2; AYn�1 D 0;

.K ; 1/; if AXn�1 D 1; AY

n�1 D 1;

.K � 1; 2/; if AXn�1 D 0; AY

n�1 D 2:

(7)

Thus, the state space of the process . OX f

¦f

k

; OY f

¦f

k

/; k 2 N , is given byŁ, where

Ł D f.0; K C 1/; .1; K /; .2; K � 1/; .K C 1; 0/; .K ; 1/; .K � 1; 2/g: (8)

Recall that the state of the buffer process at an overflow instant is determined by the actions of thebuffer management policy. In order to formally define a buffer management policy, we need to determinethe states of .X f

¦f

k

;Y f

¦f

k

/; k 2 N . Since by Lemma 1, there can be only one cell loss at a loss instant,

the policy, f , has to drop or push out only one cell. Therefore, 8k 2 N , .X f

¦f

k

;Y f

¦f

k

/ is constrained to lie

within the following bounds:

X f

¦f

k

2 f OX f

¦f

k

; OX f

¦f

k

� 1g; Y f

¦f

k

2 f OY f

¦f

k

; OY f

¦f

k

� 1g: (9)

From Eq. (9), the state space of fX f

¦f

k

;Y f

¦f

k

g can be easily determined as the following subset of S:

f.0; K /; .1; K � 1/; .2; K � 2/; .K � 2; 2/; .K � 1; 1/; .K ; 0/g: (10)

A buffer management policy, f , is then defined by the sequence of states at cell loss times,f OX f

¦f

k

; OY f

¦f

k

; X f

¦f

k

;Y f

¦f

k

g, k 2 N0.


2.2. Properties of f OX fn ; OY f

n gWe now study a few properties of the overflow process that will be useful in Sections 4.1 and 5. The

first property characterizes the passage from a buffer content of c cells to that of c C 1 cells. The secondstates the one-step transition probabilities of the overflow process, while the third gives its recurrenceproperties.

Lemma 2. The total number of cells in the buffer may increase by only one cell in a slot, provided thatthe buffer is not empty and one server idles; i.e., if at some time n0 that is not a loss instant, we have that

OX fn0C OY f

n0D c; 1 � c � K ;

− D minfn > n0: OX fn C OY f

n D c C 1g;then

OX fn C OY f

n � c; n0 � n < −;

OX f−�1 D 0 or OY f

−�1 D 0:

Proof. Since OX fn0 C OY f

n0 � K , we know that . OX fn0;OY fn0/ D .X f

n0;Y fn0/. Therefore, by expanding Eq. (3), at

time n D n0,

OX fnC1 C OY f

nC1 D

8>>><>>>:X f

n C Y fn � 2C AX

n C AYn ; if X f

n ;Y fn > 0;

Y fn � 1C AX

n C AYn ; if X f

n D 0;Y fn > 0;

X fn � 1C AX

n C AYn ; if X f

n > 0;Y fn D 0;

AXn C AY

n ; if X fn D 0;Y f

n D 0:

(11)

From Eq. (1) we have AXn0C AY

n0� 2, and we can easily see that

OX fn0C1 C OY f

n0C1 � OX fn0C OY f

n0D c

when X fn0;Y f

n0 > 0. In the next two cases, when one of the two servers idles, we see that we may haveOX f

n0C1 C OY fn0C1 D c C 1 and − D n0 C 1 if AX

n0C AY

n0D 2. In this case, we see that the statement of the

lemma is satisfied.If − > n0 C 1, OX f

n0C1 C OY fn0C1 � c � K , and we get . OX f

n0C1;OY fn0C1/ D .X f

n0C1;Y fn0C1/. Then, at time

n D n0C 1, the arguments made for time n D n0 can be applied again. Continuing inductively, we get therequired result. �

For the particular case that c D K , − is also a cell loss time. It can easily be seen that at timen D − � 1 we must have either . OX f

n ; OY fn / D .K ; 0/ or .0; K /. However, this property does not hold true

at a general loss instant; i.e., we cannot claim that . OX f

¦f

k �1; OY f

¦f

k �1/ D .K ; 0/ or .0; K /. This is because

we may have ¦ fk�1 D ¦ f

k � 1 in which case OX f

¦f

k �1C OY f

¦f

k �1> K . This problem is circumvented by the

use of the state of the buffer process, .X fn ;Y f

n /, at time ¦ fk � 1, as in Lemma 1.

For any .x1; y1/ 2 S and .x2; y2/ 2 S [Ł, n 2 N0, we define the following function:

p..x1; y1/; .x2; y2// D PfAXn D x2 � .x1 � 1/C; AY

n D y2 � .y1 � 1/Cg: (12)


We can see from Eq. (2) that for x D x2�.x1�1/C, and y D y2�.y1�1/C, p..x1; y1/; .x2; y2// D rxy .The next lemma (that is reminiscent of the Strong Markov Property [1,3]) proves that the function

p..x1; y1/; .x2; y2// is the one-step transition probability of the overflow process from state .x1; y1/ to.x2; y2/. Our choice of state space for .x1; y1/ makes it clear that the transition probabilities at lossinstants are not considered, as this transition depends upon the specific policy used.

Lemma 3. For any stopping time − on the process f OX fn ; OY f

n g that is not a loss instant (− 6D ¦ fk ;8k 2 N0),

the conditional distribution of the vector . OX fn ; OY f

n / at time − C 1 given the entire past, depends only onthe state of the process at time − and is stationary; i.e., for .x−C1; y−C1/ 2 S [Ł; .x− ; y− / 2 S,

Pf. OX f−C1;OY f−C1/ D .x−C1; y−C1/ j . OX f

− ;OY f− / D .x− ; y− /; : : :; . OX f

0 ;OY f0 / D .x0; y0/g

D p..x− ; y− /; .x−C1; y−C1//:

Proof. Since − is not a loss instant, x− C y− � K , and we can use Eqs. (3) and (5) to write:

OX f−C1 D . OX f

− � 1/C C AX− ;

OY f−C1 D . OY f

− � 1/C C AY− :

For simplicity, we define:

C D f. OX f−C1;OY f−C1/ D .x−C1; y−C1/g;

B D f. OX f− ;OY f− / D .x− ; y− /; : : :; . OX f

0 ;OY f0 / D .x0; y0/g;

x D x−C1 � .x− � 1/C; y D y−C1 � .y− � 1/C:

Then, PfC j Bg D PfAX− D x; AY

− D y j Bg. Therefore,

PfC j Bg D1XjD0

PfAX− D x; AY

− D y j − D j; Bg ð Pf− D j; Bg=PfBg:

Now,

PfAX− D x; AY

− D y j − D j; BgD PfAX

j D x; AYj D y j . OX f

j ;OY f

j / D .x− ; y− /; : : :; . OX f0 ;OY f0 / D .x0; y0/g:

From Eq. (3), we know that . OX fj ;OY f

j / depends upon fAXi ; AY

i g, i � j � 1, and the policy f . Sincearrivals are iid, .AX

j ; AYj / is independent of fAX

i ; AYi g, i � j � 1. Since f is non-anticipative, .AX

j ; AYj /

is independent of f OX fi ;OY fi g, i � j . Therefore,

PfAXj D x; AY

j D y j . OX fj ;OY f

j / D .x− ; y− /; : : :; . OX f0 ;OY f0 / D .x0; y0/g

D PfAXj D x; AY

j D yg D rxy;

PfC j Bg D p..x− ; y− /; .x−C1; y−C1//ð1XjD0

Pf− D j; Bg=PfBg D p..x− ; y− /; .x−C1; y−C1//: �

In [7], the following lemma is proved.

Lemma 4. All states in the state space of f OX fn ; OY f

n g are visited infinitely often.


3. The optimality criterion and the optimal policy

Consider two systems controlled by policies f and g 2 F . Define a sequence of stopping times,f¦ f g

k g, k 2 N0, on the stochastic process f OX fn ; OY f

n ; OX gn ; OY g

n g, k 2 N0, as:

¦f g

0 D 0; ¦f g

k D minfn > ¦ f gk�1: OX f

n C OY fn > K or OX g

n C OY gn > K g:

Therefore, f¦ f gk g is the kth time that there is a loss in either system.

The cell loss under systems f and g at time ¦ f gk , k 2 N , is defined as:

L fk D OX f

¦f g

k

C OY f

¦f g

k

� K ; Łgk D OX g

¦f g

k

C OY g

¦f g

k

� K :

We define as our optimality criterion the minimization of the average expected cell loss in the system;i.e., f Ł is optimal, if the following inequality is satisfied:

limn!1

1

n

nXkD1

E L f Łk � lim

n!11

n

nXkD1

E Lgk ; 8g 2 F :

In [7], it is proved that the optimal policy, f Ł, has the following structure. For every n D ¦ f Łk , k 2 N ,

If . OX f Łn ; OY f Ł

n / D .0; K C 1/ then .X f Łn ;Y f Ł

n / D .0; K /:


n / D .1; K / then .X f Łn ;Y f Ł

n / D .1; K � 1/:


n / D .K C 1; 0/ then .X f Łn ;Y f Ł

n / D .K ; 0/:


n / D .K ; 1/ then .X f Łn ;Y f Ł

n / D .K � 1; 1/:


n / D .2; K � 1/ then .X f Łn ;Y f Ł

n / D(.1; K � 1/; if .1; K � 2/ < �.1; K � 2/;

.2; K � 2/; otherwise:


n / D .K � 1; 2/ then .X f Łn ;Y f Ł

n / D(.K � 1; 1/; if �.K � 2; 1/ < .K � 2; 1/;

.K � 2; 2/; otherwise:

In the above expression, .x; y/ and �.x; y/ for .x; y/ 2 S n f.0; 0/g are taboo probabilities that aredefined as follows: let − D minfn > 0: OX f

n C OY fn D OX f

0 C OY f0 C 1g, then,

.x; y/ D Pf OY fn > 0; 0 � n < − j OX f

0 D x; OY f0 D yg;

�.x; y/ D Pf OX fn > 0; 0 � n < − j OX f

0 D x; OY f0 D yg:

(13)

The definition of − is similar to that in Lemma 2, with c D OX f0 C OY f

0 . From the lemma, we knowthat for 0 � n < − , we have OX f

n C OY fn � x C y � K . Therefore, the actions of the policy do not

affect f OX fn ; OY f

n g−nD0, which is why the probabilities .x; y/ and �.x; y/ are independent of the policy f .Lemma 2 tells us that one of OX f

−�1 and OY f−�1 must be 0. .x; y/ and �.x; y/ give us the probability that

only one of the two events occurs. It can easily be seen that the optimal policy chooses to push out cellsfrom the class whose server has a smaller probability of going idle before the next cell loss takes place.For more details, see [7].


Thus, with the knowledge of .1; K � 2/, �.1; K � 2/, .K � 2; 1/, and �.K � 2; 1/, a buffer policyis able to make optimal decisions. The optimal policy, f Ł, is static and Markov, in the sense that thedecision taken at an overflow time depends only upon the state of the overflow process at that timeand given the same state at different overflow times, f Ł takes the same decision. However, it requiresknowledge of the probability mass function of the arrival process, something that is unlikely to beavailable in a real network. Therefore, we present in the next section, a dynamic algorithm, which usesestimates for .Ð/ and �.Ð/, that are available to the policy at time ¦ f

k . As we will show in Section 5, thisdynamic algorithm is also optimal.

4. The dynamic policy

In this section, we describe the system information that the dynamic policy must measure, in order toobtain the estimates it needs and then formally define the policy. For simplicity of notation, we retainthe symbol f to denote the (yet unstated) dynamic policy.

4.1. Measurements taken by the policy

First we define the sequences fS fk g; k 2 N0 and fT f

k g; k 2 N0. For any .a; b/ 2 S n f.0; 0/g, andk 2 N , let

S f0 D T f

0 D 0;

S fk D minfn ½ T f

k�1: . OX fn ; OY f

n / D .a; b/g;T f

k D minfn > S fk : OX f

n C OY fn D a C b C 1g:

(14)

S fk and T f

k are actually functions of .a; b/, but the argument .a; b/ is dropped for simplicity of notation.The same is true of all the other variables to follow in this section.

S fk is not the kth return time to the state .a; b/. S f

k is the first time that the overflow process reachesthe state .a; b/ after time T f

k�1. At the times fS fk�1 C 1; S f

k�1 C 2; : : :; T fk�1 � 1g, the state .a; b/ may be

visited several times. However, from the definition in Eq. (14), none of these instants may be S fk .

Similarly, T fk is not the kth return time to the state f OX f

n C OY fn D .a C b C 1/g (referred to hereafter

as state .a C b C 1/). It is the first time that the state a C b C 1 is visited after time S fk . In the times

fT fk�1 C 1; : : :; S f

k g, state a C b C 1 may be visited several times, with none of them being equal to T fk .

However, it can be seen that both fS fk g; k 2 N0 and fT f

k g; k 2 N0 are sequences of stopping times on thef OX f

n ; OY fn g process.

Since from Lemma 4 we know that all states in the state space of f OX fn ; OY f

n g are visited infinitely often,we get the following result, stated as a lemma.

Lemma 5. The random variables S fk � T f

k�1 and T fk � S f

k are finite, with probability 1.

Thus the time axis is divided into non-overlapping, non-contiguous cycles of time fS fk ; : : :; T f

k g. Theterm kth cycle is used to refer to the interval fS f

k ; : : :; T fk g, k 2 N0. Since these cycles have been proven

to be finite with probability 1 (wp 1), the random variables defined below are finite wp 1 too.


Let kAk denote the cardinality of a set A. Define the sequence of sets fAk; k 2 N0g as A0 D ;, and,

A fk D fn 2 fS f

k ; S fk C 1; : : :; T f

k g: . OX fn ;OY fn / D .a; b/g:

The sequence of random variables, fH fk g; k 2 N0, is defined as H f

k D kA fk k. The random variable,

H fk , counts the number of times that the state .a; b/ is visited in the kth cycle. By the definition of S f

k inEq. (14), H f

k ½ 1; k 6D 0. From Lemma 5, all cycles are of finite length wp 1, and thus,

PfH fk <1g D 1: (15)

Define the sequence of random variables, fC fn g; n 2 N0, as follows:

C fn D maxfi ½ 0: T f

i � n; i 2 N0g: (16)

C fn counts the number of cycles completed in f0; : : :; ng. Since the length of the 0th interval is 0, it is

not really counted as a cycle and thus C fn D 0; n < T f

1 . It can be easily seen that the (integer) sequencefC f

n g is non-decreasing. However, by Lemma 5, since the length of cycles is finite, wp 1, the number oftimes that any integer is repeated must be finite, wp 1, too.

Define the sequence of random variables, fJ fk g; k 2 N , as follows:

J fk D max

(m ½ 0:

mXiD0

H fi < k; m 2 N0

)C 1: (17)

J fk denotes the cycle in which the kth visit to state .a; b/ takes place. Since we know by Eq. (15) that

the number of visits per cycle (H fk ) is finite wp 1, J f

k has to increase with k, though not strictly.Define the sequence of random variables, fN f

k g; k 2 N , as:

N f1 D S f

1 ;

N fk D minfn > N f

k�1: . OX fn ; OY f

n / D .a; b/; n 2 N0g; k D 2; 3; : : :(18)

N fk is the kth return time of the state .a; b/. It is a strictly increasing random variable (therefore,

limk!1 N fk D 1 wp 1,) with the property that

S f

J fk

� N fk < T f

J fk

: (19)

The above equation merely says that the time of the kth visit to state .a; b/ must lie between thebeginning and end of the cycle it takes place in (i.e., J f

k ). By the definitions of the cycles in Eq. (14), weknow that none of these visits can take place outside the cycles.

From Lemma 2, we know that between every visit of the state .a; b/, say the kth one, and the end ofthe cycle it occurs in, (TJ f

k), either the state f OX f

n D 0g or f OY fn D 0g or both must be visited; i.e.,²

OX fn > 0; OY f

n > 0; n D N fk ; : : :; T f

J fk

¦D ;; k 2 N : (20)


The variables, fI f xk g and fI f y

k g, k 2 N , that we introduce next, track if either the f OX fn D 0g state or

the f OY fn D 0g state is visited exclusively after the kth visit to state .a; b/. More formally,

I f xk D

(1; if OX f

n > 0; 8n D N fk ; N f

k C 1; : : :; T f

J fk

� 1;

0; otherwise:(21)

I f yk D

(1; if OY f

n > 0; 8n D N fk ; N f

k C 1; : : :; T f

J fk

� 1;

0; otherwise:(22)

From Eq. (20), it can be seen that for any fixed k, at most one of I f xk and I f y

k will be equal to 1. Ifboth states f OX f

n D 0g and f OY fn D 0g are visited, then these random variables will both be 0.

Define the sequence of random variables, fG fk g; k 2 N , as:

G fk D

kXiD1

H fi : (23)

So, G fk counts the number of times that the state .a; b/ is visited in the first k cycles, viz. in the time

interval f0; : : :; T fk g. Since H f

i ½ 1; i 6D 0, we know that G fk is a positive and strictly increasing integer

sequence.

4.2. Estimates used by the dynamic policy

We are now in a position to define f Q� fk .a; b/g; k 2 N0, and f Q f

k .a; b/g; k 2 N0, the estimates for�.a; b/ and .a; b/, respectively. These are calculated at the end of every cycle as:

Q� f0 .a; b/ D Q f

0 .a; b/ D 0;

Q� fk .a; b/ D

G fkX

iD1

I f xi

G fk

;

Q fk .a; b/ D

G fkX

iD1

I f yi

G fk

; k 2 N :

(24)

Since G fk ½ 1;8k 2 N , the estimates are well defined. Thus, we have expressed the estimates as a

time average or Ces Jaro sum of a sequence of random variables.

4.3. The decision made by the dynamic policy

When the overflow states are one of f.0; K C1/; .1; K /; .K C1; 0/; .K ; 1/g, the actions of the (static)optimal policy, f Ł, are fixed. In these cases, our dynamic policy follows the actions of the optimal


policy. In the other two states, f.2; K � 1/; .K � 1; 2/g, it uses estimates as shown in the followingequation:

If . OX f

¦f

k

; OY f

¦f

k

/ D .2; K � 1/

then .X f

¦f

k

;Y f

¦f

k

/ D8<:.1; K � 1/; if Q f

C f

¦f

k

.1; K � 2/ < Q� f

C f

¦f

k

.1; K � 2/;

.2; K � 2/; otherwise:

If . OX f

¦f

k

; OY f

¦f

k

/ D .K � 1; 2/

then .X f

¦f

k

;Y f

¦f

k

/ D8<:.K � 1; 1/; if Q� f

C f

¦f

k

.K � 2; 1/ < Q f

C f

¦f

k

.K � 2; 1/;

.K � 2; 2/; otherwise:

Q� fk and Q f

k are updated at the end of every cycle, i.e., at times, T fk ; k 2 N (one of the times that the

state f OX fn C OY f

n D K g is visited). However, the policy itself uses these estimates only at loss instants. Weknow the kth one occurs at time ¦ f

k ; k 2 N which is also the kth time that the state f OX fn C OY f

n D K C 1gis visited. Since by Lemma 2 we know that during a cycle the total number of cells in the buffer (orthe value of OX f

n C OY fn ) cannot exceed the number at the beginning, the cell loss can only take place

in between cycles, not during a cycle. And since at the time of the kth loss, ¦ fk , there have been C f

¦f

k

complete cycles, the estimates used by the policy at this time are Q� f

C f

¦f

k

and Q f

C f

¦f

k

. (An example of a

sequence of estimates that policy f might use is Q� f0 ;Q� f0 ;Q� f1 ;Q� f3 ;Q� f7 ;Q� f7 ; : : :.) In order to relate our

policy f to the static optimal policy, f Ł, we will first show that:

limk!1

Q� f

C f

¦f

k

.a; b/ D �.a; b/;

limk!1

Q f

C f

¦f

k

.a; b/ D .a; b/:(25)

This has to be shown for both .a; b/ D .K � 2; 1/ and .1; K � 2/. It should be remembered that foreach value of .a; b/, a different set of cycles and variables as defined in Section 4.1 are obtained. In thenext section, we deal with the convergence for the general case .a; b/ 2 S n f0; 0g.

5. Proof of convergence

To prove Eq. (25), we prove the stronger result

limk!1

Q� fk .a; b/ D �.a; b/;

limk!1

Q fk .a; b/ D .a; b/:

(26)

First we note that f¦ fk g is a strictly increasing sequence with limk!1¦

fk D 1. This follows from the

facts that the state OX fn C OY f

n > K is visited infinitely often (see Lemma 4) and that by Lemma 1, there


is at most one cell loss in any time slot. Since fC fk g is a non-decreasing integer sequence so is fC f

¦f

k

g.Therefore, f Q� f

C f

¦f

k

g is composed of a subsequence of f Q� fk g in which certain terms are repeated, but only

for a finite amount of times. The same relation applies to f Q f

C f

¦f

k

g and f Q fk g.

From Eq. (24), Eq. (26) can be restated as the following:

limk!1

G fkX

iD1

I f xi

G fk

D �.a; b/; limk!1

G fkX

iD1

I f yi

G fk

D .a; b/: (27)

Since fI f xk g and fI f y

k g are not iid sequences, we show that they are regenerative in order to proveconvergence of their time averages. However, both processes are defined upon the f OX f

n ; OY fn g process,

and since the decisions taken by the dynamic policy at time ¦ fk ; k 2 N depend upon the entire past of

the process, fI f xk g1kD1 and fI f y

k g1kD1 are not necessarily regenerative.We first extract a regenerative (and Markov) process, f QX f

n ; QY fn g, from f OX f

n ; OY fn g by concatenating all

the fS fk ; : : :; T f

k g cycles (or deleting the fT fk C 1; : : :; S f

kC1 � 1g periods). Then we show that the I f xk

and I f yk variables remain unchanged if redefined on this new process. This enables us to use properties

of Markov Chains to prove the required regenerative properties.

5.1. The f QX fn ; QY f

n g process

This process is defined as:

QX fn D OX f

Ýf

n; QY f

n D OY f

Ýf

n; n 2 N0: (28)

Ýf

n can be explained in the following way: if a counter were to be started at 0, incrementing in everytime slot that occurs within an fS f

k ; : : :; T fk g cycle of the f OX f

n ; OY fn g process, then, Ý f

n would be the valueof time at which the counter equalled n. An example is given in Fig. 3.Ý

fn is defined formally as follows: set Ý f

0 D S f1 , and

Ý fn D

8<:Ýf

n�1 C 1; if OX f

Ýf

n�1

C OY f

Ýf

n�1

6D .a C bC 1/;

minfi > Ýf

n�1: . OX fi ;OY fi / D .a; b/g; otherwise:

(29)

Lemma 6. fÝ fn g is a sequence of stopping times on f OX f

n ; OY fn g.

Proof. By induction. Since S f1 is a stopping time, so is Ý f

0 . Suppose that Ý fn�1 is a stopping time. From

the definition, it can be seen that the event fÝ fn � ig can be determined by observing the values of

f OX fm; OY f

m g, Ý fn�1 � m � i . Therefore, Ý f

n is a stopping time on f OX fn ; OY f

n g. �

5.2. The Markov property of f QX fn ; QY f

n gNow we use the property of f OX f

n ; OY fn g stated in Lemma 3 to show that f QX f

n ; QY fn g is a Markov Chain.


Lemma 7. The process f QX fn ; QY f

n g is a time-homogeneous Markov Chain with state space QS Df.x; y/: x; y ½ 0; x C y � a C b C 1g and one-step transition probability from state .x; y/ to state. Px; Py/ denoted by Qp..x; y/; . Px; Py//, given in Eqs. (30) and (31).

Proof. Consider the transition from state . QX fn�1;QY fn�1/ D .xn�1; yn�1/ to state . QX f

n ; QY fn / D .xn; yn/. For

xn�1C yn�1 D aC bC 1, from Eq. (14), we know that n� 1 is the end of some cycle. So, . QX fn ; QY f

n / canonly be .a; b/; i.e., whenever the process reaches the state QX f

n C QY fn D aC bC 1, it is forced to return to

state .a; b/ in the next time slot. Also, from Lemma 2, OX fn C OY f

n � a C b C 1 for all n that occur withina fS f

k ; : : :; T fk g cycle. Therefore, QX f

n C QY fn � a C b C 1;8n.

To write out the one-step transition probabilities, we consider two cases:

ž xn�1 C yn�1 D a C b C 1. For .x; y/; . Px ; Py/ 2 QS and x C y D a C b C 1 we define

Qp..x; y/; . Px; Py// D(

1; if . Px; Py/ D .a; b/;

0; if . Px; Py/ 6D .a; b/:(30)

We see that:

Pf. QX fn ;QY fn / D .xn; yn/ j . QX f

n�1;QY fn�1/ D .xn�1; yn�1/g D Qp..xn�1; yn�1/; .xn; yn//:

From Eqs. (28) and (29),


n�1;QY fn�1/ D .xn�1; yn�1/; : : :; . QX f

0 ;QY f0 / D .x0; y0/g

D Qp..xn�1; yn�1/; .xn; yn//:

ž xn�1 C yn�1 < a C b C 1.


n�1;QY fn�1/ D .xn�1; yn�1/g

D Pf. OX f

Ýf

n; OY f

Ýf

n/ D .xn; yn/ j . OX f

Ýf

n�1

; OY f

Ýf

n�1

/ D .xn�1; yn�1/g:

Sf~

Tf

1~ ~

Sf

2~S

f

3

Sf

2Tf

11Sf

Xf

n~ ~

Ynf

Xf

n

1

^

0

Ynf^

Sf

3Tf

2

~T

f

2

0

Fig. 3. The new process.


Since OX f

Ýf

n�1

C OY f

Ýf

n�1

D xn�1 C yn�1 < a C bC 1, from Eq. (29), Ý fn D Ý f

n�1 C 1. The left-hand side

of the above equation can instead be written as:

Pf. OX f

Ýf

n�1C1; OY f

Ýf

n�1C1/ D .xn; yn/ j . OX f

Ýf

n�1

; OY f

Ýf

n�1

/ D .xn�1; yn�1/g:

Now, by Lemma 6, Ý fn�1 is a stopping time (Ý f

n�1 is a stopping time and should not be confused withÝ

fn � 1, which is not) on f OX f

n ; OY fn g. Therefore we can apply the property of Lemma 3 to get:

Pf. OX f

Ýf

n�1C1; OY f

Ýf

n�1C1/ D .xn; yn/ j . OX f

Ýf

n�1

; OY f

Ýf

n�1

/ D .xn�1; yn�1/g D p..xn�1; yn�1/; .xn; yn//:

For .x; y/; . Px ; Py/ 2 QS and x C y < a C b C 1 we define

Qp..x; y/; . Px; Py// D p..x; y/; . Px; Py//: (31)

Therefore,


n�1;QY fn�1/ D .xn�1; yn�1/g D Qp..xn�1; yn�1/; .xn; yn//:

Again, by using Lemma 3, we can extend the result given above to the case when the entire past ofthe process is provided.


n�1;QY fn�1/ D .xn�1; yn�1/; f. QX f

m;QY fm / D .xm; ym/g;m � n � 2g

D Pf. OX f

Ýf

n; OY f

Ýf


Ýf

n�1

; OY f

Ýf

n�1

/

D .xn�1; yn�1/; f. OX f

Ýf

m; OY f

Ýf

m/ D .xm; ym/g;m � n � 2g

D Pf. OX f

Ýf

n; OY f

Ýf


Ýf

n�1

; OY f

Ýf

n�1

/ D .xn�1; yn�1/g

D Qp..xn�1; yn�1/; .xn; yn//: �

5.3. The measurements on the new process

Define variables QS fk ;QT fk ;QN f

k on the f QX fn ; QY f

n g; k 2 N0, process in the same manner that theS f

k ; T fk ; N f

k were defined on the f OX fn ; OY f

n g process in Section 4.1:

QS f0 D 0; QT f

0 D 0; QN f0 D 0;

QN fk D minfn ½ QN f

k�1: . QX fn ;QY fn / D .a; b/g;

QS fk D minfn ½ QT f

k�1: . QX fn ;QY fn / D .a; b/g;

QT fk D minfn > QS f

k : QX fn C QY f

n D a C b C 1g; k 2 N :

Lemma 8. The realizations of the random variables H fk , J f

k , I f xk , I f y

k and G fk , are unchanged if the

stochastic process they are defined on is changed from f OX fn ; OY f

n g to f QX fn ; QY f

n g.

The proof of the above lemma is trivial and hence omitted.


Finally we come to the results for which we had to define a new process composed of the cycles andshow that it is a Markov Chain.

Lemma 9. The random variables fI f xk g (respectively fI f y

k g), 8k 2 N , have identical distribution, givenby PfI f x

k D 1g D �.a; b/ (respectively PfI f yk D 1g D .a; b//:

Proof. QN fk is the kth return time of the f QX f

n ; QY fn g process to the state .a; b/. Therefore, . QX QN f

k; QY QN f

k/ D

.a; b/;8k 2 N . The stochastic process f QX fn ; QY f

n g, which is a Markov Chain by Lemma 7, regenerates atall f QN f

k g instants by the Strong Markov Property. Also, we know that QN f1 D 0, i.e., . QX0; QY0/ D .a; b/.

Therefore, from [1], f QX fn ; QY f

n g, n ½ QN fk is stochastically equivalent to (has the same joint distribution

as) f QX fn ; QY f

n g, n 2 N0.Consider the fI f x

k g; k 2 N , process, defined on the f QX fn ; QY f

n g process. I f xk is a function of the

post- QN fk process, i.e., of f QX f

n ; QY fn g, n ½ QN f

k , which has a common joint distribution for all k.Therefore, for any k 2 N , the random variables I f x

k too have a common marginal distribution, which bydefinition is

PfI f xk D 1g D Pf QX f

n > 0; 8n D QN fk ;QN f

k C 1; : : :; QT f

J fk

� 1gD Pf QX f

n > 0; 8n D QN f1 ;QN f

1 C 1; : : :; QT f

J f1

� 1gD Pf QX f

n > 0; 8n D 0; 1; : : :; QT f1 � 1g

D Pf OX fn > 0; 8n D S f

1 ; : : :; T f1 � 1g D �.a; b/:

The last equality follows from the definition of �.a; b/ in Eq. (13) and a trivial extension of Lemma3. Similarly, we can show that PfI f y

k D 1g D .a; b/, 8k 2 N . �

Lemma 10. The random processes fI f xk g and fI f y

k g; k 2 N regenerate at all Gk C 1 instants.

Proof. It can be easily seen that the indicator variables, I f xk are not independent. However, from the

definition of I f xk , it is seen that its dependence on the f QX f

n ; QY fn g process starts and ends with the cycle

that the kth visit takes place in. Therefore, if we show that the cycles are independent, then indicatorvariables for visits in different cycles are independent.

From its definition, QS fk is a strictly increasing sequence of stopping times on the Markov Chain

f QX fn ; QY f

n g, with the property that

. QX fQS fk

; QY fQS fk

/ D .a; b/; k 2 N :

Therefore, by the corollary to theorem 13.3 of [3], the collections f QX fn ; QY f

n g; QT fk ½ n ½ QS f

k , k 2 Nare independent.

Therefore, the beginning of every cycle in the f QX fn ; QY f

n g process marks a regeneration in the indi-cator process. At n D QS f

kC1, the beginning of the k C 1st cycle, there have been Gk C 1 visits of thestate .a; b/. Therefore, the process fI f x

k g regenerates at all Gk C 1 instants. The same is true of the fI f yk g

process. �


Our main result is the following:

Theorem 11. The estimates, Q� fk .a; b/ and Q f

k .a; b/ converge to �.a; b/ and .a; b/ respectively, wp 1,i.e.,

limk!1

Q� fk .a; b/ D �.a; b/; lim

k!1Q fk .a; b/ D .a; b/:

Moreover, the average expected loss incurred under the dynamic policy, f; converges to that incurred bythe optimal static policy, f ŁI i.e.,

limn!1

1

n

nXkD1

E L fk D lim

n!11

n

nXkD1

E L f Łk :

Proof. Since by Lemma 10 the processes fI f xk g, k 2 N , and fI f y

k g, k 2 N , are regenerative and byLemma 9 their distribution is known, we can apply the convergence of time-averages for regenerativeprocesses theorem from [8]. Therefore,

limk!1

Q� fk .a; b/ D lim

k!1

G fkX

iD1

I f xi

G fk

D E.I f xk / D �.a; b/:

Similarly,

limk!1

Q fk .a; b/ D lim

k!1

G fkX

iD1

I f yi

G fk

D E.I f yk / D .a; b/:

For the second result, we note that from Section 4.3, the decision made by the dynamic policyis determined by the sign of the difference Q� f

k .a; b/ � Q fk .a; b/. If this sign is the same as that of

�.a; b/� .a; b/ (for both .a; b/ D .2; K � 1/ as well as .a; b/ D .K � 1; 2/), then the dynamic policymakes the same decision as the optimal static policy, f Ł.

For any one value of .a; b/, say .a; b/ D .2; K � 1/, assume without loss of generality that the signof �.a; b/� .a; b/ is positive. Then we know that since Q� f

k .a; b/ converges to �.a; b/, and Q fk .a; b/ to

.a; b/, there must be a positive integer time, say k0, such that:

j Q� fk .a; b/� �.a; b/j < �.a; b/� .a; b/

2; 8k > k0;

j Q fk .a; b/� .a; b/j < �.a; b/� .a; b/

2; 8k > k0:

Therefore, we get that Q� fk .a; b/� Q fk .a; b/ > 0, 8k > k0, i.e., Q� f

k .a; b/� Q fk .a; b/ has the same sign as

�.a; b/� .a; b/ after time k0.A similar relationship can be proved for the other value of .a; b/ D .K � 1; 2/. Let the time after

which the result holds true be k1 2 N0. Let k2 be the maximum of k0 and k1. We can see that after timek2, the dynamic policy, f , starts taking the same decisions as the optimal policy, f Ł, with the result


that f OX fn ; OY f

n g; n ½ k2 has the same finite dimensional joint probability mass function as f OX f Łn ; OY f Ł

n g,n 2 N0. This fact proves the second claim of the theorem. �

6. Conclusions

In this paper, we considered the problem of minimizing average expected cell losses in an ATMshared buffer switch with two input and two output ports. Based on a static policy, we have defineda dynamic buffer management policy, that does not require knowledge of the pmf of the input arrivalprocess. We have shown that when the input processes to the switch are iid Bernoulli (with unknownparameters), the estimates used by the dynamic policy converge to the parameters used by the optimalstatic policy; therefore, the cost incurred by the dynamic policy converges to that incurred by the optimalpolicy; thus proving that the dynamic policy is also optimal.

The main difficulty in the work lay first in identifying the estimates to be used, and then in provingtheir regenerative properties. However, the dynamic policy itself is simple to implement and has theadvantage over the optimal static policy that it does not need to know the pmf of the arrival process;this information that is unlikely to be available in a real network. Arrival processes to switches in realnetworks may not always be iid Bernoulli. When these processes are modeled by Discrete Batch MarkovArrival Processes [2], the optimal static policy has been identified in [7] to have a similar form to theoptimal static policy for the iid Bernoulli case. This leads us to believe that a dynamic policy similar tothe one presented in this paper can be proven to be optimal too. However, this work is left for the future.

References

[1] R. Bhattacharya, E. Waymire, Stochastic Processes and Applications, Wiley, New York, 1990.[2] C. Blondia, O. Casals, Performance analysis of statistical multiplexing of VBR sources, in: Proc. Infocom, Florence,

Italy, May 1992, IEEE, Vol. 2, pp. 828–838 (6C.2).[3] K.-L. Chung, Markov Chains with Stationary Transition Probabilities, Springer, Berlin, 1960.[4] M.G. Hluchyj, M.J. Karol, Queueing in high-performance packet switching, IEEE J. Selected Areas Commun. 6 (9)

(1988) 1587–1597.[5] F. Kamoun, L. Kleinrock, Analysis of shared finite storage in a computer network node environment under general traffic

conditions, IEEE Trans. Commun. 28 (1980) 992–1003.[6] S. Karlin, H.M. Taylor, A First Course in Stochastic Processes (2nd ed.), Academic Press, San Diego, CA, 1975.[7] S. Sharma, Optimal buffer management in shared buffer ATM switches, Ph.D. Thesis, North Carolina State University,

Raleigh, NC, 1996.[8] R.W. Wolff, Stochastic Modeling and the Theory of Queues, Prentice-Hall, Englewood Cliffs, NJ, 1989.

Supriya Sharma received a B.E. from VRCE, Nagpur, India, in ’91, an M.S. from Clemson Univ.,S.C., in ’93, and a Ph.D. from NCSU, Raleigh, N.C., in ’96. She currently works in the StrategicNetwork Planning group in Alcatel USA. Her research interests are data network architecture,traffic management, quality of service, and performance analysis of networking software. Supriyawas awarded the National Merit Scholarship from the Govt. of India from ’86 to ’91, the MotorolaScholarship in ’91, and an Internet Achievement Award for her work on Patricia trees at IBM, RTP,in ’97. Her e-mail address is [email protected]


Yannis Viniotis received his Ph.D. from the University of Maryland, College Park, in 1988 andis currently Associate Professor of Electrical and Computer Engineering at North Carolina StateUniversity. Dr. Viniotis is the author of over fifty technical publications, including an engineeringtextbook on probability and random processes (published by McGraw-Hill, 1998). He has servedas the guest editor of a special issue of the Performance Evaluation Journal and as the co-editorof the proceedings of two international conferences in computer networking. His research interestsinclude quality of service issues in high speed networks, multicast routing, traffic management anddesign and analysis of stochastic algorithms. Dr. Viniotis is the Associate Director of the Mastersin Computer Networking Program, a program offered jointly by the Departments of Electrical andComputer Engineering, Computer Science, and the School of Business at NCSU; he is also theco-founder of a startup networking company in Research Triangle Park, that specializes in ASIC

implementation of integrated QoS solutions for IP and ATM networks.

convergence of a dynamic policy for buffer management in shared buffer atm switches

Documents