[ieee 2012 2nd ieee international conference on parallel, distributed and grid computing (pdgc) -...

6
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing Deterministic 1-2 Skip List in Distributed System Subhrangsu Mandal, Sandip Chakraborty, Sushanta Karmakar Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Assam, India Email: {subhrangsu, c. sandip, sushantak} @iitg. ernet. in Abstct-Searching data efficientl y in distributed applications like peer-to-peer s y stem is a challenging task due to the random distribution of data among several participating nodes. Efficient data structures are designed and implemented to reduce the complexit y of data searching in such an environment. In this paper a data structure called deterministic 1-2 . ski li�t has been proposed as a solution for search problems In dIstrIbuted environment. The data structure has three main operations viz. search, insert, and delete. The detailed description of the insertion, deletion and search operations are given in this paper. It is found that the message complexit y of the insertion, deletion and search algorithm is O(logn) where n is the total number of nodes in the skip-list. 1. INTRODUCTION Searching data efficiently in distributed applications like peer-to-peer system is a challenging task due to the random distribution of data among several participating nodes. Effi- cient data structures are designed and implemented to reduce the complexity of data searching in such an environment. One of the widely used data structure is distributed hash table (DHT) which is implemented efficiently for structured peer- to-peer [1], overlay networks [2], [3] and content distribution system [4]. The advantage of DHT is its scalability and fault tolerance architecture that makes it useful for several distributed search applications. However, as DHT is based on hash based look up, it does not support range queries based look up, and can only provide service for single-shot queries. This limits applications that requires range queries and can not be implemented efficiently using DHT. In [5], the authors have introduced distributed segment trees (DST) to support range queries over DHT. DST becomes insufficient when leaf nodes get saturated. Thus scalability is a problem for DST. The solution of above problem requires an efficient list or tree-like data structure that supports range queries with im- proved scalability support. One of such efficient data structure is skip list[6] that have the properties for both a list and a balanced tree, and thus makes searching efficient for both single-shot queries and range queries. However, skip-list is designed and implemented for centralized architecture only. Compared to the balanced search trees this is relatively simple data structure from the maintenance and implementation point of view. Skip list is a data structure for storing a sorted list of elements using a hierarchy of linked lists that connect The second author of this paper is supported by TATA Consultancy Services Research Fellowship, 2011, India Fig. 1. A Skip list increasingly sparse sequence of items. While searching in a skip list, some nodes can be probabilistically skipped based on the distribution of references to the next node. Fig. 1 shows a skip list. Let, we need to search 10 in this skip list. Then from node with value 6, we can directly jump to node with value 10, skipping the node with value 8. With a geometric or negative binomial distribution of references to the next nodes, the search, insert and delete operation in a skip list can operate in O(log n) in average cases, where n is the total number of nodes [6]. Being probabilistic in nature, skip list has some drawbacks. Nodes that will be in the next level is determined probabilis- tically, thus the shape of the skip list is not deterministic in nature. So it is not possible to give an upper bound of worst case insert and maintenance cost. To overcome this difficulty of skip list Munro et at provided deterministic versions of skip list [7] called 1-2 skip list. A 1-2 skip list is a deterministic skip list where there is either 1 or 2 nodes of height (h -1) between any two nodes of height h or higher. Here height of a node denotes that the maximum level of list up to which the node belongs to. This 1-2 deterministic skip list has a one-to-one correspondence with 2-3 tree. 1-2 skip lists have worst case complexity of o (log n) for search operation and worst case complexity of O(log 2 n) for update operation [7]. The ran- domized skip list is also used to solve the problem of finding all intervals that overlap a point. To solve this problem Hanson et at. proposed a data structure called "Interval skip list"(IS- List) [8]. This IS-List allows stabbing queries and dynamic insertion and deletion of intervals. In [9], the authors have used skip list for efficient and practical technique for dynamically maintaining an authenticated dictionary. Applications of their work includes certificate revocation in public key infrastructure and the publication of data collections on the Internet. There are some preliminary works in literature that intro- duce skip list as an alternate data structure for efficient search- ing in distributed peer-to-peer and overlay networks. Guerraoui et at. proposed a self organizing and fully distributed overlay called GosSkip [10]. This overlay is based on a skip list like 978-1-4673-2925-5/12/$31.00 ©2012 IEEE 296

Upload: sushanta

Post on 31-Jan-2017

215 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: [IEEE 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC) - Solan, India (2012.12.6-2012.12.8)] 2012 2nd IEEE International Conference on Parallel,

2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

Deterministic 1-2 Skip List in Distributed System

Subhrangsu Mandal, Sandip Chakraborty, Sushanta Karmakar Department of Computer Science and Engineering,

Indian Institute of Technology Guwahati, Assam, India Email: {subhrangsu, c. sandip, sushantak} @iitg.ernet. in

Abstract-Searching data efficiently in distributed applications like peer-to-peer system is a challenging task due to the random distribution of data among several participating nodes. Efficient data structures are designed and implemented to reduce the complexity of data searching in such an environment. In this paper a data structure called deterministic 1-2 . skip' li�t has been proposed as a solution for search problems In dIstrIbuted environment. The data structure has three main operations viz. search, insert, and delete. The detailed description of the insertion, deletion and search operations are given in this paper. It is found that the message complexity of the insertion, deletion and search algorithm is O(logn) where n is the total number of nodes in the skip-list.

1. INTRODUCTION

Searching data efficiently in distributed applications like peer-to-peer system is a challenging task due to the random distribution of data among several participating nodes. Effi­cient data structures are designed and implemented to reduce the complexity of data searching in such an environment. One of the widely used data structure is distributed hash table (DHT) which is implemented efficiently for structured peer­to-peer [1], overlay networks [2], [3] and content distribution system [4]. The advantage of DHT is its scalability and fault tolerance architecture that makes it useful for several distributed search applications. However, as DHT is based on hash based look up, it does not support range queries based look up, and can only provide service for single-shot queries. This limits applications that requires range queries and can not be implemented efficiently using DHT. In [5], the authors have introduced distributed segment trees (DST) to support range queries over DHT. DST becomes insufficient when leaf nodes get saturated. Thus scalability is a problem for DST.

The solution of above problem requires an efficient list or tree-like data structure that supports range queries with im­proved scalability support. One of such efficient data structure is skip list[6] that have the properties for both a list and a balanced tree, and thus makes searching efficient for both single-shot queries and range queries. However, skip-list is designed and implemented for centralized architecture only. Compared to the balanced search trees this is relatively simple data structure from the maintenance and implementation point of view. Skip list is a data structure for storing a sorted list of elements using a hierarchy of linked lists that connect

The second author of this paper is supported by TATA Consultancy Services Research Fellowship, 2011, India

Fig. 1. A Skip list

increasingly sparse sequence of items. While searching in a skip list, some nodes can be probabilistically skipped based on the distribution of references to the next node. Fig. 1 shows a skip list. Let, we need to search 10 in this skip list. Then from node with value 6, we can directly jump to node with value 10, skipping the node with value 8. With a geometric or negative binomial distribution of references to the next nodes, the search, insert and delete operation in a skip list can operate in O(log n) in average cases, where n is the total number of nodes [6].

Being probabilistic in nature, skip list has some drawbacks. Nodes that will be in the next level is determined probabilis­tic ally, thus the shape of the skip list is not deterministic in nature. So it is not possible to give an upper bound of worst case insert and maintenance cost. To overcome this difficulty of skip list Munro et at provided deterministic versions of skip list [7] called 1-2 skip list. A 1-2 skip list is a deterministic skip list where there is either 1 or 2 nodes of height (h -1) between any two nodes of height h or higher. Here height of a node denotes that the maximum level of list up to which the node belongs to. This 1-2 deterministic skip list has a one-to-one correspondence with 2-3 tree. 1-2 skip lists have worst case complexity of o (log n) for search operation and worst case complexity of O(log2

n) for update operation [7]. The ran­domized skip list is also used to solve the problem of finding all intervals that overlap a point. To solve this problem Hanson et at. proposed a data structure called "Interval skip list"(IS­List) [8]. This IS-List allows stabbing queries and dynamic insertion and deletion of intervals. In [9], the authors have used skip list for efficient and practical technique for dynamically maintaining an authenticated dictionary. Applications of their work includes certificate revocation in public key infrastructure and the publication of data collections on the Internet.

There are some preliminary works in literature that intro­duce skip list as an alternate data structure for efficient search­ing in distributed peer-to-peer and overlay networks. Guerraoui et at. proposed a self organizing and fully distributed overlay called GosSkip [10]. This overlay is based on a skip list like

978-1-4673-2925-5/12/$31.00 ©2012 IEEE 296

Page 2: [IEEE 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC) - Solan, India (2012.12.6-2012.12.8)] 2012 2nd IEEE International Conference on Parallel,

2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

data structure which is designed for distributed environment. In this scheme no hashing is used on data attributes, thus it preserves semantic locality and supports range queries. In [11], the authors have designed a distributed data structure called skip graph to provide functionality like DHT. Wang et at. de­veloped a peer-to-peer asynchronous video streaming system using skip list [12]. They used a randomized, distributed skip list to overcome the challenge of on-demand streaming with asynchronous requests. Clouser et at. proposed a peer-to-peer system called Tiara [13] in which they used a self stabilizing sparse 0-1 skip list on top of a self- stabilizing sorted list. They provided a self stabilizing algorithm for a sorted list first and then extended it to sparse 0-1 skip list. Recently Scheideler et

at. proposed a self stabilizing deterministic message-passing 1-2 skip list called Corona [14]. In this scheme all the processes send their status at any level to its neighbors by passing message periodically.

This paper describes a set of algorithms for implementing deterministic 1-2 skip list in a distributed environment for efficient searching and list stabilization after a new node gets inserted or an existing node is deleted. The proposed set of algorithms overcome the shortcomings of existing architectures, as proposed in [13] and [14]. After a node gets inserted or deleted from Tiara [13], the stabilization gets violated, and in the worst case, the stabilization may grow up­to O( n) steps. In a deterministic 1-2 skip list, the stabilization is always achieved in O(!og n) steps, for both insert and delete operations. The Corona architecture [14] provides a self-stabilization algorithm for deterministic 1-2 skip list by means of periodic broadcasting to check skip-list properties. The proposed system in this paper stabilizes on-the-fly after an insertion or deletion procedure gets executed, and thus does not require periodic broadcasting. The search, insertion and deletion algorithm requires o (log n) message complexity.

II. SYSTEM MODEL

In a distributed environment, skip list can be considered as a collection of nodes which holds data elements and can be in different lists of different levels. Here we can represent each of these nodes as different processes. It has been assumed that each process can communicate only by sending messages. Channels are assumed to be asynchronous, FIFO and reliable. Each process holds a data item called the key and the ordering of the nodes are done based on those keys. The list contains two special nodes - header and null, the first and the last node of the list respectively. There are two dedicated processes functioning for the header and null nodes. The header node contains the key -00 and the null node contains key 00. The keys are sorted in ascending order in the list. Every node maintains some local variables: data( v) is the local key stored at node v, and ndlevel is the list of keys for right neighbors at every level, d is the data to be searched in the list, and level is the current level explored. Let lm (v) denotes the maximum level for node v. Rlevel (v) and Llevel (v) is the right neighbor and left neighbor respectively for node v at corresponding

level. We assume no node or link failures in the system.

297

III. DATA SEARCHING IN A DISTRIBUTED 1-2 SKIP LIST

The search procedure is started from the header node of the skip list. If the search procedure is successful, it returns the process id of the node that contains the key value. If search is unsuccessful, the procedure returns an error message to the header node. The whole procedure is initiated by a message ("SEARCH", data, l) sent by the header node. The searching procedure is described in Algorithm 1.

Algorithm 1 received( "SEARCH", d, l)u,v 1: if v = header then

2: level := lm(v) 3: else

4: level := l 5: end if

6: if data( v) = d then

7: return(v) 8: else if data( v) < d then

while level =J 0 do 9:

10: if ndlevel ( v) :s; d then

11:

12:

send( "SEARCH", d, level) to Rlevel (v) break

13: else

14: level := level - 1 15: end if

16: end while

17: end if

18: if level = 0 then

19: send a "ERROR" message to header.

20: end if

Following lemmas and theorems show the correctness of the search procedure.

Lemma 1: Let us assume node Ni receives SEARCH token at level Ii and forwards search token to node Nj at level lj. Then li ;::: lj

Proof According to Algorithm 1, when a node Ni receives a SEARCH token at li, it performs some local computation. According to line 8 of Algorithm 1, if the key is less than data, the search procedure continues. If the data does not match with the key, the data is compared with the key of next node, according to line 10. If the key is less than or equals to the data, the message is forwarded to the next node with same level. Otherwise the level is decremented, according to line 14, until it reaches O. Thus whenever the search message is forwarded, the level value either remains same or decremented, until it reaches O. This implies li ;::: lj

Theorem 1: The algorithm for search terminates either with a search failure or with the node address which contains the search data eventually.

Proof Let INi and dNi denote the level and key at node Ni where the "SEARCH" message is forwarded. From Lemma 1,

(2)

Page 3: [IEEE 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC) - Solan, India (2012.12.6-2012.12.8)] 2012 2nd IEEE International Conference on Parallel,

2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

TABLE I CONDITIONS FOR 1-2 SKIP LIST MAINTENANCE

Wm(U) = lm(V)) A (Y1(U) = T)) A Wm(W) = lm(V)) A (Y2(W) = T)) (I a)

((lm(u) = lm(v)) A (Yl(U) = T)) A ((lm(w) = lm(v)) A (Yl(W) = T)) (lb)

((lm(u) = lm(v)) A (Y2(U) = T)) A ((lm(w) = lm(v)) A (Yl(W) = T)) (Ie)

((lm(u) = lm(v)) A (Y2(U) = T)) A ((lm(w) = lm(v)) A (Y2(W) = T)) (ld)

Wm(U) = lm(V)) A (Y2(U) = T)) A (lm(W) > lm(V)) (Ie)

Wm(U) = lm(V)) A (Y1(u) = T)) A (lm(W) > lm(V)) (If)

(lm(U) > lm(V)) A Wm(W) = lm(V)) A (Y2(W) = T)) (I g)

(lm(u) > lm(v)) A ((lm(w) = lm(v)) A (Yl(W) = T)) (lh)

(lm(u) > lm(v)) A (lm(w) > lm(v)) (Ii)

As the list is sorted in ascending order,

(3)

w. Then we have 9 possible Conditions (Ia)-(li), given in Table I, that denotes the state of node v. These conditions can be derived from definition 1, definition 2 and Property 1.

Now we have dheader = -00 and dnull = 00, so dheader <

data < dnull. Thus from equation (2) and equation (3), it can be concluded that traversal of "SEARCH" messages forms a partially ordered set in decreasing order of level values, which implies the level eventually becomes zero, and an unsuccessful search terminates at this point. If the search is successful, it immediately terminates returning the result. •

A. Insertion Procedure

Theorem 2: The search procedure works in O(log n) mes­sage complexity.

Proof In the proposed distributed searching algorithm, every steps requires one search message to be forwarded to the next process. If data is not found we need to decrease level by one. At that level there can be only two nodes between any two nodes of height greater than the current level. So at this level, we need to step down by one level after forwarding two messages. There are log n + 1 number of levels in the skip list [7]. So we need to send at most 2 (log n) + 2 messages for searching. Finally one message is required to send the result to the header. So message complexity is asymptotically bounded by O(logn). •

It can be noted that the search operation can be extended to support range queries in this data structure. As the list is in sorted order, range queries can be implemented using two search operations.

IV. INSERTION AND DELETION FROM A DISTRIBUTED 1-2 SKIP LIST

Every node v uses two local variables Y 1 (v) and Y 2 (v). Let T stands for true predicate and F stands for false predicate. Let, L1m(v) (v) = U and R1m(v)(v) = w. Then,

Definition 1: Y 1 (v) = T implies lm( u) > lm( v) Definition 2: Y2(v) = T implies, (i) Y1(u) T,

(ii) lm(u) = lm(v) and (iii) lm(w) > lm(v) Following property can be derived from the definition of

Y1 and Y2, Property 1: For a node v, Y1(v) = T =} Y2(v) = F and

Y2(v) = T =} Y1(v) = F. Let v is a node that violates the properties of 1-2 skip list

after insertion or deletion. Let L1m(v)(v) = U and R1m(v)(v) =

298

Let node v needs to be inserted in a 1-2 skip list and for this case Llm(v)(V ) = U and Rlm(v)(V ) = w. We have following properties. The correctness of these properties are not given due to space constraint.

Property 2: Let the following conditions be true,

Then, lm(w) > lm(v)

lm(u) = lm(v)

Y2(u) = T (4a)

(4b)

Property 3: Let the following conditions be true along with Condition (4a),

Then, Y2(w) = T

Y1(U) = T lm(w) = lm(v)

(Sa)

(Sb)

Property 4: Let the following condition be true along with Condition (Sb),

(6a)

Then, Y l (W ) = T It should be noted that from Property 2, Condition (lc) and

Condition (ld) can not be evaluated to T when a new node is inserted. Similarly from Property 3, Condition (l b), and from Property 4, Condition (lg) can not hold T for insertion procedure. So we are left with five possible conditions. Once a node v receives the state information of u and w using a pair of message transmissions, it should execute one of the five possible actions, as shown in Table II.

Condition (la) denotes that before insertion u and v were in the same maximum level. So after insertion v should be upgraded to 1m (u) + 1. Once v is upgraded, Y 1 (w) should be set to T. Condition (Ie) denotes that before insertion u and Llm(u)(U) were in same level. So after insertion u should be upgraded to lm(u) + 1. Y1(v) is set to T. Condition (If) denotes that before insertion Y 1 (u) = T and after insertion lm(w) > lm(v), so Y2(v) is set to T. Condition (lh) denotes

Page 4: [IEEE 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC) - Solan, India (2012.12.6-2012.12.8)] 2012 2nd IEEE International Conference on Parallel,

2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

TABLE II INSERTION PROCEDURE FOR 1-2 SKIP LIST

Condition (la) ---+upgrade(v),

Yl(W) = T

Condition (le) ---+upgrade(u),

Yl(V) = T Condition (If) ---+Y2(V) = T Condition (lh) ---+checkup(w)

Condition (li) ---+Y1(v) = T

(7a)

(7b)

(7c)

(7d)

(7e)

that after insertion lm( u) > lm(v) and Y l (W) = T. Following property shows the limitation of Condition (lh).

Property 5: Node v can always determine whether there is any possible up-gradation at lm(u), but not at lm(w).

So from Property 5, node v sends a checkup( w) message to node W to check for any possible up-gradation. The following property for checkup operation limits the checkup operation upto a single hop.

Property 6: If a node x receives checkup from y where x = RI=(Y)(Y)' then Condition (lh) can not hold T at node x.

Hence the checkup message does not propagate further. Condition (Ii) denotes that after insertion both u and w are in higher maximum level. So v sets Y 1 (v) to T.

The insertion procedure works as follows;

Step l. Let v be the node to be inserted. Node v first sends a query to the header node, which in turn finds out Ui = Li(V) and Wi = Ri(v), probable left and right neighbors of node v respectively, using a procedure similar to the search operation. Here i denotes the corresponding level of the skip list. Node v then sends a message LNODE to u and RNODE to w with its level information to the neighbors at level l, as shown in Algorithm 2 and Algorithm 3.

Step 2. On receiving LNODE and RNODE with level information, node u and node w compute the stability conditions according to table I, and send a RESPONSE message back to v. The RESPONSE message contains a parameter m,

whose value is defined as follows for node u, • If ((lm(u) = lm(v)) /\ (Yl(u) = T)), then m = 1 • If ((lm(u) = lm(v)) /\ (Y2(u) = T)), then m = 2 • If lm(u) > lm(v), then m = 3

Similarly m value is defined for node w. Step 3. The operations taken by node v, on receiving

RESPONSE from both u and w, is shown in Algorithm 4. Node v first sets its statei,j value to 1 , where i is the m value. j = 0 if the message is received from u and j = 1 if the message is received from w. Now based on statei,j value, node v executes the operation according to Table II. The upgrade procedure is shown in Algorithm 6.

Step 4. If node v can not take necessary actions based on in-

299

formation from w, as described in Theorem 5, it sends a CHECKUP message to node w. The operations of CHECKUP message is shown in Algorithm 5.

Step 5. After all possible up-gradations, If there is a require­ment for up-gradation at header or null as shown in Algorithm 6, the same would be executed, and the insertion operation terminates.

Algorithm 2 Node u received " LNODE" from node v

1: RI(V) := u 2: if v i- header /\ lm(v) = l then

3: if Y 1 (v) = T then

4: send("RESPONSE", 1) to u 5: else

6: send("RESPONSE", 2) to u 7: end if

8: else

9: send("RESPONSE", 3) to u 10: end if

Algorithm 3 Node u received " RNODE" from node v

1: LI(V) := u 2: if v i- header /\ lm(v) = l then

3: if Yl(v) = T then

4: send("RESPONSE", 2) to u 5: else

6: send("RESPONSE", 1) to u 7: end if

8: else

9: send("RESPONSE", 3) to u 10: end if

Algorithm 4 Node v received " RESPONSE" from u or w

1: set the statei,j value based on m and u, w 2: if statel,O = 1 /\ statel,l = 1 then

3: upgrade(v) 4: else if statel,O = 1 /\ state3,l = 1 then

5: Y2(v) := T 6: else if state2,O = 1 then

7: Yl(v) := T 8: upgrade(Li(v)) 9: else if state2,l = 1 /\ state3,O = 1 then

10: Yl(v) := T 11: send("CHECKUP") to w 12: else if state3,O = 1 /\ state3,l = 1 then

13: Y l(V) := T 14: end if

1) Correctness: Following lemmas and theorems can be used to check the correctness and termination of the proposed algorithm. The proof for the theorems and lemmas are not given in this paper due to space constraint. However the proofs can be derived with little effort.

Page 5: [IEEE 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC) - Solan, India (2012.12.6-2012.12.8)] 2012 2nd IEEE International Conference on Parallel,

2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

Algorithm 5 Node W received "C H EC KU pI! from node v 1: if Im(RI= (W)) = lm( w) then

2: Y1(RI=(W)) := T 3: upgrade(w) 4: else

5: Y2(w) := T 6: end if

Algorithm 6 Procedure upgrade ( u) 1: level := level + 1 2: lm(u) := level 3: send("LNODE", level) to Llevel(U) 4: send( "RNODE", level) to Rlevel (u) 5: if level > 1m (header) then

6: send( " UPHEAD") to header. I*upgrade header*1

7: send(" UPNULL") to null. I*upgrade null*1

8: Llevel(U) := header 9: Rlevel(U) := null

10: end if

Theorem 3: The insertion algorithm gets terminated even­tually.

Conditions (If) or (li) evaluates to true, or the control goes to the maximum level of the skip list where no more further up-gradation required. The algorithm terminates at this point.

Theorem 4: After the insertion procedure the properties of 1-2 skip list remains maintained.

2) Message Complexity:

Theorem 5: The insertion procedure works in O(!og n) message complexity.

Proof To find all the neighbors of new node takes 2 x

O(!og n) number of messages. According to [7], there can be O(log n) number of levels possible with a n node skip list. So there can be O(log n) up-gradation possible after a new node gets inserted. Every up-gradation requires a pair of message communication. So, overall the insertion procedure works in O(log n) messages. •

B. Deletion Procedure

Deletion requires downgrade operation as well as upgrade operations to be performed to make a 1-2 skip list stable. Let us consider fig. 2(a). Let node v is deleted. Then either node U or node W need to be downgraded. We have following properties for deletion;

Property 7: Let node v is a deleted from a stable 1-2 skip list. Let U = L1m(v)(v) and W = R1m(v)(v). Then the downgrade operation can be either at lm(u) or lm(w) .

Property 8: Let node v is downgraded in a 1-2 skip list. Let I'm (v) be the maximum level of node v before downgrade operation, and ul = Ll;,,(v) (v) and Wi = Rcm(v) (v). Then the downgrade operation can be either at lm(u/) or lm(w/) .

Property 9: There can not be more than one level down­grade operation at any node.

Property 10: Let node v gets downgraded and lm(v) be the maximum level of node v after downgrade operation. Let

300

� §:-____ -uu=:§ :u ; v ; w :u :w

�� � _

____ (b) _____ ;§

:u : v : w :u :w (d)

§:-____ -uu� : "

Fig. 2. Deletion in a Skip list

TABLE III DELETION PROCEDURE FOR 1-2 SKIP LIST

Condition (la) -tupgrade(v),

:w

Yl(W) = T (8a)

Condition (lb) -tupgrade(v) (8b)

Condition (lc) -tupgrade(v) (8c)

Condition (ld) -tupgrade(v) (8d)

Condition (le) -tupgrade(u) (8e)

Condition (If) -tY2(V) = T (8f)

Condition (lg) -tYl(V) = T (8g)

Condition (lh) -tcheckup(w) (8h)

Condition (li) -tY1(v) = T (8i)

U = Ll=(v)(v) and w = Rl=(v)(v). Then if either of U or w gets upgraded due to the result of this downgrade operation, then the downgrade operation terminates.

Property 11: The downgrade operation terminates after O(log n) number of operations.

Once the downgrade operation gets terminated according to Property 11, the next task is to check for possible up-gradation. The up-gradation can occur only at the neighborhood of the node deleted. The checking for up-gradation is started from level 1 neighbors of the deleted node. The up-gradation is similar to the operations described for insertion procedure. Starting from level 1 neighbors, every node uses a pair of mes­sage communication to get the information of their neighbors at maximum level, and executes condition (la)-(li) to check for possible up-gradations. The actions taken by a node v are given in Table III. The following property optimizes neighbor search procedure for up-gradation in deletion procedure.

Property 12: Let node v gets deleted. Let Ui = Li(V) and Wi = Ri (v) denotes ith level left and right neighbor respectively for node v. Let Wj (j < i) is upgraded to ith level because of up-gradation. Then LI=(wj)(Wj) = Ui and R1m(w;)(Wj) = Wi· Similarly if Uj(j < i) is upgraded to ith level because of up-gradation. Then Lim (Uj) (Uj) = Ui and R1m(Uj)(Uj) = Wi·

The complete deletion procedure works as follows. The pseudocodes are not given due to space constraints.

1) Let node v needs to be deleted. It first sends its right neighbor information to its left neighbors for the checking of possible downgrade operations. Then it constructs two

Page 6: [IEEE 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC) - Solan, India (2012.12.6-2012.12.8)] 2012 2nd IEEE International Conference on Parallel,

2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

lists - listl and list2, containing its neighbor information for all the levels for its left and right correspondingly. It then forwards an UP N EXT message to node u =

L1m(v)(v). 2) On receiving the UP N EXT message, the node first

stores its local variables, and then send LN EXT mes­sage to its right neighbor to update their neighbor infor­mation. If there is any possible downgrade operation, the downgrade is performed at this stage, and then it checks for possible up-gradation due to the down-gradation of the node by sending CHECKUP message. If there is no downgrade operation, it sends a CHECKUP message to the next node in listl to check for any possible upgradation. It then sends C H EC K ST message to its left and right neighbor at maximum level.

3) On receiving CHECKST, the node checks the condi­tions given in table I, and sends a RESPON SE message to node v with a m value which is set in a similar way as described for insertion operation.

4) On receiving RESPONSE, node v executes the oper­ations according to table III.

5) The operations are performed for all nodes in listl and list2. When both listl and list2 becomes empty, a W RAPU P message is sent to the header to terminate the algorithm.

1) Correctness: Following lemmas and theorems show the correctness of the deletion procedure. The detailed proof are not given due to space constraints.

Lemma 2: The sizes of listl and list2 are O(logn). Theorem 6: Eventually the deletion procedure gets termi­

nated. Theorem 7: After deleting a node from the skip list the

properties of 1-2 skip list are maintained. 2) Message Complexity:

Theorem 8: The deletion procedure works in O(log n) mes­sage complexities.

Proof There can be o (log n) levels in a n-node 1-2 skip list [7]. So the downgrade operations can occur at most o (log n) times. Every downgrade operation requires one DTOK EN message forwarding. Thus for downgrade operation O(log n) messages are required. Every up-gradation required one pair of message communications. From Theo­rem 12, the checking for up-gradation does not require any extra search operation and neighbors can be extracted directly from the list <J?(i) piggybacked with DTOKEN message. So for up-gradation, another 2 x o (log n) number of messages required in worst case. So the message complexity of deletion procedure is 3 x O(log n) which is essentially o (log n) •

V. DISCUSSION

It should be noted that in the proposed scheme, the insertion and deletion procedure are atomic (i. e. the insertion and deletion procedure can not work concurrently). However, the search procedure can works concurrently. For insertion and deletion operation the serializability can be achieved with the cooperation of header node. Every insertion and deletion

301

procedure is initiated by the header node. So the header node can use a list to make the insertion and deletion operation serializable. When a insertion or deletion procedure gets completed, then only the header node starts executing next procedure. As the system is assumed to be reliable, and there is no sudden failure, so with the cooperation with the header node, concurrency in the system can be achieved.

V I. CONCLUSION

In this paper a set of algorithms have been proposed to design a distributed deterministic 1-2 skip list that addresses the searching problem in a large data set in distributed environment. It has been shown in the paper that the worst case message complexity for the search, insertion and deletion procedure is O(log n). The proposed data structure is novel in distributed message passing applications like peer-to-peer and overlay systems which can be effectively use to solve the search problem in such environments. Enforcing fine grained concurrency control in insert and delete operation without the help of header node can be an interesting future work.

REFERENCES

[I] F. Dabek, B. Zhao, P. Druschel, J. Kubiatowicz, and l. Stoica, "Towards a common API for structured peer-to-peer overlays," in Proceedings of International Workshop on Peer-to-Peer Systems, 2003.

[2] J. Jannotti, D. K. Gifford, K. L. Johnson, M. F. Kaashoek, and J. W. O'Toole, Jr., "Overcast: reliable multicasting with on overlay network," in Proceedings of the 4th Conference on Symposium on Operating System Design & Implementation, 2000.

[3] N. J. A. Harvey, M. B. Jones, S. Saroiu, M. Theimer, and A. Wolman, "Skipnet: a scalable overlay network with practical locality properties," in Proceedings of the 4th Conference on USENIX Symposium on Internet Technologies and Systems, 2003.

[4] S. Androutsellis-Theotokis and D. Spinellis, "A survey of peer-to-peer content distribution technologies," ACM Comput. Surv., vol. 36, pp. 335-371, December 2004.

[5] C. Zheng, G. Shen, S. Li, and S. Shenker, "Distributed segment tree: Support range query and cover query over dht," in Proceedings of the

Fifth International Workshop on Peer-to-Peer Systems, Feb. 2006. [6] W. Pugh, "Skip lists: a probabilistic alternative to balanced trees,"

Commun. ACM, vol. 33, no. 6, pp. 668-676, Jun. 1990. [7] J. I. Munro, T. Papadakis, and R. Sedgewick, "Deterministic skip lists,"

in Proceedings of the third annual ACM-SIAM symposium on Discrete

algorithms, 1992, pp. 367-375. [8] E. N. Hanson and T. Johnson, "The interval skip list: A data structure

for finding all intervals that overlap a point," in Proceedings of the 2nd Workshop on Algorithms and Data Structures, 1992, pp. 153-164.

[9] M. T. Goodrich and R. Tamassia, "Efficient authenticated dictionaries with skip lists and commutative hashing," Johns Hopkins Information Secutity Institute, Tech. Rep., 2001.

[10] R. Guerraoui, S. B. Handurukande, K. Huguenin, A.-M. Kermarrec, F. Le Fessant, and E. Riviere, "Gosskip, an efficient, fault-tolerant and self organizing overlay using gossip-based construction and skip-lists principles," in Proceedings of the Sixth IEEE International Conference

on Peer-to-Peer Computing, 2006, pp. 12-22. [11] J. Aspnes and G. Shah, "Skip graphs," ACM Trans. Algorithms, vol. 3,

November 2007. [I2] D. Wang and J. Liu, "Peer-to-peer asynchronous video streaming using

skip list," 2006, pp. 1397-1400. [I3] T. Clouser, M. Nesterenko, and C. Scheideler, "Tiara: A self-stabilizing

deterministic skip list," in Proceedings of the lath International Sympo­

sium of Stabilization, Safety, and Security of Distributed Systems, vol. 5340, 2008, pp. 124-140.

[14] R. M. Nor, M. Nesterenko, and C. Scheideler, "Corona: a stabilizing deterministic message-passing skip list," in Proceedings of the 13th

International Symposium of Stabilization, Safety, and Security of Dis­

tributed Systems, 2011, pp. 356-370.